public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Peter Todd <pete@petertodd.org>
To: Jeremy Spilman <jeremy@taplink.co>
Cc: bitcoin-development@lists.sourceforge.net
Subject: Re: [Bitcoin-development] Privacy and blockchain data
Date: Fri, 10 Jan 2014 05:10:51 -0500	[thread overview]
Message-ID: <20140110101051.GA25749@savin> (raw)
In-Reply-To: <op.w9cu78gsyldrnw@laptop-air.hsd1.ca.comcast.net>

[-- Attachment #1: Type: text/plain, Size: 3843 bytes --]

On Tue, Jan 07, 2014 at 10:34:46PM -0800, Jeremy Spilman wrote:
> >
> >2) Common prefixes: Generate addresses such that for a given wallet they
> >   all share a fixed prefix. The length of that prefix determines the
> >   anonymity set and associated privacy/bandwidth tradeoff, which
> >   remainds a fixed ratio of all transactions for the life of the
> >   wallet.
> >
> 
> Interesting thought to make the privacy/bandwidth trade-off using
> vanitygen and prefix filters.
> 
> But doesn't this effectively expand the universe of potential spies
> from 'the global attacker' who is watching your SPV queries, to
> simply 'the globe' -- anyone with a copy of the blockchain?

It's a trade-off. Most people are going to use public peers for their
SPV nodes - they're not going to run full nodes. They also are going to
want to limit how much bandwidth they use to sync their wallets; if they
don't care the can use a very short, or no, prefix and the problem goes
away.

If you make that bandwidth/privacy trade-off by using very specific
filters and non-specific addresses then you have a situation where those
public peers are learning a lot of potentially valuable data. It's easy
to imagine, say, the IRS being willing to pay for data on how many
Bitcoins people have in their wallets to try to catch tax cheats for
instance, and that can easily fund a lot of fast and high-quality peers
that don't advertise the fact that they're selling data on the contents
of your wallet.

On the other hand if you use non-specific filters, and prefixed
addresses for incoming payments, then you're not leaking high-quality
information to anyone. I think this makes for a more robust Bitcoin
system, especially as we need things like CoinJoin for privacy that make
*everyones* privacy matter to you; CoinJoin could easily be defeated by
aquiring lots of good info on the contents of wallets through SPV
queries.

> Some stats on UTXO set size:  (slightly stale -- as of block 270733)
> 
>    7.4m unspent outputs
>    2.2m transactions with unspent outputs
>    2.1m unique unspent scriptPubKeys
>    Side note: the top 1,000 scriptPubKeys have 10% of all unspent outputs.
> 
> Let's say you use an 8-bit prefix (1/256) that would be ~10,000
> transactions in the UTXO you would be monitoring. But if I knew a
> few different days / time-periods you transacted, I could figure out
> your prefix.

Actually UTXO isn't the right way to look at this; prefix filters would
be almost certainly matched against all txouts in blocks. Or put another
way, UTXO isn't the right way to look at it because the attacker will
have some rough idea of the time period, and wants to know about
transactions made.

> Of course, anyone you transact with would know your prefix outright.

Well what good, in your example, is it for the attacker to go from "I
know my target gets a paycheck every two weeks for $x" to "His wallet
prefix is abcd with y% probability"? Even once you learn the prefix of
your target's wallet, what funds they actually own is still embedded in
a much larger anonymity set of hundreds to thousands of transactions
that had nothing to do with them.

> Wouldn't this also allow obvious identification of spend versus
> change addresses in a transaction?

No, I specifically said that you don't want to use prefixes with change
txouts for that reason. Fortunately while the set of all scriptPubKey's
ever used for change txouts will grow over time, as long as you are not
watching for new payments on any key in that set you only need to query
for the ones that still have funds on them, and that's only because you
want to be able to detect unauthorized spends of them.

-- 
'peter'[:-1]@petertodd.org
00000000000000028a5c9edabc9697fe96405f667be1d6d558d1db21d49b8857

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 685 bytes --]

      reply	other threads:[~2014-01-10 10:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-06  2:53 [Bitcoin-development] Privacy and blockchain data Peter Todd
2014-01-08  6:34 ` Jeremy Spilman
2014-01-10 10:10   ` Peter Todd [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140110101051.GA25749@savin \
    --to=pete@petertodd.org \
    --cc=bitcoin-development@lists.sourceforge.net \
    --cc=jeremy@taplink.co \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox