[Bitcoin-development] Full Clients in the future - Blockchain management

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

From: Alan Reiner <etotheipi@gmail.com>
To: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: [Bitcoin-development] Full Clients in the future - Blockchain management
Date: Sat, 02 Jun 2012 11:40:27 -0400	[thread overview]
Message-ID: <4FCA33EB.5030706@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2801 bytes --]

Devs,

I have decided to upgrade Armory's blockchain utilities, partly out of 
necessity due to a poor code decision I made before I even decided I was 
making a client.  In an effort to avoid such mistakes again, I want to 
do it "right" this time around, and realize that this is a good 
discussion for all the devs that will have to deal with this eventually...

The part I'm having difficulty with, is the idea that in a few years 
from now, it just may not be feasible to hold transactions 
file-/pointers/ in RAM, because even that would overwhelm standard RAM 
sizes.  Without any degree of blockchain compression, I see that the 
most general, scalable solution is probably a complicated one.

On the other hand, where this fails may be where we have already 
predicted that the network will have to split into "super-nodes" and 
"lite nodes."  In which case, this discussion is still a good one, but 
just directed more towards the super-nodes.  But, there may still be a 
point at which super-nodes don't have enough RAM to hold this data...

(1)  As for how small you can get the data:  my original idea was that 
the entire blockchain is stored on disk as blkXXXX.dat files.  I store 
all transactions as 10-byte "file-references."  10 bytes would be

     -- X in blkX.dat (2 bytes)
     -- Tx start byte (4 bytes)
     -- Tx size bytes (4 bytes)

The file-refs would be stored in a multimap indexed by the first 6 bytes 
of the tx-hash.  In this way, when I search the multimap, I potentially 
get a list of file-refs, and I might have to retrieve a couple of tx 
from disk before finding the right one, but it would be a good trade-off 
compared to storing all 32 bytes (that's assuming that multimap nodes 
don't have too much overhead).

But even with this, if there are 1,000,000,000 transactions in the 
blockchain, each node is probably 48 bytes  (16 bytes + map/container 
overhead), then you're talking about 48 GB to track all the data in 
RAM.  mmap() may help here, but I'm not sure it's the right solution

(2) What other ways are there, besides some kind of blockchain 
compression, to maintain a multi-terabyte blockchain, assuming that 
storing references to each tx would overwhelm available RAM?   Maybe 
that assumption isn't necessary, but I think it prepares for the worst.

Or maybe I'm too narrow in my focus.  How do other people envision this 
will be handled in the future.  I've heard so many vague notions of 
"well we could do /this/ or /that/, or it wouldn't be hard to do /that/" 
but I haven't heard any serious proposals for it.  And while I believe 
that blockchain compression will become ubiquitous in the future, not 
everyone believes that, and there will undoubtedly be users/devs that 
/want/ to maintain everything under all circumstances.

-Alan

[-- Attachment #2: Type: text/html, Size: 3412 bytes --]

next             reply	other threads:[~2012-06-02 15:40 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-02 15:40 Alan Reiner [this message]
2012-06-03 14:17 ` [Bitcoin-development] Full Clients in the future - Blockchain management Mike Hearn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FCA33EB.5030706@gmail.com \
    --to=etotheipi@gmail.com \
    --cc=bitcoin-development@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox