[Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
@ 2013-03-12  0:18 Pieter Wuille
  2013-03-12  1:01 ` Pieter Wuille
  0 siblings, 1 reply; 18+ messages in thread
From: Pieter Wuille @ 2013-03-12  0:18 UTC (permalink / raw)
  To: Bitcoin Dev, bitcoin-security

[-- Attachment #1: Type: text/plain, Size: 1065 bytes --]

Hello everyone,

Í've just seen many reports of 0.7 nodes getting stuck around block 225430,
due to running out of lock entries in the BDB database. 0.8 nodes do not
seem to have a problem.

In any case, if you do not have this block:

  2013-03-12 00:00:10 SetBestChain: new
best=000000000000015aab28064a4c521d6a5325ff6e251e8ca2edfdfe6cb5bf832c
 height=225439  work=853779625563004076992  tx=14269257  date=2013-03-11
23:49:08

you're likely stuck. Check debug.log and db.log (look for 'Lock table is
out of available lock entries').

If this is a widespread problem, it is an emergency. We risk having
(several) forked chains with smaller blocks, which are accepted by 0.7
nodes. Can people contact pool operators to see which fork they are on?
Blockexplorer and blockchain.info seem to be stuck as well.

Immediate solution is upgrading to 0.8, or manually setting the number of
lock objects higher in your database. I'll follow up with more concrete
instructions.

If you're unsure, please stop processing transactions.

-- 
Pieter

[-- Attachment #2: Type: text/html, Size: 1436 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  0:18 [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk Pieter Wuille
@ 2013-03-12  1:01 ` Pieter Wuille
  2013-03-12  9:10   ` Mike Hearn
  0 siblings, 1 reply; 18+ messages in thread
From: Pieter Wuille @ 2013-03-12  1:01 UTC (permalink / raw)
  To: Bitcoin Dev, bitcoin-security

[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]

Hello again,

block 000000000000015c50b165fcdd33556f8b44800c5298943ac70b112df480c023
(height=225430) seems indeed to have cause pre-0.8 and 0.8 nodes to fork
(at least mostly). Both chains are being mined on - the 0.8 one growing
faster.

After some emergency discussion on #bitcoin-dev, it seems best to try to
get the majority mining power back on the "old" chain, that is, the one
which 0.7 accepts
(with 00000000000001c108384350f74090433e7fcf79a606b8e797f065b130575932 at
height 225430). That is the only chain every client out there will accept.
BTC Guild is switching to 0.7, so majority should abandon the 0.8 chain
soon.

Short advice: if you're a miner, please revert to 0.7 until we at least
understand exactly what causes this. If you're a merchant, and are on 0.8,
stop processing transactions until both sides have switched to the same
chain again. We'll see how to proceed afterwards.

-- 
Pieter

On Tue, Mar 12, 2013 at 1:18 AM, Pieter Wuille <pieter.wuille@gmail.com>wrote:

> Hello everyone,
>
> Í've just seen many reports of 0.7 nodes getting stuck around block
> 225430, due to running out of lock entries in the BDB database. 0.8 nodes
> do not seem to have a problem.
>
> In any case, if you do not have this block:
>
>   2013-03-12 00:00:10 SetBestChain: new
> best=000000000000015aab28064a4c521d6a5325ff6e251e8ca2edfdfe6cb5bf832c
>  height=225439  work=853779625563004076992  tx=14269257  date=2013-03-11
> 23:49:08
>
> you're likely stuck. Check debug.log and db.log (look for 'Lock table is
> out of available lock entries').
>
> If this is a widespread problem, it is an emergency. We risk having
> (several) forked chains with smaller blocks, which are accepted by 0.7
> nodes. Can people contact pool operators to see which fork they are on?
> Blockexplorer and blockchain.info seem to be stuck as well.
>
> Immediate solution is upgrading to 0.8, or manually setting the number of
> lock objects higher in your database. I'll follow up with more concrete
> instructions.
>
> If you're unsure, please stop processing transactions.
>
> --
> Pieter
>

[-- Attachment #2: Type: text/html, Size: 2907 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  1:01 ` Pieter Wuille
@ 2013-03-12  9:10   ` Mike Hearn
  2013-03-12  9:53     `  Jorge Timón
                       ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Mike Hearn @ 2013-03-12  9:10 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Dev, bitcoin-security

Just so we're all on the same page, can someone confirm my
understanding  - are any of the following statements untrue?

BDB ran out of locks.
However, only on some 0.7 nodes. Others, perhaps nodes using different
flags, managed it.
We have processed 1mb sized blocks on the testnet.
Therefore it isn't presently clear why that particular block caused
lock exhaustion when other larger blocks have not.

The reason for increasing the soft limit is still present (we have run
out of space).
Therefore transactions are likely to start stacking up in the memory
pool again very shortly, as they did last week.
There are no bounds on the memory pool size. If too many transactions
enter the pool then nodes will start to die with OOM failures.
Therefore it is possible that we have a very limited amount of time
until nodes start dying en-masse.
Even if nodes do not die, users have no way to find out what the
current highest fees/bids for block space are, nor any way to change
the fee on sent transactions.
Therefore Bitcoin will shortly start to break for the majority of
users who don't have a deep understanding of the system.

If all the above statements are true, we appear to be painted into a
corner - can't roll forward and can't roll back, with very limited
time to come up with a solution. I see only a small number of
alternatives:

1) Start aggressively trying to block or down-prioritize SatoshiDice
transactions at the network level, to buy time and try to avoid
mempool exhaustion. I don't know a good way to do this, although it
appears that virtually all their traffic is actually coming via
blockchain.infos My Wallet service. During their last outage block
sizes seemed to drop to around 50kb. Alternatively, ask SD to
temporarily suspend their service (this seems like a long shot).

2) Perform a crash hard fork as soon as possible, probably with no
changes in it except a new block size limit. Question - try to lift
the 1mb limit at the same time, or not?

On Tue, Mar 12, 2013 at 2:01 AM, Pieter Wuille <pieter.wuille@gmail.com> wrote:
> Hello again,
>
> block 000000000000015c50b165fcdd33556f8b44800c5298943ac70b112df480c023
> (height=225430) seems indeed to have cause pre-0.8 and 0.8 nodes to fork (at
> least mostly). Both chains are being mined on - the 0.8 one growing faster.
>
> After some emergency discussion on #bitcoin-dev, it seems best to try to get
> the majority mining power back on the "old" chain, that is, the one which
> 0.7 accepts (with
> 00000000000001c108384350f74090433e7fcf79a606b8e797f065b130575932 at height
> 225430). That is the only chain every client out there will accept. BTC
> Guild is switching to 0.7, so majority should abandon the 0.8 chain soon.
>
> Short advice: if you're a miner, please revert to 0.7 until we at least
> understand exactly what causes this. If you're a merchant, and are on 0.8,
> stop processing transactions until both sides have switched to the same
> chain again. We'll see how to proceed afterwards.
>
> --
> Pieter
>
>
>
> On Tue, Mar 12, 2013 at 1:18 AM, Pieter Wuille <pieter.wuille@gmail.com>
> wrote:
>>
>> Hello everyone,
>>
>> Í've just seen many reports of 0.7 nodes getting stuck around block
>> 225430, due to running out of lock entries in the BDB database. 0.8 nodes do
>> not seem to have a problem.
>>
>> In any case, if you do not have this block:
>>
>>   2013-03-12 00:00:10 SetBestChain: new
>> best=000000000000015aab28064a4c521d6a5325ff6e251e8ca2edfdfe6cb5bf832c
>> height=225439  work=853779625563004076992  tx=14269257  date=2013-03-11
>> 23:49:08
>>
>> you're likely stuck. Check debug.log and db.log (look for 'Lock table is
>> out of available lock entries').
>>
>> If this is a widespread problem, it is an emergency. We risk having
>> (several) forked chains with smaller blocks, which are accepted by 0.7
>> nodes. Can people contact pool operators to see which fork they are on?
>> Blockexplorer and blockchain.info seem to be stuck as well.
>>
>> Immediate solution is upgrading to 0.8, or manually setting the number of
>> lock objects higher in your database. I'll follow up with more concrete
>> instructions.
>>
>> If you're unsure, please stop processing transactions.
>>
>> --
>> Pieter
>
>
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  9:10   ` Mike Hearn
@ 2013-03-12  9:53     `  Jorge Timón
  2013-03-12  9:57     ` Peter Todd
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: 	Jorge Timón @ 2013-03-12  9:53 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev, bitcoin-security

On 3/12/13, Mike Hearn <mike@plan99.net> wrote:
> 1) Start aggressively trying to block or down-prioritize SatoshiDice
> transactions at the network level, to buy time and try to avoid
> mempool exhaustion. I don't know a good way to do this, although it
> appears that virtually all their traffic is actually coming via
> blockchain.infos My Wallet service. During their last outage block
> sizes seemed to drop to around 50kb. Alternatively, ask SD to
> temporarily suspend their service (this seems like a long shot).

They have a vested interested in bitcoin's success. Can't they be
asked to suspend their operations temporarily until the new hard-fork
is properly prepared?

I thought they have stopped them already.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  9:10   ` Mike Hearn
  2013-03-12  9:53     `  Jorge Timón
@ 2013-03-12  9:57     ` Peter Todd
  2013-03-12 10:10       ` Mike Hearn
  2013-03-12 10:13     ` Michael Gronager
  2013-03-12 12:38     ` Gregory Maxwell
  3 siblings, 1 reply; 18+ messages in thread
From: Peter Todd @ 2013-03-12  9:57 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev, bitcoin-security

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

On Tue, Mar 12, 2013 at 10:10:15AM +0100, Mike Hearn wrote:
> There are no bounds on the memory pool size. If too many transactions
> enter the pool then nodes will start to die with OOM failures.
> Therefore it is possible that we have a very limited amount of time
> until nodes start dying en-masse.

Note that nodes dying en-mass due to OOM failures is a relatively benign
failure mode as the point as which any particular node would die is
uncorrelated with other nodes - it won't cause a network fork.

Implementing a simple and stupid "while [ true ] do ; ./bitcoind ; done"
loop combined with ulimit to keep total memory usage to something sane
is a perfectly acceptable hack until proper mempool code with expiration
can be written. Gavin can talk more about his ideas in that regard.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  9:57     ` Peter Todd
@ 2013-03-12 10:10       ` Mike Hearn
  2013-03-12 10:17         ` Peter Todd
  0 siblings, 1 reply; 18+ messages in thread
From: Mike Hearn @ 2013-03-12 10:10 UTC (permalink / raw)
  To: Peter Todd; +Cc: Bitcoin Dev, bitcoin-security

However, most nodes are not running in such a loop today. Probably
almost no nodes are.

I suppose you could consider mass node death to be more benign than a
hard fork, but both are pretty damn serious and warrant immediate
action. Otherwise we're going to see the number of nodes drop sharply
over the coming days as unattended nodes die and then don't get
restarted.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  9:10   ` Mike Hearn
  2013-03-12  9:53     `  Jorge Timón
  2013-03-12  9:57     ` Peter Todd
@ 2013-03-12 10:13     ` Michael Gronager
  2013-03-12 10:26       ` Peter Todd
                         ` (2 more replies)
  2013-03-12 12:38     ` Gregory Maxwell
  3 siblings, 3 replies; 18+ messages in thread
From: Michael Gronager @ 2013-03-12 10:13 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev

Yes, 0.7 (yes 0.7!) was not sufficiently tested it had an undocumented and unknown criteria for block rejection, hence the upgrade went wrong.

More space in the block is needed indeed, but the real problem you are describing is actually not missing space in the block, but proper handling of mem-pool transactions. They should be pruned on two criteria:

1. if they gets to old >24hr
2. if the client is running out of space, then the oldest should probably be pruned 

clients are anyway keeping, and re-relaying, their own transactions and hence it would mean only little, and only little for clients. Dropping free / old transaction is a much a better behavior than dying... Even a scheme where the client dropped all or random mempool txes would be a tolerable way of handling things (dropping all is similar to a restart, except for no user intervention).

Following that, increase the soft and hard limit to 1 and eg 10MB, but miners should be the last to upgrade.

/M


On 12/03/2013, at 10:10, Mike Hearn <mike@plan99.net> wrote:

> Just so we're all on the same page, can someone confirm my
> understanding  - are any of the following statements untrue?
> 
> BDB ran out of locks.
> However, only on some 0.7 nodes. Others, perhaps nodes using different
> flags, managed it.
> We have processed 1mb sized blocks on the testnet.
> Therefore it isn't presently clear why that particular block caused
> lock exhaustion when other larger blocks have not.
> 
> The reason for increasing the soft limit is still present (we have run
> out of space).
> Therefore transactions are likely to start stacking up in the memory
> pool again very shortly, as they did last week.
> There are no bounds on the memory pool size. If too many transactions
> enter the pool then nodes will start to die with OOM failures.
> Therefore it is possible that we have a very limited amount of time
> until nodes start dying en-masse.
> Even if nodes do not die, users have no way to find out what the
> current highest fees/bids for block space are, nor any way to change
> the fee on sent transactions.
> Therefore Bitcoin will shortly start to break for the majority of
> users who don't have a deep understanding of the system.
> 
> 
> If all the above statements are true, we appear to be painted into a
> corner - can't roll forward and can't roll back, with very limited
> time to come up with a solution. I see only a small number of
> alternatives:
> 
> 1) Start aggressively trying to block or down-prioritize SatoshiDice
> transactions at the network level, to buy time and try to avoid
> mempool exhaustion. I don't know a good way to do this, although it
> appears that virtually all their traffic is actually coming via
> blockchain.infos My Wallet service. During their last outage block
> sizes seemed to drop to around 50kb. Alternatively, ask SD to
> temporarily suspend their service (this seems like a long shot).
> 
> 2) Perform a crash hard fork as soon as possible, probably with no
> changes in it except a new block size limit. Question - try to lift
> the 1mb limit at the same time, or not?
> 
> 
> 
> 
> On Tue, Mar 12, 2013 at 2:01 AM, Pieter Wuille <pieter.wuille@gmail.com> wrote:
>> Hello again,
>> 
>> block 000000000000015c50b165fcdd33556f8b44800c5298943ac70b112df480c023
>> (height=225430) seems indeed to have cause pre-0.8 and 0.8 nodes to fork (at
>> least mostly). Both chains are being mined on - the 0.8 one growing faster.
>> 
>> After some emergency discussion on #bitcoin-dev, it seems best to try to get
>> the majority mining power back on the "old" chain, that is, the one which
>> 0.7 accepts (with
>> 00000000000001c108384350f74090433e7fcf79a606b8e797f065b130575932 at height
>> 225430). That is the only chain every client out there will accept. BTC
>> Guild is switching to 0.7, so majority should abandon the 0.8 chain soon.
>> 
>> Short advice: if you're a miner, please revert to 0.7 until we at least
>> understand exactly what causes this. If you're a merchant, and are on 0.8,
>> stop processing transactions until both sides have switched to the same
>> chain again. We'll see how to proceed afterwards.
>> 
>> --
>> Pieter
>> 
>> 
>> 
>> On Tue, Mar 12, 2013 at 1:18 AM, Pieter Wuille <pieter.wuille@gmail.com>
>> wrote:
>>> 
>>> Hello everyone,
>>> 
>>> Í've just seen many reports of 0.7 nodes getting stuck around block
>>> 225430, due to running out of lock entries in the BDB database. 0.8 nodes do
>>> not seem to have a problem.
>>> 
>>> In any case, if you do not have this block:
>>> 
>>>  2013-03-12 00:00:10 SetBestChain: new
>>> best=000000000000015aab28064a4c521d6a5325ff6e251e8ca2edfdfe6cb5bf832c
>>> height=225439  work=853779625563004076992  tx=14269257  date=2013-03-11
>>> 23:49:08
>>> 
>>> you're likely stuck. Check debug.log and db.log (look for 'Lock table is
>>> out of available lock entries').
>>> 
>>> If this is a widespread problem, it is an emergency. We risk having
>>> (several) forked chains with smaller blocks, which are accepted by 0.7
>>> nodes. Can people contact pool operators to see which fork they are on?
>>> Blockexplorer and blockchain.info seem to be stuck as well.
>>> 
>>> Immediate solution is upgrading to 0.8, or manually setting the number of
>>> lock objects higher in your database. I'll follow up with more concrete
>>> instructions.
>>> 
>>> If you're unsure, please stop processing transactions.
>>> 
>>> --
>>> Pieter
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
>> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
>> endpoint security space. For insight on selecting the right partner to
>> tackle endpoint security challenges, access the full report.
>> http://p.sf.net/sfu/symantec-dev2dev
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>> 
> 
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
> endpoint security space. For insight on selecting the right partner to 
> tackle endpoint security challenges, access the full report. 
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 10:10       ` Mike Hearn
@ 2013-03-12 10:17         ` Peter Todd
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Todd @ 2013-03-12 10:17 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev, bitcoin-security

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Tue, Mar 12, 2013 at 11:10:47AM +0100, Mike Hearn wrote:
> However, most nodes are not running in such a loop today. Probably
> almost no nodes are.
> 
> I suppose you could consider mass node death to be more benign than a
> hard fork, but both are pretty damn serious and warrant immediate
> action. Otherwise we're going to see the number of nodes drop sharply
> over the coming days as unattended nodes die and then don't get
> restarted.

I'm sure if "mass node death" becomes an issue miners will have plenty
of incentive to temporarily, or permanently, setup some high-memory and
high-bandwidth nodes to accept transactions. The DNS seeds sort by
reliability so it won't be long before nodes are connecting to them.

My home machine has 16GB of ram, bigger than the whole blockchain.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 10:13     ` Michael Gronager
@ 2013-03-12 10:26       ` Peter Todd
  2013-03-12 10:43         ` Mike Hearn
  2013-03-12 10:40       ` Roy Badami
  2013-03-12 11:44       ` Pieter Wuille
  2 siblings, 1 reply; 18+ messages in thread
From: Peter Todd @ 2013-03-12 10:26 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

On Tue, Mar 12, 2013 at 11:13:09AM +0100, Michael Gronager wrote:
> Following that, increase the soft and hard limit to 1 and eg 10MB, but miners should be the last to upgrade.

We just saw a hard-fork happen because we ran into previously unknown
scaling issues with the current codebase. Why follow that up immediately
with yet another jump into unknown scaling territory?

I suspect the PR fallout from another chain split, let alone multiple
splits, will be far damaging to Bitcoin than stories along the lines of
"Gee, actually it'd kinda expensive to do a Bitcoin transaction these
days due to all the competition. I dunno, I guess it must be really
popular and valuable or something?"

Lets let the issue rest for a while, and we can all have some time to
work on our various approaches to solving the problem. The worst that
will happen is growth temporarily slows - hardly a disaster I think.

-- 
'peter'[:-1]@petertodd.org

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 10:13     ` Michael Gronager
  2013-03-12 10:26       ` Peter Todd
@ 2013-03-12 10:40       ` Roy Badami
  2013-03-12 11:44       ` Pieter Wuille
  2 siblings, 0 replies; 18+ messages in thread
From: Roy Badami @ 2013-03-12 10:40 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

> clients are anyway keeping, and re-relaying, their own transactions
> and hence it would mean only little, and only little for clients.

Not all end-user clients are always-on though



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 10:26       ` Peter Todd
@ 2013-03-12 10:43         ` Mike Hearn
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Hearn @ 2013-03-12 10:43 UTC (permalink / raw)
  To: Peter Todd; +Cc: Bitcoin Dev, Michael Gronager

> We just saw a hard-fork happen because we ran into previously unknown
> scaling issues with the current codebase.

Technically, it with the previous codebase ;)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 10:13     ` Michael Gronager
  2013-03-12 10:26       ` Peter Todd
  2013-03-12 10:40       ` Roy Badami
@ 2013-03-12 11:44       ` Pieter Wuille
  2013-03-12 12:11         ` Mike Hearn
  2013-03-12 12:18         `  Jorge Timón
  2 siblings, 2 replies; 18+ messages in thread
From: Pieter Wuille @ 2013-03-12 11:44 UTC (permalink / raw)
  To: Michael Gronager; +Cc: Bitcoin Dev

On Tue, Mar 12, 2013 at 11:13:09AM +0100, Michael Gronager wrote:
> Yes, 0.7 (yes 0.7!) was not sufficiently tested it had an undocumented and unknown criteria for block rejection, hence the upgrade went wrong.

We're using "0.7" as a short moniker for all clients, but this was a limitation that all
BDB-based bitcoins ever had. The bug is simply a limit in the number of lock objects
that was reached.

It's ironic that 0.8 was supposed to solve all problems we had due to BDB (except the
wallet...), but now it seems it's still coming back to haunt us. I really hated telling
miners to go back to 0.7, given all efforts to make 0.8 signficantly more tolerable...

> More space in the block is needed indeed, but the real problem you are describing is actually not missing space in the block, but proper handling of mem-pool transactions. They should be pruned on two criteria:
> 
> 1. if they gets to old >24hr
> 2. if the client is running out of space, then the oldest should probably be pruned 
> 
> clients are anyway keeping, and re-relaying, their own transactions and hence it would mean only little, and only little for clients. Dropping free / old transaction is a much a better behavior than dying... Even a scheme where the client dropped all or random mempool txes would be a tolerable way of handling things (dropping all is similar to a restart, except for no user intervention).

Right now, mempools are relatively small in memory usage, but with small block sizes,
it indeed risks going up. In 0.8, conflicting (=double spending) transactions in the
chain cause clearing the mempool of conflicts, so at least the mempool is bounded by
the size of the UTXO subset being spent. Dropping transactions from the memory pool
when they run out of space seems a correct solution. I'm less convinced about a
deterministic time-based rule, as that creates a double spending incentive at that
time, and a counter incentive to spam the network with your risking-to-be-cleared
transaction as well.

Regarding the block space, we've seen the pct% of one single block chain space consumer
grow simultaneously with the introduction of larger blocks, so I'm not actually convinced
there is right now a big need for larger blocks (note: right now). The competition for
block chain space is mostly an issue for client software which doesn't deal correctly
with non-confirming transactions, and misleading users. It's mostly a usability problem
now, but increasing block sizes isn't guaranteed to fix that; it may just make more
space for spam.

However, the presence of this bug, and the fact that a full solution is available (0.8),
probably helps achieving consensus fixing it (=a hardfork) is needed, and we should take
advantage of that. But please, let's not rush things...

-- 
Piter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 11:44       ` Pieter Wuille
@ 2013-03-12 12:11         ` Mike Hearn
  2013-03-12 12:27           ` Michael Gronager
  2013-03-12 12:18         `  Jorge Timón
  1 sibling, 1 reply; 18+ messages in thread
From: Mike Hearn @ 2013-03-12 12:11 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Dev, Michael Gronager

I'm not even sure I'd say the upgrade "went wrong". The problem if
anything is the upgrade didn't happen fast enough. If we had run out
of block space a few months from now, or if miners/merchants/exchanges
had upgraded faster, it'd have made more sense to just roll forward
and tolerate the loss of the older clients.

This really reinforces the importance of keeping nodes up to date.

On Tue, Mar 12, 2013 at 12:44 PM, Pieter Wuille <pieter.wuille@gmail.com> wrote:
> On Tue, Mar 12, 2013 at 11:13:09AM +0100, Michael Gronager wrote:
>> Yes, 0.7 (yes 0.7!) was not sufficiently tested it had an undocumented and unknown criteria for block rejection, hence the upgrade went wrong.
>
> We're using "0.7" as a short moniker for all clients, but this was a limitation that all
> BDB-based bitcoins ever had. The bug is simply a limit in the number of lock objects
> that was reached.
>
> It's ironic that 0.8 was supposed to solve all problems we had due to BDB (except the
> wallet...), but now it seems it's still coming back to haunt us. I really hated telling
> miners to go back to 0.7, given all efforts to make 0.8 signficantly more tolerable...
>
>> More space in the block is needed indeed, but the real problem you are describing is actually not missing space in the block, but proper handling of mem-pool transactions. They should be pruned on two criteria:
>>
>> 1. if they gets to old >24hr
>> 2. if the client is running out of space, then the oldest should probably be pruned
>>
>> clients are anyway keeping, and re-relaying, their own transactions and hence it would mean only little, and only little for clients. Dropping free / old transaction is a much a better behavior than dying... Even a scheme where the client dropped all or random mempool txes would be a tolerable way of handling things (dropping all is similar to a restart, except for no user intervention).
>
> Right now, mempools are relatively small in memory usage, but with small block sizes,
> it indeed risks going up. In 0.8, conflicting (=double spending) transactions in the
> chain cause clearing the mempool of conflicts, so at least the mempool is bounded by
> the size of the UTXO subset being spent. Dropping transactions from the memory pool
> when they run out of space seems a correct solution. I'm less convinced about a
> deterministic time-based rule, as that creates a double spending incentive at that
> time, and a counter incentive to spam the network with your risking-to-be-cleared
> transaction as well.
>
> Regarding the block space, we've seen the pct% of one single block chain space consumer
> grow simultaneously with the introduction of larger blocks, so I'm not actually convinced
> there is right now a big need for larger blocks (note: right now). The competition for
> block chain space is mostly an issue for client software which doesn't deal correctly
> with non-confirming transactions, and misleading users. It's mostly a usability problem
> now, but increasing block sizes isn't guaranteed to fix that; it may just make more
> space for spam.
>
> However, the presence of this bug, and the fact that a full solution is available (0.8),
> probably helps achieving consensus fixing it (=a hardfork) is needed, and we should take
> advantage of that. But please, let's not rush things...
>
> --
> Piter



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 11:44       ` Pieter Wuille
  2013-03-12 12:11         ` Mike Hearn
@ 2013-03-12 12:18         `  Jorge Timón
  2013-03-12 12:40           ` Jay F
  1 sibling, 1 reply; 18+ messages in thread
From: 	Jorge Timón @ 2013-03-12 12:18 UTC (permalink / raw)
  To: Pieter Wuille; +Cc: Bitcoin Dev, Michael Gronager

A related question...some people mentioned yesterday on #bitcoin-dev
that 0.5 appeared to be compatible with 0.8.
Was that only for the "fatal block" and would have forked 0.8 later
too or is it something else?
I'm having a hard time understanding this 0.5 thing, if someone can
bring some light to it I would appreciate it.

Thanks in advance

On 3/12/13, Pieter Wuille <pieter.wuille@gmail.com> wrote:
> On Tue, Mar 12, 2013 at 11:13:09AM +0100, Michael Gronager wrote:
>> Yes, 0.7 (yes 0.7!) was not sufficiently tested it had an undocumented and
>> unknown criteria for block rejection, hence the upgrade went wrong.
>
> We're using "0.7" as a short moniker for all clients, but this was a
> limitation that all
> BDB-based bitcoins ever had. The bug is simply a limit in the number of lock
> objects
> that was reached.
>
> It's ironic that 0.8 was supposed to solve all problems we had due to BDB
> (except the
> wallet...), but now it seems it's still coming back to haunt us. I really
> hated telling
> miners to go back to 0.7, given all efforts to make 0.8 signficantly more
> tolerable...
>
>> More space in the block is needed indeed, but the real problem you are
>> describing is actually not missing space in the block, but proper handling
>> of mem-pool transactions. They should be pruned on two criteria:
>>
>> 1. if they gets to old >24hr
>> 2. if the client is running out of space, then the oldest should probably
>> be pruned
>>
>> clients are anyway keeping, and re-relaying, their own transactions and
>> hence it would mean only little, and only little for clients. Dropping
>> free / old transaction is a much a better behavior than dying... Even a
>> scheme where the client dropped all or random mempool txes would be a
>> tolerable way of handling things (dropping all is similar to a restart,
>> except for no user intervention).
>
> Right now, mempools are relatively small in memory usage, but with small
> block sizes,
> it indeed risks going up. In 0.8, conflicting (=double spending)
> transactions in the
> chain cause clearing the mempool of conflicts, so at least the mempool is
> bounded by
> the size of the UTXO subset being spent. Dropping transactions from the
> memory pool
> when they run out of space seems a correct solution. I'm less convinced
> about a
> deterministic time-based rule, as that creates a double spending incentive
> at that
> time, and a counter incentive to spam the network with your
> risking-to-be-cleared
> transaction as well.
>
> Regarding the block space, we've seen the pct% of one single block chain
> space consumer
> grow simultaneously with the introduction of larger blocks, so I'm not
> actually convinced
> there is right now a big need for larger blocks (note: right now). The
> competition for
> block chain space is mostly an issue for client software which doesn't deal
> correctly
> with non-confirming transactions, and misleading users. It's mostly a
> usability problem
> now, but increasing block sizes isn't guaranteed to fix that; it may just
> make more
> space for spam.
>
> However, the presence of this bug, and the fact that a full solution is
> available (0.8),
> probably helps achieving consensus fixing it (=a hardfork) is needed, and we
> should take
> advantage of that. But please, let's not rush things...
>
> --
> Piter
>
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
> endpoint security space. For insight on selecting the right partner to
> tackle endpoint security challenges, access the full report.
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>


-- 
Jorge Timón

http://freico.in/
http://archive.ripple-project.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 12:11         ` Mike Hearn
@ 2013-03-12 12:27           ` Michael Gronager
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Gronager @ 2013-03-12 12:27 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev

Well a reversed upgrade is an upgrade that went wrong ;)

Anyway, the incident makes it even more important for people to upgrade, well except, perhaps, for miners...

Forks are caused by rejection criteria, hence: 
1. If you introduce new rejection criteria in an upgrade miners should upgrade _first_.
2. If you loosen some rejection criteria miners should upgrade _last_.
3. If you keep the same criteria assume 2.

/M

On 12/03/2013, at 13:11, Mike Hearn <mike@plan99.net> wrote:

> I'm not even sure I'd say the upgrade "went wrong". The problem if
> anything is the upgrade didn't happen fast enough. If we had run out
> of block space a few months from now, or if miners/merchants/exchanges
> had upgraded faster, it'd have made more sense to just roll forward
> and tolerate the loss of the older clients.
> 
> This really reinforces the importance of keeping nodes up to date.
> 
> On Tue, Mar 12, 2013 at 12:44 PM, Pieter Wuille <pieter.wuille@gmail.com> wrote:
>> On Tue, Mar 12, 2013 at 11:13:09AM +0100, Michael Gronager wrote:
>>> Yes, 0.7 (yes 0.7!) was not sufficiently tested it had an undocumented and unknown criteria for block rejection, hence the upgrade went wrong.
>> 
>> We're using "0.7" as a short moniker for all clients, but this was a limitation that all
>> BDB-based bitcoins ever had. The bug is simply a limit in the number of lock objects
>> that was reached.
>> 
>> It's ironic that 0.8 was supposed to solve all problems we had due to BDB (except the
>> wallet...), but now it seems it's still coming back to haunt us. I really hated telling
>> miners to go back to 0.7, given all efforts to make 0.8 signficantly more tolerable...
>> 
>>> More space in the block is needed indeed, but the real problem you are describing is actually not missing space in the block, but proper handling of mem-pool transactions. They should be pruned on two criteria:
>>> 
>>> 1. if they gets to old >24hr
>>> 2. if the client is running out of space, then the oldest should probably be pruned
>>> 
>>> clients are anyway keeping, and re-relaying, their own transactions and hence it would mean only little, and only little for clients. Dropping free / old transaction is a much a better behavior than dying... Even a scheme where the client dropped all or random mempool txes would be a tolerable way of handling things (dropping all is similar to a restart, except for no user intervention).
>> 
>> Right now, mempools are relatively small in memory usage, but with small block sizes,
>> it indeed risks going up. In 0.8, conflicting (=double spending) transactions in the
>> chain cause clearing the mempool of conflicts, so at least the mempool is bounded by
>> the size of the UTXO subset being spent. Dropping transactions from the memory pool
>> when they run out of space seems a correct solution. I'm less convinced about a
>> deterministic time-based rule, as that creates a double spending incentive at that
>> time, and a counter incentive to spam the network with your risking-to-be-cleared
>> transaction as well.
>> 
>> Regarding the block space, we've seen the pct% of one single block chain space consumer
>> grow simultaneously with the introduction of larger blocks, so I'm not actually convinced
>> there is right now a big need for larger blocks (note: right now). The competition for
>> block chain space is mostly an issue for client software which doesn't deal correctly
>> with non-confirming transactions, and misleading users. It's mostly a usability problem
>> now, but increasing block sizes isn't guaranteed to fix that; it may just make more
>> space for spam.
>> 
>> However, the presence of this bug, and the fact that a full solution is available (0.8),
>> probably helps achieving consensus fixing it (=a hardfork) is needed, and we should take
>> advantage of that. But please, let's not rush things...
>> 
>> --
>> Piter
> 
> ------------------------------------------------------------------------------
> Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
> Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
> endpoint security space. For insight on selecting the right partner to 
> tackle endpoint security challenges, access the full report. 
> http://p.sf.net/sfu/symantec-dev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12  9:10   ` Mike Hearn
                       ` (2 preceding siblings ...)
  2013-03-12 10:13     ` Michael Gronager
@ 2013-03-12 12:38     ` Gregory Maxwell
  2013-03-12 13:00       ` Michael Gronager
  3 siblings, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2013-03-12 12:38 UTC (permalink / raw)
  To: Mike Hearn; +Cc: Bitcoin Dev, bitcoin-security

On Tue, Mar 12, 2013 at 2:10 AM, Mike Hearn <mike@plan99.net> wrote:
> BDB ran out of locks.
> However, only on some 0.7 nodes. Others, perhaps nodes using different
> flags, managed it.
> We have processed 1mb sized blocks on the testnet.
> Therefore it isn't presently clear why that particular block caused
> lock exhaustion when other larger blocks have not.

Locks are only mostly related to block size, once I heard what was
happening I was unsurprised the max sized test blocks hadn't triggered
it.

> Therefore it is possible that we have a very limited amount of time
until nodes start dying en-masse.

Scaremongering much? Egads.

On Tue, Mar 12, 2013 at 5:27 AM, Michael Gronager <gronager@ceptacle.com> wrote:
> Forks are caused by rejection criteria, hence:
> 1. If you introduce new rejection criteria in an upgrade miners should upgrade _first_.
> 2. If you loosen some rejection criteria miners should upgrade _last_.
> 3. If you keep the same criteria assume 2.

And ... if you aren't aware that you're making a change ???



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 12:18         `  Jorge Timón
@ 2013-03-12 12:40           ` Jay F
  0 siblings, 0 replies; 18+ messages in thread
From: Jay F @ 2013-03-12 12:40 UTC (permalink / raw)
  To: Jorge Timón; +Cc: Bitcoin Dev

On 3/12/2013 5:18 AM, Jorge Timón wrote:
> A related question...some people mentioned yesterday on #bitcoin-dev
> that 0.5 appeared to be compatible with 0.8.
> Was that only for the "fatal block" and would have forked 0.8 later
> too or is it something else?
> I'm having a hard time understanding this 0.5 thing, if someone can
> bring some light to it I would appreciate it.
>
> Thanks in advance
>
It was reported that not all 0.7 died from the BDB error either. This 
will likely take a post-mortem to determine exactly what build 
environments and versions are incompatible, by feeding each the bloated 
block (hopefully there are lots of snapshots of the bad chain being the 
best height for testing; I forgot to get one).



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk
  2013-03-12 12:38     ` Gregory Maxwell
@ 2013-03-12 13:00       ` Michael Gronager
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Gronager @ 2013-03-12 13:00 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: Bitcoin Dev

>> Forks are caused by rejection criteria, hence:
>> 1. If you introduce new rejection criteria in an upgrade miners should upgrade _first_.
>> 2. If you loosen some rejection criteria miners should upgrade _last_.
>> 3. If you keep the same criteria assume 2.
> 
> And ... if you aren't aware that you're making a change ???

then only half should upgrade :-P

Well I thought I covered that by 3... But, question is of course if we could have been in a situation where 0.8 had been the one rejecting blocks? 

So miners could go with a filtering approach: only connect to the network through a node of a version one less than the current. That would still have caused block 225430 to be created, but it would never have been relayed and hence no harm. (and if the issue had been in 0.8 the block would not even have been accepted there in the first place). Downside is some lost seconds.

/M

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-03-12 13:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-12  0:18 [Bitcoin-development] Warning: many 0.7 nodes break on large number of tx/block; fork risk Pieter Wuille
2013-03-12  1:01 ` Pieter Wuille
2013-03-12  9:10   ` Mike Hearn
2013-03-12  9:53     `  Jorge Timón
2013-03-12  9:57     ` Peter Todd
2013-03-12 10:10       ` Mike Hearn
2013-03-12 10:17         ` Peter Todd
2013-03-12 10:13     ` Michael Gronager
2013-03-12 10:26       ` Peter Todd
2013-03-12 10:43         ` Mike Hearn
2013-03-12 10:40       ` Roy Badami
2013-03-12 11:44       ` Pieter Wuille
2013-03-12 12:11         ` Mike Hearn
2013-03-12 12:27           ` Michael Gronager
2013-03-12 12:18         `  Jorge Timón
2013-03-12 12:40           ` Jay F
2013-03-12 12:38     ` Gregory Maxwell
2013-03-12 13:00       ` Michael Gronager

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox