* Re: [Bitcoin-development] overall bitcoin client code quality
@ 2011-07-12 21:47 Michael Offel
2011-07-12 23:40 ` Gregory Maxwell
2011-07-13 0:17 ` Matt Corallo
0 siblings, 2 replies; 18+ messages in thread
From: Michael Offel @ 2011-07-12 21:47 UTC (permalink / raw)
To: bitcoin-development
Monday, July 11, 2011, 12:31:20 AM, Jeff Garzik wrote:
>> 2. isolation of modules
> It is a long term goal to move towards 'libbitcoin"
What about creating a branch and start libbtc by implementing a small
module like irc or p2p connection handling and use the new lib in the
client. I think this would be a proper start for a new clean code base
without having a non functional client for some time and it also
provides some kind of red line between libbtc (cleaned up code) and
the old code base, making it easy to maintain order.
Would this approach be accepted for a merge?
Monday, July 11, 2011, 12:36:53 AM, Matt Corallo wrote:
>> While I can guess what the CScript class does I would more like to understand the thoughts behind the decision to implement some crypto macros in a compile time interpreter and
>> why Berkeley db 4 is used at all.
> At the time Bitcoin began being built, Ubuntu 9.04 (or was it 9.10?) was
> used, as it offered the oldest libc on the newest OS. Ubuntu 9.04 just
> happened to only have db4.7. For backward compatibility, db4.7 has been
> used ever since (except, for some reason, the osx builds). In 0.4,
> db4.8 will be used. If you are asking why bdb was used to begin with,
> why not? its an excellent db and why reinvent the wheel?
It was more meant as an rhetorical question. A documented decision
would give anyone the chance of arguing against the usage of a library
instead of asking stupid questions. A mailing list archive suits well
for this type of information, so let me try to get some information
here. Db4 is an excellent choice if you need indexed database
functionality without SQL interface. But compared to an stl map lookup
and fopen, fwrite and fclose it is much harder to understand and
brings a lot performance overhead. This is true as long as your
information are small enough to stay in main memory. A stl map storing
file offsets is also not that hard to write and understand. On the
other side using an SQL interface would bring the advantage of
swapping database providers. An enterprise website could use oracle
while the average user could use sqlite. Also is db4 used for any type
of disk storage, this makes files like wallet.dat some kind of hard to
read. It is in no way more secure than storing private key's in an xml
file. But it is much harder to maintain and understand by the user and
the average programer.
> If you are on Windows, why are you on Windows? ;)
I'm forced to to use windows by the type of clients I'm working for.
And during leisure I like to use a System that does not need much
effort to simply do what it is made for. ;)
>>
>> 6. hardcoded values
>>
>> There are tons of hardcoded values for the official and the testnet block chain. It would be a great step to move all chain related settings to a chain description file. This would allow custom chains and clean up the code.
> This one is an interesting debate. There is no real reason to do this
> aside from some questionable code cleanup. Also, there is no reason to
> encourage improperly-implemented alternate chains. Alternate chains
> should be designed in such a way as to share the main chain's difficulty
> as described by Mike on the forum, not just make a new chain and hope it
> sticks.
It is not that interesting as it looks first. There is no good in
running multiple chains for production use. To share the difficulty is
indeed a good start to solve the problem. That's also one of the
things I don't like off the QBitcoin client. What I meant is just
to have the possibility to have all adjustable parameters in one place
and to be able to quickly build an internal testnet without crazy
firewalling to prevent it from dying. The first would allow to detect
problematic ddos protection settings early and giving the average user
the possibility to adjust all important settings if he knows what he
does. That includes not only alternate chains. One could choose to
include transactions only at a higher fee or at no fee at all.
Everyone could do such things by changing the code anyway. But not all
brilliant administrators or users are programmers.
>> The official Bitcoin client should be some kind of an reference project for other clients and must therefore be extra clean and well documented.
> True, but it is much higher priority to clean up the code than comment
> it better, plus there are various other features/more user-facing issues
> that need fixed as well, so...
Good code is the best documentation anyway.
Monday, July 11, 2011, 3:01:51 AM, Luke-Jr wrote:
>> My overall suggestion is to begin a complete rewrite, inspired by the old
>> code rather than moving a lot of "known to be somehow functional" around.
> There are many rewrites in progress, often with much better designs.
There is no other client that uses C and is meant to be a full
implementation and platform independent except QBitcoin. QBitcoin seems
to have no public repository to work on or I have overlooked it ?!?
Starting a new client on my own is just like starting an other never
ending and never used open source project.
>> The official Bitcoin client should be some kind of an reference project
>> for other clients and must therefore be extra clean and well documented.
> Bitcoin is supposed to be an authorityless project. There is no official.
While there is no authority, there is just one fully working client to
look at. This may lead to an working but instable network if other
clients are trying to interpret net.cpp and fail on it in details.
>> *everything else*
> Fix it yourself and submit the changes. If they don't get merged, fork.
As I said, there is no need for an other never ending story. I would
like to know if my affords have a chance to get merged or accepted
before I start to work on it.
Tuesday, July 12, 2011, 4:31:07 AM, Gavin Andresen wrote:
> We'll just tell everybody to stop using bitcoin so much for six months
> or so while we implement a much better client. It will be exactly
> like the bitcoin we have now, except with a much nicer internal
> architecture and much cleaner code-base, and we're pretty sure we can
> get it done in six months if everything goes exactly as planned.
It is some kind of arrogant to believe that anyone would stop using
bitcoin if some programers decide to stop working for some month. And
it is also fond to not fix bugs in the old code base if they appear.
Also there are lots of people out there using old clients anyway. The
important improvement is more about quick extendibility and therefore
more feature rich secure code. This would not only help the official
code base, it would also improve trust and result in better external
bitcoin related projects.
> Then again, it might be easier to modify the
> CRITICAL_SECTION code to detect and report deadlocks (anybody have
> experience doing that?).
That would be true if possible, but I'm pretty sure that the only way
to detect deadlocks is by either analyzing the code or single step
simulating it, what is really tricky on network applications.
Tuesday, July 12, 2011, 6:19:28 AM, Jeff Garzik wrote:
>> While spying on the old code, I noticed one major problem that could be
>> fixed quite easily. That is, the 1 class-per .h/.cpp rule is completely
>> ignored in main.h/cpp and net.h/cpp If all of the classes in the project
>> were re-factored to their own files,
> This is about the last thing we should do, and it's one of the worst
> coding practices of many C++ projects (and unfortunately carried over
> to Java by force). See Knuth and his work on literate programming.
> Putting logically similar code -together- is often more impactful and
> meaningful than splitting up files into dozens (hundreds or thousands,
> in large projects) of tiny, 20-line files.
We seem to have very different opinions on that, but let's try to be
objective. I belive that every class should be able to stand on its
own. That way it can be reused in other projects or situations in the
same project. And it is much more easy to catch and extend class
behavior if it is isolated to one file instead of multiple files or
mixed between other class methods in one file. On the other hand, what
is bad on having 50-80 code files in bitcoin? In terms of source
control it even gives some kind of easier to read history and fewer
merge conflicts. When you start writing a class for exactly one
propose in one specific situation used by one other class you should
think about writing a nested class, which can and should be
implemented in the same cpp file. That way you can achieve you similar
code in one location while accepting the rule others like.
Another nice side effect is the ability to see a class dependency list
be looking at the include listing.
Tuesday, July 12, 2011, 8:21:12 AM, John Smith wrote:
> re:4) I also don't see why boost would be a 'nonstandard
> dependency'. From what I understand, a large part of the new C++0x
> standard is derived from boost. Also C++ compilers are getting
> faster and more smart all the time, so I absolutely don't see "build speed" as a goal.
Don't get me wrong. If boost would be used for something meaningful
there would be no point in removing it. Everything non questionable
about boost does already find its way into most stl implementations.
And everything that find it's way into C++ 0x does it for the reason
that it is better handled by an language extension than by an boost
construct. Otherwise there would be no point in extending the language.
Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-12 21:47 [Bitcoin-development] overall bitcoin client code quality Michael Offel
@ 2011-07-12 23:40 ` Gregory Maxwell
2011-07-13 0:17 ` Matt Corallo
1 sibling, 0 replies; 18+ messages in thread
From: Gregory Maxwell @ 2011-07-12 23:40 UTC (permalink / raw)
To: Michael Offel; +Cc: bitcoin-development
On Tue, Jul 12, 2011 at 5:47 PM, Michael Offel <michael.offel@web.de> wrote:
> We seem to have very different opinions on that, but let's try to be
> objective. I belive that every class should be able to stand on its
> own. That way it can be reused in other projects or situations in the
> same project. And it is much more easy to catch and extend class
Objectively, your believes have only the weight of the electrons they are
printed on, so long as you're talking and not coding.
I don't mean that as an insult— I'm sure many people value your ideas
but when you disagree with someone who is actually coding you'll
eventually lose every time. Talk is cheap.
(And I'm guilty of this too— but aware of my lacking commits I'm
certainly not going to expect anyone to listen to _coding style_ advice.
I try to keep my comments to crap I can measure and speculate about.)
[snip]
> In terms of source
> control it even gives some kind of easier to read history and fewer
> merge conflicts. When you start writing a class for exactly one
> propose in one specific situation used by one other class you should
Certainly no modern SCM has major issues with merge conflicts due to
shared files.
Bitcoin is a _tiny_ piece of software... on the order of 20kloc. It's a
a scale where someone competent can read it in a day and have a basic
overall understanding of it in a few.
This fact makes the aesthetics talk seem like pointless shed-painting
especially coming from people who are yet doing substantial work.
The proposal about reimplementing parts as libraries and the switching
to them after validating them is a fine one. I suggest you do it.
Having multiple work on such projects would not be wasted effort,
as we'd all learn from the competition in designs/APIs/and targets for
comparative testing.
The interesting logic, however, is not net.cpp... because nothing too
awful happens if peers get confused and drop their connections here
and there. The critical logic is the blockchain validation logic. Which
must make absolutely identical decisions on all nodes and which has a
lot of corner cases which are difficult to test and might expose
behavioral differences.
There is also a lot of neat functionality in the scripts which is
currently disabled because of a lack of confidence about the
security of that code.
So not only are new, clean, secondary implementations of this logic
needed, but good automatic testing shims which can find
inconsistencies between implementations.
(Testing rigs are often an excellent area of work for people new to
a project.)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-12 21:47 [Bitcoin-development] overall bitcoin client code quality Michael Offel
2011-07-12 23:40 ` Gregory Maxwell
@ 2011-07-13 0:17 ` Matt Corallo
2011-07-13 11:50 ` Mike Hearn
1 sibling, 1 reply; 18+ messages in thread
From: Matt Corallo @ 2011-07-13 0:17 UTC (permalink / raw)
To: bitcoin-development
[-- Attachment #1: Type: text/plain, Size: 7044 bytes --]
On Tue, 2011-07-12 at 22:47 +0100, Michael Offel wrote:
> Monday, July 11, 2011, 12:31:20 AM, Jeff Garzik wrote:
> >> 2. isolation of modules
> > It is a long term goal to move towards 'libbitcoin"
> What about creating a branch and start libbtc by implementing a small
> module like irc or p2p connection handling and use the new lib in the
> client. I think this would be a proper start for a new clean code base
> without having a non functional client for some time and it also
> provides some kind of red line between libbtc (cleaned up code) and
> the old code base, making it easy to maintain order.
> Would this approach be accepted for a merge?
I had been planning on, and putting off starting work on, a central hub
infrastructure to Bitcoin until fairly recently. Its a central hub
where net/main/wallet/etc can subscribe to get notified of new
blocks/txes/etc, push new blocks/txes/etc and can get information about
blocks/txes/etc. I started work fairly recently, but hopefully it will
be functional sometime in the not-too-distant future.
As I said earlier, the Bitcoin code base really isn't all that messy,
its only its lack of clear lines between classes that makes it seem that
way. It does some things inefficiently, however its general algorithms
are quite good the way they stand. (though net could probably stand a
ground-up rewrite of the backend). If you want to rewrite for a more
optimized handling of p2p connections/etc, it would be apprecitated and
(assuming its done well) certainly merged.
>
> It was more meant as an rhetorical question. A documented decision
> would give anyone the chance of arguing against the usage of a library
> instead of asking stupid questions. A mailing list archive suits well
> for this type of information, so let me try to get some information
> here. Db4 is an excellent choice if you need indexed database
> functionality without SQL interface. But compared to an stl map lookup
> and fopen, fwrite and fclose it is much harder to understand and
> brings a lot performance overhead. This is true as long as your
> information are small enough to stay in main memory. A stl map storing
> file offsets is also not that hard to write and understand. On the
> other side using an SQL interface would bring the advantage of
> swapping database providers. An enterprise website could use oracle
> while the average user could use sqlite. Also is db4 used for any type
> of disk storage, this makes files like wallet.dat some kind of hard to
> read. It is in no way more secure than storing private key's in an xml
> file. But it is much harder to maintain and understand by the user and
> the average programer.
I can't speak for satoshi here, but I would agree with his decision on
the grounds that BDB offers a good mix. Compared to a sql-driven
library, it offers a much lighter-weight footprint. Compared to rolling
our own, its much, much simpler to use the existing code that people
have spent years writing/optimizing/fixing/etc. It also provides good
Database transactioning which Bitcoin does depend on on some (rare,
though hopefully less so as time goes on) circumstances.
>
> >>
> >> 6. hardcoded values
> >>
> >> There are tons of hardcoded values for the official and the testnet block chain. It would be a great step to move all chain related settings to a chain description file. This would allow custom chains and clean up the code.
> > This one is an interesting debate. There is no real reason to do this
> > aside from some questionable code cleanup. Also, there is no reason to
> > encourage improperly-implemented alternate chains. Alternate chains
> > should be designed in such a way as to share the main chain's difficulty
> > as described by Mike on the forum, not just make a new chain and hope it
> > sticks.
> It is not that interesting as it looks first.
Interesting might have been the wrong word. Let me rephrase that too
"of hot topic if you ask several people who incessantly create new
chains for no reason other than to create new chains".
> There is no good in
> running multiple chains for production use. To share the difficulty is
> indeed a good start to solve the problem. That's also one of the
> things I don't like off the QBitcoin client.
Neither the original client nor any other client or patch currently
implements work-sharing, I don't think you understood my statement here.
I was referring to http://forum.bitcoin.org/?topic=7219.0
> What I meant is just
> to have the possibility to have all adjustable parameters in one place
> and to be able to quickly build an internal testnet without crazy
> firewalling to prevent it from dying. The first would allow to detect
> problematic ddos protection settings early and giving the average user
> the possibility to adjust all important settings if he knows what he
> does.
Those parameters are available, though I don't think they show up in
--help output. If someone had the time to go back and document the
parameters not in --help, it would be much appreciated ;)
> That includes not only alternate chains. One could choose to
> include transactions only at a higher fee or at no fee at all.
> Everyone could do such things by changing the code anyway. But not all
> brilliant administrators or users are programmers.
That is yet another debated issue. The transaction (relay) fees are
there for a reason much more than just for the hell of it. If
transaction (relay) fees were easily changeable, they would serve no
purpose as they would all be set to 0. Transaction fee handling needs a
rethinking and recoding, but offering each user the option to just relay
every transaction off the wire is not an option.
>
> Tuesday, July 12, 2011, 4:31:07 AM, Gavin Andresen wrote:
> > We'll just tell everybody to stop using bitcoin so much for six months
> > or so while we implement a much better client. It will be exactly
> > like the bitcoin we have now, except with a much nicer internal
> > architecture and much cleaner code-base, and we're pretty sure we can
> > get it done in six months if everything goes exactly as planned.
> It is some kind of arrogant to believe that anyone would stop using
> bitcoin if some programers decide to stop working for some month. And
> it is also fond to not fix bugs in the old code base if they appear.
> Also there are lots of people out there using old clients anyway. The
> important improvement is more about quick extendibility and therefore
> more feature rich secure code. This would not only help the official
> code base, it would also improve trust and result in better external
> bitcoin related projects.
That was not at all the point of that comment. Trying to fix bugs on an
old codebase while rewriting a new one is worthless and just creating
way more effort than is necessary.
Matt
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-13 0:17 ` Matt Corallo
@ 2011-07-13 11:50 ` Mike Hearn
2011-07-13 13:04 ` Andy Parkins
0 siblings, 1 reply; 18+ messages in thread
From: Mike Hearn @ 2011-07-13 11:50 UTC (permalink / raw)
To: Matt Corallo; +Cc: bitcoin-development
For what it's worth, BitCoinJ has a NetworkParameters abstraction that
does what you suggest (groups all the constants together):
http://code.google.com/p/bitcoinj/source/browse/trunk/src/com/google/bitcoin/core/NetworkParameters.java
It exists primarily to make unit testing easier. In the test suite, we
often build small chains and other structures. We do this by using a
NetworkParameters that has the easiest difficulty possible. It means
you can solve blocks in a few attempts, easily fast enough to build
test chains of any length you like.
I suspect that as the test suite expands, a similar abstraction will
be introduced to the Satoshi client.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-13 11:50 ` Mike Hearn
@ 2011-07-13 13:04 ` Andy Parkins
2011-07-13 18:37 ` Luke-Jr
0 siblings, 1 reply; 18+ messages in thread
From: Andy Parkins @ 2011-07-13 13:04 UTC (permalink / raw)
To: bitcoin-development
[-- Attachment #1: Type: Text/Plain, Size: 1695 bytes --]
On 2011 July 13 Wednesday, Mike Hearn wrote:
> For what it's worth, BitCoinJ has a NetworkParameters abstraction that
> does what you suggest (groups all the constants together):
"all" is a strong word :-)
I'm doing a similar thing, and so far I have (and it's definitely incomplete)
the following for these magic-constants that are often literals in the offical
client:
TBlock *GenesisBlock;
TBitcoinHash ProofOfWorkLimit;
uint16_t DefaultTCPPort;
uint32_t Magic;
uint8_t BitcoinAddressPrefix;
unsigned int COINBASE_MATURITY;
unsigned int COINBASE_MINIMUM_SCRIPT_SIZE;
unsigned int COINBASE_MAXIMUM_SCRIPT_SIZE;
unsigned int MAX_BLOCK_SIZE;
unsigned int MAX_BLOCK_SIZE_GEN;
unsigned int MAX_BLOCK_SIGOPS;
unsigned int MINIMUM_TRANSACTION_SIZE;
TCoinsElement MIN_MONEY;
TCoinsElement MAX_MONEY;
TCoinsElement MIN_TX_FEE;
TCoinsElement INITIAL_MINING_REWARD;
unsigned int INFLATION_PERIOD;
unsigned int BLOCK_TIMESTAMP_WINDOW;
unsigned int DIFFICULTY_TIMESPAN;
unsigned int NEW_BLOCK_PERIOD;
unsigned int INV_MAX;
unsigned int GETDATA_MAX;
unsigned int GETBLOCKS_RESPONSES_MAX;
unsigned int GETHEADERS_RESPONSES_MAX;
unsigned int ADDR_MAX;
unsigned int ADDR_MIN_TIME;
unsigned int ADDR_MAX_TIME_OFFSET;
unsigned int ADDR_DEFAULT_TIME_PENALTY;
unsigned int ASSUME_OFFLINE_AFTER;
unsigned int OFFLINE_UPDATE_INTERVAL;
unsigned int ONLINE_UPDATE_INTERVAL;
map<unsigned int, TBitcoinHash> Checkpoints;
static const TBitcoinHash NULL_REFERENCE_HASH;
static const unsigned int NULL_REFERENCE_INDEX;
Any suggestions for others gratefully received.
Andy
--
Dr Andy Parkins
andyparkins@gmail.com
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-13 13:04 ` Andy Parkins
@ 2011-07-13 18:37 ` Luke-Jr
2011-07-13 21:41 ` Andy Parkins
0 siblings, 1 reply; 18+ messages in thread
From: Luke-Jr @ 2011-07-13 18:37 UTC (permalink / raw)
To: bitcoin-development
On Wednesday, July 13, 2011 9:04:09 AM Andy Parkins wrote:
> TCoinsElement MAX_MONEY;
This isn't an arbitrary constant, it's the result of a calculation...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-13 18:37 ` Luke-Jr
@ 2011-07-13 21:41 ` Andy Parkins
0 siblings, 0 replies; 18+ messages in thread
From: Andy Parkins @ 2011-07-13 21:41 UTC (permalink / raw)
To: Luke-Jr; +Cc: bitcoin-development
On Wednesday 13 July 2011 19:37:57 Luke-Jr wrote:
> On Wednesday, July 13, 2011 9:04:09 AM Andy Parkins wrote:
> > TCoinsElement MAX_MONEY;
>
> This isn't an arbitrary constant, it's the result of a calculation...
Don't tell me:
bitcoin/src/main.h:41
static const int64 MAX_MONEY = 21000000 * COIN
21,000,000 seems pretty arbitrary to me.
--
Dr Andy Parkins
andyparkins@gmail.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Bitcoin-development] overall bitcoin client code quality
@ 2011-07-10 22:37 Michael Offel
2011-07-10 23:07 ` Douglas Huff
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Michael Offel @ 2011-07-10 22:37 UTC (permalink / raw)
To: bitcoin-development
Hello,
I would like to start a discussion about code quality to catch some opinions and create an codebase cleanup plan.
Let me just start with some points I've seen while reading and stepping throw bitcoin:
1. nearly no code documentation
All I found was the original paper and some partial wiki pages and the overall coding style does not help much here. I would love to see some class hierarchy, method descriptions and thoughts in the code. Instead one can find comments like this...
> // Map ports with UPnP
> if (fHaveUPnP)
> MapPort(fUseUPnP);
This comment is just waste of valuable disk space and time for anyone who reads it. While I can guess what the CScript class does I would more like to understand the thoughts behind the decision to implement some crypto macros in a compile time interpreter and why Berkeley db 4 is used at all.
2. isolation of modules
It would be a lot easier to understand parts of the code if they would use well defined interfaces to communicate. Bitcoin makes use of global variables, public member variables in classes and global external functions what makes understanding the code a lot harder. To give an example here, the irc module has no interface at use it or to use. It gets some kind of magic thread started and pushes received addresses directly to some global function in net.cpp. The code is full of concepts like this. A well defined interface would be an bitcoin unrelated IRC module interface and some kind of translation class between the IRC and peer2peer network interface.
3. poor use of threads
Bitcoin starts a new thread for every little module it has and as soon as there is some serious work to do, it locks some kind of global mutex. While this eliminates nearly every performance advantage, it introduces a high difficulty in writing bug free code. Not only that one has to know which mutex to lock when, there is no documentation about that, one may also accidently add dead locks. This kind of bug is very hard to find, it may run well for a million of tests and crash or just hang on the next one. And my experience tells me that it will not be an developer nor an little user when it occurs. The fist user who hit's the bug will be mt gox doing an emergency transfer. ;) My suggestion is to remove all threads and critical sections and build a sequential easy to follow model. Some modules like the cpu miner may still require to spawn threads, but he should do this internally and hide any synchronization.
4. long build times
It takes longer to build Bitcoin than building some of the million lines of code projects I'm working on. The reasons I did see so far is the use of boost, lack of module isolation and implementations in header files. While the rest is just bad coding style the use of boost is arguable. As far as I can see it is mainly used for threading and FOREACH. I already put threading on the table, but using pthread or making an platform dependent startthread and mutex would be much more lightweight and nobody needs FOREACH. Boost is also an heavy non standard dependency that is an unnecessary barrier for new developers.
5. style guide
There is a style guide but it does not include anything about readability.
I'm talking about one file per class, no methods and single code line longer than a screen page. It should be natural to write code like this and I dislike having a lot of rules but the code shows that there is a need for such thing.
6. hardcoded values
There are tons of hardcoded values for the official and the testnet block chain. It would be a great step to move all chain related settings to a chain description file. This would allow custom chains and clean up the code.
All this is just the tip of the iceberg. Bitcoin is a widely used application and users are transferring millions of dollars. The current code quality is very prone to bug's. To be clear I'm not saying there are a lot of bugs nor do I blame someone for the code quality. It is still a beta version and it was necessary to bring out a working version to attract more developers. And it is hard to analyze the code or just see a bug during development. One can be happy to understand what a method does, but this gives not the confidence to see and report obvious bugs.
Let me also say that I'm not pointing to someone to do all this. I'm willing to spend a lot of time on this promising project but this kind of cleanup is simply too large for one person who is new to the code.
My overall suggestion is to begin a complete rewrite, inspired by the old code rather than moving a lot of "known to be somehow functional" around.
The official Bitcoin client should be some kind of an reference project for other clients and must therefore be extra clean and well documented.
Hopefully I did not hurt someone's feelings.
Michael
___________________________________________________________
Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die
Toolbar eingebaut! http://produkte.web.de/go/toolbar
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-10 22:37 Michael Offel
@ 2011-07-10 23:07 ` Douglas Huff
2011-07-10 23:31 ` Jeff Garzik
` (3 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Douglas Huff @ 2011-07-10 23:07 UTC (permalink / raw)
To: Michael Offel; +Cc: bitcoin-development
[-- Attachment #1: Type: text/plain, Size: 805 bytes --]
On Jul 10, 2011, at 5:37 PM, Michael Offel wrote:
> 4. long build times
>
> It takes longer to build Bitcoin than building some of the million lines of code projects I'm working on. The reasons I did see so far is the use of boost, lack of module isolation and implementations in header files.
I've actually offered (on irc) to fix the latter a few times and been told that a pull removing implementations from the headers would not be merged so have not wasted the effort. (Because it's a rather large one.) Not only does this cause long build times it makes adding new code in the logically "right" places nearly impossible due to the build deps.
I agree with pretty much the entirety of your post and think all of it needs to be discussed further and addressed.
--
Douglas Huff
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 881 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-10 22:37 Michael Offel
2011-07-10 23:07 ` Douglas Huff
@ 2011-07-10 23:31 ` Jeff Garzik
2011-07-10 23:36 ` Matt Corallo
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Jeff Garzik @ 2011-07-10 23:31 UTC (permalink / raw)
To: Michael Offel; +Cc: bitcoin-development
On Sun, Jul 10, 2011 at 6:37 PM, Michael Offel <Michael.Offel@web.de> wrote:
> 1. nearly no code documentation
agreed -- contributions welcome
> 2. isolation of modules
It is a long term goal to move towards 'libbitcoin"
> 3. poor use of threads
agreed -- contributions welcome
> 4. long build times
very low priority.
> 5. style guide
contributions to existing style guide welcome
> 6. hardcoded values
very low priority
> Let me also say that I'm not pointing to someone to do all this. I'm willing to spend a lot of time on this promising project but this kind of cleanup is simply too large for one person who is new to the code.
> My overall suggestion is to begin a complete rewrite, inspired by the old code rather than moving a lot of "known to be somehow functional" around.
> The official Bitcoin client should be some kind of an reference project for other clients and must therefore be extra clean and well documented.
>
> Hopefully I did not hurt someone's feelings.
We know the code isn't what people would prefer, but that's what we've
inherited. Everybody has suggestions, but given limited development
resources we're swamped as is. (hence all the "contributions welcome"
responses)
--
Jeff Garzik
exMULTI, Inc.
jgarzik@exmulti.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-10 22:37 Michael Offel
2011-07-10 23:07 ` Douglas Huff
2011-07-10 23:31 ` Jeff Garzik
@ 2011-07-10 23:36 ` Matt Corallo
2011-07-11 2:01 ` Luke-Jr
2011-07-11 9:33 ` Mike Hearn
4 siblings, 0 replies; 18+ messages in thread
From: Matt Corallo @ 2011-07-10 23:36 UTC (permalink / raw)
To: bitcoin-development
[-- Attachment #1: Type: text/plain, Size: 8332 bytes --]
On Mon, 2011-07-11 at 00:37 +0200, Michael Offel wrote:
> Hello,
>
> I would like to start a discussion about code quality to catch some opinions and create an codebase cleanup plan.
>
> Let me just start with some points I've seen while reading and stepping throw bitcoin:
>
>
> 1. nearly no code documentation
Yep, anyone with the time can gladly comment up the code, it would be
much appreciated, but as it stands now there are more important things
to do...like many of the things here:
> All I found was the original paper and some partial wiki pages and the overall coding style does not help much here. I would love to see some class hierarchy, method descriptions and thoughts in the code.
Yes, thats one of the general development goal, sipa's CWallet was an
excellent start, but much more work needs done in terms of clear
splitting of the code.
> Instead one can find comments like this...
>
> > // Map ports with UPnP
> > if (fHaveUPnP)
> > MapPort(fUseUPnP);
>
> This comment is just waste of valuable disk space and time for anyone who reads it.
My bad, was just following the previous comments...
> While I can guess what the CScript class does I would more like to understand the thoughts behind the decision to implement some crypto macros in a compile time interpreter and
> why Berkeley db 4 is used at all.
At the time Bitcoin began being built, Ubuntu 9.04 (or was it 9.10?) was
used, as it offered the oldest libc on the newest OS. Ubuntu 9.04 just
happened to only have db4.7. For backward compatibility, db4.7 has been
used ever since (except, for some reason, the osx builds). In 0.4,
db4.8 will be used. If you are asking why bdb was used to begin with,
why not? its an excellent db and why reinvent the wheel?
>
> 2. isolation of modules
>
> It would be a lot easier to understand parts of the code if they would use well defined interfaces to communicate. Bitcoin makes use of global variables, public member variables in classes and global external functions what makes understanding the code a lot harder.
> To give an example here, the irc module has no interface at use it or to use. It gets some kind of magic thread started and pushes received addresses directly to some global function in net.cpp. The code is full of concepts like this.
> A well defined interface would be an bitcoin unrelated IRC module interface and some kind of translation class between the IRC and peer2peer network interface.
Though satoshi was clearly brilliant, he didn't care much for code
cleanliness. This is one of the next development goals (IMO).
>
> 3. poor use of threads
>
> Bitcoin starts a new thread for every little module it has and as soon as there is some serious work to do, it locks some kind of global mutex. While this eliminates nearly every performance advantage, it introduces a high difficulty in writing bug free code.
> Not only that one has to know which mutex to lock when, there is no documentation about that, one may also accidently add dead locks.
> This kind of bug is very hard to find, it may run well for a million of tests and crash or just hang on the next one. And my experience tells me that it will not be an developer nor an little user when it occurs.
> The fist user who hit's the bug will be mt gox doing an emergency transfer. ;)
This is something that will come with general code cleanup and
modularization. The locks will become specific to the object (as they
should be) and the performance and clarity will be fixed.
> My suggestion is to remove all threads and critical sections and build a sequential easy to follow model.
> Some modules like the cpu miner may still require to spawn threads, but he should do this internally and hide any synchronization.
Though it would be ideal to rewrite 90% of Bitcoin just to fix code
clarity, that is way more work than anyone has time for, in the mean
time there is more than just code cleanup that needs done. It has to be
done in chunks.
>
> 4. long build times
>
> It takes longer to build Bitcoin than building some of the million lines of code projects I'm working on. The reasons I did see so far is the use of boost, lack of module isolation and implementations in header files.
> While the rest is just bad coding style the use of boost is arguable. As far as I can see it is mainly used for threading and FOREACH. I already put threading on the table,
> but using pthread or making an platform dependent startthread and mutex would be much more lightweight and nobody needs FOREACH. Boost is also an heavy non standard dependency that is an unnecessary barrier for new developers.
Header files could stand to be cleaned up a bit, though all the
implementation stuff is limited to one or two lines (though sometimes
thats too much). If you want to rewrite Bitcoin sans-boost, please do,
however Boost really isnt a huge barrier as its a build-once thing. If
you are on Linux, all you have to do is install a bunch of packages and
build wx. If you are on Windows, why are you on Windows? ;)
>
> 5. style guide
>
> There is a style guide but it does not include anything about readability.
> I'm talking about one file per class, no methods and single code line longer than a screen page. It should be natural to write code like this and I dislike having a lot of rules but the code shows that there is a need for such thing.
Its not due to the current coders, its due to how it was originally
written.
>
> 6. hardcoded values
>
> There are tons of hardcoded values for the official and the testnet block chain. It would be a great step to move all chain related settings to a chain description file. This would allow custom chains and clean up the code.
This one is an interesting debate. There is no real reason to do this
aside from some questionable code cleanup. Also, there is no reason to
encourage improperly-implemented alternate chains. Alternate chains
should be designed in such a way as to share the main chain's difficulty
as described by Mike on the forum, not just make a new chain and hope it
sticks.
>
> All this is just the tip of the iceberg. Bitcoin is a widely used application and users are transferring millions of dollars. The current code quality is very prone to bug's.
> To be clear I'm not saying there are a lot of bugs nor do I blame someone for the code quality. It is still a beta version and it was necessary to bring out a working version to attract more developers.
> And it is hard to analyze the code or just see a bug during development. One can be happy to understand what a method does, but this gives not the confidence to see and report obvious bugs.
>
> Let me also say that I'm not pointing to someone to do all this. I'm willing to spend a lot of time on this promising project but this kind of cleanup is simply too large for one person who is new to the code.
> My overall suggestion is to begin a complete rewrite, inspired by the old code rather than moving a lot of "known to be somehow functional" around.
Really no reason to do that. Although the code is messy in terms of
global usage and poorly-implemented RPC/net/etc, most of the code is
absolutely fine. Just throw it in clearly-defined methods and classes
and it would be much more readable and less prone to mistakes.
Additionally, the things that are poorly-implemented can be slowly
changed over time in a clean and independent fashion instead of having
to rewrite massive chunks at a time. Even if we had a full-time
development team of many, many developers, this isn't the right way to
do it. The code itself is cleaner that it first appears, even if its
global structure is not.
> The official Bitcoin client should be some kind of an reference project for other clients and must therefore be extra clean and well documented.
True, but it is much higher priority to clean up the code than comment
it better, plus there are various other features/more user-facing issues
that need fixed as well, so...
>
> Hopefully I did not hurt someone's feelings.
Don't think so, the code sucks in terms of cleanliness, everyone knows
it, its just a question of who is going to and when its going to get
fixed.
Matt
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-10 22:37 Michael Offel
` (2 preceding siblings ...)
2011-07-10 23:36 ` Matt Corallo
@ 2011-07-11 2:01 ` Luke-Jr
2011-07-12 4:13 ` Alan Grimes
2011-07-11 9:33 ` Mike Hearn
4 siblings, 1 reply; 18+ messages in thread
From: Luke-Jr @ 2011-07-11 2:01 UTC (permalink / raw)
To: bitcoin-development; +Cc: Michael Offel
On Sunday, July 10, 2011 6:37:15 PM Michael Offel wrote:
> why Berkeley db 4 is used at all.
Because it's a good tool for the job? Or you mean the version?
Debian stable: 4.8
Gentoo stable: 4.8
Ubuntu LTS : 4.8
> Boost is also an heavy non standard dependency that is an unnecessary
> barrier for new developers.
Boost is pretty much standard C++ nowadays.
> I'm talking about one file per class, no methods and single code line
> longer than a screen page. It should be natural to write code like this
> and I dislike having a lot of rules but the code shows that there is a
> need for such thing.
Blame your text editor if it can't show long lines sanely. The only problem I
see with the style itself is the use of spaces instead of tabs.
> My overall suggestion is to begin a complete rewrite, inspired by the old
> code rather than moving a lot of "known to be somehow functional" around.
There are many rewrites in progress, often with much better designs.
> The official Bitcoin client should be some kind of an reference project
> for other clients and must therefore be extra clean and well documented.
Bitcoin is supposed to be an authorityless project. There is no official.
> *everything else*
Fix it yourself and submit the changes. If they don't get merged, fork.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-11 2:01 ` Luke-Jr
@ 2011-07-12 4:13 ` Alan Grimes
2011-07-12 5:19 ` Jeff Garzik
0 siblings, 1 reply; 18+ messages in thread
From: Alan Grimes @ 2011-07-12 4:13 UTC (permalink / raw)
To: bitcoin-development
Yeah, I'm starting to run into real design problems in my attempt to
write a bitcoin daemon, namely I'm trying to figure out how to manage
connections and peers and how to interpret and coordinate messages from
peers.
While spying on the old code, I noticed one major problem that could be
fixed quite easily. That is, the 1 class-per .h/.cpp rule is completely
ignored in main.h/cpp and net.h/cpp If all of the classes in the project
were re-factored to their own files, it would be much easier to audit
what was interacting with what. I think this work could be done within
16 programmer hours.
The old codebase doesn't build on my system but I'd be willing to rough
it in anyway.
I think this should be done in parallel to efforts, such as mine, to
re-implement major functionality.
--
E T F
N H E
D E D
Powers are not rights.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-12 4:13 ` Alan Grimes
@ 2011-07-12 5:19 ` Jeff Garzik
0 siblings, 0 replies; 18+ messages in thread
From: Jeff Garzik @ 2011-07-12 5:19 UTC (permalink / raw)
To: Alan Grimes; +Cc: bitcoin-development
On Tue, Jul 12, 2011 at 12:13 AM, Alan Grimes <agrimes@speakeasy.net> wrote:
> While spying on the old code, I noticed one major problem that could be
> fixed quite easily. That is, the 1 class-per .h/.cpp rule is completely
> ignored in main.h/cpp and net.h/cpp If all of the classes in the project
> were re-factored to their own files,
This is about the last thing we should do, and it's one of the worst
coding practices of many C++ projects (and unfortunately carried over
to Java by force). See Knuth and his work on literate programming.
Putting logically similar code -together- is often more impactful and
meaningful than splitting up files into dozens (hundreds or thousands,
in large projects) of tiny, 20-line files.
A project attains zen in the -balance-. Bitcoin was not well served
by "everything in main.cpp" approach -- but neither is it well served
by splitting wallet / transaction / etc. logic across a great many
files. We should not have to ask ourselves, Is This Code In
WalletFactory.cpp, WalletTx.cpp, WalletStore.cpp, WalletKey.cpp,
WalletCrypt.cpp, or in s/$F.cpp/$F.h/ ?
Strict, unthinking rules do not buy anything, and can cost us much.
Plus, no matter how you slice it, bitcoin is Hard To Learn and there's
no getting around that.
--
Jeff Garzik
exMULTI, Inc.
jgarzik@exmulti.com
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-10 22:37 Michael Offel
` (3 preceding siblings ...)
2011-07-11 2:01 ` Luke-Jr
@ 2011-07-11 9:33 ` Mike Hearn
2011-07-12 3:31 ` Gavin Andresen
` (2 more replies)
4 siblings, 3 replies; 18+ messages in thread
From: Mike Hearn @ 2011-07-11 9:33 UTC (permalink / raw)
To: Michael Offel; +Cc: bitcoin-development
> My overall suggestion is to begin a complete rewrite, inspired by the old
> code rather than moving a lot of "known to be somehow functional" around.
This essay is old but still relevant, I think:
http://www.joelonsoftware.com/articles/fog0000000069.html
Despite that, there are efforts to write a fresh implementation. For
example, BitCoinJ:
http://code.google.com/p/bitcoinj/
It is not a complete implementation. It's targeting the "simplified
payment verification" mode as a first base, and is mostly intended for
mobile phones today as that's a niche the current codebase can't meet.
In the (very) long run, it may evolve into a full node.
> Hopefully I did not hurt someone's feelings.
The code was written by Satoshi who is long gone, and I doubt he would
care much for this type of list anyway. He was a do-er rather than a
talker.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-11 9:33 ` Mike Hearn
@ 2011-07-12 3:31 ` Gavin Andresen
2011-07-12 7:21 ` John Smith
2011-07-12 22:50 ` Michael Offel
2 siblings, 0 replies; 18+ messages in thread
From: Gavin Andresen @ 2011-07-12 3:31 UTC (permalink / raw)
To: bitcoin-development
It is SO tempting to start over from scratch, isn't it?
We'll just tell everybody to stop using bitcoin so much for six months
or so while we implement a much better client. It will be exactly
like the bitcoin we have now, except with a much nicer internal
architecture and much cleaner code-base, and we're pretty sure we can
get it done in six months if everything goes exactly as planned.
I think incremental improvement of the "devil we know" is the right
thing to do right now, although I'm going to spend more time thinking
about how to make sure different bitcoin implementations work well
together (I've started working on network-protocol-level testing).
Regarding Michael's specific suggestions: the
lots-of-threads-and-mutexes architecture of the client bothers me
because it is too easy to change code and create a deadlock that is
very hard to debug and fix. Switching to asynchronous IO might be the
right thing to do. Then again, it might be easier to modify the
CRITICAL_SECTION code to detect and report deadlocks (anybody have
experience doing that?).
--
--
Gavin Andresen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-11 9:33 ` Mike Hearn
2011-07-12 3:31 ` Gavin Andresen
@ 2011-07-12 7:21 ` John Smith
2011-07-12 22:50 ` Michael Offel
2 siblings, 0 replies; 18+ messages in thread
From: John Smith @ 2011-07-12 7:21 UTC (permalink / raw)
To: Mike Hearn; +Cc: bitcoin-development, Michael Offel
[-- Attachment #1: Type: text/plain, Size: 1168 bytes --]
On Mon, Jul 11, 2011 at 9:33 AM, Mike Hearn <mike@plan99.net> wrote:
> > My overall suggestion is to begin a complete rewrite, inspired by the old
> > code rather than moving a lot of "known to be somehow functional" around.
>
> This essay is old but still relevant, I think:
>
> http://www.joelonsoftware.com/articles/fog0000000069.html
>
+1
More code documentation would be helpful, and so would making the interfaces
more understandable/readable, and getting rid of the manual locking
(especially in client code!), but I don't see how that would warrant a
complete rewrite.
Some refactoring would be much safer than trying to reproduce every nook and
cranny in a rewrite.
re:4) I also don't see why boost would be a 'nonstandard dependency'. From
what I understand, a large part of the new C++0x standard is derived from
boost. Also C++ compilers are getting faster and more smart all the time, so
I absolutely don't see "build speed" as a goal.
re:6) I've already submitted a few pull requests that replace hardcoded
magic values with constants. Moving the constants to a config file is not
needed IMO because the end-user doesn't need to change them.
JS
[-- Attachment #2: Type: text/html, Size: 1654 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Bitcoin-development] overall bitcoin client code quality
2011-07-11 9:33 ` Mike Hearn
2011-07-12 3:31 ` Gavin Andresen
2011-07-12 7:21 ` John Smith
@ 2011-07-12 22:50 ` Michael Offel
2 siblings, 0 replies; 18+ messages in thread
From: Michael Offel @ 2011-07-12 22:50 UTC (permalink / raw)
To: bitcoin-development
Monday, July 11, 2011, 10:33:04 AM, Mike Hearn wrote:
>> My overall suggestion is to begin a complete rewrite, inspired by the old
>> code rather than moving a lot of "known to be somehow functional" around.
> This essay is old but still relevant, I think:
> http://www.joelonsoftware.com/articles/fog0000000069.html
It is indeed a very good one but I disagree in two points. First
> As a corollary of this axiom, you can ask almost any programmer
> today about the code they are working on. "It's a big hairy mess,"
> they will tell you. "I'd like nothing better than to throw it out and start over."
If someone asks me this question about the project I'm currently
working on I would never answer like that. This is important in the
threads context because it gives me the confidence to say you can
build very large C++ projects with lots of programers attached over
multiple years and still have a very clean and nice code base. And the
article does also accidently points to one of the roots of messy code.
> Back to that two page function. Yes, I know, it's just a simple
> function to display a window, but it has grown little hairs and
> stuff on it and nobody knows why. Well, I'll tell you why: those are
> bug fixes. One of them fixes that bug that Nancy had when she tried
> to install the thing on a computer that didn't have Internet
> Explorer. Another one fixes that bug that occurs in low memory
> conditions. Another one fixes that bug that occurred when the file
> is on a floppy disk and the user yanks out the disk in the middle.
> That LoadLibrary call is ugly but it makes the code work on old
> versions of Windows 95.
Well, if you can not identify the meaning of some hairs, they are
either attached to the wrong place, in an unclear way or are just
missing a line of comment. There are studies about what size of
function and up to what number of variables per function can be
covered by the average code reading programer. And these numbers are
low.
Second, I agree with the point that you can not give up your market
leadership by beginning from scratch and you will if you do so. Unless
you do it like microsoft did in his example by simultaneously extend
the old code base. And microsoft does this all the time, just look at
Windows 9x compared to Windows NT and the dead object oriented Windows
kernel. (They did buy lots of that but that is not the point here)
The mistake is to compare a small project like Bitcoin to any
application like Word or Netscape. The author did explicitly write
that he does not mean that partial rewrite is a bad thing. And
rewriting the Bitcoin client with three or four guys is like a tiny
rewrite in a real world application like Word.
> It's important to remember that when you start from scratch there is
> absolutely no reason to believe that you are going to do a better
> job than you did the first time.
In case of Bitcoin there is reason to believe that a rewrite would be
better. The first version was hacked together by far less programmers
and by at least one who did not care about readability, what tells me
that he possibly did never work on a real project before. And we now
have a known to work protocol, what did for sure slow down the
development a lot and caused rewrites.
Michael
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2011-07-13 21:41 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-12 21:47 [Bitcoin-development] overall bitcoin client code quality Michael Offel
2011-07-12 23:40 ` Gregory Maxwell
2011-07-13 0:17 ` Matt Corallo
2011-07-13 11:50 ` Mike Hearn
2011-07-13 13:04 ` Andy Parkins
2011-07-13 18:37 ` Luke-Jr
2011-07-13 21:41 ` Andy Parkins
-- strict thread matches above, loose matches on Subject: below --
2011-07-10 22:37 Michael Offel
2011-07-10 23:07 ` Douglas Huff
2011-07-10 23:31 ` Jeff Garzik
2011-07-10 23:36 ` Matt Corallo
2011-07-11 2:01 ` Luke-Jr
2011-07-12 4:13 ` Alan Grimes
2011-07-12 5:19 ` Jeff Garzik
2011-07-11 9:33 ` Mike Hearn
2011-07-12 3:31 ` Gavin Andresen
2011-07-12 7:21 ` John Smith
2011-07-12 22:50 ` Michael Offel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox