From: Btc Drak <btcdrak@gmail.com>
To: Jeff Garzik <jgarzik@gmail.com>
Cc: Bitcoin development mailing list <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] libconsensus and bitcoin development process
Date: Tue, 15 Sep 2015 10:55:34 +0100 [thread overview]
Message-ID: <CADJgMzvSUTrPZa0mQvBsQaYkMN3NPXVB_Ay6RJNivq-agUoUww@mail.gmail.com> (raw)
In-Reply-To: <CADm_WcY8Vy+k+5BaBS+jV6D6tmSXrok8rAxoPxxKOzUhyPWgMg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 12157 bytes --]
I also share a lot of Jeff's concerns about refactoring and have voiced
them several times on IRC and in private to Jorge, Wladamir and Greg. I
meant to do a write up but never got around to it. Jeff has quite
eloquently stated the various problems. I would like to share my thoughts
on the matter because we really do need to come up with a plan on how this
issue is dealt with.
Obviously, Bitcoin Core is quite tightly coupled at the moment and
definitely needs extensive modularisation. Such work will inevitably
require lots of bulk code moves and then finer refactoring. However, it
requires proper planning because there are lots of effects and consequences
for other people contributing to Core and also downstream projects relying
on Core:
1. Refactoring often causes other pull requests to diverge and require
rebasing. Continual refactoring can put PRs in "rebase hell" and puts a big
stress on contributors (many of whom are part time).
2. Version to version, Bitcoin Core changes significantly in structure. 0.9
to 0.10 is unrecognisable. 0.10 to 0.11 is even more so. This makes makes
it hard to follow release to release and the net result is less people
upgrade (especially think of miners trying to keep their patch sets working
while trying not to disrupt or risk their mining operations).
3. Continual refactoring increases risk: we're human, and mistakes will
slip through peer review. This is especially concerning with consensus
critical code and this makes it difficult to merge such refactoring often,
which of course exacerbates the problem.
The net negative consequence is it is harder to contribute to Core, harder
for the Core maintainers to merge and harder for downstream/dependent
projects/implementations to keep up.
Suggested Way Forward
---------------------------------
With the understanding that refactored code by definition must not change
behaviour. There are three major kinds of refactoring:
1. code moves (e.g. separating concerns into different files);
2. code style;
3. structural optimisation and consolidation (reducing LOC, separating
concerns, encapsulation etc).
Code moves(1) and CS(2) are easy to peer review and merge quickly. The
third kind(3) requires deeper analysis to ensure that while the code
changed, the behaviour (including any bugs) did not.
We must resist all temptation to fix bugs or tack on minor fixes and tweaks
during refactoring: pull requests should only be refactoring only, with no
net change to behaviour. Keeping discipline makes it much easier to verify
and peer review and this faster to merge.
With respect to Code moves and CS, I believe we should have a "refactoring
fortnight" where we so the bulk of code move-only refactoring plus CS where
necessary. This is by fat the most disruptive kind of change because it
widely affects other PRs mergeability. We should aim to get most of this
done in one go, so that it's not happening in dribs and drabs over months
and many releases. Once done, it gives everyone a good idea to the overall
new structure and where one can expect to find things in the future. The
idea here is to help orientation and not have to continuously hunt for
where things have moved to.
To be clear, I am strongly suggesting code move-only refactoring PRs not be
mixed with anything else. Same for CS changes. This makes the PRs extremely
easy to vet and thus quick to merge.
Towards this end, maybe there should be an IRC meeting to agree the initial
moves, then someone who has the stomach for it can get on and do it -
during that time, we do not merge anything else. We need to bite the bullet
and break the back out of code moves.
With regards to CS, I think we do need to get CS right, because a continual
dribble of CS changes also makes diffs between releases less easy to
follow. Much of CS checking can be automated by the continuous integration
so authors can get it right easily. It can be just like a Travis check.
With respect to the 3rd kind of refactoring, we need to set some standards
and goals and aim for some kind of consistency. Refactoring needs to fulfil
certain goals and criterion otherwise contributors will always find a
reason to fiddle over and over forever. Obvious targets here can be things
like proper encapsulation and separation of concerns.
Overall, refactoring should be merged quickly, but only on a schedule so it
doesn't cause major disruption to others.
Obviously the third kind of refactoring more complex and time consuming and
will need to occur over time, but it should happen in defined steps. As
Jeff said, one week a month, or maybe one month a release. In any case,
refactoring changes should be quickly accepted or rejected by the project
maintainer and not left hanging.
Finally, refactoring should *always* be uncontroversial because essentially
functionality is not changing. If functionality changes (e.g. you try to
sneak in a big fix or feature tweak "because it's small") the PR should be
rejected outright. Additionally, if we break down refactoring into the
three kinds stated above, peer review will be much more straightforward.
On Tue, Sep 15, 2015 at 5:10 AM, Jeff Garzik via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> [collating a private mail and a github issue comment, moving it to a
> better forum]
>
> On libconsensus
> ---------------
> In general there exists the reasonable goal to move consensus state
> and code to a specific, separate lib.
>
> To someone not closely reviewing the seemingly endless stream of
> libconsensus refactoring PRs, the 10,000 foot view is that there is a
> rather random stream of refactors that proceed in fits and starts
> without apparent plan or end other than a one sentence "isolate
> consensus state and code" summary.
>
> I am hoping that
> * There is some plan
> * We will not see a five year stream of random consensus code movement
> patches causing lots of downstream developer headaches.
>
> I read every code change in every pull request that comes into
> github/bitcoin/bitcoin with three exceptions:
> * consensus code movement changes - too big, too chaotic, too
> frequent, too unfocused, laziness guarantees others will inevitably
> ACK it without me.
> * some non-code changes (docs)
> * ignore 80% of the Qt changes
>
> As with any sort of refactoring, they are easy to prove correct, easy
> to reason, and therefore quick and easy to ACK and merge.
>
> Refactors however have a very real negative impact.
> bitcoin/bitcoin.git is not only the source tree in the universe.
> Software engineers at home, at startups, and at major companies are
> maintaining branches of their own.
>
> It is very very easy to fall into a trap where a project is merging
> lots of cosmetic changes and not seeing the downstream ripple effects.
> Several people complained to me at the conference about all the code
> movement changes breaking their own work, causing them to stay on
> older versions of bitcoin due to the effort required to rebase to each
> new release version - and I share those complaints.
>
> Complex code changes with longer development cycles than simple code
> movement patches keep breaking. It is very frustrating, and causes
> folks to get trapped between a rock and a hard place:
> - Trying to push non-trivial changes upstream is difficult, for normal
> and reasonable reasons (big important changes need review etc.).
> - Maintaining non-trivial changes out of tree is also painful, for the
> aforementioned reasons.
>
> Reasonable work languishes in constant-rebase hell, and incentivizes
> against keeping up with the latest tree.
>
>
> Aside from the refactor, libconsensus appears to be engineering in the
> dark. Where is any sort of plan? I have low standards - a photo of a
> whiteboard or youtube clip will do.
>
> The general goal is good. But we must not stray into unfocused
> engineering for a non-existent future library user.
>
> The higher priority must be given to having a source code base that
> maximizes the collective developers' ability to maintain The Router --
> the core bitcoin full node P2P engine.
>
> I recommend time-based bursts of code movement changes. See below;
> for example, just submit & merge code movement changes on the first
> week of every 2nd month. Code movement changes are easy to create
> from scratch once a concrete goal is known. The coding part is
> trivial and takes no time.
>
> As we saw in the Linux kernel - battle lessons hard learned - code
> movement and refactors have often unseen negative impact on downstream
> developers working on more complicated changes that have more positive
> impact to our developers and users.
>
>
> On Bitcoin development release cycles & process
> ------------------------------------------------------------------
>
> As I've outlined in the past, the Linux kernel maintenance phases
> address some of these problems. The merge window into git master
> opens for 1 week, a very chaotic week full of merging (and rebasing),
> and then the merge window closes. Several weeks follow as the "dust
> settles" -- testing, bug fixing, moving in parallel OOB with
> not-yet-ready development. Release candidates follow, then the
> release, then the cycle repeats.
>
> IMO a merge window approach fixes some of the issues with refactoring,
> as well as introduces some useful -developer discipline- into the
> development process. Bitcoin Core still needs rapid iteration --
> another failing of the current project -- and so something of a more
> rapid pace is needed:
> - 1st week of each month, merge changes. Lots of rebasing during this
> week.
> - remaining days of the month, test, bug fix
> - release at end of month
>
> If changes are not ready for merging, then so be it, they wait until
> next month's release. Some releases have major features, some
> releases are completely boring and offer little of note. That is the
> nature of time-based development iteration. It's like dollar cost
> averaging, a bit.
>
>
> And frankly, I would like to close all github pull requests that are
> not ready to merge That Week. I'm as guilty of this as any, but that
> stuff just languishes. Excluding a certain category of obvious-crap,
> pull requests tend to default to a state of either (a) rapid merging,
> (b) months-long issues/projects, (c) limbo.
>
> Under a more time-based approach, a better pull request process would be to
> * Only open pull requests if it's a bug fix, or the merge window is
> open and the change is ready to be merged in the developer's opinion.
> * Developers CC bitcoin-dev list to discuss Bitcoin Core-bound projects
> * Developers maintain and publish projects via their own git trees
> * Pull requests should be closed if unmerged after 7 days, unless it
> is an important bug fix etc.
>
> The problem with projects like libconsensus is that they can get
> unfocused and open ended. Code movement changes in particular are
> cheap to generate. It is low developer cost for the developer to
> iterate all the way to the end state, see what that looks like, and
> see if people like it. That end state is not something you would
> merge all in one go. I would likely stash that tree, and then start
> again, seek the most optimal and least disruptive set of refactors,
> and generate and merge those into bitcoin/bitcoin.git in a time-based,
> paced manner. Announce the pace ahead of time - "cosmetic stuff that
> breaks your patches will be merged 1st week of every second month"
>
> To underscore, the higher priority must be given to having a source
> code base and disciplined development process that maximizes the
> collective developers' ability to maintain The Router that maintains
> most of our network.
>
> Modularity, refactoring, cleaning up grotty code generates a deep
> seated happiness in many engineers. Field experience however shows
> refactoring is a never ending process which sometimes gets in the way
> of More Important Work.
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
[-- Attachment #2: Type: text/html, Size: 13799 bytes --]
next prev parent reply other threads:[~2015-09-15 9:55 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-15 4:10 [bitcoin-dev] libconsensus and bitcoin development process Jeff Garzik
2015-09-15 9:55 ` Btc Drak [this message]
2015-09-15 15:26 ` Jeff Garzik
2015-09-15 16:00 ` Eric Lombrozo
2015-09-15 18:26 ` Btc Drak
2015-09-16 22:29 ` Peter Todd
2015-09-18 0:07 ` Wladimir J. van der Laan
2015-09-18 8:42 ` Eric Lombrozo
2015-09-18 16:22 ` Mike Hearn
2015-09-22 18:12 ` Jorge Timón
2015-09-22 23:49 ` Dave Scotese
2015-09-23 17:28 ` Jorge Timón
2015-09-29 13:04 ` Jeff Garzik
[not found] ` <CABsx9T0dHxXzemxJN87mU59j4_KZ=zOdwxpXOUe-NhB0ENVMWw@mail.gmail.com>
2015-09-23 16:58 ` Jorge Timón
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADJgMzvSUTrPZa0mQvBsQaYkMN3NPXVB_Ay6RJNivq-agUoUww@mail.gmail.com \
--to=btcdrak@gmail.com \
--cc=bitcoin-dev@lists.linuxfoundation.org \
--cc=jgarzik@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox