If code movement is not compressed into a tight time window, code movement becomes a constant stream during development.  A constant stream of code movement is a constant stream of patch breakage, for any patch or project not yet in-tree.  The result is to increase the work and cost on any contributor whose patches are not immediately merged.

For the record, since this is trending reddit, I __do__ support the end result of 0.10 refactoring, the work towards the consensus lib.

My criticism is of a merge flow which _unintentionally_ rewards only certain types of patches, and creates disincentives for working on other types of patches.







On Mon, Dec 15, 2014 at 4:19 PM, Cory Fields <lists@coryfields.com> wrote:
On Mon, Dec 15, 2014 at 2:35 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:
> On Mon, Dec 15, 2014 at 1:42 PM, Cory Fields <lists@coryfields.com> wrote:
>>
>> That's exactly what happened during the modularization process, with
>> the exception that the code movement and refactors happened in
>> parallel rather than in series. But they _were_ done in separate
>> logical chunks for the sake of easier review.
>
>
> "That's exactly what was done except it wasn't"
>
> Yes, in micro, at the pull request level, this happened
> * Code movement
> * Refactor
>
> At a macro level, that cycle was repeated many times, leading to the
> opposite end result:  a lot of tiny movement/refactor/movement/refactor
> producing the review and patch annoyances described.
>
> It produces a blizzard of new files and new data structures, breaking a
> bunch of out-of-tree patches, complicating review quite a bit.  If the vast
> majority of code movement is up front, followed by algebraic
> simplifications, followed by data structure work, further patches are easy
> to review/apply with less impact on unrelated code.
>

I won't argue that at all because it's perfectly logical, but in
practice that doesn't translate from the macro level to the micro
level very well. At the micro level, minor code changes are almost
always needed to accommodate useful code movement. Even if they're not
required, it's often hard to justify code movement for the sake of
code movement with the promise that it will be useful later.

Rather than arguing hypotheticals, let's use a real example:
https://github.com/bitcoin/bitcoin/pull/5118 . That one's pretty
simple. The point of the PR was to unchain our openssl wrapper so that
key operations could be performed by the consensus lib without
dragging in bitcoind's structures. The first commit severs the
dependencies. The second commit does the code movement from the
now-freed wrapper.

I'm having a hard time coming up with a workflow that would handle
these two changes as _separate_ events, while making review easier.
Note that I'm not attempting to argue with you here, rather I'm
genuinely curious as to how you'd rather see this specific example
(which is representative of most of my other code movement for the
libbitcoinconsensus work, i believe) handled.

Using your model above, I suppose we'd do the code movement first with
the dependencies still intact as a pull request. At some later date,
we'd sever the dependencies in the new files. I suppose you'd also
prefer that I group a bunch of code-movement changes together into a
single PR which needs little scrutiny, only verification that it's
move-only. Once the code-movement PRs are merged, I can begin the
cleanups which actually fix something.

In practice, though, that'd be a massive headache for different
reasons. Lots in flux with seemingly no benefits until some later
date. My PR's can't depend on eachother because they don't actually
fix issues in a linear fashion. That means that other devs can't
depend on my PRs either for the same reason. And what have we gained?

Do you find that assessment unreasonable?

Cory


--
Jeff Garzik
Bitcoin core developer and open source evangelist
BitPay, Inc.      https://bitpay.com/