Hi aj, answering slightly out of order:

> what happens if the peer announcing packages to us is dishonest?
> They announce pkg X, say X has parents A B C and the fee rate is garbage. But actually X has parent D and the fee rate is excellent. Do we request the package from another peer, or every peer, to double check? Otherwise we're allowing the first peer we ask about a package to censor that tx from us?

Yes, providing false information shouldn't be worse than not announcing the package at all, otherwise we have a censorship vector. In general, the request logic should not let one peer prevent us from requesting a similar announcement from another peer.
Yes I was indeed expecting that we would ask for package info from everyone who announces it until it accepts the package or has full information.
I can see that it's a fair bit of messages (request pckginfo, oh it's low fee, request pckginfo from somebody else), but we also need to track announcements / potentially go through the same circle to handle "notfound"s, right?
In normal running, the fee filter should stop a bunch of honest nodes from telling us packages that are low fee.

> I think the fix for that is just to provide the fee and weight when announcing the package rather than only being asked for its info? Then if one peer makes it sound like a good deal you ask for the parent txids from them, dedupe, request, and verify they were honest about the parents.
> Likewise, I think you'd have to have the graph info from many nodes if you're going to make decisions based on it and don't want hostile peers to be able to trick you into ignoring txs.

I don't think providing more information up front can ever sufficiently resolve the censorship issue. If we want to prevent any one peer from being able to censor requests to other peers, we need to store all announcements and be prepared to request from everybody.

Would it be better if we just took out the fee information and had "pckginfo" only consist of transaction ids? Sender tries its best to apply the fee filter? Presumably you have a txInventoryKnown of your peer based on what they've announced to you... just take the ancestor set of a transaction, subtract what they already have, and apply the fee filter to that? Or some kind of algorithm that ensures we don't underestimate? If it's imperfect, the worst case is the receiver downloads a few transactions and rejects them. Given that our goal is just to avoid this case, perhaps opting for simplicity is better than adding a topology graph serialization/deserialization + feerate assessment algorithm on top of this protocol...?

>>We'd erroneously ask for A+B+C+X, but really we should only take A+B.
>>But wouldn't A+B also be a package that was announced for B?

> In theory, yes, but maybe it was announced earlier (while our node was down?) or had dropped from our mempool or similar, either way we don't have those txs yet.

Hm. It's fine if they have Erlay, since a sender would know in advance that B is missing and announce it as a package. A potential tack-on solution would be to request package information whenever you have a "low fee" error on a parent and "missing inputs" on a child. Or we solve it at the validation level - instead of submitting each tx individually, we submit each ancestor subset. Do you think any of these is sufficient? At least the package properly propagates across nodes which are online when it is broadcasted...

Best,
Gloria

On Wed, May 25, 2022 at 11:55 AM Anthony Towns <aj@erisian.com.au> wrote:
On 24 May 2022 5:05:35 pm GMT-04:00, Gloria Zhao via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
>To clarify, in this situation, I'm imagining something like
>A: 0 sat, 100vB
>B: 1500 sat, 100vB
>C: 0 sat, 100vB
>X: 500 sat, 100vB
>feerate floor is 3sat/vB
>
>With the algo:
>>  * is X alone above my fee rate? no, then forget it
>>  * otherwise, s := X.size, f := X.fees, R := [X]
>>  * for P = P1..Pn:
>>   * do I already have P? then skip to the next parent
>>   * s += P.size, f += P.fees, R += [P]
>>  * if f/s above my fee rate floor? if so, request all the txs in R
>
>We'd erroneously ask for A+B+C+X, but really we should only take A+B.
>But wouldn't A+B also be a package that was announced for B?

In theory, yes, but maybe it was announced earlier (while our node was down?) or had dropped from our mempool or similar, either way we don't have those txs yet.

>Please lmk if you were imagining something different. I think I may be
>missing something.

That's what I was thinking, yes.

So the other thing is what happens if the peer announcing packages to us is dishonest?

They announce pkg X, say X has parents A B C and the fee rate is garbage. But actually X has parent D and the fee rate is excellent. Do we request the package from another peer, or every peer, to double check? Otherwise we're allowing the first peer we ask about a package to censor that tx from us?

I think the fix for that is just to provide the fee and weight when announcing the package rather than only being asked for its info? Then if one peer makes it sound like a good deal you ask for the parent txids from them, dedupe, request, and verify they were honest about the parents.

>> Is it plausible to add the graph in?

Likewise, I think you'd have to have the graph info from many nodes if you're going to make decisions based on it and don't want hostile peers to be able to trick you into ignoring txs.

Other idea: what if you encode the parent txs as a short hash of the wtxid (something like bip152 short ids? perhaps seeded per peer so collisions will be different per peer?) and include that in the inv announcement? Would that work to avoid a round trip almost all of the time, while still giving you enough info to save bw by deduping parents?


> For a maximum 25 transactions,
>23*24/2 = 276, seems like 36 bytes for a child-with-parents package.

If you're doing short ids that's maybe 25*4B=100B already, then the above is up to 36% overhead, I guess. Might be worth thinking more about, but maybe more interesting with ancestors than just parents.

>Also side note, since there are no size/count params, wondering if we
>should just have "version" in "sendpackages" be a bit field instead of
>sending a message for each version. 32 versions should be enough right?

Maybe but a couple of messages per connection doesn't really seem worth arguing about?

Cheers,
aj


--
Sent from my phone.