From mboxrd@z Thu Jan 1 00:00:00 1970 Delivery-date: Thu, 01 May 2025 23:48:17 -0700 Received: from mail-yb1-f185.google.com ([209.85.219.185]) by mail.fairlystable.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1uAkBz-0004s0-Ty for bitcoindev@gnusha.org; Thu, 01 May 2025 23:48:17 -0700 Received: by mail-yb1-f185.google.com with SMTP id 3f1490d57ef6-e5740c858besf2450781276.2 for ; Thu, 01 May 2025 23:48:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20230601; t=1746168490; x=1746773290; darn=gnusha.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-sender:mime-version :subject:references:in-reply-to:message-id:to:from:date:sender:from :to:cc:subject:date:message-id:reply-to; bh=vd1pM1x674wLmpfMpkQEJ8Ylvw6T98jBJRRZ+lked38=; b=P5dUpP2wEwQf+118IdPZGxR5tVkWgI2kx9QXAJYhDWkJbTdEUn3JVfyQ0hjysNSxqY Xak/+/th+9Ho7NPhSFu129lnoMIU+d67vbyeONzrobpHU8M9/zmAOaUqILngNi4vFJw4 epO+vmRSTkbXoVXuKcO5ELVZJFICH+RHBelE7Om0t2Qmh/jTiuEogaOn6iCL8D+A/E12 dLElo8pWetacxafCVXZDEpri4XY9UKd2W3oUIekpUXJ6NuqpjiLgS9YOrTmIkmaKyAHT WfyVquYN7JjXUNWRYmysEFDMZCwBFewCmVfmbD+ppUfe+lxteCZ525UKh7jtKeNCeAJx tPoQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746168490; x=1746773290; darn=gnusha.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-sender:mime-version :subject:references:in-reply-to:message-id:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=vd1pM1x674wLmpfMpkQEJ8Ylvw6T98jBJRRZ+lked38=; b=O8sgW7Me6d7+ISl+Uj81OC3whHni0Xwpt38bOnrSBKRayNlK2/4yRCUhdoqfpGEggH 58Tk2NMeiFSjirnp0XNvwqHoLVfLun5VlhUosn8Eh5Fkm44UOq3hExGlCJ96QhdEOLl+ 7MgvaaHYb7DVKZHcpXW2jV5NA5a/ivfm5RXFEKJaQh0+kkZng/a4gxkNHEl5hv6eJz2X Nec4gzw5eiApY6aibZGaq5muyHo7Em02Fk8jFGROvjF2lBxmUm+OlEo8+4aw/7/ekDMf FrNryNKL93/yr/UMWClOaLmsn1R85BTPSI/d+uSHmsk+uedpOKzTVzvY6AdrAsGTADQi 8Dsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746168490; x=1746773290; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-sender:mime-version :subject:references:in-reply-to:message-id:to:from:date:x-beenthere :x-gm-message-state:sender:from:to:cc:subject:date:message-id :reply-to; bh=vd1pM1x674wLmpfMpkQEJ8Ylvw6T98jBJRRZ+lked38=; b=EF+0p2sv9x4i+PDgWfKSb4pmW430Aq3vUB4z5m4lGEwUGQZ5mv1RwTLVAbjcoCLkNO D4F3ycDQqCLgsXm82uIX7hGXXybQgXAdgcNGQaiUvCPtOtJyGSacVTZr/6BlQQAXFyYv esaU15PbON1QxtkN3q0HbxFys7LyO3Q9vfNooGQNPyCpE9yVL9kXXojflbyPfQwjuH2E xzHxSZ43D75EF+kkobU7BuMyuIgIFW8DnwuqNavHd0669AI+ifN9f6reNppdgMzUOt8i YlRN8TKhq5sM7KJC5EeVJ52smXmdu5f4TN4+JCJ8jKd35oijOIA7oTEGU8IUzQY5kEwM ktfg== Sender: bitcoindev@googlegroups.com X-Forwarded-Encrypted: i=1; AJvYcCWc4L6hGuEGz8A45BCSDQdaTYVd4lW7hJdX5CVPAGFsymYqOk/NqdSAIefSNX0e0Y9BtYUay2L43Z9V@gnusha.org X-Gm-Message-State: AOJu0Yy/BUeO4qd2VfTdxAYVHqnwTs11BF6RxnZiALifRy6zqFr1MblS Hc6Lbxi5BPqUFxxItnYpCOxkIjFKpOxKopnrvbSNSUtTv8qaTCzD X-Google-Smtp-Source: AGHT+IEKa31mmaFHkPq1T1hJQcYOvfmWZG7mgmTgQfr/EV9QvKz8hIi+9csItDgcCF58AHKxrVdi1A== X-Received: by 2002:a05:6902:4812:b0:e6d:e985:5eb1 with SMTP id 3f1490d57ef6-e75655f7e32mr2322537276.38.1746168489852; Thu, 01 May 2025 23:48:09 -0700 (PDT) X-BeenThere: bitcoindev@googlegroups.com; h=AVT/gBFFpSGVHdbkGHOAro2TNHahFIkTN/sevrsKlVDefCDVFg== Received: by 2002:a25:7bc1:0:b0:e75:60d4:3256 with SMTP id 3f1490d57ef6-e7560d43323ls165585276.0.-pod-prod-00-us; Thu, 01 May 2025 23:48:06 -0700 (PDT) X-Received: by 2002:a05:690c:6892:b0:702:5886:7804 with SMTP id 00721157ae682-708bcf646bcmr68393037b3.19.1746168486626; Thu, 01 May 2025 23:48:06 -0700 (PDT) Received: by 2002:a81:f801:0:b0:6ef:590d:3213 with SMTP id 00721157ae682-708cfb0b4cems7b3; Thu, 1 May 2025 23:47:22 -0700 (PDT) X-Received: by 2002:a05:690c:a8f:b0:6ff:2777:92b7 with SMTP id 00721157ae682-708ced7059bmr23985057b3.9.1746168441212; Thu, 01 May 2025 23:47:21 -0700 (PDT) Date: Thu, 1 May 2025 23:47:20 -0700 (PDT) From: Greg Maxwell To: Bitcoin Development Mailing List Message-Id: In-Reply-To: References: Subject: [bitcoindev] Re: SwiftSync - smarter synchronization with hints MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_32778_194082566.1746168440844" X-Original-Sender: gmaxwell@gmail.com Precedence: list Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com List-ID: X-Google-Group-Id: 786775582512 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -0.5 (/) ------=_Part_32778_194082566.1746168440844 Content-Type: multipart/alternative; boundary="----=_Part_32779_1552355224.1746168440844" ------=_Part_32779_1552355224.1746168440844 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This sounds like an excellent idea that preserves the assume valid-like=20 security properties but with more performance and more optimization=20 oppturnities. I like particularly -- if I understand it correctly-- that the hints=20 themselves are not security relevant, if they're wrong you'll just fail=20 rather than end up with something incorrect. Also I like the lack of=20 basically anything being normative, it's easier to feel comfortable with=20 something when you won't be stuck with it forever... the whole scheme could= =20 just be reworked every version with no particular harm because its effects= =20 are all ephemeral. At worst you might have problems if you started IBD=20 with one version and tried to complete it with another. I haven't seen any code, but if hash() is a MD style hash function such as= =20 sha256 you should place the salt first so that it can be absorbed then the= =20 midstate reused. For sha256 you could get potentially double the hashing=20 performance and I assume/hope that hash is actually fairly high in the=20 profile cost.. maybe more like a 1/3 improvement considering the size of=20 the entry, care should be taken to try to minimize compression function=20 runs. ... but at least the utxo representation there is entirely=20 implementation defined. You may be able to optimize the size of the hints further with the=20 observation that false positives are harmless as long as they're rare, as= =20 in you can save some extra txouts and if you later find them being spent,= =20 then just go ahead and remove them. So for example ribbon filters give you= =20 a memory space and read efficient hash table that is constructed by solving= =20 a linear system to make sure all the inputs match. One could solve for=20 successively larger filters to target a specific false positive rate. (I= =20 mean just a 2,4 cuckoo filter or similar would also work, but that's two=20 memory accesses required and you don't actually need runtime modification)= . On Wednesday, April 9, 2025 at 10:11:29=E2=80=AFAM UTC Ruben Somsen wrote: > Hi everyone, > > SwiftSync is a new validation method that allows for near-stateless, full= y=20 > parallelizable validation of the Bitcoin blockchain via hints about which= =20 > outputs remain unspent (<100MB total). All other inputs/outputs are=20 > efficiently crossed off inside a single hash aggregate that only reaches= =20 > zero if validation was successful and the hints were correct. > > The main observation is that it can be much easier to validate that a=20 > given UTXO set is correct than to compute it yourself. It allows us to no= =20 > longer require a stateful moment-to-moment UTXO set during IBD and allows= =20 > everything to be order independent. I'll briefly summarize the protocol,= =20 > before sharing the link to the full write-up. > > Each output gets a boolean hint (e.g. committed into Bitcoin Core) about= =20 > whether or not it will still be in the UTXO set after validation complete= s.=20 > If it does, we write it to disk (append-only - it won't be used until=20 > SwiftSync finishes). If it does not, we hash the UTXO data and add it to = an=20 > aggregate. For each input, we once again hash the UTXO data and remove it= =20 > from the aggregate. > > At the end, for every added output there should have been exactly one=20 > removed input, bringing the end total of the aggregate to zero. If this i= s=20 > indeed the case, we will have validated that the hints, and the resulting= =20 > UTXO set, were correct. > > E.g. For spent outputs A, B and inputs C, D we calculate=20 > hash(UTXO_A||salt) + hash(UTXO_B||salt) - hash(UTXO_C||salt) -=20 > hash(UTXO_D||salt) =3D=3D 0 (proving (A=3D=3DC && B=3D=3DD) || (A=3D=3DD = && B=3D=3DC)). > > There is one missing step. The UTXO data is only available when processin= g=20 > the output, but not when processing the input. We resolve this by either= =20 > downloading the outputs that were spent for each block (equivalent to the= =20 > undo data, maybe 10-15% of a block), or we lean on assumevalid, making it= =20 > sufficient to only hash the outpoints (which are available in both the=20 > output and input) rather than the full UTXO data. > > Ignoring bandwidth, the expectation is that the speedup will be most=20 > significant on either RAM limited devices and/or devices with many CPU=20 > cores. Initial PoC benchmarks (thanks to theStack) show a 5.28x speed-up,= =20 > while currently still being largely sequential. > > Many more details are in the full write-up: > https://gist.github.com/RubenSomsen/a61a37d14182ccd78760e477c78133cd > > It will answer the following questions (and more): > > - How the hash aggregate can be made secure against the generalized=20 > birthday problem > - How exactly assumevalid is utilized and what the implications are > - How we can still check for inflation when we skip the amounts with=20 > assumevalid > - How we can validate transaction order while doing everything in paralle= l > - How we can perform the BIP30 duplicate output check without the UTXO se= t > - How this all relates to assumeutxo > > To my knowledge, every validation check involving the UTXO set is covered= ,=20 > but I'd be curious to hear if anything was overlooked or if you spot any= =20 > other issues. > > Thanks for reading, and thanks to everyone who provided invaluable=20 > feedback while the idea was coming together. > > -- Ruben Somsen > --=20 You received this message because you are subscribed to the Google Groups "= Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/= cc2dfa79-89f0-4170-9725-894ea189a0e2n%40googlegroups.com. ------=_Part_32779_1552355224.1746168440844 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
This sounds like an excellent idea that preserves the assume valid-lik= e security properties but with more performance and more optimization opptu= rnities.

I like particularly -- if I understand = it correctly-- that the hints themselves are not security relevant, if they= 're wrong you'll just fail rather than end up with something incorrect.=C2= =A0 Also I like the lack of basically anything being normative, it's easier= to feel comfortable with something when you won't be stuck with it forever= ... the whole scheme could just be reworked every version with no particula= r harm because its effects are all ephemeral.=C2=A0 At worst you might have= problems if you started IBD with one version and tried to complete it with= another.

I haven't seen any code,=C2=A0 but if = hash() is a MD style hash function such as sha256 you should place the salt= first so that it can be absorbed then the midstate reused. For sha256 you = could get potentially double the hashing performance and I assume/hope that= hash is actually fairly high in the profile cost.. maybe more like a 1/3 i= mprovement considering the size of the entry, care should be taken to try t= o minimize compression function runs. ... but at least the utxo representat= ion there is entirely implementation defined.

Yo= u may be able to optimize the size of the hints further with the observatio= n that false positives are harmless as long as they're rare,=C2=A0 as in yo= u can save some extra txouts and if you later find them being spent, then j= ust go ahead and remove them.=C2=A0 So for example ribbon filters give you = a memory space and read efficient hash table that is constructed by solving= a linear system to make sure all the inputs match.=C2=A0 One could solve f= or successively larger filters to target a specific false positive rate.=C2= =A0 (I mean just a 2,4 cuckoo filter or similar would also work, but that's= two memory accesses required and=C2=A0 you don't actually need runtime mod= ification).




<= div class=3D"gmail_quote">
On Wednesd= ay, April 9, 2025 at 10:11:29=E2=80=AFAM UTC Ruben Somsen wrote:
=
H= i everyone,

SwiftSync is a new validation method that a= llows for near-stateless, fully parallelizable validation of the Bitcoin bl= ockchain via hints about which outputs remain unspent (<100MB total). Al= l other inputs/outputs are efficiently crossed off inside a single hash agg= regate that only reaches zero if validation was successful and the hints we= re correct.

The main observation is that it can be much easier to va= lidate that a given UTXO set is correct than to compute it yourself. It all= ows us to no longer require a stateful moment-to-moment UTXO set during IBD= and allows everything to be order independent. I'll briefly summarize = the protocol, before sharing the link to the full write-up.

Each out= put gets a boolean hint (e.g. committed into Bitcoin Core) about whether or= not it will still be in the UTXO set after validation completes. If it doe= s, we write it to disk (append-only - it won't be used until SwiftSync = finishes). If it does not, we hash the UTXO data and add it to an aggregate= . For each input, we once again hash the UTXO data and remove it from the a= ggregate.

At the end, for every added output there should have been = exactly one removed input, bringing the end total of the aggregate to zero.= If this is indeed the case, we will have validated that the hints, and the= resulting UTXO set, were correct.

E.g. For spent outputs A, B and i= nputs C, D we calculate hash(UTXO_A||salt) + hash(UTXO_B||salt) - hash(UTXO= _C||salt) - hash(UTXO_D||salt) =3D=3D 0 (proving (A=3D=3DC && B=3D= =3DD) || (A=3D=3DD && B=3D=3DC)).

There is one missing step.= The UTXO data is only available when processing the output, but not when p= rocessing the input. We resolve this by either downloading the outputs that= were spent for each block (equivalent to the undo data, maybe 10-15% of a = block), or we lean on assumevalid, making it sufficient to only hash the ou= tpoints (which are available in both the output and input) rather than the = full UTXO data.

Ignoring bandwidth, the expectation is that the spee= dup will be most significant on either RAM limited devices and/or devices w= ith many CPU cores. Initial PoC benchmarks (thanks to theStack) show a 5.28= x speed-up, while currently still being largely sequential.

Many mor= e details are in the full write-up:
https://gist.github.com/RubenSomsen/a61a37d14182ccd78760e477c78133cd<= br>
It will answer the following questions (and more):

- How the = hash aggregate can be made secure against the generalized birthday problem<= div>- How exactly assumevalid is utilized and what the implications are
- How we can still check for inflation when we skip the amounts with= assumevalid
- How we can validate transaction order while doing = everything in parallel
- How we can perform the BIP30 duplicate o= utput check without the UTXO set
- How this all relates to assume= utxo

To my knowledge, every validation check involving the UTXO set = is covered, but I'd be curious to hear if anything was overlooked or if= you spot any other issues.

Thanks for reading, and thanks to everyo= ne who provided invaluable feedback while the idea was coming together.
=
-- Ruben Somsen

--
You received this message because you are subscribed to the Google Groups &= quot;Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoind= ev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoind= ev/cc2dfa79-89f0-4170-9725-894ea189a0e2n%40googlegroups.com.
------=_Part_32779_1552355224.1746168440844-- ------=_Part_32778_194082566.1746168440844--