From mboxrd@z Thu Jan 1 00:00:00 1970 Delivery-date: Fri, 02 May 2025 08:28:45 -0700 Received: from mail-qv1-f58.google.com ([209.85.219.58]) by mail.fairlystable.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1uAsJe-0006kg-SR for bitcoindev@gnusha.org; Fri, 02 May 2025 08:28:44 -0700 Received: by mail-qv1-f58.google.com with SMTP id 6a1803df08f44-6e91d8a7165sf39311216d6.0 for ; Fri, 02 May 2025 08:28:43 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1746199717; cv=pass; d=google.com; s=arc-20240605; b=VfwvzIJ3bNn0NlA8bGwCO4yz0z77weXNHf1n7UcE7B7zMreD5j3wcA+A1TOAWp811O ykBzHs/0i/pg1q9LTvET4YVarm5GBkh6ork58Ih0+RvvjPoWwa3F6o5vKTFHlNk+N8nG ekb2qAFKkc1R1mXPtwiJwdo+E9Xu1uIi6m/EAcHGKZBgTseNaGZXgSpGU+X6spKyyDgc 46znuY8ShXXV3rIId8ggwJMx20Qtrd8wF+r7+EdnW/kqFSCVqUe3Z/2W4oW8jC2UhLFC m+mop8ulXD3TSFA75OeNKADXBhbEOBXmlFL6AvwVS/pbhAY9nxTryHKNM8Sv4aXp+7Fe K9Pw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=43Bj1OsKD1sEGqBEwk4rYWxJLkKhWQ/mdK4oRudt0hk=; fh=xyTV+O/OB6Wv8CviWQ2futicEydi64CUC7XCUSvrOSM=; b=hEyjK+Yl8QdSt986hW7lXKNT/+hoW0pRXgG5oUa5F8YN9vBwJpFaWfWVXR3p2w8X3e XTf9K7SsT3D/b209t3hAWtGU2mCTXnzCTOc2rlM7G0xkkRxk8ZbFrZe7nMvblKujuEFs 8988is5rszMEQI3soD+6NhXPKNa4E1cX4tUrbpo/nWaWSnpP2vlWWSMPCFz3V0XH5lTG /nNi61kBY8wyWjFE4vr385V0f+Qd8K8uhCo0VubMtet/vXk67Gg8p+Ku5EhXmZo4FYd0 rPrZj7tnkkGbItPVQBTt42oLGXH8emO+tkZJPfvy5OcIGecOS3kaNkBEB3bLTdSkQbe1 ZbHw==; darn=gnusha.org ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=e11T7dTJ; spf=pass (google.com: domain of saintwenhao@gmail.com designates 2a00:1450:4864:20::536 as permitted sender) smtp.mailfrom=saintwenhao@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=pass header.i=@googlegroups.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20230601; t=1746199717; x=1746804517; darn=gnusha.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-authentication-results :x-original-sender:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:sender:from:to:cc:subject:date:message-id :reply-to; bh=43Bj1OsKD1sEGqBEwk4rYWxJLkKhWQ/mdK4oRudt0hk=; b=cNukKI6dvnGiTvpZ9FSG95CqxbcDoC+yC/18VQeQFBz8ppEBkggLaXt6puHrJriPl7 YzhQV4ofpc7p5xmTG9gwq7d/DASE7VjUOioaCFRp+yriKWZeuG5AkKOL/AHtwy/8jAow xnEyEYhm3EXiV6bqGo6ByN5wVqznASUe5xT5tHnk/iH7Awlk8SuMxAYEdLH75VpKsqSh F1aeZFhZq9jjgVTKL0LfJUdrTIQcVOl4OyUHKV3K+ShhmWUDka69fRxwGE3j3fRHFZ2A /Hj4wVOfsOdsagm3PSSirm4R8mA9WN6s31iy4s1KtEjxZfHjwVDP1p/uUrg2BZ8PqSjL NQmg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746199717; x=1746804517; darn=gnusha.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-authentication-results :x-original-sender:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:from:to:cc:subject:date:message-id:reply-to; bh=43Bj1OsKD1sEGqBEwk4rYWxJLkKhWQ/mdK4oRudt0hk=; b=LaSM7T/KD91sHUh7LDoEb503baF4/m3hXk02N5bRuKPq/32lDoVcrMuRdx/2x0V6D2 5eN6CJ9HDBdQFOBIp8zaLDQFpzD1L273nAvsrkY6cAo9rUfLcR4+4vHeXT+gewm3Ic57 1u7z4/BGGph9iA3T7g8q9W/TS/y45p42DNoXg8T42SJqex6aMSnEi12r0k+hmqVAeh91 1Tm9Xx+tz+DfZyrF8Lkb+JZsXy/iunmX+SoLAHGRl2OgHKLCGJMnkkQdpJczcnaeSkUe LuCF00cJ6n+xDgnCXW+CuzCze8R//lkcTRk7BcEo2xc5EcgepLpCZccqZ3xVz3uxA9cb njlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746199717; x=1746804517; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-authentication-results :x-original-sender:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=43Bj1OsKD1sEGqBEwk4rYWxJLkKhWQ/mdK4oRudt0hk=; b=ngmuPvXIeEvbIoO0I27rHqgvYQ9JSMcCKl/IKp21/29NYiT4f5Rm5YFWUdCzzyQvCE mWCYVkCWtaOXQv+mYpEUDVgM7YpO18trNz3ru9UbFeE1XgVuVqkGpDb42G6Bc7Z56B3q 4/NliviusyITaTZZxBp6XXs2II6d3d356mBNbe1ddNmrnzTjB/kJG2xHt+solKuq8lra YIxn8plyhV4gi4xzeokD8RcGuJsrdbIus8sCkZYqyZGzSg98swg7tAgagY43FdxNGstA F3FArNZJSCPkD2NOWp6I5Pdd5TcZS/KYN03W0oU1zhCr7Y+eiDnxn17TeVOG7T5/S437 hW8w== Sender: bitcoindev@googlegroups.com X-Forwarded-Encrypted: i=2; AJvYcCUYaoMhPy7jcgtOejdzn2aSbocqeleUFu6JxaaYSaGKaH99+htIfotHsVbUoJFe60VMt6mwOzSS7Rms@gnusha.org X-Gm-Message-State: AOJu0Ywc7gG+Do2086FmhwoJ3+x4IO2PPie4xxNcCd4/43JauRN+PPjK DDpWDrUnQd3HCjZrHBij3qzAY1ty0U4hOJiuYADSV/ZgMbzVB8uB X-Google-Smtp-Source: AGHT+IFleV9SG57YLZzL5VIUr5X5EggXKgO2t0EIX2BoyuSD4N1J8h14nQ1lkDreuBydv24TnyRmKg== X-Received: by 2002:a05:6214:2508:b0:6f5:107c:7db2 with SMTP id 6a1803df08f44-6f515256d90mr58520486d6.9.1746199716757; Fri, 02 May 2025 08:28:36 -0700 (PDT) X-BeenThere: bitcoindev@googlegroups.com; h=AVT/gBHseIl1IvrqKGLefboCAae+wGBK8k8rk7kK9qk482pgHA== Received: by 2002:a05:6214:248a:b0:6e8:fa98:8af6 with SMTP id 6a1803df08f44-6f5084e1c03ls36262476d6.1.-pod-prod-03-us; Fri, 02 May 2025 08:28:33 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCWuT4ZIGbWrmsKbaDc862eh8gDAChKtQhiPOqEv2Yyp9vW7S4IKaLBCtW1Qdin9mT4IGQeVaIXQMeXm@googlegroups.com X-Received: by 2002:a05:620a:17a7:b0:7c5:3d60:7f8f with SMTP id af79cd13be357-7cad5b412efmr481996385a.18.1746199712980; Fri, 02 May 2025 08:28:32 -0700 (PDT) Received: by 2002:ab3:11b:0:b0:293:3256:5107 with SMTP id a1c4a302cd1d6-2acb761afabmsc7a; Fri, 2 May 2025 06:38:36 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUNENbqIvSgg4kdjjEM96QR94nmSo8IQAFk2P83S3UE+uFCCSDtZRX+O7KRrZ1EqAC1xNjBAE9sPg3q@googlegroups.com X-Received: by 2002:a2e:be26:0:b0:30d:b31e:2628 with SMTP id 38308e7fff4ca-320c56fe894mr7824561fa.27.1746193114296; Fri, 02 May 2025 06:38:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1746193114; cv=none; d=google.com; s=arc-20240605; b=JO3p4eAuFF7tCt4MECrEcNMOc2mPcb/C5e4mjzIdRdRPCgnVNM994F0VR2An5cSwYL 6oBH21sABDw5Km4US9tvgS8XC2ixnco18BFMhn6i5v1rLK0WgePa3jkGKscHEExZWjpp bmMAgTDlBHZ+uVswea4dhej8Y1u64O3hjk1VttyejB2mcNZNyDXk7xvcP0m5xVjNAxLX hhU9kEVl0/fru7j0MhfS5BmBamSKZvwb8tsQFiKkxDxLtI1BrL9nZRRiOPjERxNXktCn 4EOHOTONiadPj++MSdqmhPC2PkhUXGz2GE0eBnesBXgPnLHkgbtOUjW0n1G3UyINarCR CApg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=9lA+bsG3APj4TXUfXyp1527nQ7vp+/HU7teDKiRORKk=; fh=nD2wJcta0v2E5/SvkLPWVLnGOcSJ0YFHlOmNpe4rceU=; b=gM+fnI7EFHt+vR5vt8NNwre3F2hRHS8skGxNjCi83mOY+4d834lxWDHIGHHR3cv6G+ vjq2ugStHiTD8MCkhnC37YnQ5zXBKRX4c4LX8WWL74wT1030pxESolWL7csE3uMLD6Ck RGv7zMpDQMfYRtnL5/Gq/w43pAMFQE9lPGucxfBOlCtn4V1lSyTdBto5JW3gsDhvPcVI DFTSZL29d6AflWRajLMWVMNOOr/3MWmx3YB9RHhnK9w4/cBSXkZhlU0vaVACXDyu830T 9Js4gUo4ThFF7jPMuYKzATc6nvJtr8wusr4yDSu4qUfF5NdtDETLdmRcHgUho2C8W0r/ m0Rw==; dara=google.com ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=e11T7dTJ; spf=pass (google.com: domain of saintwenhao@gmail.com designates 2a00:1450:4864:20::536 as permitted sender) smtp.mailfrom=saintwenhao@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=pass header.i=@googlegroups.com Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com. [2a00:1450:4864:20::536]) by gmr-mx.google.com with ESMTPS id 38308e7fff4ca-3202ab8c0ffsi511971fa.8.2025.05.02.06.38.34 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 02 May 2025 06:38:34 -0700 (PDT) Received-SPF: pass (google.com: domain of saintwenhao@gmail.com designates 2a00:1450:4864:20::536 as permitted sender) client-ip=2a00:1450:4864:20::536; Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-5e5e63162a0so3047076a12.3 for ; Fri, 02 May 2025 06:38:34 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCWXwOGq0HVnze7T0pQ2sAXQQuN0OqdlqSEkPIteYlEDZVOmZ7P9NYMgxBBlo264dyiN9r6LKpIlcejt@googlegroups.com X-Gm-Gg: ASbGncu3yEdqbZh/IZ09t3jlZYg1HJd0KfdBzY9CX4zTCgqiQPUIPB1ht/BTM2DVx4W a4k379FDoM+GTtOHnAanMzRrfCGSN8ShMqXFmXeuGkWSlpVoBpNUPn8k5NEhbfr5qA9OA2AjouD 7ra3uPkwmdYIa1dEyNq9qpTmid+EPt3XrO X-Received: by 2002:a05:6402:5187:b0:5f8:e330:ff3d with SMTP id 4fb4d7f45d1cf-5fa780261b0mr2376064a12.11.1746193113263; Fri, 02 May 2025 06:38:33 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Saint Wenhao Date: Fri, 2 May 2025 15:38:20 +0200 X-Gm-Features: ATxdqUEFwHRjft7oSLFclS1fFZn2Fd7uVxUgBuipFlZk2_ky6_DkPvb-Dxl_LGc Message-ID: Subject: Re: [bitcoindev] Re: SwiftSync - smarter synchronization with hints To: Ruben Somsen Cc: Greg Maxwell , Bitcoin Development Mailing List Content-Type: multipart/alternative; boundary="00000000000039b29d06342744c6" X-Original-Sender: saintwenhao@gmail.com X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=e11T7dTJ; spf=pass (google.com: domain of saintwenhao@gmail.com designates 2a00:1450:4864:20::536 as permitted sender) smtp.mailfrom=saintwenhao@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com; dara=pass header.i=@googlegroups.com Precedence: list Mailing-list: list bitcoindev@googlegroups.com; contact bitcoindev+owners@googlegroups.com List-ID: X-Google-Group-Id: 786775582512 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Score: -0.5 (/) --00000000000039b29d06342744c6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > >sha256 you should place the salt first so that it can be absorbed then the midstate reused > Nice. That is a very good point that I had not yet considered. Basically, if you put salt first, and if the size of the salt will be aligned with the size of SHA-256 internal chunks, then you will reach tagged hashes (where salt is just a tag's name). > Is there a faster alternative to sha256? Even if it is, then if you introduce another hash function, which is not used anywhere else, then it will be another moving part, which means more complexity. > Can we get away with 16 byte hashes? If you want to get just that, then you can always strip SHA-256 output, and take just the first/last N bytes. > I could be mistaken, but in theory it seems we could even get away with 4 byte hashes if we allowed for a 1 in 4 billion chance of accidentally accepting an invalid chain. Grinding 2^32 values seems to be too easy (it is just network difficulty equal to one, in case of double SHA-256). Because then, even if someone will not compromise the system, by grinding a matching hash, then still: it can slow things down. And also, if you take the sum of hashes, which should be finally zero, then by grinding UTXOs, someone could make it zero, even if there is some mismatch. And just for that reason, you probably want something bigger than 32-bit, and something at least big enough to be collision and preimage resistant. pt., 2 maj 2025 o 13:01 Ruben Somsen napisa=C5=82(a): > Hi Greg, > > I appreciate your thoughts. > > > >excellent idea that preserves the assume valid-like security properties > > Thanks. Yeah, the assumevalid version is particularly efficient. Without > assumevalid there's a bandwidth tradeoff, but that also clearly seems wor= th > it. > > > >Also I like the lack of basically anything being normative, it's easier > to feel comfortable with something when you won't be stuck with it foreve= r > > Yes, that is a very pleasant property. There is essentially no difference > between the end state that you reach with or without SwiftSync. > > > >the hints themselves are not security relevant, if they're wrong you'll > just fail > > I agree. > > > >sha256 you should place the salt first so that it can be absorbed then > the midstate reused > > Nice. That is a very good point that I had not yet considered. > > > >false positives are harmless as long as they're rare, as in you can sav= e > some extra txouts and if you later find them being spent, then just go > ahead and remove them > > I don't see a practical way to do this without compromising the benefits > of SwiftSync because of the "later find them being spent" part. For one, = it > would require doing a lookup for each input to see if it's in your UTXO > set, something we currently don't need to do at all. Secondly, it would > require blocks to be processed in order, limiting parallelization. The > space savings would also be negligible, considering the hint data is > already <100MB (compressed). > > > I think most of the remaining optimization potential (other than the > engineering work to enable parallel validation) is in the hash > aggregate, like the midstate reuse. Is there a faster alternative to > sha256? Can we get away with 16 byte hashes? I could be mistaken, but in > theory it seems we could even get away with 4 byte hashes if we allowed f= or > a 1 in 4 billion chance of accidentally accepting an invalid chain. Leavi= ng > consensus to chance feels philosophically improper, but it's a pretty saf= e > bet considering it also involves PoW to trigger and even then would only > affect one or two random unlucky people on Earth. > > > Thanks for chiming in. > > Cheers, > Ruben > > On Fri, May 2, 2025 at 8:48=E2=80=AFAM Greg Maxwell = wrote: > >> This sounds like an excellent idea that preserves the assume valid-like >> security properties but with more performance and more optimization >> oppturnities. >> >> I like particularly -- if I understand it correctly-- that the hints >> themselves are not security relevant, if they're wrong you'll just fail >> rather than end up with something incorrect. Also I like the lack of >> basically anything being normative, it's easier to feel comfortable with >> something when you won't be stuck with it forever... the whole scheme co= uld >> just be reworked every version with no particular harm because its effec= ts >> are all ephemeral. At worst you might have problems if you started IBD >> with one version and tried to complete it with another. >> >> I haven't seen any code, but if hash() is a MD style hash function such >> as sha256 you should place the salt first so that it can be absorbed the= n >> the midstate reused. For sha256 you could get potentially double the >> hashing performance and I assume/hope that hash is actually fairly high = in >> the profile cost.. maybe more like a 1/3 improvement considering the siz= e >> of the entry, care should be taken to try to minimize compression functi= on >> runs. ... but at least the utxo representation there is entirely >> implementation defined. >> >> You may be able to optimize the size of the hints further with the >> observation that false positives are harmless as long as they're rare, = as >> in you can save some extra txouts and if you later find them being spent= , >> then just go ahead and remove them. So for example ribbon filters give = you >> a memory space and read efficient hash table that is constructed by solv= ing >> a linear system to make sure all the inputs match. One could solve for >> successively larger filters to target a specific false positive rate. (= I >> mean just a 2,4 cuckoo filter or similar would also work, but that's two >> memory accesses required and you don't actually need runtime modificati= on). >> >> >> >> >> On Wednesday, April 9, 2025 at 10:11:29=E2=80=AFAM UTC Ruben Somsen wrot= e: >> >>> Hi everyone, >>> >>> SwiftSync is a new validation method that allows for near-stateless, >>> fully parallelizable validation of the Bitcoin blockchain via hints abo= ut >>> which outputs remain unspent (<100MB total). All other inputs/outputs a= re >>> efficiently crossed off inside a single hash aggregate that only reache= s >>> zero if validation was successful and the hints were correct. >>> >>> The main observation is that it can be much easier to validate that a >>> given UTXO set is correct than to compute it yourself. It allows us to = no >>> longer require a stateful moment-to-moment UTXO set during IBD and allo= ws >>> everything to be order independent. I'll briefly summarize the protocol= , >>> before sharing the link to the full write-up. >>> >>> Each output gets a boolean hint (e.g. committed into Bitcoin Core) abou= t >>> whether or not it will still be in the UTXO set after validation comple= tes. >>> If it does, we write it to disk (append-only - it won't be used until >>> SwiftSync finishes). If it does not, we hash the UTXO data and add it t= o an >>> aggregate. For each input, we once again hash the UTXO data and remove = it >>> from the aggregate. >>> >>> At the end, for every added output there should have been exactly one >>> removed input, bringing the end total of the aggregate to zero. If this= is >>> indeed the case, we will have validated that the hints, and the resulti= ng >>> UTXO set, were correct. >>> >>> E.g. For spent outputs A, B and inputs C, D we calculate >>> hash(UTXO_A||salt) + hash(UTXO_B||salt) - hash(UTXO_C||salt) - >>> hash(UTXO_D||salt) =3D=3D 0 (proving (A=3D=3DC && B=3D=3DD) || (A=3D=3D= D && B=3D=3DC)). >>> >>> There is one missing step. The UTXO data is only available when >>> processing the output, but not when processing the input. We resolve th= is >>> by either downloading the outputs that were spent for each block >>> (equivalent to the undo data, maybe 10-15% of a block), or we lean on >>> assumevalid, making it sufficient to only hash the outpoints (which are >>> available in both the output and input) rather than the full UTXO data. >>> >>> Ignoring bandwidth, the expectation is that the speedup will be most >>> significant on either RAM limited devices and/or devices with many CPU >>> cores. Initial PoC benchmarks (thanks to theStack) show a 5.28x speed-u= p, >>> while currently still being largely sequential. >>> >>> Many more details are in the full write-up: >>> https://gist.github.com/RubenSomsen/a61a37d14182ccd78760e477c78133cd >>> >>> It will answer the following questions (and more): >>> >>> - How the hash aggregate can be made secure against the generalized >>> birthday problem >>> - How exactly assumevalid is utilized and what the implications are >>> - How we can still check for inflation when we skip the amounts with >>> assumevalid >>> - How we can validate transaction order while doing everything in >>> parallel >>> - How we can perform the BIP30 duplicate output check without the UTXO >>> set >>> - How this all relates to assumeutxo >>> >>> To my knowledge, every validation check involving the UTXO set is >>> covered, but I'd be curious to hear if anything was overlooked or if yo= u >>> spot any other issues. >>> >>> Thanks for reading, and thanks to everyone who provided invaluable >>> feedback while the idea was coming together. >>> >>> -- Ruben Somsen >>> >> -- >> You received this message because you are subscribed to the Google Group= s >> "Bitcoin Development Mailing List" group. >> To unsubscribe from this group and stop receiving emails from it, send a= n >> email to bitcoindev+unsubscribe@googlegroups.com. >> To view this discussion visit >> https://groups.google.com/d/msgid/bitcoindev/cc2dfa79-89f0-4170-9725-894= ea189a0e2n%40googlegroups.com >> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "Bitcoin Development Mailing List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to bitcoindev+unsubscribe@googlegroups.com. > To view this discussion visit > https://groups.google.com/d/msgid/bitcoindev/CAPv7TjaDGr4HCdQ0rR6_ma5zh2u= mU9r3_529szdswn_GjjnuCw%40mail.gmail.com > > . > --=20 You received this message because you are subscribed to the Google Groups "= Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/= CACgYNOLnV2JO%3Dn4UZhByLGMzEJ2vUXoaB%2BPCnLG2nzXvu8uAsg%40mail.gmail.com. --00000000000039b29d06342744c6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
> >sha256 you should place the salt first so that it= can be absorbed then the midstate reused
> Nice. That is a very good= point that I had not yet considered.

Basically, if you put salt fir= st, and if the size of the salt will be aligned with the size of SHA-256 in= ternal chunks, then you will reach tagged hashes (where salt is just a tag&= #39;s name).

> Is there a faster alternative to sha256?

Ev= en if it is, then if you introduce another hash function, which is not used= anywhere else, then it will be another moving part, which means more compl= exity.

> Can we get away with 16 byte hashes?

If you want = to get just that, then you can always strip SHA-256 output, and take just t= he first/last N bytes.

> I could be mistaken, but in theory it se= ems we could even get away with 4 byte hashes if we allowed for a 1 in 4 bi= llion chance of accidentally accepting an invalid chain.

Grinding 2^= 32 values seems to be too easy (it is just network difficulty equal to one,= in case of double SHA-256). Because then, even if someone will not comprom= ise the system, by grinding a matching hash, then still: it can slow things= down.

And also, if you take the sum of hashes, which should be fina= lly zero, then by grinding UTXOs, someone could make it zero, even if there= is some mismatch. And just for that reason, you probably want something bi= gger than 32-bit, and something at least big enough to be collision and pre= image resistant.

=
pt., 2 maj 2025 o 13:01=C2=A0Ruben So= msen <rsomsen@gmail.com> nap= isa=C5=82(a):
Hi Greg,

I appreciate your thoughts.


>excellent idea that preserves the a= ssume valid-like security properties

Thanks. Yeah,= the assumevalid version is particularly efficient. Without assumevalid the= re's a bandwidth tradeoff, but that also clearly seems=C2=A0worth it.


>Also I like the lack of basicall= y anything being normative, it's easier to feel comfortable with someth= ing when you won't be stuck with it forever

Ye= s, that is a very pleasant property. There is essentially no difference bet= ween the end state that you reach with or without SwiftSync.

=

>the hints themselves are not security relevan= t, if they're wrong you'll just fail

I= agree.


>sha256 you should place= the salt first so that it can be absorbed then the midstate reused

Nice. That is a very good point that I had not yet consid= ered.


>false positives are harml= ess as long as they're rare,=C2=A0 as in you can save some extra txouts= and if you later find them being spent, then just go ahead and remove them=

I don't see a practical way to do this withou= t compromising the benefits of SwiftSync because of the=C2=A0"later fi= nd them being spent" part. For one, it would require doing a lookup fo= r each input to see if it's in your UTXO set, something we currently do= n't need to do at all. Secondly, it would require blocks to be processe= d in order, limiting parallelization. The space savings would also be negli= gible, considering the hint data is already <100MB (compressed).

=C2=A0
I think most of the remaining optimizati= on potential (other than the engineering work to enable parallel validation= ) is in the hash aggregate,=C2=A0like the midstate reuse. Is there a faster= alternative to sha256? Can we get away with 16 byte hashes? I could be mis= taken, but in theory it seems we could even get away with 4 byte hashes if = we allowed for a 1 in 4 billion chance of accidentally accepting an invalid= chain. Leaving consensus to chance feels philosophically improper, but it&= #39;s a pretty safe bet considering it also involves PoW to trigger and eve= n then would only affect one or two random unlucky people on=C2=A0Earth.


Thanks for chiming in.

=
Cheers,
Ruben

On Fri, May 2, 2025 at 8:48=E2=80= =AFAM Greg Maxwell <gmaxwell@gmail.com> wrote:
This sounds like an excellent idea that preserve= s the assume valid-like security properties but with more performance and m= ore optimization oppturnities.

I like particularly= -- if I understand it correctly-- that the hints themselves are not securi= ty relevant, if they're wrong you'll just fail rather than end up w= ith something incorrect.=C2=A0 Also I like the lack of basically anything b= eing normative, it's easier to feel comfortable with something when you= won't be stuck with it forever... the whole scheme could just be rewor= ked every version with no particular harm because its effects are all ephem= eral.=C2=A0 At worst you might have problems if you started IBD with one ve= rsion and tried to complete it with another.

I hav= en't seen any code,=C2=A0 but if hash() is a MD style hash function suc= h as sha256 you should place the salt first so that it can be absorbed then= the midstate reused. For sha256 you could get potentially double the hashi= ng performance and I assume/hope that hash is actually fairly high in the p= rofile cost.. maybe more like a 1/3 improvement considering the size of the= entry, care should be taken to try to minimize compression function runs. = ... but at least the utxo representation there is entirely implementation d= efined.

You may be able to optimize the size of th= e hints further with the observation that false positives are harmless as l= ong as they're rare,=C2=A0 as in you can save some extra txouts and if = you later find them being spent, then just go ahead and remove them.=C2=A0 = So for example ribbon filters give you a memory space and read efficient ha= sh table that is constructed by solving a linear system to make sure all th= e inputs match.=C2=A0 One could solve for successively larger filters to ta= rget a specific false positive rate.=C2=A0 (I mean just a 2,4 cuckoo filter= or similar would also work, but that's two memory accesses required an= d=C2=A0 you don't actually need runtime modification).




On Wednesday, April 9, 2025 at 10:11:29=E2= =80=AFAM UTC Ruben Somsen wrote:
Hi everyone,

Swi= ftSync is a new validation method that allows for near-stateless, fully par= allelizable validation of the Bitcoin blockchain via hints about which outp= uts remain unspent (<100MB total). All other inputs/outputs are efficien= tly crossed off inside a single hash aggregate that only reaches zero if va= lidation was successful and the hints were correct.

The main observa= tion is that it can be much easier to validate that a given UTXO set is cor= rect than to compute it yourself. It allows us to no longer require a state= ful moment-to-moment UTXO set during IBD and allows everything to be order = independent. I'll briefly summarize the protocol, before sharing the li= nk to the full write-up.

Each output gets a boolean hint (e.g. commi= tted into Bitcoin Core) about whether or not it will still be in the UTXO s= et after validation completes. If it does, we write it to disk (append-only= - it won't be used until SwiftSync finishes). If it does not, we hash = the UTXO data and add it to an aggregate. For each input, we once again has= h the UTXO data and remove it from the aggregate.

At the end, for ev= ery added output there should have been exactly one removed input, bringing= the end total of the aggregate to zero. If this is indeed the case, we wil= l have validated that the hints, and the resulting UTXO set, were correct.<= br>
E.g. For spent outputs A, B and inputs C, D we calculate hash(UTXO_A= ||salt) + hash(UTXO_B||salt) - hash(UTXO_C||salt) - hash(UTXO_D||salt) =3D= =3D 0 (proving (A=3D=3DC && B=3D=3DD) || (A=3D=3DD && B=3D= =3DC)).

There is one missing step. The UTXO data is only available w= hen processing the output, but not when processing the input. We resolve th= is by either downloading the outputs that were spent for each block (equiva= lent to the undo data, maybe 10-15% of a block), or we lean on assumevalid,= making it sufficient to only hash the outpoints (which are available in bo= th the output and input) rather than the full UTXO data.

Ignoring ba= ndwidth, the expectation is that the speedup will be most significant on ei= ther RAM limited devices and/or devices with many CPU cores. Initial PoC be= nchmarks (thanks to theStack) show a 5.28x speed-up, while currently still = being largely sequential.

Many more details are in the full write-up= :
https://gist.github.com/Ruben= Somsen/a61a37d14182ccd78760e477c78133cd

It will answer the follo= wing questions (and more):

- How the hash aggregate can be made secu= re against the generalized birthday problem
- How exactly assumevalid i= s utilized and what the implications are
- How we can still check= for inflation when we skip the amounts with assumevalid
- How we= can validate transaction order while doing everything in parallel
- How we can perform the BIP30 duplicate output check without the UTXO se= t
- How this all relates to assumeutxo

To my knowledge, ev= ery validation check involving the UTXO set is covered, but I'd be curi= ous to hear if anything was overlooked or if you spot any other issues.
=
Thanks for reading, and thanks to everyone who provided invaluable feed= back while the idea was coming together.

-- Ruben Somsen

--
You received this message because you are subscribed to the Google Groups &= quot;Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.googl= e.com/d/msgid/bitcoindev/cc2dfa79-89f0-4170-9725-894ea189a0e2n%40googlegrou= ps.com.

--
You received this message because you are subscribed to the Google Groups &= quot;Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://= groups.google.com/d/msgid/bitcoindev/CAPv7TjaDGr4HCdQ0rR6_ma5zh2umU9r3_529s= zdswn_GjjnuCw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups &= quot;Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to bitcoind= ev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/= d/msgid/bitcoindev/CACgYNOLnV2JO%3Dn4UZhByLGMzEJ2vUXoaB%2BPCnLG2nzXvu8uAsg%= 40mail.gmail.com.
--00000000000039b29d06342744c6--