Cluster Mempool Explained with Pieter Wuille

Stephan Livera and Pieter Wuille discuss Cluster Mempool for Bitcoin Core, its motivations, and its implications for Bitcoin users and miners. Where does the current mempool design have issues? Why is it important to maintain a transparent and reliable open mempool?

Pieter Wuille also explains the complexities of transaction clustering and how the new framework improves efficiency and helps keep bitcoin mining open.

Timestamps:

(00:00) – Intro

(01:05) – What is Cluster Mempool?

(03:05) – What is its impact on everyday Bitcoin users?

(06:21) – How does the mempool work today?

(11:52) – Current mempool heuristics and issues

(16:37) – Censorship resistance and economic demand

(22:56) – Practical implications for exchanges & miners

(26:12) – How does cluster mempool work?

(29:29) – Transaction clusters & mempool dynamics

(37:27) – Should mempools align across the node network?

(43:47) – What about other implementations of Bitcoin?

(48:17) – Other interesting areas for bitcoin development

(52:11) – Closing thoughts

Links:

Stephan Livera links:

Transcript:

Stephan Livera (00:00)
Hi everyone and welcome back to Stephan Livera podcast. Today I’m to be, well I’m being, joined by Pieter Wuille. He is a long time Bitcoin developer, researcher, and he is working at Chaincode Labs. And we’re going to be talking a bit about Clustermemple, which is a new project that is landing in the new version of Bitcoin Core V31, I believe. So Pieter, welcome to the show and let’s start with, start with why. Why were you interested in this project of Clustermemple?

Pieter Wuille (00:20)
correct?

Yeah, so you call it a new project. We started this just about three years ago. And the immediate motivation for this project was a talk that Suhas gave here at Chaincode Labs about some of the existing issues with the current mempool design.

The immediate motivation was this example of actually you can end up in a state today in a mempool. Obviously not likely would need a pathological adversarially constructed situation, but where the very first transactions that would be evicted if the mempool fills up, the first thing that gets evicted is actually the very first transaction you’d want to mine.

And this is obviously undesirable. It’s undesirable for miners, but it’s actually far worse than that when the network is sort of unable to reason about proper profitability of transactions. because obviously you’d want eviction and mining, even if you’re not a mining node, like you would want the things you evict to roughly correspond to the very last things a miner would

would

want to include.

And so that was the motivation. ⁓ Sure, like actually the current design we have is broken in many ways. That doesn’t necessarily mean exploitable, but it sort of makes it hard to reason about things. And we wanted to address that, came up with a cool design and it grew into something that ended up replacing

everything in the mempool basically internally touching on some relay policies that change along with it yeah and so it’s been merged now it will be in Bitcoin Core 31 which is slated to be released next month I think so pretty excited about that

Stephan Livera (02:20)
Okay,

great. And so just to motivate this for listeners, why does this project matter for everyday Bitcoin users? What is the impact that they might see or the, what is it? Yeah, basically what is the impact that an everyday Bitcoin user will see here?

Pieter Wuille (02:35)
Hopefully nothing. I think the motivation is more a long-term health of the network question than it is directly improving things for users. Now they might see a few changes, the most noticeable I think is how…

today, so up to the current Bitcoin Core release, we have relay policies that include ancestor limits and descendant limits. That means any transaction together with any and all of its unconfirmed ancestors cannot be more than 25. And similarly, a transaction together with all its unconfirmed descendants cannot be more than 25 transactions. And this

sometimes you see these peeling chains of someone makes a transaction, sends a change back to themselves, use the chain to do another payouts, sends it back to themselves. Hopefully they would batch this and turn it into a single transaction, but if they don’t, this sort of chain is limited to 25 transactions.

at any given time, It’s not that it’s impossible to peel further, but you’ll need to wait for some of them to confirm before the later ones propagate across the network. And this changes. With ClusterMemple, there are no more ancestor or descendant limits. Instead, there are cluster limits. think of it as a…

It’s your to.

continue with the analogy of ancestors and descendants in a family tree. Cluster is just the widest possible extended family thing. It’s like your parents and children, your grandparents and grandchildren, also aunts and uncles, niece and nephew, their parents, their children. anything that is related through any combination of is parent of or is child of. And you can, and sometimes do,

see fairly complicated things where you have one user that pays a number of other users and then they spend those coins together with another output that they got from yet another party. They all become related in a cluster. so in Cluster Mempool, we have a limit of 64 transactions in a cluster. That’s a number we picked based on performance characteristics.

the algorithms inside, similar to how the ancestor and descendant limits of 25 in the past were picked based on, well, these are the things we can sort of efficiently reason about still without the computation time for nodes blowing out of the water.

Yeah, from the most obvious direct impact is probably you could build these peeling chains up to 64 rather than 25. But it’s not just increasing the number because you also have this extended family thing that gets counted along with it.

Stephan Livera (05:21)
Okay, so.

Okay, and so can we just overview a little bit on how the mempool works today and then what are the main issues that you see with that? I think you were touching on some of that around the ancestor and descendant limits and then we can sort of go into some other concepts to help make this understandable for us. So just an overview on the mempool today and the issues.

Pieter Wuille (05:59)
So, Bitcoin Core uses a design where it maintains every node, not just miners, maintains a set of transactions that it expects to be mined in the short to medium term, up to roughly a day worth of transactions with default configurations.

it maintains the set as a consistent set so we’ll never have two conflicting transactions in the mempool it just becomes awfully hard if you try to reason about that

So its purpose is that there many ⁓ uses for this. One is obviously for miners to pick their transactions from, but maybe more importantly, it’s for nodes in the network to get an idea of what the transaction market at that point is like. It’s used for fee estimation.

it is used for deciding what replacements to make.

If you see two conflicting transactions, one will need to replace the other or not. You stick with the old one. How do you reason about, is this actually an improvement? And if we can decide that this new transaction actually looks better to miners, we should assume it will make their way to those miners. And thus, for us to be able to predict the future behavior, we want them in our mempool too.

basically a modeling of what we think the block space market is like at this very instance. This is guided by a whole number of heuristical algorithms today. Before Cluster Mempool, example, mining used something called

An ancestor set based mining which is

For every transaction, you compute the set of all its ancestors, which, as I explained, was limited to 25 in the past, and look at the average fee rate of that whole set. And the reasoning is like, say I pay you, and now you take the, but I attach a very low fee only. And…

you want to spend those coins for some reason and you want to do so urgently so you take those coins pay someone else with it but my transaction hasn’t confirmed yet so what you want to do is bump the fee this is called child pays for parents where you attach a higher fee than you normally would to sort of pay for the missing fee I have

because miners are not allowed to include your transaction without including mine as well. And so, but by attaching a higher fee, you incentivize them to include both of the transactions at the same time. And this is sort of modeled using this ancestor set based mining where the…

the mining logic or the block template building logic, this is long before there’s an actual miner involved, will look at the two set transaction combined because they form an ancestor set and sees, this is the average fee in this whole set, I’m gonna try treating it as a package that gets mined in one go. So this is what makes ChildPays4Parent work today, and it has since 2015.

Practically how it works, we pre-compute for every transaction in the memple the set of all its ancestors, compute the average fee rate of all of those ancestor sets, and sort the transactions by their average ancestor set fee rate. And we…

pretty much eagerly include the highest ancestor set fee rates package and then start over until the block fills up. This is pretty good approximation, but…

The opposite, we would like to do the very opposite because again, not every node in the network is a minor, but every node has reasons to try to predict that behavior, is eviction. And in eviction, we do the exact opposite, which is we compute the descendant sets for every transaction, because whenever you want to evict a transaction, you must also evict all their descendants. They become invalid.

I paid you, you paid someone else. If my transaction gets evicted, you don’t get your money anymore, so your transactions need to disappear as well. So we do, pre-compute for every transaction the descendant set and the average fee rate of that descendant set. And when evicting, we pick the descendant set with the lowest fee rate. And this feels very much like the opposite of what a mining algorithm is, but it…

out you can construct weird edge cases that involve diamond-shaped dependencies. you have one parent and two children and something that’s pence those two children that causes things to be counted double and as a result you you these two orderings like highest ancestor set first and lowest descendant set first are not opposites of each other.

But there are many more.

heuristics involved, for example, for RBF, we’ve had for the longest time the BIP 125 rules that sort of give a number of conditions on if you satisfy this and this and this, your transaction will be accepted for replacement. And it tries to heuristically answer, does this actually make things better for miners in terms of fee income? Because if it is, they’ll want it and

We don’t want to introduce a reason for people to submit their transactions to miners and make the block space market basically private. So to avoid that we need to be able to reason about this profitability and BIP 125 was an attempt at that in pre-cluster mempool.

There’s more things like fee estimation uses MMPL. There’s decisions like if you suddenly get a flood of transactions at the same time.

We don’t want to overload our peers, so we trickle them out. Which ones do we send first? Well, the transactions that we think are better. So we try to send the ones with a higher fee rate first, and so forth. So that’s sort of how, I think if I had to summarize today’s, is like we have,

Stephan Livera (12:30)
Okay.

Pieter Wuille (12:36)
MMPL is an attempt at approximating the current view of demand for the block space markets and it has a whole bunch of heuristical rules to decide what to keep and what to evict and what to send and what to receive that try to approximate this economic reality.

I’d say cluster-memple is replacing all of that with a framework that can actually reason about things properly, taking arbitrary dependencies into account and do away with all the heuristics and replace it with one rule is does this make things better or not.

Stephan Livera (13:13)
Okay, and so as I’m reading you then, it’s like in simple terms, okay, we’ve got this mempool, people put their transaction in there, but as you said, there are certain heuristics around ancestors and descendants and eviction, what we’re saying, and…

some of these heuristics aren’t necessarily always correct because they’re heuristics and maybe as a quote-unquote rule of thumb they might kind of mostly work but then there are some times where maybe the heuristic goes awry or goes wrong and there are examples where maybe a miner would have or should have wanted to include a particular transaction and the descendants of that transaction but actually because of the heuristics they would cut them out. Is that what you’re getting at?

Pieter Wuille (13:51)
Yeah, yeah,

yeah, yeah, yeah. And this is undesirable because we do have to make caveat here is that ultimately only the things that people actually do matter. Like if nobody creates super complicated dependency graphs of transactions, there’s no real economic demand for that. It doesn’t really matter how things.

like it doesn’t matter that we treat those suboptimally because it’s only an economic reality that matters. But the problem becomes not just that our decisions may be suboptimal, but they may be inconsistent. And inconsistency can be a lot worse because now you…

It’s not just, okay, miners make in this weird edge case that nobody really does make a tiny bit less money, who cares? But if there’s…

If we’re talking about eviction rules, maybe you get something, well, someone wants to make those transactions and they’re not propagating. So now you’re incentivizing market players to move towards private.

Stephan Livera (14:58)
I see, you’re driving towards private mempools

again. Right.

Pieter Wuille (15:00)
Exactly.

And I think that is really the core issue here is like, it’s a very tenuous situation, how we’re trying to make the public peer to peer transparent mempool market as

good as possible and the hope is that if it is sufficiently good there’s just no incentive for people to bypass it and we see attempts at bypassing it right there are what is it transaction acceleration exactly

Stephan Livera (15:30)
Mara slipstream and

different accelerators that a lot of the pools have had for years. It’s not a new thing.

Pieter Wuille (15:35)
Yeah,

and I think it’s worrying. I don’t think it’s necessarily a problem today. I think that the total income from those services is mostly negligible, except maybe at certain times. But if we were to end up in a situation where just the…

there is so much economic activity going through these private transaction rails.

where the network just doesn’t have visibility into it. It ultimately makes it harder to enter the mining markets because say there’s three big companies that have a private mempool that you can submit things to and as a miner you really have no choice but to contract with one or most of these because otherwise you just miss out on a substantial portion

of fee income and are unable to compete with those who do, it just makes it harder to, especially anonymously, like we would very much like it to be possible for anyone in the world. Obviously you need access to internet and electricity and hardware, but you don’t need permission from someone to enter the mining market if you are

if you think there’s money to be made or if you are unhappy with the set of transactions that are being censored by others. Now we get to the core issue, censorship resistance.

Right, like so much more, I would go as far as saying this is the entire reason why Bitcoin has proof of work.

The point of having proof of work is that anyone can join the mining market. Like Bitcoin’s decentralization and censorship resistance ultimately boils down if things go really wrong. Like the final protection is you can become your own miner. Maybe even at a loss, but hopefully.

not at a big one. And so what is it that incentivizes existing large miners to basically accept every transaction? It is the knowledge that if they start reliably censoring some transactions that have economic demand, someone will just pop up and mine them anyway. But all of this only works when

Access to the stream of transactions that users create and have economic demand for is public. You don’t need to go to a private party to get access to that economic activity. Because if you do that, they become the gatekeepers. They could enforce a rule and say, well, you’re anonymous miner, you don’t get our stream.

And so I think this is ultimately the motivation for all of this work. We want to make it so that the public network is able to reliably relay transactions that users want to create and miners want to mine out into the public so everyone has access to them. Unfortunately,

In a decentralized system, denial of service resistance is a very hard problem. We can’t just relay every transaction.

I stream my video over it and code it into transactions that they’re all conflicting with one another. There’s a network out there that will free relay every transaction. That doesn’t work. So what is, and this has for a long time been sort of the philosophy behind the mempool design in Bitcoin Core is…

under the assumption that we can reliably predict what will be mined because miners are economic actors that

then our denial of service protection is basically, well, transactions in our mempool will get confirmed and they pay a fee. And this is the protection, they pay a fee. Even if it’s not to us as a relaying node, they’re paying the fee to someone, so there’s a finite resource they have to spend. And.

So we need heuristics to decide what will be relayed, what won’t be. And those boil down to predicting profitability of those transactions. And all the examples I gave about mining and eviction and deciding the order in which things go are

sometimes faulty heuristics that try to reason about profitability. So with Gustav Mempel we can just do that much more reliably and consistently.

Stephan Livera (20:18)
Okay, yeah, so just replaying some of that, the understanding I’m taking from it is instead of using these heuristics, moving to a new model of cluster mempool which allows, let’s say that node runner or that miner to more.

accurately, let’s say, ⁓ get an idea of what transactions are likely to be the most profitable. And as you said, it’s not just about optimizing the profit. It’s about making it so that it’s an, you mining remains an open thing that a new miner can join. And I guess, yeah.

Pieter Wuille (20:39)
Yep. Yep.

Exactly, Making it predictable but

even predictable in the face of possibly complicated constellations of transactions.

So as a very simple example and not an attack or anything is to sort of talked about child base for parents where you can have a single child that bumps the fee of a parent because the parent paid too little fee. Well, you cannot have to and you can have one child that bumps multiple parents at the same time too, but you cannot have multiple children that bump the same parents because

of this ancestor set framework, each of the children has its own ancestor set and they’re never treated as a single set. And I think this was not the original goal of Gloucester and Poole, because I haven’t really talked about what it is yet, but the goal wasn’t so much improving or adding more…

crazy use cases that someone could construct like children pay for parents, it was just making things consistent. But then along the way we discovered that there were actually much nicer algorithms that became available that can take these much more complicated constructions into account and still deal with them almost optimally.

Stephan Livera (22:14)
Yeah. Okay. And so as I was trying to research and understand where this will impact.

So as you said, mining is an obvious one. Lightning and L2s, as I understand it, it might have some impact for them. And then maybe as an example, the exchange withdrawal case. So as I’m understanding some exchanges, might do like, they might, let’s say, you know, let’s say I’m the exchange and I’ve got like 100 customers, I might be doing a big withdrawal, but then I might be continually doing either CPFP or RBF to continually add new outputs to that transaction to go out to the,

Pieter Wuille (22:35)
this feeling.

Stephan Livera (22:49)
the customers and I presume then Cluster Mempool might help in this context also because they can more reliably do that.

Pieter Wuille (22:49)
Mm-hmm. Mm-hmm.

Yeah, so there were

There’s a number of things that I’ve seen exchanges do so they might do Batching where you have one transaction that pays out tons of people at the same time I don’t think that that will be affected too much or you can have these these change you’re talking more about like continual replacement So I don’t know if they will be affected but it’s

I mean the rules change there because with BIP 125 it used to be the case that there are certain things that are like obvious improvements like you make a transaction that just pays a higher fee, replaces another one, it makes some changes but wouldn’t be accepted.

and the other way around, where you have something that’s clearly not an improvement and yet it would be accepted. I think…

At a high level, things become simpler in the sense that you can adopt the strategy of, I’ll just bump the fee. Like, I do a replacement, it doesn’t make its way through, I just bump the fee a bit more. And if it still doesn’t work, I bump the fee more, replace it with another transaction. And this will just work. The replacement will go through as soon

it’s actually better. Where in the past you could try to follow the BIP-125 rules but as far as I know nobody actually does. In practice people already adopt this you know try and see approach like you bump the fee it doesn’t work I’ll bump it some more and so

now it’ll go through whenever you’ve actually improved the situation.

Stephan Livera (24:48)
Yeah, gotcha. Right, so

it’s kind of like in practice people were just doing sort of a trial and error approach anyway on these things. Yeah.

Pieter Wuille (24:54)
Yeah, and

I think this is, in theory, think this is the biggest downside to the new approach is I cannot give you.

a simple set of rules anymore that are of the form like if you satisfy these rules, the fee rate of this is bigger than the fee rate of that and the size of this is less than that and whatever, I guarantee replacement. That’s no longer the case, there’s no simple rules. You have to make things better, basically.

Stephan Livera (25:24)
Yeah, okay. So, I mean, we’ve set a bit of a context and I think it was useful for listeners. Let’s actually go now to, you know, how does Custom Emblem actually work? So can you take us through an overview on that?

Pieter Wuille (25:31)
Does this work? Yep.

Yeah. So the main observation is.

In order to reliably predict how, you know, make judgments about are these transactions better than these, will this one get mined before this one?

Today with the algorithm we have, if I wanted to actually figure out, let’s talk about the eviction case. If I wanted to actually reliably figure out today what is the last transaction that would be mined? Imagine no new transactions come in, whatever, I give you a mempool of 4,000 transactions.

100,000 and you want to figure out what is the very last one that would be mine because that’s the one I want to evict if I’m Somehow resource constrained. Well, the only way you have today is you run the mining algorithm but not for a 1 mega V byte block but for a hundred thousand or 99,999 transaction block and you see What’s the one transaction that doesn’t go in? That’s the one I would want to evict. Sadly, that’s just completely

computationally not feasible. cannot, anytime we want to evict a single transaction, run this entire block template building thing to see what goes in last.

So what is the obvious answer? As a computer scientist, well, we want to pre-compute things. We want to cache it. How about we just maintain at all times the mempool in a sorted order from good to bad, and then mining becomes, well, take from the front, and eviction becomes drop from the back. Unfortunately, it turns out that you…

Yeah, I don’t think drawing it on a whiteboard here is helpful for listeners, but I’ve given examples in write-ups about this where you can construct pathological memples where you have like a whole sequence of transactions where they’re all like…

Every parent has two children and every child has two parents and they’re sort of connected as a trellis where the optimal ordering is going from left to right. The best transactions are in the left, the worst transactions are in the right. And now a single new transaction comes in that pays a huge, huge fee and attaches on the right. And now suddenly the best order becomes you go from right to left.

And this is pretty unintuitive, but it’s true if you work it out. So the observation here is that a single transaction attaching to a huge cluster of transactions can basically completely overhaul how good things are within that cluster. So this notion of we can just pre-compute what the order is and make small changes to it

doesn’t work because a single thing can come in that requires us to recompute everything.

Stephan Livera (28:36)
I see. And just, mean, out of curiosity, how does something like mempool.space do it? I mean, they kind of project out what they think the blocks will be. Is that just like really computationally expensive or like just not feasible?

Pieter Wuille (28:48)
So I don’t know what they run in the background. I assume they have Bitcoin Core nodes that they pull things from, which do something. But it may well be suboptimal. I don’t know. In practice, you make an approximation for everything, anything, and you’ll get some output. It just may not correspond to what is the actual best thing to do.

Stephan Livera (29:00)
Yeah.

I see.

Pieter Wuille (29:10)
And so.

from this observation that a single transaction in a hypothetical adversarially constructed mempool, you won’t see like, know, 90 % of the mempool is one or two transaction clusters. There’s no problem here, but in the worst case, you can have a single transaction comes in that requires you to recompute everything. And to be clear, when I say recompute everything, this may be tens of milliseconds

seconds or something. It’s just not something we can do for every single transaction that comes in. It’s not like I’m claiming this will take hours to compute. It’s like tens or hundreds of milliseconds. But that is too much to do all the time, every time a change comes in.

And so that’s where the idea of the clustering comes in, because it turns out we don’t actually need the ordering of the entire mempool. We just need the ordering within every cluster of transactions. If I pre-compute, like I have a bunch of related transactions with dependencies and whatever, but it’s a small group, and I know how to order those, and then I have another cluster, and I know how to order those, and yet another one, and I know how to

order those, you can very efficiently sort of merge them into a single view of the mempool.

And this is where the idea of cluster mumple comes in. If we were to just able to be limit how big those clusters can get, then any individual transaction that comes in will just affect the order of that one cluster. And 64 transactions is the number we ended up with as our proposal. Because for such small clusters, and they’re not tiny, right? 64 transactions is significantly larger than what things are

supported today.

Yeah, a new transaction comes in, you will just need to do the recomputation for that small group of clusters. And we don’t have to worry about a single transaction, you know.

Stephan Livera (31:23)
like upending the whole block kind of thing. Instead, you’re just reconsidering that particular cluster, is it?

Pieter Wuille (31:24)
Exactly. Yeah. Not in

my previous hypothetical example. I wasn’t just talking about one block. I was talking about the entire mempool could like hundreds of blocks worth of transactions could need to be appended in a hypothetical situation. And here it becomes limited. And so cluster mempool is

Stephan Livera (31:44)
Example, yeah.

Pieter Wuille (31:52)
switching to a policy rule that imposes a limit on how big the clusters can get based on our intuition, belief, observation that users don’t really care about huge clusters. And then it becomes possible to pre-compute the mining order within every cluster and

From that you can implicitly define a whole memple ordering and now you to decide is this better than this? Is this an improvement? What’s the best thing to pick? What’s the last thing we should be evicting and so forth?

Stephan Livera (32:28)
And then the idea is that these clusters you’ll have, I guess, is it chunks? And then you assess one chunk against another to say, the fee rate in cluster or chunk B is higher than chunk A. So I’m going to preference that in favor of, et cetera.

Pieter Wuille (32:41)
Yep,

Yeah, so the chunks are sort of the generalization of CPFP groups of transactions. Whenever you have some transactions paying for other transactions. And because…

Say we have a single parent transaction that pays both of us. You and me both get an output. But it’s a fairly low fee transaction and I want to accelerate it so I spend my output but I spend it with a huge fee. You spend yours but you don’t really care, you spend it with a low fee.

What you want is my bumping transaction to bump the parent, but yours doesn’t in the sense that yours is low, you don’t care. So even though this is one cluster, it will become two chunks. The first chunk will be the parent that paid both of us and my transaction that bumps it. And then the second chunk is yours that becomes dependent on it.

And it’s very related to the ordering. I don’t think I should go into the details too much here, but if I give you an ordering of transactions for a cluster, you can find what the chunks are, like which things need to be combined. actually finding the optimal ordering is the same problem as finding the chunks, it turns out.

And so it’s, I think the simplest way of describing what a chunk is, take your cluster, which is a whole group of dependent transactions, whatever, can be dozens, you call a topological set is a set of transaction that includes all its own ancestors. So,

Stephan Livera (34:04)
Okay.

Pieter Wuille (34:24)
In our example, just a parent would be a topological set. The parent together with mine would be one. The parent together with you would be, or all three would be topological.

And then among all the topological sets, pick the one with the highest fee rate. And that becomes your first chunk. Remove it from the cluster. And now you redo that same computation on what remains of the cluster. Try to find in what remains the highest fee rate topological sets.

I think intuitively they make sense because obviously you can only include topological sets, you cannot include something where a parent is missing. So among all those, pick the one with highest fee rates. So that is the chunking problem and that is the linearization problem.

Stephan Livera (35:07)
Yeah.

And when you say fee rate, what we’re talking about is not just absolute fee, what we’re talking about is fee per V byte, right? Yeah.

Pieter Wuille (35:19)
V-byte, yes.

And I’m talking about the average phi per V-byte in the whole set. So it’s the sum of all the fees of all the transactions in the set divided by the sum of all the V sizes or weights. Actually internally all the computation is done with weights.

Stephan Livera (35:38)
Okay, yeah, I think I’m slowly getting the idea. And then I guess the idea is, yeah, so you’re expanding out the, you’re simplifying transaction group limits. are taking away certain previous limitations on how many unconfirmed ancestors, because now instead of 25, we’re going up to 64, but that’s at a cluster level. And then I’m…

I guess kind of thinking about L2s and this thing, I’m thinking about maybe it’s not the exact precise technical term, but I’ve heard people talking about these concepts like unfurling or unrolling, this kind of thing, like maybe in the future this would be relevant for that when people are trying to exit out of some kind of shared UTXO. Is that related to this?

Pieter Wuille (36:21)
Hmm.

yes, but I’m not the expert. I’m not very familiar with these.

Stephan Livera (36:31)
Gotcha, right, but I guess just in theory, it would

be, it might ease the pathway for some of those kinds of L2s, that kind of thing. Yeah, okay. And then, let me just see. So this notion as well, like we were touching on this before. Do you believe that memples should roughly align across the network? Is that like, that’s the rough idea?

Pieter Wuille (36:39)
Possibly, yeah.

Bye.

We can’t, right? There are many reasons for, obviously no, they cannot, or we cannot guarantee it because if they would, we wouldn’t need a blockchain in the first place. Just accept whatever transactions are in the shared mempool first. The whole consensus problem is solved. We just cannot do that due to, you know, there’s no centralized party that can impose a single ordering on everything. Everybody has their own view. You might have…

conflicts that arise from even just…

double spends, like what if someone has an output, spends it in two different ways, gives one version to me, gives another to you. If one is obviously better than the other, then a replacement might be possible in one or the other direction, but what if they’re equally good? There’s no objective measure of picking one over the other. This is worsened by the fact that we need denial of service protection rules, because so even if we could define

mathematically optimal sets, we can’t really guarantee convergence because it might be computationally infeasible to do the work to figuring out if it’s actually better or not. And then obviously there’s policy differences which have good reasons, like different nodes might have different resource limits. So.

No, we can’t guarantee things. And this does mean that there are limitations to the quality of the reasoning we can make. I would not say the goal here is making memples uniform. It’s really everyone for themselves is now given a better framework to reason about profitability. And there are still other limits

that mean not everyone will find the same outcome. yeah. So, no, don’t, I think we shouldn’t, you know, gratuitously introduce inconsistencies between them. But the goal is not uniformity. The goal is, you know, make the best reasoning you can locally. And…

Stephan Livera (38:46)
I see. But as I understand it, yeah.

Okay, and so as I understand and from what I’ve read from different Bitcoin core developers, there seems to be a lot of concern about things like block propagation, making sure that it’s very low latency. So I guess that’s kind of a related idea then that if you, I guess in theory it would be nice if they sort of were roughly the same so that there’s less of a problem around that, but you can’t guarantee that. Is that, okay.

Pieter Wuille (39:25)
Yes, but we can’t guarantee it. Yeah, yeah, yeah.

If you’ve seen it, there’s been a relatively recent idea by Anthony Towns to do this block template sharing, which…

which addresses some of this in a limited way, but it’s a pretty simple idea. So the idea is like add a protocol extension that nodes can negotiate with each other, it’s not a consensus change or anything entirely opt-in, where you can ask your peer, hey,

If you were to create a block right now, what would it contain? Or you can even go a bit further, like, what are the first two blocks you would construct with your mempool right now? And I try, if it includes transactions I haven’t seen yet, I’ll try to insert them into my mempool. Obviously, why wouldn’t I? I’ll try to validate them if I can. But even if I don’t, I hold on to them. So even if somehow you have a somewhat different policy or

There’s a double spend that results in us being unable to come to you, you know, an exactly identical view. It will be the case that…

I ask you for, what’s your, what would you put in a block right now? You give me these transactions like, huh, some of these, I, I don’t even want to validate or outside of some policy I have, but I can still hold onto it. So if it then turns out that some of those transactions are found in a block, I already have them and I get the propagation speed benefit. Even if not the validation speed benefits from having them fetched ahead of time.

And so this works as long as you have a network where, and I shouldn’t say policy differences because there can be good reasons for mempool divergence that are not related to policy differences. So whenever sort of the boundary of mempool divergence, everyone is connected to someone across the border.

Like if you end up in a world where there’s two big groups, but they’re very clustered together, know, group of users A are very much connected to, mostly to other users A and then there’s B who are mostly connected to each other, the template sharing will only help on the border between them.

It’s somewhat related to an older idea called weak blocks where

miners are allowed to propagate blocks that almost meets the definition of proof of work. They’re not quite there, but we can relay them and because there’s proof of work, they could be relayed across the network everywhere and given preferential treatment. Like there’s no DOS concern because there’s proof of work and proof of work is expensive even if it’s just 10 % of the real difficulty, there’s a substantial amount of proof of

But it’s a much more complicated idea that…

I’ve seen some objections against it too, but even ignoring that, it’s much more complicated in that it interacts with minor infrastructure in a much more low-level way than this, tell me what you would put in a block right now. So I like this idea. ⁓ It helps somewhat with propagation speed and mempool convergence, but again, it doesn’t guarantee anything.

Stephan Livera (42:52)
Okay. Yeah.

Okay, so coming back to, you were touching on this before, but as I understand, Clustermemple, it’s a Bitcoin core thing. How does that work when we’re talking about, let’s say, other implementations, like, I don’t know, BTCD or whoever else is out there? What does it mean if a Bitcoin core, v31 node, is talking to other implementations of Bitcoin?

Pieter Wuille (43:23)
Yeah, well, or even what if they’re talking to a Bitcoin Core 30 or 29 node, right? I’ve touched on this before in that it’s about improving local reasoning. So the fact that you are connected to people who use different and probably, I don’t know, inferior.

Stephan Livera (43:30)
Right, an older core version, yeah.

Pieter Wuille (43:49)
reasoning doesn’t prevent you from making better decisions yourself.

in terms of predictability in that.

Once it comes to a point of like, can network users start soft relying on cluster mempool on being deployed on the network to relay maybe somewhat non-trivial things that will depend on deployment. ⁓

Stephan Livera (44:16)
But I guess you would assume that

in a few years’ time, most of the network would be running v31 or later. And then at that point, maybe that is a more fair assumption to make.

Pieter Wuille (44:27)
Yeah, right, but this is up to individual users really to assess. I mean, it does go to, if it doesn’t, then maybe we have failed in this, you know, the goal of censorship resistance through making the public market transparent enough. If there’s real demand, there has to be demand for…

some complicated constellation of transactions being relayed on the network and it doesn’t due to non-deployment of this or maybe it’s due to limitations of cluster mempool we still have anti-dos rules that result in certain pinning problems. Some are affected but they’re not gone. So for whatever reason if it turns out that there is economic demand for transaction constellations

that we cannot reliably relay due to non-deployment or due to ineffectiveness or whatever, it sort of fails at its goal of keeping this market public. ⁓ But that is to be seen. I like how the cluster mempool

Stephan Livera (45:36)
I see.

Pieter Wuille (45:42)
framework just makes it so much easier to do this reasoning. In the past, there’s, for example, been talk about package relay. We have some limited form of package relay now. There are questions about, can we do package RBF? But before Cluster Mempool, I don’t think…

we could really conceptualize what rules would be involved in a package RBF. Like you have multiple dependent transactions in a mempool and you get another set of dependent transactions in and some of those might replace some of those. it’s a very hard, it’s still pretty hard even with cluster mempool but at least there’s a framework to talk about those things. I guess that’s what I like most about it even not

Stephan Livera (46:27)
Yeah, it gets complicated.

Pieter Wuille (46:33)
the deployed capabilities that the network gains, but the fact that we have a mental framework to assess these things.

Stephan Livera (46:43)
Okay, so yeah, I was gonna ask, I mean, you mentioned any other, while we’re here talking about mempool stuff, other related mempool ideas, so you mentioned package relay, V3 transactions, anything else that you’re interested in or focused on that you wanna discuss there?

Pieter Wuille (46:58)
Yeah, so I touched on the block template sharing. think that we can think of that as a transaction relay aspect. There’s some renewed interest in early as well. These are very orthogonal improvements, right? They’re not about reasoning about how good transactions are. They’re more infrastructure level improvements to transaction relay.

Yeah, I’m interested in those. Maybe helping out with some of those is something I’ll be focusing on in the future.

Stephan Livera (47:36)
And I guess zooming out a little bit more, like out of just, let’s say, Mempool and Cluster Mempool, what are some other areas that you’re seeing as interesting, like in Bitcoin development, protocol development, just more broadly? Is there anything else that is interesting to you?

Pieter Wuille (47:51)
Sure. Yeah, my interests shift over time. Like, I can’t tell you the things I’m thinking about now.

I’m considering looking into making validation more asynchronous internally, which has some bearing on, in BigCon Chorus specifically, this would be very much an implementation detail, a protocol level change, but it would have some impact in terms of latency where you can have…

What if I send you a slow transaction? I slow to validate transaction? To what extent can I slow your node down? That is today an uncomfortable reality due to how synchronous and blocking things are done. I’d like to look into that, but yeah.

Stephan Livera (48:39)
Yeah, and so is that when you say when you mentioned validation, are we talking like at an IBD stage or just in general?

Pieter Wuille (48:45)
No,

I mean, in general, but I think the impact would mostly be on the steady state.

Stephan Livera (48:53)
Okay,

as in like staying at the chain tip correctly is kind of loosely understood.

Pieter Wuille (48:58)
Yeah,

and the interaction with the network. You being a node connected to many others that give you messages like how quickly can you respond to them? It isn’t about making things faster, it’s more about making them snappier, so to speak. If you can do multiple things at the same time, even if…

dumb example, you give me very slow transaction and someone else gives me a block. Maybe I want to interrupt processing your transaction and give priority to that block. This is not something we can do today.

Stephan Livera (49:30)
I see, yeah, I see, yeah, because at first when you said that I was thinking about like IBD stuff and like, you know, like Swift sync and Assume UTXO and this kind of, that kind of, or UTRIXO, like that kind of area, but as I understand this is a bit of a different area, okay.

Pieter Wuille (49:36)
yeah, yeah, well, yes. This

is much more an implementation detail, not, yeah.

Stephan Livera (49:47)
Yeah, I guess, I mean, other kind of things I see people talking about, ⁓ things like, you know, covenants and things like that. Is any of that interesting to you at all or not really?

Pieter Wuille (49:47)
That’s something I’m thinking of.

I’m trying very hard to sort of, to an extent after SegWit, but even more after Taproot, I’m trying to stay out of all consensus change discussions.

Yeah, I just feel like I’ve had my fair share there and, you know, it’s… Yeah, I’d rather stay out of that.

Stephan Livera (50:12)
Okay, yeah, just in general, yeah.

Sure, yeah. Okay, so I guess let’s say ClusterMempool comes in in this next, in v31, and I guess next thing for you would be maybe looking at things like you mentioned the block template sharing idea from major towns or this parallel validation. Those are probably the main areas that you’d be interested in.

Pieter Wuille (50:36)
⁓

Yeah, there are things that are currently on my mind, but who knows? Maybe in a month I’m thinking about other stuff.

Stephan Livera (50:47)
Yeah. And I guess just zooming out as like at an ecosystem level, even if it’s not consensus related, what are the things do you think people, what are the things do you think the ecosystem needs work on? Like is it wallets? Is it self custody? Is it mining stuff? Is it L2 stuff? Like do you have any thoughts on those areas?

Pieter Wuille (50:48)
I…

Not really. There’s lots of good work. you’re talking about mining. I’m happy to see the evolution with Stratum V2. I hope it takes off. I haven’t had too much contribution myself there, but yeah.

nothing really comes to mind right now ⁓

Stephan Livera (51:23)
Okay. All right, well guess let’s wrap

things up then. maybe just give a overview or just kind of a quick, what’s the key insight for people to take away on Cluster Mempool? If there’s one takeaway, what should they take away with understanding Cluster Mempool?

Pieter Wuille (51:39)
Reasoning about dependent transactions is hard. We now have a framework to do that. Bitcoin Core will be able to use that reasoning in the future and hopefully that results in us keeping the block space market public for longer.

Stephan Livera (51:58)
Excellent. Well, listeners, the links will be in the show notes so you can follow Pieter’s work online and read more about Cluster Mempool. Pieter, thank you for joining me.

Pieter Wuille (52:07)
Yeah, thanks for having me. It was great.

Stephan Livera

Cluster Mempool Explained with Pieter Wuille | SLP730

Leave a ReplyCancel reply