UTXOs, Spam & Bitcoin’s Integrity with Martin Habovstiak

In this conversation, Stephan Livera interviews Bitcoin developer Martin Habovstiak about his website Knotslies and the controversies surrounding data contiguity in Bitcoin transactions. They discuss the legal implications of data storage on the blockchain, the effectiveness of filtering illegal content, and various methods of spamming the Bitcoin chain.

Martin shares his insights on the technical aspects of Bitcoin transactions and the challenges of maintaining standards in the face of evolving practices.They also discuss the complexities of Bitcoin’s transaction mechanisms, particularly focusing on the implications of spam, the role of UTXOs, and the potential effects of BIP 110.

The conversation also highlights the importance of maintaining network integrity, the costs associated with spamming, and the necessity of mining in preserving Bitcoin’s resistance to government influence.

Takeaways:

🔸Martin created Knotslies to address misconceptions about Bitcoin data.

🔸The argument about data contiguity in transactions is flawed.

🔸Splitting data does not make it legal or safe.

🔸Technical understanding is crucial for discussing Bitcoin’s legal risks.

🔸Filters cannot effectively prevent illegal content on the blockchain.

🔸Spamming the Bitcoin chain can lead to larger transaction sizes.

🔸The cost of storing data on Bitcoin is significantly higher than cloud services.

🔸Different methods of spamming have varying costs and implications.

🔸The debate around standards in Bitcoin is ongoing and complex.

🔸People who are putting data in Bitcoin are doing it on purpose.

🔸The calculator simulates what an attacker would do to spam Bitcoin.

🔸Spammers will not be deterred by a 0.4% increase in costs.

🔸Lightning Network is crucial for reducing spam on the Bitcoin network.

🔸Changing Bitcoin due to government fear undermines its purpose.

Timestamps:

(00:00) – Intro

(01:10) – Why did Martin create Knotslies?

(07:46) – Controversies around data contiguity in Bitcoin transactions

(12:04) – The standard way to interpret Bitcoin data?

(20:24) – Can filtering protect node operators from illegal content?

(25:37) – Various methods of data spamming in Bitcoin

(33:01) – What is the Knotslies calculator?

(40:20) – Analyzing spam costs

(48:28) – Alternative solutions instead of BIP 110

(54:25) – Role of Bitcoin mining in resistance to government influence

Links:

Stephan Livera links:

Transcript:

Stephan Livera (00:01)
Hi everyone and welcome back to Stephan Livera podcast. Joining me on the show today is Martin Habovstiak. Martin and I actually first met I believe around 2019 around the lightning conference.

and I know Martin is a Bitcoin developer and had some interesting views to share. Martin is the creator of the notslies.com website and so I thought it would be interesting to elaborate on some of these views and have them explained so people can hear a different view. So Martin, welcome to the show and give us some of your, I guess your rationale. People want to hear why. Why did you want to create this website?

Martin (00:22)
Thank you.

Thank you,

hello. Well, the thing was like, you know, probably everyone heard about the BIP 110 discussion and people were making various really weird claims around it. And one of those claims was that the data in the transaction

If it’s contiguous, then there could be legal issues stemming from it. And I thought that this is a weird argument because if you are some sort of criminal that splits your files into different chunks, then of course you wouldn’t be deemed not guilty just because you split the files. That doesn’t make any sense.

So why would splitting data help? But from very early I knew that there are actually ways, like even if you kind of try to force the issue of splitting by basically chunking the data and putting some garbage between the chunks, garbage from the point of view of the format you are trying to…

to put into the chain, of course. There are techniques to either like make it corrupt the file only a little bit so it doesn’t matter or even like make it so that the data don’t get corrupted basically at all. And I knew it from the beginning and at the beginning of the discussion I…

demonstrated an image that had like red dots every 520 pixels or something like that. Basically it was red because of the encoding of the length. So that’s like 253 and there is whatever other number for blue and green. So if you know like RGB, so red, green, blue, red is one byte.

and it matched the encoding perfectly so if you encoded an image like that where you would put a red dot every 520 pixels it would damage the image a little bit but if there is illegal content it’s too visible to content you know like you won’t hide the entire image there is just bunch of dots on top of it so and i don’t think you would make an image legal just by putting dots on it of course

And actually I discussed this with a lawyer and he confirmed that this is basically bullshit, that even if you damage the image, it’s still illegal. So that was my argument, but I only posted the image alone and described like, okay, I did this.

Then someone objected to it that, yeah, but you know, you can do it because you are technical, but others can’t. He didn’t know that I used ⁓ LLM that spit it out correctly on the first try. But anyway, I thought that, okay, the argument is debunked and I moved on, but months later, I still saw people making this claim about configuals data and I was like, what? Like this was discussed to…

to death on the mailing list. It was debunked and I didn’t hear any economists in country recommend that people are still saying this and I got more and more fed up with this. And I was like, okay, I can see the problem. Like people are basically assuming, some people or most people are not actually reading the mailing list. That’s the first…

first issue, but the second issue is that even if they are on they might assume that yeah this is all theoretical or you know like it’s very vague like it’s hard to convince someone of something that like you know in theory works but if someone doesn’t understand the theory then it’s really difficult to convince. So despite my real like

distaste for spam and I really didn’t want to put data on the chain, but I thought that doing just one transaction just once just to prove the point might be reasonable and make sure that it’s not super trivial to repeat like not publish the code of course. So I wrote code actually to

assemble the transaction for me because it would be pretty tricky to do it by hand. And then I posted it into the main chain, but just before I thought it would be nice to put some cleave text into the transaction describing the project, especially because I actually had to put some garbage data into it. That’s the funny thing. So the thing I did

is actually make the entire transaction into an image. And it is quite different from just putting an image somewhere in, let’s say, witness data. so, but if you do these tricks, like trying to make the entire transaction into an image, it imposes some structure on it and specifically minimum size of something like 65 something kilobytes. So,

That’s why the transaction has around 66 and something because also there are some other reasons when the transaction starts getting even bigger. But you know, it had to have this size. It’s not like it had the size because of the image encoding or something. It had to have this size. ⁓

Stephan Livera (06:29)
Yeah, gotcha. So just walking back a little bit, the point of

this is that you crafted a…

let’s say a manual transaction or using an LLM, you crafted a transaction that creates a 66 kilobyte image, which is a .tif file that is now in the chain. And now, of course, there are different counterarguments here. On one hand, I think some of them were saying, well, hang on, this transaction would not be bit 110 compliant. And then the other big one would be that, look, you still had to go out of band to a miner to do this. Although I presume maybe if you had used Libre Relay,

or that TX pigeon you might have been able to do it through that also but can you address the I guess first of all as I understand you made a version that went you know on the current version of Bitcoin Core or just in general and then you made a bit 110 compliant version so can you just explain a bit about those different versions

Martin (07:26)
Yeah, so basically the current version was made with current rules just to prove a point that despite the claims that the transaction contains some non-contiguous chunks, like you cannot make it contiguous. I made like even the entire transaction contiguous. So it’s not true that just by putting

some like splitting the data apart, you can force planners to not put contiguous data because there are file formats that can skip over these chunks. So, and even like to the extreme where the transaction itself can be an image. But of course, like if any one of you tried like ever look up how various files format

work, they usually start, or probably always, with some specific bytes. like they start like, let’s say, the TIFF files start with 49 in hex 49, which is the same as the capital letter. Then there is 42,00. And there is some offset into the transaction.

And if you think about it, the Bitcoin transaction also starts with a version number, but the current standard NES rules dictate that the version number is standard only for 0, 1 and I think also 2 if I remember correctly, I’m not sure about it. So that doesn’t match up. So therefore, if you want to make the entire transaction, then you have to use some non-standard relay.

And even Liberilla still enforces the versioning thing. Which I think is okay, because even the idea of using weird version number is complete insanity, to be honest. So it was mostly to make it more fun. I could still make it contiguous, just start at some offset. But I didn’t have to…

I thought it would be more fun to make the entire transaction because then like, you know, like people would then need to make more effort to like figure out like where did I put the data inside the transaction and so on. But if I make the entire transaction contiguous and people already know that the entire transaction like supposedly isn’t then like

You know, the discrepancy is very obvious then.

Stephan Livera (10:04)
I see, yeah, and so just trying to reflect what I have heard and seen from the Notts Camp. Now, I won’t be able to perfectly reflect this, but it seems, from what I’ve seen…

people like mechanic have really made this kind of contiguous argument. He sort of really makes it about this contiguous argument and people have been kind of trolling him as well about I think he was at a conference on a panel on stage saying something like, grandma is going to chop off the first hundred bytes that there’s going to be a virus somehow that that’s going to be the vector for delivering malware that, you know, Ethel is going to be downloading the chain and searching this particular byte, which seems a bit odd. But I will say also in the Knott’s camp, Luke,

Luke Dasher seems to have a slightly different argument. His argument seems to be more like, well, given the way data is encoded, you could say almost anything is this illegal data, but what makes it different in his view is this idea that it’s the so-called standard way to interpret this thing. So for some reason, he seems to say that Core V30 seems to be especially endorsing

this kind of non-financial data, that it’s the standard way to interpret, what do you make of these different arguments around the exact reason it’s gonna be wrong or illegal or raising the risk of the state coming after Bitcoin node operators?

Martin (11:27)
Maybe the first question people should ask themselves is what makes things standard? Why is objitron supposed to be standard and inscriptions are not? Because if people start using them massively, how is it different from standard? So I need to… That’s the point.

Stephan Livera (11:45)
Yeah, mean, you’re telling me, yeah, and the other thing is even

up to 400 kilobytes, even pre-v30, up to 400 kilobytes is standard. So it doesn’t seem consistent to me.

Martin (11:53)
Yeah, I mean, standard

in the standardness rules. I mean, what he probably meant is standard way of interpreting bytes. So yeah, he made some point that nobody would think about saving a transaction as a tfile and trying to open it in an image viewer. That sounds completely crazy, which is true.

Stephan Livera (12:06)
Yeah.

Martin (12:22)
so far but here’s the thing like it could at some point become not crazy like it could happen potentially that some shit coiners are already trying to rewrite the code that i made they want to write their own version and try to spin off some nft tokens out of this and in that case like you know like if there is like i don’t know like few thousands or tens of thousands of transactions

made this way, then who’s to say that it isn’t standard, like it’s just arbitrary definitions. On the flip side, he says that ob-return is a standard way, but that standard doesn’t even have like defined way of distinguishing what kind of file format is there. So you have to like try to do some guessing, which ironically this transaction itself being both a transaction and an image,

Like basically if you try to guess what kind of file it is just from the data itself, you get ambiguous results because you cannot tell if this is a Bitcoin transaction or an image. If you have no context, you cannot tell.

Stephan Livera (13:35)
And even to

be fair, even in your website, notslides.com, you mention here, you can verify yourself, here’s the transaction ID, and then you have this command, xxd minus r minus p into tiff, like dot tiff, right? Because there are different image files, there’s dot tiff, there’s jpeg, png, webp, whatever, I mean, there’s many others. So I guess the point is this xxd command is…

I guess reversing the hex encoding, right? So I guess that’s a transformation step, isn’t it?

Martin (14:06)
Yeah, but the hexa-encoding is really some internal detail. It’s just… Basically, the only reason there is even the hexa-encoding to begin with is that Bitcoin daemon uses something called JSONRPC, which has to encode all data as text, which is super inefficient, but whatever, it is what it is. And…

Because you cannot encode arbitrary data as text easily, like you could have some weird characters that would end up as, I don’t know, emojis or whatever, and it would be, or it could just break your terminal because there would be control characters, like, carriage return or something crazy. So because of this, it has to encode it.

to Hex. So this encoding is kind of like temporary. It’s just for the transfer of data. It’s not like the Hex encoding is not stored on the disk because it will be twice as large, which would be insane. And it’s not Hex encoded at P2P network level. So the P2P network uses two binary encoding. It’s just because of this JSON RPC interface.

that it has to be temporarily encoded when you are pulling it out. But you can check, for instance, the mempoolspace API provides an endpoint that will give you just unencoded raw data. And then you don’t have to do any encoding or decoding, rather. So other APIs are possible. You could considerably deteriorate via peer-to-peer API.

connect to Bitcoin as if you were peer-to-peer node and ask for the transaction that way and then you would get it not hex encoded. So the hex encoding doesn’t have anything to do with it. Not to mention that even the scripts themselves are hex encoded. So if someone tried to use this as an argument, like you have to hex decode, guess what? You have to hex decode ob-return as well or anything basically like scripts.

You either have to hex decode if you want to hand parse them or you can use the decode script command, which is still another command that you have to run. And even that command will not hex decode the individual fields. It will only tell you the instructions. And if there is like push instruction that contains data, would again tell you hex encoded data. So like this cannot be an argument.

Stephan Livera (16:41)
Okay, now what about this argument around now the not or 110 proponent could come back to you at this point and say, well, the point is to make it harder. That the fact that you had to go out of band, you couldn’t just send this out on the standard Bitcoin relay network. That’s the point. From their perspective, they’ll say that’s the point. The fact that you had to go out of band to it directly to a mining pool to get them to put this in. That’s actually part of the reasoning. What would you say to that?

Martin (17:10)
Well, that’s the issue. I made the transaction more funny, but maybe I made it more confusing in the process. Because, like, it’s because of the version number. I could just get rid of this. And actually the first iteration of this idea, the first program that I coded, it actually took just one day to write it. Although, to be completely fair,

it took a bit of time to think of a method to write it like should I write it as like by modifying the image encoder or by like maybe to put it differently should I try to assemble it as transaction first and image as a secondary or the other way around so I had to figure out so that this took a bit but once I realized that

just taking an existing image encoder and modifying it to my needs, it will be easier than it took one day of coding, basically. And this was just, so this was an image that started at some offset. It wasn’t at the beginning of the transaction. So I had to like chop off the first bytes and then save it as a .tif file and it worked.

And this actually the first iteration like didn’t follow all the standard nice rules because I didn’t know all of them by heart, but it would be like matter of while to just look up all the magic constants and adjust them appropriately. And it would be really like you are not.

Stephan Livera (18:51)
So I guess what you’re saying there, just so I’m

understanding you, you’re saying theoretically you could try to grind it to a point where you found a way to make it standard while still showing as a contiguous image if you had put in more effort to sort of grind it out. Right.

Martin (19:05)
Yes, yes, it would just not be the entire transaction. That’s the difference.

Stephan Livera (19:12)
Right, you would have to be in that same context of cutting off the first number, whatever number of bytes, because you’re using a standard version number, and then you could have stuffed it into a transaction that way. And it would still be contiguous.

Martin (19:16)
Thanks for watching.

Yeah,

yeah, I like this was the first version of my code. Maybe it’s a shame that I didn’t think of saving it. I started immediately working on the next one because I could have made another transaction just to prove this point, but whatever. ⁓

Stephan Livera (19:41)
Yeah, okay.

So I guess the broad point you’re trying to get at as I read you is you are trying to talk to this idea that filtering, is it going to, I guess that’s the question. Can filtering protect node operators from illegal content? Does it somehow, yeah, that’s the question.

Martin (20:02)
There is no way it can do that. There will always be many ways of doing it. It’s funny because the first programming project for Bitcoin that I ever made that had anything to do with Bitcoin was specifically to steganographically put messages into the Bitcoin blockchain as a series of ⁓ valid and they are not even fake, completely valid addresses.

that you have private keys to so you can spend from them. So it’s absolutely indistinguishable and without the knowledge where to find it, nobody can find it. So this was my first, it was like toy project that I didn’t actually put anything into the chain. was just the idea of like generating the sequence of the addresses that encode some secret message that if you put the sequence back,

into the decoder and put the password there, it will spit out the message.

Stephan Livera (20:59)
I see, yeah. So I guess the point, also it should sort of answer the point that people will be thinking is, you trying to spam the chain or make it easy for people to spam the chain? Or I think what I’m reading you as is you’re trying to make the point that if you try to filter it this way, you’re just going to create worst problems somewhere else.

Martin (21:20)
Yeah, exactly. And that’s the issue. Every single technique used to bypass these kinds of restrictions ends up getting worse for the network, actually. So if someone is rich and bored and wants to buy a monkey picture, they don’t care if it costs 20 bucks or 30 or 50 or 5, whatever. They will pay it.

someone is rich and if someone isn’t rich they will just buy a cloud storage. Recently I have calculated that the transaction I put into the Bitcoin chain costed me 9 million times more than monthly subscription to a cloud service I pay for. something different but anyway it’s close to…

Stephan Livera (22:12)
like a Dropbox or a Google Drive kind of thing.

Bit similar to that, yeah.

Martin (22:19)
3 something euros for 100 gigabytes per month. basically…

Stephan Livera (22:25)
Yeah. sorry, you mean

more like a VPS service, not like a Dropbox service. Is that what you mean?

Martin (22:30)
No, it’s something like Google Drive, but just encrypted.

Stephan Livera (22:36)
Okay, gotcha. Yeah. So like proton drive as an example, yeah.

Martin (22:38)
That’s

the thing, maybe Google dive has even cheaper prices. not sure. I never looked up their price. I just was interested in comparing it to the price of something I actually pay for and know the price. So it’s like the factor was around 9 million times more monthly subscription. So it would be for I don’t know how many thousands of years.

storage. So it would be like if my goal was just to store data and nothing to do with Bitcoin. I wouldn’t use Bitcoin because it’s like 9 million times, maybe not 9 million because it’s monthly subscription. But you know, even if you like say like, okay, I will live for 100 years, then you can say like, okay, 1200 months. It’s still like thousands of times more.

Even if you make this assumption that I will live for 100 years and I want to store it for 100 years, it’s much, much more expensive. So it doesn’t make sense. So therefore, whoever is putting stuff in Bitcoin is putting it there because they want to put stuff into Bitcoin. That’s because they want to store data somewhere. so therefore making it harder only overloads the network itself because suddenly

Because of these restrictions, for instance, if you compare the transactions, the first one and the second one, the second one is much larger. Well, much. I think it’s like 10 kilobytes, which is pretty significant. I think. And it ends up being like 10 kilobytes larger just because I had to bypass the restrictions, which forced me to put more inputs there, which…

Finally, I have this method of making the entire transaction into the evaluative image. What it makes is the more inputs you have, the bigger it has to be, but not just because of the inputs, but there is some data that shifts more further and further away and you have to add more garbage in between. So basically, can…

You can put more garbage on more text there or whatever you like. So that’s the irony. So the filters made it much bigger. And in some cases, this can be even extremely bad for that.

Stephan Livera (25:01)
Yeah, okay. So let’s get to this. So there are different ways things can get… Now again, I want to be clear. I have never spammed the chain. I do not earn money out of this. I’m not invested in any company that is spamming or shitcoining. But just so people understand, there are different places…

Martin (25:06)
Thank

Stephan Livera (25:17)
that people can spam into the chain, right? There’s fake pub keys, there’s witness spamming, and then there’s op return. So can you just talk us through some of those different methods, just so people understand kind of the trade-offs and the costs of doing this, so that people can at least be aware of this issue?

Martin (25:36)
Yeah, so the first one, most popular maybe or most ⁓ well known is the ob-return. And that one is actually interesting because it’s not really financially viable to put large chunks of data into ob-return because at some size it starts to make more sense to put it into the witness data. And the reason why it’s

like there is some cutoff is if you want to put stuff into witness data, must make another transaction. So you have more overhead from creating a second transaction. And so if your data is large enough that the savings covers the cost of making another transaction, then it works out. And someone even calculated the exact value, but I forgot what it was. think it was like

Stephan Livera (26:21)
Gotcha.

Yeah, I think it’s

like 160 bytes or some, don’t know, 140 bytes or something like this. It’s like in that range. So I guess just, so I’ll, me just try and explain it in simple terms for the maybe less technical listeners. So, you know, in Bitcoin you have inputs and outputs and OpReturn is an output, right? But the interesting thing about witness spamming is it’s actually on the input side of the transaction. And another kind of piece of context that people should know is that

Martin (26:28)
Yeah, something’s like that,

Stephan Livera (26:55)
in inscriptions.

Again, not endorsing this, but there’s this thing called the Taproot inscription envelope, and it’s like OP false, OP if, and then they stuff the data there, and then it’s like, I think it’s an end if or something like that. And as you mentioned, it’s a two transaction scheme. It’s called, I think it’s called a commit reveal scheme. So the idea is that the spammer is kind of first doing the commit stage and then only on the reveal he’s showing, actually the thing that is valid to spend this is this massive thing. And that’s where they’re stuffing in

the data, so it’s a two-stage process. But the thing is, that is one of the cheapest ways to spam the chain, right? So again, not endorsing, but just objectively speaking, above a certain size, which is probably like that 140 bytes or whatever it is, basically anyone who wants to spam images into the chain, it’s generally gonna be cheaper for them to use witness spamming. And so I think it’s just important people understand because…

Because of the SegWit discount, get, it’s a 25 % the cost of putting something in OpReturn, which you pay full, full fare.

Martin (27:58)
Yep. So, but there is still a weird edge case is like, what if you want to put like the data that is, let’s say bigger than 80 bytes and less than whatever the threshold was, let’s say 160. Like what Citra is doing. If you need to do that, then the most

the cheapest way of doing it is to put another output with a fake public key and that is extremely harmful to the network because obritters they can be proven to be unspendable simply by looking at the first instruction if the first instruction is obritter you can be certain 100 % that this never will be spent and you don’t even need to

Stephan Livera (28:33)
It’s like pub key style.

Martin (28:56)
It’s putting it into UTXO set because this UTXO is as if it didn’t exist. So in Bitcoin, the way it internally works is that spent UTXOs and not existing UTXOs are the same thing. once UTXO is spent, is deleted from the database as if it never existed.

So basically this same thing happens with obritten. It keeps even the storing into… There is not even… People talk about the obritten being pruned, but it’s not even stored in the first place in the UT-agil set. It is still stored on disk if you don’t have pruning enabled, or at least for a while. And if you have pruning enabled, it will get deleted after a few blocks. that’s the important part.

If the data is in between, then this is super harmful for network because the public key gets stored into the exhaust set. And this can never be pruned from the node, ever. Because if someone did prune it, that could become a huge chain split or network split.

So people would see different versions of transactions and like this would be completely catastrophic and kill Bitcoin basically, probably. So there is no way like this will ever get deleted. Therefore it is better if people, if they are going to spam Bitcoin. And here’s the thing again, what I said before, like

People who are putting data in Bitcoin are doing it on purpose to have it on Bitcoin specifically, not because they want to store data somewhere wherever and Bitcoin is the cheapest option. It’s the opposite. They own Bitcoin and they will pay whatever they want, whatever they have to get in. So they will do it. So if they are in this situation where they will do it regardless, then better let them do it in

the operator output and not in the fake public keys.

Stephan Livera (31:03)
I see. Yeah, and so just talking through the… Yeah, as you mentioned, and there are even more harmful ways of spamming the train, so Stamps is kind of a well-known one of that. there are guys out there who are working on some of these things already. So it’s not like they’re just theoretical, like they do exist. can… And I think the Citroën thing is like a very specific example because it’s like…

it’s only this kind edge case sort of whereas most of the spam, actually I think a lot of the spam is text and it’s already under 83 bytes anyway, and then the images, so they’re less by the count but more by the bytes, a lot of those, there have been literally millions of images spammed into the chain using the inscription envelope. And so that’s sort of the challenge of

explaining some of these things because people will then look, they might look at, here’s the op return chart showing you how many large op returns there have been, but actually the spammers who want to put images in, they’re not spamming on op return, they’re spamming on the witness. it’s like a very niche kind of technical conversation that you have to be…

sort of really into the guts and the weeds of technical minutiae of Bitcoin to sort of even grasp some of these things. So I guess that brings us to your calculator. Let’s talk about that because you have a calculator at the website, so notslies.com slash calculator, calculator.html. Links will be in the show notes for listeners. But Martin, tell us about this calculator. What are you doing? What are you showing here? What are some of the key takeaways for people?

Martin (32:44)
Yeah, so after I posted all this stuff, people were all like, I noticed that many people were saying basically the same thing like, yes, but it’s overhead for the spammer as well, so it will get more expensive and therefore spammers will be detracted from it. And I would say like, okay, that would be probably a fair argument if it costed the spammer like a hundred times more or something.

But you know, it’s always good to get some real numbers, like how much it will cost to spammer. And I even decided to try and measure different techniques because I wasn’t even sure which technique is even the best to begin with. And there was this thing with the… Well, spelling rather.

the meaning of BIP110 that they wrote it in a way that made it look like they will also restrict the script sizes and that was the exact thing that made me unsure which technique. If I knew from the beginning that this restriction is not applied to the scripts then I would have known from the

immediately putting it into the Dapperwood or witness version 0 is basically the most efficient, rather to the script, not to witness. But I still wanted to show various techniques, not? It wasn’t too hard to code. So basically what this calculator does, it basically simulates kind of like what

the attacker do. So in the code what it does is it has these parameters that are in the UI like what is the restriction for like chunk limit, for the push limit, what is the data size and so on. And it takes these restrictions and based on them and the technique it

which really chops up the data. So there is no data real, it just works with numbers. So it tries to like, okay, so we use technique that puts stuff into witness and we have 256 by limit. So, okay, so whatever the number is divided by 256 and now we have like how many chunks we have. And then there is some processing to figure out if…

it even fits into the script itself because even scripts can have some size limits and whatnot. And then it tries to process like, okay, do I need more inputs or more outputs in case of over return or whatever? And it tries to like do all this computation to make like, to compute like how much overhead there will be. And then it ends up all the sizes and whatnot.

and outputs the numbers. then for convenience, are like percentages shown. And this is where it gets to the point. Like you can see that if you try different techniques, usually one of the techniques is super cheap. And especially for a lot of data, like if you put there like 10,000 bytes of data and you select like pushes in, in tab script, you will see

that just differentiating slightly will have the cost of like 0.4%. Like if you think that…

Stephan Livera (36:21)
Yeah, so we need

to highlight that just for listeners. if you use the right techniques, now again, not endorsing spamming the chain, but above a certain size, if you are using some alternative opcode, so let’s say hypothetically, even if Bit 110 were to be activated on the heaviest chain and everyone was using Bitcoin with Bit 110,

Spammers who want to economize on their cost could use an alternate inscription method and they would be only paying 0.4 % more. It’s actually a bit less than that, but let’s say 0.4%. Right, so listen, think about that. Do you believe that if someone is spamming the chain today that you are going to deter that person by making them pay 0.4 % more?

Martin (37:05)
Exactly.

Stephan Livera (37:06)
So I think that’s the important point that people should learn and understand here, which is that, you know, shout out to Moonsettler for this as well, because he commented under my earlier episode with Charlie talking about this and explaining, look, basically the gist of it is, as long as we still have Bitcoin script, and as long as we have the SegWit discount, this is more or less gonna be true, right? Like as long as we have these two things, more or less,

that’s kind of the situation we’re in. So I understand the desire to not have spam in Bitcoin, but there’s just some certain technical fundamental realities on the ground that unless we’re willing to go back and unwind the SegWit discount, which again, that would have its own problems, there are reasons we did the SegWit discount, and unless we’re gonna unwind Bitcoin script,

There’s not really a lot that you can do here. I think that’s just the fundamental point. So how do you see that? Do agree or disagree? How would you elaborate on that?

Martin (38:05)
Yeah, there is even

a really funny thing. Let’s say we went like, okay, let’s just get rid of the Bitcoin script so that there is no more of this stuff. Entertain this idea. What happens if we remove Bitcoin script?

Stephan Livera (38:24)
I mean, Multisig and Lightning and all these things that are out the window at that point, right?

Martin (38:25)
Lightning Network breaks. And what happens

if Lightning Network breaks? All the coffee transactions suddenly go to the chain. suddenly you have like maybe thousand times, 10,000 times more spam in the chain from the transactions that would have otherwise been on Lightning. So like, you know, like the biggest optimization and the biggest anti… That’s a funny thing. Like Lightning Network is the biggest anti-spam technology in Bitcoin.

Because if you are putting stuff on, like if you are transacting over a Lightning Network, it’s so much cheaper, then it makes economic sense to pay higher fees on the channel opening and closing transactions. And those can then drive out the spam. But if you destroy Lightning Network, then this stops working and you get even less spam. Here’s another really funny thing that…

If you look only at the cost of running a full node, a block full of Bitcoin transactions that actually transfer some value is more costly to validate than a block full of spam. That’s really interesting.

Stephan Livera (39:39)
Well, I guess

there’s a bit of a nuance with that, isn’t there? if we’re talking about inscription, I think that’s true, but if it is a spec, let’s say if we’re talking about spam as BRC20.

Martin (39:42)
It is true. I measured it.

Stephan Livera (39:54)
then that creates a lot of small UTXOs, right? Because if you remember back in late 23, early 24, there were these guys doing this competitive mint and that’s why people got annoyed, right? Chris Guida and a bunch of these other people were really annoyed about the high fee. Now, to be fair, I was also annoyed about the high fees at that time too because I thought like, what is this? These guys are like spamming, but then I sort of realized, no, this is not a sustainable thing. These guys are just like spending crazy because they’re gambling. Now, okay, I understand there are people who don’t want gambling in Bitcoin, but again, it comes back to how are gonna stop it, right?

And so with BRC 20 there’s probably a slight nuance there because if you’re doing BRC 20s these people are creating a ton a ton of UTXOs and that was that was when the UTXO set kind of blew out from maybe 70 million all the way up to maybe 160 million something like that. Rough numbers. That was kind of what happened in that time, right?

Martin (40:44)
Yeah, yeah, by spam I meant mainly like because like even those transactions are kind of like if you are measuring it in terms of bytes stored, they are inefficient. are like paying four times more fees than they would have to. But if people are just using witness and either the tab script and basically any technique that I put there then

it’s less costly for the network to validate. it’s a very simple reason, that signature verification is much, much more costly than hashing or any other script operation. It’s orders of magnitude more costly. So the funny thing is like…

Stephan Livera (41:22)
It’s very computationally costly for the node here.

Martin (41:34)
I think that the tab script envelope as used today might be one of the least costly ways for the full node to validate because it can just skip over the instructions that are in the unexecuted branch so it just skips over them and it only needs to hash the entire script just once so there is just one hash.

You are doing other techniques there in my heaven that the node has to maybe execute some instruction, maybe allocate some memory. It might need to hash stuff because one of the techniques, actually is the technique that I used for my image is just putting

the chunks into the witness and then putting hash’s verification into witness. And this is much more overhead than the taproot envelope, but I wanted to prove a point that you don’t need taproot and you don’t need op-if and you don’t need… ⁓

Stephan Livera (42:30)
Yeah, yeah, and while we’re here, I’ll just quickly

mention, because this is also relevant for listeners, I think Shesic ran the number. He actually ran the numbers. I’ve posted this. I’ve been sharing it as well for people. He ran the numbers on what difference did Taproot make? And that number is like 12%. Okay, so basically, even without Taproot, you’re paying 12 % more. And even without the current inscription envelope,

People are just going to be paying point four percent more. So right so even if we like let’s say we took away tapu We took away op-if

you’re forcing the spammers to pay 12.4 % more. Is that really gonna move the needle? Are they really gonna stop? Can you think of anyone, any spammer who’s like, you know, when the price of Bitcoin and the fee market, the block space market can vary by more than that, do you really, you know, listen, really think this through? Do you think 12.4 % is enough to stop them? I don’t think so, right? Now yes, there’s other kind of arguments people get into of like, well, okay, you’re forcing the spammers to change their inscription envelope. But even there, like if you look at what Casey Radon

and some of those other guys have said, he’s the creator of Ordinals and the Ord, which is an, think of it as an indexer, and it’s also the reference implementation for Ordinals. So he could probably push an update in like an afternoon. And then most of the rest of the spammer ecosystem, like the big exchanges and whatever else, the wallets who support all this Ordinals spam stuff.

they’d be updated within a few weeks. So really, the way I’m understanding this is it’s basically more like a temporary inconvenience to spammers and a big, cost for us to do a consensus change. So I think it’s important that people just sort of understand these different costs in terms of the asymmetry. There’s a big asymmetry here. So that’s what I would say. Do you agree or disagree or how do you see that?

Martin (44:18)
Yeah, certainly if there was literally zero cost to doing this, then I would be like, okay, let’s make them pay 12 % more, why not? Hypothetically. Well, I wouldn’t want to destroy taproot, that’s another thing, like there is privacy, gains and so on, but let’s say we could manage somehow magically create 12 % overhead for the spammers without causing any issues whatsoever, no risk, no…

chain speed, no problems with anything. It’s something super simple in the code that requires almost no engineering or testing. I would be like, okay, whatever, let’s do it. it is not the case. There are real costs. is risk of chain speeds, there is risk of freezing some funds. There is more code complexity. And also, even if it’s temporary, the code that enforces the temporary rules still has to be

in the codebase until the end of time. We cannot remove it afterwards because then someone could after several like after many years if mining gets progressively better someone could ⁓ use it to fork the chain and create like fake chain that like violates those previously activated rules. So like this is not like simple stuff.

And we would also be a of legitimate users. would like VIP 110 would also force protocols like Citra to use fake PAP keys and whatnot. Basically it would be…

Stephan Livera (46:00)
Yeah, I mean to be clear, Setraya already are

going to use fake pop keys, but it’s bit of a nuanced thing. It’s only for their challenge transaction that is unlikely to even hit the chain.

Martin (46:10)
Yeah.

Stephan Livera (46:10)
So

the thing, because that’s also another big kind of point, people saying, they changed for Citra, which is not true. Citra were an example, but it was not changed for Citra. And the way we know that is Citra didn’t change to do large op returns. They didn’t bother changing their protocol because they never asked for it and they didn’t bother changing for this. It was like just an, it was an example raised by developers just for people to understand that.

Martin (46:31)
Yeah, it was mostly like kind of like message from developers, please, would you mind to change your code just to be nice to Bitcoin? And suddenly there is like a huge, huge like this revolution against it. And then like, okay, so, so actually the BIP 110

crowd probably already caused more spam on Bitcoin. Well, aside from my transactions, obviously. thing like I could totally imagine, I’m not sure what Citra is doing or not, but I could totally imagine like their CEO or CTO or whoever else is like, okay, there is this risk that if we change it to put it in the op return, we might need to change it back because there are those

guys will just fork bitcoin for soft fork and it will stop working entirely even let’s just not do that in the first place and we are screwed well screwed thankfully it’s probably not that big of a deal if it’s not supposed to get in the chain in the first place but it’s worse basically so things will be worse just because people were screaming about it let’s take look

Stephan Livera (47:45)
I say.

So I guess sort of wrapping things up then, do you have an alternate vision for realistic Bitcoin adoption? What would you say instead of BIP 110? What should people do?

Martin (48:02)
Well, if we want to decrease spam, then the only option is to decrease the block size. I wouldn’t decrease the witness discount that some people propose because the discount is still important to avoid people from making too many outputs. It might make sense to reduce the segment discount if combined with something like cross-input signature aggregation or something else.

There are some reasons to do that in that case, but only in that specific case. other than that, I think just decreasing block size is much simpler. And it doesn’t involve…

The thing is like any of these changes probably makes ⁓ coding, other libraries and software that works with Bitcoin in any way much more difficult. Since I am co-maintainer of RAS Bitcoin, I can tell for RAS Bitcoin, I know the APIs, I know how it works. I can pretty well imagine what kind of changes would be needed to make this change.

And here’s the funny thing, this is specific to REST ecosystem but maybe some other languages work like this as well, is that like the REST Bitcoin library is like the top most library that all other libraries use. And the thing is like, if we break the API of this library, then every other library has to update and they have to do it pretty much synchronously like if…

some libraries are interacting between each other as well, then they have to update, well, not strictly in lockstep, but kind of together. It needs to be kind of coordinated. And this is a huge pain. And if there is any fork that creates these kind of problems, then it’s huge on the entire ecosystem, not just all these like…

risks and and cost to the network and so on but also to the engineers to the developers who instead of making new features on mid-corner or maybe increasing the security or something they have to now adjust the code because of some non-sensical consensus change so even then I would be very careful about doing these adjustments but like just decreasing the block weight

maximum block weight is the most like the simplest way of doing it but it’s not like super simple at all like it still requires a bunch of work

Stephan Livera (50:42)
Yeah, and mean look, for what it’s worth…

In 2017, I could have gotten hypothetically, we could have gotten SegWit without a block size increase, I would have been in favor of that. think chatting with Adam Back, think he mentioned this idea of like, oh, well, what if we just had like a 500 kilobyte base block size with like a discount? But I mean, at this point, it seems unlikely, right? Like just being honest, it seems very unlikely that it’s gonna happen. But if we could get consensus for it, I mean, I think maybe, but I just, yeah, it just seems unlikely.

And even the size idea, the cross-input aggregate signatures, from what I understand, doesn’t seem that anyone’s actually working on that right now. it’s kind of, again, it’s a bit of, or at least not right now. ⁓ I didn’t know that, yeah, okay.

Martin (51:25)
Actually there are people working on it to my knowledge.

I know that one person got a grant for it. I don’t know if it expired already or not. ⁓ I hope not.

Stephan Livera (51:37)
Yeah, because from what I recall,

think Jonas Nick was doing some work on that and maybe Fabian as well. yeah, but it didn’t seem that there was a big, that there’s a lot of people who are out trying to push it. mean, hey, maybe I’m not opposed to it also. So anyway, guess the point is.

Martin (51:43)
Hey, I’m Anthony.

Stephan Livera (51:59)
there’s not a lot of good ways to stop the spam that don’t also hurt us in some other way monetarily, right? Whether it’s harming our scripting, whether it’s harming Lightning, whether it’s harming, let’s say, Miniscrpt, Taproot Miniscrpt, which is also a future pathway for maybe improving self-custody for a lot of people, whether that’s redundancy, inheritance, know, just security in general. Like I think there’s a lot of these, you know, the harder you try to clamp down on some of these…

possible ways of spamming the chain, the harder you actually clamp down on our genuine monetary use. And that’s why for me, even though I’ve been Bitcoin only since 2013, I do not endorse spamming the chain at all, but I just, think there are genuine risks and problems with BIP 110. I think it is extremely unlikely to even happen, but.

I think it’s at least educational if people can understand, okay, it’s not as much about just sort of signaling how much you hate spam, it’s about understanding the certain technical realities of how Bitcoin works and that we are not meaningfully increasing the cost for spammers with Bit110. And so I think it’s just not, and it’s kind of also a bad premise, right? It’s like a false fake emergency because it’s inconsistent of, let’s say, Luke or other people in that camp

to say that somehow what core has in the default mempool is somehow different to what both core and not accept as consensus valid already.

and have done so for a long time, like one megabyte op return has been consensus valid for a long time. Why are we now, and to kind of make a huge deal about this 100 kilobyte op return when there are literally millions of image inscriptions and theoretically up to 400 kilobytes people could, pre-V30 could even be related around the network as standard on most of the network. So it just, to me it just feels like a false narrative. So at least that’s how I see it. Do you want to take a few minutes? Give us your closing thoughts, any key takeaways for listeners.

Martin (53:55)
Maybe there is another angle to look at it that isn’t brought up very often but I believe it’s extremely important. What people need to understand is the reason Bitcoin has mining is to make it resistant to the government influence. because like if you let’s say for instance that you either trust a government

Or you at least trust a group of governments, like let’s say you pick United States, United Kingdom, France, whatever, Russia, who cares? And you issue a private key for each of these governments. And they do basically multi-signet of blogs. And basically they do the same thing that Signet is using or Liquid, where they just sign blogs. There is no mining. And you just trust them.

If you trust the governments, this would be much more efficient because there would be no mining. The box could be much more frequent because there are no synchronization issues between fewer parties. The filters could be just implemented by the governments. Why not? And if you trust the governments, then ⁓ this would be much more efficient system, much better than Bitcoin. So why does Bitcoin have mining?

Why Satoshi didn’t issue the keys? Because Satoshi doesn’t want to involve the governments in the first place, obviously. And like, okay, but then people are like saying stuff about legal and illegal content in the blockchain and so on. But here’s the deal, like if you are willing to change your node because you fear that the government will persecute you for storing illegal content.

You will be willing to change your node because you are fearing government prosecuting you for enforcing 21 million bitcoins cap. If government says okay whoever runs bitcoin node which enforces the 21 million cap will be jailed so either run our hard fork that removes the cap or go to jail. What will you do?

based on current situation, there will be IPv1.1.0, supporters will probably change their node. So there is no cap actually. So I think what’s important is for people to realize that Bitcoin is absolutely nonsensical without this ability and without the willingness of the node runners to say no to the government.

and run the nodes covered to it so they cannot be prosecuted because the government cannot go door to door and see every single computer and analyze the software on the computer to figure out if it is if it happens to store some version of Bitcoin Core that is enforcing the limit that this is non-workable for the government so basically if you run your node in secret which is how it’s supposed to be

then the government cannot know that you are running it and cannot, even if they find out that you are running, they cannot prove that you are running specific versions or something. And if this is case, then you don’t have to fear the government jailing you for possessing illegal content in the Bitcoin blockchain because they don’t know you actually do. So that’s kind of the point. And I think that it’s…

what people don’t realize is like, even if all these arguments were kind of valid in the sense that it would be real, like legal things, then it would still be wrong to change Bitcoin for it. Because if people change Bitcoin because they fear government, then Bitcoin already failed.

Stephan Livera (57:41)
Alright well guess we’ll leave it there so listeners check out Martin’s website it’s over at notslies.com and you can follow him online K-I-X-U-N-I-L and Martin thanks for joining me today.

Martin (57:53)
Thank you too. Bye.

Stephan Livera

UTXOs, Spam & Bitcoin’s Integrity with Martin Habovstiak | SLP729

Leave a ReplyCancel reply