Coldcard Mk5, quantum, and AI with NVK | SLP735

In this episode, we explore the latest in Bitcoin security with NVK, including the new MK5 device, firmware updates, and the impact of AI and quantum computing on crypto security. We discuss practical tools, philosophical debates on self-custody, and future tech trends.

Timestamps:

(00:00) – Intro

(00:21) – What’s new with @COLDCARDwallet?

(04:05) – Miniscript support in Coldcard

(09:48) – Thoughts on Bitkey

(17:49) – @bisq_network protocol exploit

(23:17) – Debunking the quantum FUD

(29:51) – nvk on AI & LLMs

(37:54) – llm-wiki

(45:38) – Practical guides for using AI agents

(52:03) – Closing thoughts

Links:

Stephan Livera links:

Transcript:

Stephan Livera (00:00.728)
Hi everyone and welcome back to Stefan Lovera podcast. Rejoining me on the show is my friend NVK. Many of you will know him as the founder of CoinKite, the creator of the Coldcard and various other Bitcoin security products. NVK, welcome back to the show. So NVK, I know you guys recently put out the MK5. So tell us what’s new with the MK5.

Hey man, thanks for having me.

NVK (00:22.638)
It was time to refresh that line and we have like a nice screen now, much bigger, gorilla glass on it. We’ve upgraded our industrial design capabilities so the case is a lot harder. It’s way more refined. We’ve built a keyboard from scratch. So it’s a really nice keyboard. We moved the USB to the bottom so that it’s more aligned with like how people use it.

And we’ve improved the NFC as well. So the NFC is on par with QLine now. It’s very nice NFC. So that it can do your push TX and broadcast transactions without needing a computer. What else? And we added weights so that it holds nice on the table when you’re using it. And it’s got the slide cover like Mk4. It’s the same security as the Q.

The device is essentially, nobody has produced to us a key extraction yet of MK4 or Q-Line. Same as for Mark V. And we’re pretty happy with it.

Cool and so, yes, as I’m understanding it then it’s more like a visual and a feel refresh to the product but the idea is you’re still going to keep them running as separate, you you’ll have the MK5 on one side and you’ll have the Q and are you planning to keep them both running parallel? Yes.

Yes, there is like a lot of people, I’d say like about 35 % of people of our market prefers a smaller device that they can conceal easily, right? And they like the form factor of the MK line. They just wanted a bigger screen, you know, and a few sort of improvements. So that’s essentially what we did. We were listening to our users, right? And then…

NVK (02:18.958)
The other sort of 40, 45 % of the market prefers the Q with the full keyboard and quarantine and the scanner and all that stuff.

Yeah, yeah, I think the scanner feature is so cool for me with the the queue. But yeah, so let’s talk about the firmware side of it. Also, I know you’re we’re still on firmware 5.5.0. What are some of the latest things there? I see you’ve got BIP 322 proof of reserves signing. Yeah.

Yeah, so that was super cool. There is a lot of commercial sort of like enterprise clients that use our devices that needed to do proof of reserves internally for their own compliance. So we added that. And it’s kind of cool because you can just scan a QR to prove reserves too. Super easy to do. So that’s there. We also have…

Sorry, just on the proof of reserves as I understand it’s kind of like a it’s like a PSBT signing kind of process That’s right And then that in doing that you are proving the reserves and you know that might be useful like you said for compliance or audit reasons or I know of maybe even in some cases where people are doing like You know, they’re a customer of a Bitcoin neo bank or something like that and they want to prove ownership of coins So that’s like another use case where maybe the user wants to able to prove that he controls these coins without actually moving them

That’s right. It’s used a lot. And I think there is more Bitcoin features coming that we will need some proof of reserves. Not base layer features, but things that people are building for contracts and for other things of such nature.

Stephan Livera (04:03.874)
Got it. And so anything new on the mini script or mini script side of things?

Yeah, I mean, we’ve been sort of like keeping on par. We support pretty much it’s just like essentially like a ledger and cold card support the the the let’s call the frontier of mini script. We

On the edge firmware to be clear, not the standard firmware.

That’s right. We just keep on sort of following the features and trying to make it available to the developers. We also support it because of Anchor Watch and Liana. So yeah, folks are playing with it. It is starting to see more adoption. But you know, like old school multi-sig Steel King.

That’s what most people use, that and single SIG plus best phrase. So that’s interesting there. One thing that we’ve been putting a lot of attention in the recent year is making sure that we have a rock solid setup for spending policies. And a lot of people use our devices that way now. It’s a very common use case where you essentially have maybe one or two devices where you have one device that’s available to you, right?

NVK (05:24.718)
say on business premises or home office premise, where the device itself may have the, you know, has the keys to sign your coins, but it is locked down to a spending policy. So, you know, you cannot spend more than whatever you define. Say for example, you you define that it can only spend $100,000 or something like that, right?

That means that if you’re under duress or something happens and you’re forced to spend, you can only spend that limit. They cannot take the rest of the coins of those keys using that device. And what’s cool is in good cold card fashion, there is a million ways for you to conceal and confuse attackers.

You know, like we don’t say what the spending policy is at the time of signing, right? So you have to know that. So you can make a fail many times. We also have a bunch of sort of like different ways in which you can force, you know, fake fails. You can have it so that it can only send to whitelists. So for example, one use case people are using code cards a lot for now is to top up their Bitcoin back loans when the Bitcoin price drops.

And that device is set so they can only send to the whitelist addresses of their loan provider. So if the price drops, they can sign a transaction, send it over, save their loan from getting called. But they cannot send it to any other address. And that device is set to that.

Interesting. so in using that just for clarity, they could be having it in their office or home office, let’s say. And then if they want to spend the larger amount above that threshold, that’s where they’ve got to go to another device or like the multi-sig. How are they setting that part?

NVK (07:26.862)
Well, see, it would be like single SIG or multi SIG, depending. Because we support spending policy on both modes. We have a setup where in multi SIG, the device can have two keys. And they’re treated securely differently internally so that they can enforce that spending policy. I mean, I’m not going to get into the weeds in a call like this. But.

The point is the functionality is there, right? So if you have multi-sig or single-sig, the device can enforce spending policies so that it can be operationally in a good spot if the price drops, for example, for your Bitcoin loans.

Any standout user feedback or surprises since launch?

Honestly, no, mean, like people get it. We’re very lucky to have a very dedicated user base. We’re not dealing with the full DGEN folks doing treasure ledger for their slew of checkpoints. And the people who gravitate to our devices tend to care a little bit more about understanding what’s going on.

So they truly leverage the features that we build to the max. People really use the stuff we do in the way that we intend. So we end up getting very good feedback that way. And it’s weird that you’re going to find a cold card user that gave away their seeds on a DM. This is not our user base.

Stephan Livera (09:04.77)
Yeah, there’s a selection effect at play. Like it tends to be the more intermediate and advanced level.

It’s more people who have some money. It ends up being people who actually have coins, not some people who just started collecting a few, like sats, right? I always say, if you don’t have enough Bitcoin to justify the purchase of a hardware wallet, just stick around with a phone. A phone wallet is fine. And then you upgrade from there. Because as your stack grows, your interest

an investment, like knowledge investment into understanding self custody is going to grow too.

Yeah, I think that’s the common pathway for a lot of people. I also wanted to get your thoughts on the BitKey because I know there’s been a lot of conversation back and forth with that and BitKey guys recently put out a new version with a touchscreen and that was one of the big criticisms people had. But I know there are some other maybe philosophical differences that you might have with the BitKey style of approach where it’s like a two of three with blocks server key and the seed phrase is not exposed.

to the user.

NVK (10:12.846)
Yeah, I mean, I think it’s great that you have a pub called trying to create self custody. I just find it very disappointing that the marketing does a sort of like is not honest enough. Because the reality is like, you know, this is a general company and they’re going to do a lot of like marketing. They hopefully are going to get a lot of people into Bitcoin. But, you know, they’re removing the

the unilateral exit capacity, is Bitcoin’s biggest feature for self-custody. The reality is you cannot export your keys out of a big key. You cannot export your keys out of the app either, and you cannot export the keys out of their servers. So that means if they want to, they can make people stock. They maintain an APK for Android for people to

to load things if they still have the hardware and if the hardware doesn’t have any malicious firmware, which they only control.

We’re talking about like a sovereign recovery. I think or it’s called emergency exit kid. That’s Asking if in your cloud and the idea is like if block goes down you get this PDF and some other and

That’s right, yeah, the break the glass.

NVK (11:28.16)
It’s theater. I mean, the reality is it’s theater. Because the problem is you have a device that has auto update, the user doesn’t control the software updates, right? And they control the whole stack. So if they want it to be malicious and push a bad firmware, right, that doesn’t allow people to send coins to certain addresses, they could, right? And then the user is stuck trying to send with their APK, but the APK is controlled by them. You know, it’s not ideal.

especially when you’re a pubco, right? Because if you’re a pubco, you’re more scrutinized, you’re more vulnerable to sort of like government concerns, right? So in that sense, I just find it disappointing that there is no true break the glass unilateral exit, which is again, it’s like that’s Bitcoin’s biggest feature if you’re going to call it a self custody. If it’s like advertised differently, it’s like, hey, use this is better than a phone app.

It’s better than keeping on the next change, right? But, you know, we can’t, we can prevent you from exiting. know, maybe, maybe there is like, you know, it’s a better sort of path that way. I just wish it was sort of like a better marketing, a little bit more honestly marketed.

I see. Yeah. So as I’m trying to understand a little bit about how that emergency exit kit works, I’m just kind of quickly looking. don’t know the detail of it. I guess you could have downloaded that app beforehand and kept it, but in practice, who’s going to do that?

especially the target market.

Stephan Livera (13:03.978)
Yeah, I see. But I mean, I guess the counter-argue would be, they’re gonna bring more people into Bitcoin, right? They are going to,

Well, so is MicroStrategy, right? I mean, MicroStrategy is going to bring more people to Bitcoin. Nobody brought more people to Bitcoin than Coinbase. I think it’s like, you know, like again, if you’re going to have the do-gooder sort of like we care about Bitcoin sort of narrative, right? Well, then be honest about it. I think, you know, like it’s, know, Coinbase is honest about like, you know, what they do and how they do it, right? Just be honest about it. Don’t confuse people.

NVK (13:41.659)
I think that’s the main beef I’ve had of that product for a long time because I was one of the biggest supporters.

Yeah, I recall maybe some of the earlier conversations, some part of your criticism was that they were, let’s say, seeds in some of the marketing. think maybe that was part of it. think, look, what we’re getting at here, and I’m happy to host someone from BitKey on and Block to kind of talk about from their perspective, but I guess I’m somewhere in the middle. I’m okay with normies kind of using BitKey as an example, but…

For me personally, I’m gonna use cold card, and for my family, I’ll teach them how to use a cold card. But if it’s someone who maybe you don’t have the time to spend with them, yeah, there’s just gonna be this fundamental trade off. And I think to be honest to you as well, or to be fair to you, I think there is maybe like a fundamental philosophical difference of like, how much time and effort are you going to put in to just fundamentally learn what is going on under the hood? And I think maybe you and perhaps even Nunchuck, the Nunchuck guys were also kind of commenting in a similar direction of saying,

No, part of the way you use the Nunchuck app is that you first create a key and then you create a wallet so that you can understand what’s going on under the hood. Whereas maybe with BitKey, the idea is the user does not have to understand really much of what’s happening under the

Yeah, mean, again, like, I think like the best thing that we can have in Bitcoin is like honesty. Right. So like if everybody is just clear about their trade offs, right, I think is a big win for the consumer. It’s always been like that, you know, and it really is that simple. like, hey, guys, you know what, like we have this super easy to use, beautiful new device, you know, like that screen is gorgeous.

NVK (15:24.539)
You know, like the only problem that we have is that like, you know, to make this this easy, you don’t have any lot of.

Yeah, it’s kind of like an asterisk thing because it’s like they’ve got the emergency exit kit. there is, is that is that not a unilateral exit, but it’s just a bit of a, it’s it’s a harder to access and it’s in a different way to the typical like, uh, it’s not a, um, it’s like, it’s a kind of, maybe you could argue it’s more like a vendor lock in esque kind of approach as opposed to the.

Well, the main problem is that it can be rugged. That unilateral exit is very easy to be rugged. So it doesn’t fit well with a company that has that much government exposure. But that’s just like splitting hairs. Like, listen, I hope they do well. I hope they onboard. On the other hand, what they’re doing with merchants is incredible.

Yeah.

Stephan Livera (16:14.99)
Right, think I saw Miles announce they had like over 800,000 merchants.

It’s incredible, absolutely incredible, right? Like they should focus on that. I mean, they’re very good at that.

Yeah, but I think I guess I see them as kind of they’re doing the ecosystem. They’re doing everything right. They’re doing the merchant processing side is that you know, mean, who knows, they’re probably going to do loans, you know, they’re probably going to do in the hardware side of it, they’ve got the mining side of it, like they’re kind of just doing everything so

That’s right. mean, you know, like that’s how companies, know, like Google used to have, don’t be evil, right. On their tagline, right. Like this is the problem. This is a public traded company. You know, I’m sure Jack has the bass in his heart for it, right. But it’s not his company, right. It’s the shareholders. You know, like look at Twitter, right. I mean, like, you know, eventually, like, you know, he’s, he’s going to get bored of it and move on to his life, right. Like, and then like who takes over next, you know, they may not want you to take your coins out, right.

When the government gets in,

NVK (17:08.083)
It’s just not the right vehicle for self-custody.

Yeah, I see. Well, I mean, I’m somewhere in middle. I’d be okay with like, you know, people just learning on that and you know, maybe over time, then, you know, figure out exactly how you want to do your self custody. And if you want it, I think really, to me, it just comes back to this like this fundamental philosophical difference of like, what should be the approach for, let’s say for self custody for people who have, you know, large, larger amounts of coin, let’s say.

And maybe the answer from the Bitcoin side or the block side would be, look, this is not meant for people who are, you know, holding large, large, large amounts of coins. I don’t know. I’d have to see. But anyway, let’s move on. I think there’s some other interesting things around Bitcoin security that are worth talking about. I’m sure you might have seen in the industry, BISC just recently got hit yesterday. There was a big exploit. I think it was about 11 BTC. It was some kind of negative minor fee floor in Multisig.

What, you know, from your perspective, what does this say about decentralized protocols and what it means for us in hardware security?

I didn’t follow that one, but if I remember, even Bisk using some checkpoint or something like that to make things happen, I lost track of that project.

Stephan Livera (18:19.406)
Yeah, so they have like I think they have like a colored coin BSQ. I don’t know the details. This is maybe slightly out of date information. And I believe they do have some altcoin trading. So maybe it relates to that. I’m not 100 % sure though. But I guess the broader here’s another interesting question for you. Because we are now we don’t know for sure. But they were saying the BISC announcement thread mentioned that it is likely an AI powered attack.

So maybe people using, you know, is it some North Korean group using Claude or Cursor or Mythos or something?

Yeah, but that’s like saying that they just encountered an adversary that’s a little bit better than they are. The reality is people are trying to break stuff all the time and if you have Bitcoin to be taken, it just means that they had bets to

Yeah, but I guess the point would be is there has the game changed right is there a you know now This is a big new threat that people need to start thinking about which is basically AI assisted hacking And in this case, it’s like a decentralized protocol, right? So obviously the keys aren’t offline It’s a different thing like it relates to like a smart contract or if they find a bug or an exploit Whereas obviously when we’re dealing with like, you know your cold card and your long-term security storage

you’re very, very careful about that. The keys are kept offline. It’s a very different thing. by the same token, mean, who knows? Maybe in the future, if people start doing more quote unquote, defile lending, using like, you know, Bitcoin on chain, then maybe there’s kind of an angle there. But I guess, do you think the AI assisted hacking is changing the game in some way? Or is it more like, well, there’s AI assisted hacking, and now we need AI assisted defense and you need to fight, you to defend it. And that way.

NVK (20:03.832)
So it’s already happening. got a preview, a friend got a preview of the Codec cyber or whatever they call it. The KYC NDA version of their security assessment tools. You know, the first thing he did was run after run it on the cold card repo. And, you know, honestly, like we saw the the bug reports are all like, you know, extremely mediocre stuff. Everything was like high.

Right? Like they said everything is like, hi, hi, so it’s like super dangerous. And everything was like completely false reporting. The tools are still abhorrent. The quality of this hacking, AI hacking, is too ultra crap. You know, again, like if you’re going to point that against, you know, WordPress and a bunch of JavaScript stuff, you know, of course you’re going to find a bunch of stuff. But like if you point like an Indian company that does like

you know, those like, assessment, threat assessment things, you know, against these JavaScript tools, they’re going to find a bunch of bugs too. Right. I just, you know, it’s, it’s going to definitely get better. it’s not to dismiss it. I think it’s going to definitely get better. I think, you know, cause the reality is this machines can, can try a lot more stuff concurrently that a person can. Right.

But what we notice is that these machines, been trained on very poor code. So most of the code they’re being trained on is crap online. It’s not all the closed source missile guided systems. There’s a lot of amazing stuff out there that just isn’t public. And these machines, cannot.

test against hardware yet either. So all the embedded code, they can find like obvious bugs, right? But they cannot test their bugs because they don’t have access to the embedded hardware in a lab, right? On their desk. Because that’s how you test embedded hardware. can’t just like connect to it or just run a virtualized version of an MCU on your computer. So anyways, I think it’s gonna get a lot better. I think we’re gonna see a lot.

NVK (22:24.462)
more interesting bug reports in the future. It’s going to be very helpful. I’m super optimistic about the stuff. I think it just makes everything better. But we’re definitely not there yet. It’s only things that are very poorly made that are getting cracked. So if you’re Coca-Cola and you’re Proctor and Gable, whatever, you have a bunch of cobbled together garbage running on your enterprise servers, Of course, these tools are going to find a bunch of holes.

know, finding holes against like hardened Bitcoin security things I think is going to be a lot harder.

Let’s talk about AI. I know you’ve been doing a lot with that and also the Bitcoin quantum debates. You’ve been weighing in on some of that. We’ve seen some FUD from the quantum types who are panicking about it and you’ve been responding back to them using some AI as well. Do you want to talk to us a little bit about the quantum side of it before we then get into the AI stuff?

Yeah, mean, like the quantum stuff, I mean, I have a pedestrian, you know, like understanding of a lot of the quantum stuff. you know, I don’t claim to be an expert on it. You know, I do, I do understand some of the hardware related to that and you know, the limitations of it. Like we are so beyond far anything that is usable for actual applied cracking. You know, we still need.

Where we might have, I actually like, am skeptical enough to like, I don’t even know if we get to proper quantum computing, by the way. You know, we may just not find a way to scale the hardware, like to scale the computation. There is a very hard, we’re at a point now, I think the best analogy is like, we have essentially like,

NVK (24:19.266)
know time machine science like time machine math right but we cannot build the time machines think about it that way okay no really

It’s Google paper talking about the quantum circuit. Yeah.

It’s all crazy shit. Like when you actually dive in, like, you know, like it’s like, it’s kind of like similar to talking about like black holes, you know, it’s like, we have some math, you know, and it’s also math that we can’t prove either direction either. Like, so like, we just, you know, we can describe what we think is describable in certain ways. you know, and like,

After you sort of like dissect the realities of the math itself, which is still not like as solid as people think it is, including like the scientists, like the people working on the shit say it themselves. It’s not me. Like, you know what mean? Like once you get out of Twitter and you actually go research, like you look at the size of the preambles or the assumptions that people put on their papers to be taken seriously, right? It’s ginormous. Like they’re like,

Listen, know, like assuming this, this, this, and this and that, which is not very solid yet. You know what I mean? We can calculate this, you know? That’s where we are with that stuff. So, and that’s not even touching on this insane, absolute, like, you know, space, time, continuing, changing hardware that you need, right? Like, you know, it’s just unrealistic to be freaking out on Twitter. Like, I mean, you know, I’m sure somebody had some shorts on Bitcoin, somebody made some money.

NVK (25:54.414)
You know, some people have some quantum, some PQ, some post quantum math to sell. You know, I would say like at this specific time, like I would just like put it into the ignore bucket.

Yeah, and it’s interesting. I think one of the articles had made an interesting point, which is often that experts in a field often are wrong in their technology predictions and domain insiders are systematically 20 years more optimistic than outsiders.

That’s right. mean, unless we have some gigantic breakthroughs in both the math and also the hardware, I don’t see us having quantum computing for like 50 years. Remember, we’ve been working on this stuff for more than 50 years. It’s not new. And again, it’s just too out there. It’s so out there that we cannot

quantify and qualify any of the advancements either.

The other thing, obviously people will want to know, just because obviously you’re the guy from CoinKite, people might be curious on if we were to do some kind of quantum mitigation and just for example, say, let’s say it was Jonas, Nix, Shrinks plus Shrimps as an idea. And if we had to have much bigger signatures, what would that mean for us in hardware terms? Do you have any kind of…

Stephan Livera (27:26.568)
speculation on what that might look like, like would it be feasible or not? Like just if we were to assume like okay the quantum thing is real and it’s coming in I don’t know 10 or 15 years, like do you have any thoughts on what a quantum hardware wallet would look like? No.

I mean, have like, there is like, you know, 20 different proposals for I don’t know how many anymore, like maybe 10, 20, whatever proposals of like new cryptography, you know, for the hardware itself, like, we don’t know, right, we’re going to essentially just sign what Bitcoin needs signed. Right. So if, if it looks like we need whatever new curve, whatever new SPAC,

Gotcha.

NVK (28:11.182)
It’s going to come to Libsac eventually, right? And then we’re going to support it. There is a non-zero chance that it might require new hardware, depending on the requisites of the signature, right? Because these are all big signatures. It’s impossible to know. And it’s so out there still that it’s hard to know. Too early. I mean, I highly recommend people going to bitcoinquantum.space.

to even speculate.

NVK (28:40.366)
I have three articles there, fully researched. The experts could not poke holes in it. And it really goes through everything. This is not my research, even though it’s in the eye person there. I got the bots to do some very, very, very well cited and researched papers, or essays, or whatever you want to call it.

and just review what’s out there. The reality is that an abacus and a dog can do better than a quantum computer.

So let’s yeah, I I know there’s a paper about that. Yeah.

This is true. Really, this is where we are. An abacus and a dog can do better math than a quantum computer at this moment.

So yeah, so it definitely puts things in a different light, but Let’s talk a bit about the AI side of things. So I know you’re really into that also You’ve got some websites. One of them is called learn to prompt org and You’ve also been working on this LLM wiki So tell us a little bit about that. Give us an overview on what you’re doing there with AI

NVK (29:52.674)
Right. essentially, there was a, what’s his name? Something Carpathian.

Andrej Carpathi, the guy who used to be like a Tesla AI or something. Yeah.

Yeah, so that guy put out a very, very interesting idea file or an idea out there, right? Where to use Wikis to essentially become the persistent memory for the LLMs, right? For people that don’t know the LLMs, the AI, whatever you want to call it, these large language models, they don’t have real persistent memory.

Right? They’re kind of like they live in a memento sort of like remember memento the movie. Yeah. So every time they wake up, they have no memory. Okay. So every time you create a new session, they are back to stock, which is the last date they were trained in. Right. So then how do you use them? So like you tell them to essentially create, to absorb context, right? The context may be the last session’s history.

I haven’t seen it, but I’ve heard of it,

NVK (31:00.886)
It might be the current repository you’re doing coding in. It could be a list of things you tell it to start looking into. It could be search the internet for this, that kind of stuff. So that’s very restrictive. it really doesn’t scratch the surface on how usable these models are. It’s like having a Ferrari with no fuel in it.

You like you put a drop in the tank, it can go very fast for like a second, but then it dies out, right? So it’s not very usable. and then AIs, use vector databases, right? And vector databases are great at some things. And say, for example, fuzzy search, so they can find things that are related to each other because they find points on a map. But they’re not very good at finding precise things, right? So that’s why, for example, using time series data on

LLMs like do my taxes sucks, right? So then what people do is they create like plugins and they create like things to try to infer back into the session, right? Like the data and try to guide this very smart engine, right? Into the path, right? In the correct context. So.

What’s interesting about the Wikis is that Wikis are essentially a graph database. And a graph database is more like a relational database. So you have a relation between topics, right? So you’re no longer looking for math points on a graph. You’re now inferring that like, you know, like the topic of quantum computing and Bitcoin and, you know, and then like this other topic, they all sort of relate to each other, right? Because

You’re literally giving an index of a wiki that points to these things and they each points to each other in different ways, right? So anyway, so I got the bot, know, the LLM to go and start working on what would be a good wiki based on this guy’s like idea. And, you know, and then after like, you know, I’ve been working on that for months now.

NVK (33:20.77)
But that’s where LLM Wiki was born, llm-wiki.net.

And so I guess in simple terms, it’s like you can create your own LLM wiki and it can ingest papers and debates and material and then you can then go back and ask about those things.

So the LLMs are designed to guess the next word. That’s literally what they do. They’re just very good at guessing. But I don’t want them to guess. I want them to research and read and then do the conclusion, do the thinking based on the actual data. So essentially what I want them to do is do everything from near first principles.

So then what the LLM does is it’s agentic, right? So it has multiple agents, right? So for example, on the research side of the LLM Wiki, I have five researchers. I have the researcher that looks for the stuff. I have a researcher that looks for news related to the topic. I have a researcher that looks for the technical analysis of it. I have a researcher that looks from a different angle. And then I have a researcher that is a contrarian, right?

And then they all kind of like keep on talking to each other while they’re researching, almost if it was like a research group, right? And then they come up with their research summary and all the stuff that they found and all the citation and everything, right? So for example, let’s say you’re interested in, I don’t know, using red light therapy, infrared and near infrared light therapy for yourself, right? You go in each other’s Wiki, like, you know, go research infrared and near infrared light therapy.

NVK (35:07.758)
And so the research is going to go out there, it’s going to search the internet, it’s going to find the papers. And it’s really the cool thing is that you go read the papers, not like you find some blog, but then the blog maybe points to the paper and then it tries to find more original sourcing. And then it has a confidence scale for the information that it found.

So it can infer if it’s like, okay, I have high confidence that this is true, I have mead confidence, have low confidence, right? Which is very important. So for example, in some topics, you may find things that are just unfortunately low confidence because you couldn’t find deeper papers and more science on it, right? So then you can tell it, well, go try to find it from this other angle or whatever, right? And try to get to your truth. But what’s cool about this is you can use it for

They’re in different topics. You can use it for technology as well. Not just like, you can use it for anything really. Because that’s like how people think and how things are built. It’s kind of like an engineering principle to things. And then once you have the research done, then you can do things like, for example, write me a 5,000 word essay explaining to me this topic. Right?

And you will use its research, it create an outline plan that you can ask it to present to you before it writes, and then it write it. And it can even change, like you can do research on writing styles. This is something I like. So I told it to do research on writing styles and also research on how not to sound like an NLM. So then it can say, like, I’ll put this research on this thing.

Ha ha ha.

NVK (36:57.002)
also making sure that you read this paper on not sounding like an LLM. And I’ll put something for me, right? So it starts to become very usable and useful for you to do things. And that’s how you end up using the LLM Wiki for all your usage of LLM. Like you’re no longer going the gooey version of like ChattyPT or whatever, because it starts to look like a very crude

use of LLMs. It’s almost like right now people are using LLMs like toys. People go on the LLM and they use it like if it was Google or something, right? And LLM like fails miserably and it’s not great. Now if you start using like these very powerful ways with multi-agent things, like it really shows like the potential and like how far we are on this stuff. It’s remarkable.

Right, I guess the point here is you’ve got this website, llm-wiki.net, and as you mentioned, it’s building on what Andre did with LLM Wiki. So what’s specific about what you have done? Is it a specific flavor or what’s the thing that you’ve done?

No, yeah, not a of it. really sort of like, I mean, maybe I’m like, grateful I don’t know the guy. I’m not part of that industry. So I kind of went my own direction. This really sort of like takes like his idea into how I think, right? So, you know, I created like a way for you to plan things into specific specs, so RFCs or a SPAC spec.

You know, this thing has like a librarian feature where, you know, you upkeep all the data. It has a thesis mode, which I like to do it. I use it a lot. So, you know, like I think of something, you know, stupid and I go like, and I say, okay, like, this is my thesis, you know, like, I don’t know. It’s like, we can, I don’t know, put rockets in space using baseballs, whatever, like something completely idiotic, right? And, you know, the LLM Wiki will think it through.

NVK (39:07.31)
as a thesis, we’ll double check that you understood your thesis, right? So you can be very terse. You can be very sort of like, as they call it, retard the maxing on it. And then you would decompose your thesis, and it will try to sort of like offer you back what you understood for you to confirm. And then it will start research on it to try to like prove or disprove it, right? It will try to find truth in your thesis. Because like most of the time, people don’t start research just out of like,

you know, sure, you may need research on something, but oftentimes it’s because like you had a thought, right? And your thought is like, you know, I think this, right? And so like, I find that a better path to start research from. So there is that feature. The librarian feature keeps up the library going. And I also introduced this freshness because like there is a lot of fields where the data changes all the time.

right? So AI or Bitcoin or whatever, like things are changing at all times. So I support a way for you to qualify the data in terms of freshness. There is audit capability on the Wiki so you can audit more deeply the research, which is great. Lessons is one of my favorite feature. So there is essentially learned lessons in the shortcut is LL. So oftentimes you’re using your chatbot

I use either Cloud Code or Codex or Pi. And you spent your two, three hours working through a problem, right? If you’re building something or whatever you’re doing. And through that process, there were many times the LLMs failed, you failed, but then eventually you come to a conclusion, right? Like you fix the thing, right? So you have this wonderful knowledge

in the chat history, right, of the things that you’ve learned, right? So what you do is you go like, Wiki lessons learned, right? You will read the history back and we will extract all the lessons and all the things that like, you know, why did you think that and why do you got that and why you got that wrong? And we will also put that into that topic Wiki. So then in the future, like when you’re trying to do similar things,

NVK (41:31.274)
And people bump into this problem all the time. You try to do similar things in a new session of LLM and it’s failing the same stupid stuff and it’s really pissing you off. So with this feature, like it’s all in the wiki. So when you go like, okay, let’s go back to setting up this environment that we were doing with this problem. Please review the wiki first. And it’s like, boom, it’s got like this absolute golden quality context.

for you to use it that you you won’t destroy your stuff, know won’t waste your time.

So you see the value for a researcher for, and I guess the idea is people, guess, pre AI would do, you know, this idea of like obsidian and this kind of quote unquote second brain and they’ll put all the notes in there. This is kind of like a way to maybe in a way loosely, you’re, you’re automating a lot of that with AI and instead of just directly querying with, say chat GPT or Claude or whoever, whatever, or Gemini or whatever.

Everything.

Stephan Livera (42:34.21)
You are setting up your own knowledge base based on all these articles and the AI is kind of collating the material for you and rating it or ranking it based on relevance and things like that. And then you’re able to query for an output based on that. Is that loosely what’s happening?

Yes, and it’s for everyone and everything. What’s cool about the Wiki is that we have topics of interest, right? And it’s surprising how many a person will have. And I’d say 60 % to 80 % of the time when people are querying their chat bots, they’re querying related to topics that they queried before.

Right? Maybe, but it’s like probably sort of like new ground on that topic. Right? What’s cool is that like for if you do that using the wiki, now you are adding to your corpus of understanding, but also you’re not repeating the same research, you’re not repeating the same mistakes. Right? Because again, the LLMs hallucinate because they’re trying to guess the next thing, they’re not doing the research.

And that’s true for anything you use LLMs for.

So you think it might actually save you tokens?

NVK (43:56.038)
absolutely. Yes. I mean, it’s kind of a wash because like, you know, the agency use a lot of a lot of tokens. Like this is not a cheap sort of system to use. Right. There’s because the context is big. It likes large context models. And that’s why it’s so good. But, I don’t have a concern about that. Like things are going to get like cheaper, faster, better in the next. Like, you know, they are every day getting cheaper, faster,

got it. And you can use different AI agents, right? You’ve got Claude code, open, open code pi. And I guess so the idea, but everything you might eventually start supporting even those, like the deep seek of deep seeks of the world, that kind of thing.

So I support.

NVK (44:38.592)
no, no, that stuff is already supported. So the support really is for the harnesses, the coding agents, not for the models. So you can use any model you want with this stuff, as long as they’re generally fast and they have larger context windows. But the actual support itself is for the harness, so the harness knows how to call the wiki agents and stuff.

I really support all the open ones that you can use with any model. like OpenCode, Py, or just any LLM agent with an agent’s file. That’s all there. And it works. The repo has now, I don’t know, like 40, 50 forks and 300 plus, almost, yeah, 350 stars on it. So people are using it. That stuff works. And people use it with very different sort of agents.

Interesting, yeah. Let’s get now to your broader one where you’ve got leantoprompt.org. So as I’m understanding this, this is more like practical guides for using AR coding agents, but securely and effectively. Can you elaborate a bit on this? What’s the goal with leantoprompt.org?

So, you know, as I started playing around with this stuff, I’m like, people are just running, essentially like loose dogs on their user space that’s full secrets on their computers, especially as devs. Right? It’s just ludicrous. It is absolutely ludicrous to run, you know, cloud code on your computer. Like, it’s crazy.

the thing has like, it could go and look at your SSH keys. It’s really bad. So I started doing research and like how, because you know, the best way to run LLMs like that is to really use a virtualized way, right? So you use a VM on your computer or something like that. But you know, sometimes you do want it on your main machine, right? So I started doing some research on like, how can you accomplish that? And

NVK (46:48.15)
And also I bump into NoNo, which is great little library, like a tool that creates sandboxes for

Sandboxing the what you’re trying we’re trying to work on and I guess the idea is also for Secure like very security conscious security, you know relevant things like your SSH and things like that or I guess your your passwords and things

That’s right. So essentially, what you can do, you can do a few different things. So I’m doing a chain there of tools. So first of all, you pin down the tools you’re using. So you have a hash of the tools and a hash of their path. And then you have AMP chain, which is like a cool little tool to inject the.

the secrets into the session. So for example, you need API keys for this device, for these tools to use. So you can inject it that way. I created a fork because that one was not maintained. So it’s a nvchain-x.

And so this is like AI, but in a done in a more security paranoid way.

NVK (48:00.782)
Yeah, pretty much. But the problem that I arrived at is that it’s so bad. You have zero security right now running these things on your computer. Zero. Because Cloud is known for essentially ignoring the .env. Instruction software, for example, you go like, ignore these security files, ignore these security folders, don’t read them. You can put that in the Cloud spec file, in the config file.

But these tools, once they reach a certain threshold of trying something else and they don’t, they literally have an instruction to ignore their own configs.

Yeah, that’s concerning for people. because a lot of people just, especially with open claw, it kind of opened up this idea of just yoloing things on your own machine and connecting not only that, but connecting it to your real world, whether that’s your email or your telegram and signal and whatever. So I guess that’s, that’s an interesting thing people will have to navigate of like they want the AI tool to be able to make their life efficient. But on the other hand, how do you do it in a secure way?

I mean, like, listen, you have your, like, Sparrow files on your computer. know, you have your, maybe you have your Sparrow open, you know, and things are decrypted in memory and, you know, the machine is uploading everything. Like, and the other problem too is that, like, all these LLMs, the way that they function is, like, they will tokenize whatever context, right, that’d be your drive, your files, whatever they’re working on. And all that stuff gets uploaded as tokens. So, like, you are uploading all your data.

to open AI to a dropper. And they can reverse that process and read those files. So it is. So like you have this thing that has like no guardrails on your computer with the capacity of uploading. It is insane when you actually start looking into it is absolutely insane. So anyway, so I created a nice little stack there. And the cool thing is like you can ask your agent to set up the stack for you. Like you don’t have to understand

Stephan Livera (49:40.544)
input tokens.

Stephan Livera (49:53.858)
full permission off you go

Stephan Livera (50:07.854)
Right because you can point it to that side

I literally have, I wrote at the end of these articles, essentially like it’s an instruction, like this was written in a format that agents can read so that they can set it up for you. You can say like, Hey, go to learn2prompt.org and help me set up my environment just like that.

And so the idea, and you could set this up on a VPS, it could be in a virtualized system or just if you’re yoloing just straight in your, on your actual physical box.

I mean, if you want to use a virtual machine, just like a virtual setup, then who cares? Then just, on the virtual machine, can do whatever you want.

Got it. Okay. So I guess you’re saying this would be mainly for if you’re like the others you’re trying to secure it on your own bare metal at home kind of thing.

NVK (50:58.606)
So essentially, Pondage just creates some sanity with hashes of each part of the stack, right? So you have a source of truth of, who is what, right? Like, who’s running and what version is running and which paths are the absolute paths of these tools, right? And then that gets passed on to chain extra, which essentially is the decider, you know, if the…

if each key, each tool should see it and uses the OS level keychain and half signature checks to do that. This is really cool. And then no-no checks, well, it enforces which paths of the system each agent should have access to. So it’s like nice and tidy that everything sort of like gets like very sort of specific access.

you’re trying to lock it down or sandbox it. That’s right. Okay, interesting. And so any any closing thoughts or what should people think about with doing AI coding and AI use?

I mean, like, yeah, like, you know, either do it on a separate machine or like lock it down. It’s incredibly powerful. Like these tools are super fun now. You know, like you can build websites. can build, like I use LLM Wiki to build the Learn to Cooled, sorry, learntoprompt.org website. So I essentially told it to keep track of all the things I was doing on the computer. And then, you know, everything got annotated on the Wiki. And then I said, non-docsing my system.

create a cool website for learn to prompt so other people can take advantage of this. And then it generated this website. And then every time I do it, I just go like, now go and update the website. Like, how cool is that, right? Like, we can produce this knowledge for other people to have, you know, with minimum effort. So things are progressing fast. I highly recommend people play around with it. Play with the LLM Wiki. If you don’t like mine, there’s a bunch of others around as well.

NVK (53:09.134)
And but it does like increase your, the quality of the output and the output capacity by, you know, like a hundred X, you know, stop using AI, like, like you’re trying to Google something or, you know, it’s just, you’re using the tool wrong, right? So that’s it on the AI stuff. It’s just like, have fun. It’s a fun space right now.

Stephan:
Awesome. Well, yes, if you’re listening to those links, I’ll learn to prompt.org, llm-wiki.net, bitcoinquantum.space, and of course, you can check out the cold card over at coinquite.com. And okay, thanks for joining me on the show today.

Leave a ReplyCancel reply