Articles sharding

Sharding and RChain

Greg Meredith 00:04
Okay, so we’re just going to talk a little bit about RChain sharding, so that people aren’t laboring with a lot of misconceptions. What’s really important to understand is that without sharding, you can’t get to global scale networks. It just isn’t going to happen. There are computational limitations that with classical computing are going to be really, really hard to overcome without tons and tons and tons of effort.

Greg Meredith 00:38
The RChain mainnet is a full network in and of itself, as we contemplate somewhere between 100 and 1000 nodes in this network. You’ve already seen, if you have seen Theo’s demos, that even with existing 0.13 Alpha 3 code, that this network is fast enough to upload video to any any shard and stream the video from the shard. However, if you get the entire internet, treating RChain mainnet as if it were YouTube, well, anyone who’s used used Youtube, and uploads videos to YouTube, knows that with all the resources that Google throws at YouTube, and they are considerable, it still takes quite a long time to get videos up to YouTube, and that network is devoted entirely to that problem. And so if you were also to try to put financial transactions, for example, just simple REV transfers through that network, they would all be stuck behind millions of videos being uploaded every day.

Greg Meredith 02:04
And that’s not sustainable. It’s not reasonable. Google doesn’t route all of the YouTube video requests through the Gmail network. It doesn’t work that way. It doesn’t route all the YouTube video requests through through its Google Maps network, it doesn’t work that way. That would be silly. Instead, each of those services has a dedicated network of servers that are serving up those services. That’s how the Internet scales today. And that’s how the blockchain has to scale. So however, it doesn’t mean that REV is going to be devalued or not useful, or anything like that. So with a network like RPC, which is devoted to a variety of digital assets, those assets can be shunted to this network, and not involve RChain mainnet, except in certain cases. Right?

Greg Meredith 03:20
So this is what we want to talk about – those cases. So I’ll just use these but then I’m going to add some additional cases. So RChain has spent some time to make sure that there are at least two active shards. And this is a part of our promise to the community that RChain – the 1.0 release – would support sharding. Right?

Greg Meredith 03:46
So we have to deliver actual shards and show that sharding works. That’s a part of what we promised the community, and that’s one of the reasons that we’re doing it. There are also important economic reasons. But now, when someone is just involved in the manipulation of digital assets, it is possibly the case that all of those transactions would remain here. But if you’re talking about real world use cases, for example, a marketplace that is utilizing RPC as a storage network, almost certainly, that market network wants to defray costs of using RPC by using advertisement.

Greg Meredith 04:46
So a user request then goes into the marketplace. And that marketplace ultimately needs to hit the storage network where the digital assets are, right? This costs the marketplace staking token – costs some resources to get down here. So they want to defray the cost of that, right?

Because it would be ridiculous for every data transaction that this does, on behalf of the user, to go back to the user and say: “Hey, pony up…, cough up some more. Okay, cough up some more. Sorry. Now, one more, one more page of data requests, you got to cough up some more.” Right? That would be a terrible user experience. So what’s going to happen in realistic applications is that the user is going to be presented with a variety of sponsored content, which subsidizes their use of the marketplace. Where’s that sponsored content coming from? That’s coming from over here. (DAASL).

Greg Meredith 06:09
So when RPC serves up responses to the data requests, they’re also pulling sponsored content from over here. So that means that any realistic user request is actually hitting two shards. And when it’s hitting two shards, and those transactions need to be coordinated, they have to go through RChain mainnet. That means that every user request to this marketplace has to use REV. Now as we add more and more shards, and there’s more and more interesting capabilities – so for example, let’s say, air transport, or hospitality, like, you know, air b&b or hotels or things like that, or ground transport. Now let’s imagine that we have a user who’s talking to a travel service. So a user who is booking a realistic request, at a travel service is going to hit not just one shard, but all three of these shards. And that needs to be a coordinated transaction.

That means it goes to main net, that means that every user request for booking a trip that involves two or more of these kinds of back end, chain services needs to go to the main net, which means every user requests like that cost REV. So this is important, because what it does is it gives us both the ability to scale and a reasonable economic model for the mainnet. Okay, now, let’s think a little bit about the existing internet today, and how its existing architecture relates to possible sharding architectures. So the existing internet today is largely shallow, in the sense that you have a lot of top level services. So from the point of view of this picture, top level services to the user, are things like the marketplace for the travel service that hit multiple different back end shards. So a top level service might be Amazon, you know, a top level service might be Travelocity, or, Kayak or something like that. So users are entering in this way. And they hit back in things like this. In other words, the vast majority of the internet architecture is broad and shallow. Lots of branches off of a single route, not too deep. The depths have to do with private stuff, not public stuff. So the reason this becomes important from an economic model is it is absolutely reasonable for there to be sub shards of RPC. And it’s reasonable for RPC to act in the same way as RChain mainnet for coordinating some of those sub shard activities.

Greg Meredith 09:55
In which case, you don’t spend REV for those, but most of those have to do with private interactions here, as opposed to the sorts of public stuff that needs to be coordinated at this level. So RPC is not really going to be a competitor, or a risk factor for RChain. And largely this has to do with the way these kinds of networks tend evolutionarily. But also for another reason. As you go deeper and deeper into the tree, what you’ll find is that you have more and more niche resources, and fewer and fewer validators that are serving them up, at least in terms of the heterogeneity or diversity of the validators. In which case, from a security point of view, from the users perspective, on security, the further down in the tree you go, the less and less secure the network is. Only specialized users want to go deeper into the tree. The average user, the people who are booking travel and buying things off the marketplace, the average user stays at this shallow level, which means the bulk of the market spends REV. Specialized users will be down in the weeds here, and they’ll be spending staking tokens on these, you know, a lower level shards. But that’s exactly right. It’s exactly as it should be, in order to get the network to scale. So we’ll stop there. Before I go on. Questions, comments. Does this make sense?

Raphael 11:47
Yes, it does. So yeah, the deepest we go into the levels, the less security or the less validators there are, or the more shady that gets, they get maybe the more private,

Greg Meredith 12:05
The more private, they get there’s less heterogeneity. So you’re not getting this broad spectrum of community who are validating the network, you have one or two players, right. So they might have 50 validators each, but there’s only two players. So the possibility for collusion amongst those validators is very high. And so they have to provide some other measure of integrity, right up here, you know, at this level, and at this level, you’re going to have a lot of heterogeneity, a lot of different kinds of parties, who have a vested interest in validating these networks. And so they all watch each other’s back, they’re making sure that nobody’s doing any shady stuff, and the coordination costs of all of them colluding to cheat the user is too great for that to happen. So lower down, you get special purpose compute – potentially higher performance compute – but less security higher up, you get more security, but you don’t get all the special purpose compute that you could get at the lower levels. That’s the architecture. That’s what we’ve been describing since 2017, and what I’ve been contemplating for much longer than this. Hopefully, this is making sense to folks.

Greg Meredith 13:29
Okay, so the final point is, what is the relationship between block merge and sharding? So contracts are listening on channels. The basic mechanism of rholang, and r-space, is that contracts are listening on channels.

Okay, so we’re gonna let this shape represent the contract for us. And so we have a contract that’s listening on a channel and another contract that’s listening on a channel. In fact let’s use these to represent transactions not contracts. So these two transactions involve these two distinct channels. That block merge can do is notice whether or not these two channels are the same. If they are the same, then these two transactions have to be sequentialized. If they are not the same, then it is safe to run them side by side. So they can run at the same time. And we can do all of that sort of calculation inside a shard but the one thing that we have to do across all the transactions is we have to charge REV or the staking token associated it with this charge.

Greg Meredith 15:00
So every single transaction is going through a common resource. And that common resource is the cost accounting. So the processing of all of these transactions goes through a common resource. And because they have to go through this common resource, that means that there’s some slowdown, because everybody’s touching this common resource, which is the cost accounting. Now we’ve done as much as we can to make it so that this is as efficient as possible. But there’s no way around it. Every shard has a common resource, which is the cost accounting that is involved in charging for every single transaction.

Now, if you have two of these, so this is now shard A, and this is shard B. These two transactions don’t go against this cost accounting resource. So you’ve now basically halved the number of things that are going through this cost accounting resource, right? So you group all of these with this one, and all of these with this one. And it’s the same mechanism.

Greg Meredith 16:25
We have essentially taken the namespace and bifurcated it. So in this case, it’s the names of the channels associated with the cost accounting, that have been bifurcated. So we have some names for this cost accounting, some channels for this cost accounting and some channels for this cost accounting. And they are separate. And because they’re in two different shards, that’s the guarantee that they’re separate. So it’s the same mechanism in block merge, just applied to the cost accounting, infrastructure, which allows for scaling. So if you have 10 million transactions going through here, and 10 million transactions going through here, this 10 million doesn’t load up on this resource. So it’s exactly the same mechanism, we’re just applying it at a at a higher level. So that’s why sharding is necessary, we basically need to do groupings of names and say, we’re going to apply consensus over this grouping and this cost accounting administration is managing this group of names. And over here, we have another group of names, and this cost accounting mechanism is managing or administrating this group of names. So it’s exactly the same mechanism as block merge, but it’s applied at this higher level.

Greg Meredith 17:58
So it’s the same principles that are being employed throughout the entire network to create the scaling effect that is RChain. But we must have this kind of capability, or the network cannot scale. Let’s think about it for a minute, right? So imagine the state of California, the world’s fifth largest economy. So imagine trying to put everybody’s grocery purchases, their movie tickets, their medical expenses, their utility bills, their automobile expenses, bus tickets, airfare, all of the citizens of California, all of those transactions against a single resource. Never going to work. It can’t be done except possibly with quantum computing, but that scale of quantum computing is at least 50 years out. So that is not feasible. And anybody who tells you that it is, is is probably also then going to try to sell you a bridge somewhere.

Jacob B 19:33
Yeah, so is the bottleneck the the cost accounting, and if so, give us a sense of how much that throttles down the…..

Greg Meredith 19:50
Yes, you have to introduce cost accounting, in order to prevent denial of service attacks. That’s what everybody on the internet who has an Internet facing API does. You have to have tokens, whether you’re calling Google or Amazon’s API’s, they provide you tokens. Now, they’re not cryptographic tokens, like like they are in blockchain. But everybody provides tokens to throttle access to to their internet facing API’s. Got to do that. So that’s thrust upon you from a security point of view. But now, once once you have this kind of cost accounting introduced, for that particular sub network, every single transaction must go through that. And again, this is just basic information theory, there’s no way around that architecture.

Greg Meredith 20:54
Now, you can do tricks, and believe me, we have done a lot of tricks to make that fast. And I don’t have time to go into all of the vagaries of how to do that. And there are a lot more that we can employ. But we can’t spend five years just optimizing the hell out of cost accounting. We have to build the whole thing, and then optimize different pieces over time. So the cost accounting currently, makes it at least half as fast as as it could be. In point of fact, it’s actually hundreds of times. And there, the way you get to those numbers, is if you’re familiar with databases, and this sort of thing, you can easily get TPCC benchmarks in the millions of transactions per second, right? So 10 or 20 years ago, people were getting TPCC benchmarks in the millions of transactions per second. Because they can centralize all of the resources that we distribute.

Greg Meredith 22:21
So we’re distributing over a network of servers, which requires a bunch of sanity checks across the network of servers. So they jettison that overhead. And then they don’t have the cost accounting because they’re inside a trust barrier. And so since they don’t have this, this overhead, and they don’t have the, the sanity checks that is essentially the consensus across all the network of servers, they can get much, much greater transaction volumes. And that’s why it’s a basic fact, that if you’re not crossing trust boundaries, you have to think very hard about whether or not you want a blockchain.

Greg Meredith 23:09
Now, RChain ameliorates that to some extent. So that there’s a mitigation of that, because RChain gives you more scalability as you add nodes. So essentially, as long as most of the transactions are isolated, which is going to be the case, as long as most of the transactions are isolated, yes, you will have to pay this overhead cost. But we can effectively assign a physical thread to each transaction, if we have that many physical threads. So if you have at least as many physical threads, or as many physical threads plus one as there are nodes in the network, then you can get maximal throughput. And so if you keep adding nodes, and you keep adding physical threads, RChain will just keep getting better, faster, under the assumption that, you know, 90 plus percent of your transactions are isolated. Does that makes sense?

Jacob B 24:12
Yes, yeah, I think so. I think the architecture is good for, like distributing the wealth across the globe, unlike networks like Bitcoin and Etherium, which are very much flatter. So I think it’s a good thing. But I think the big issue RChain is having is going to market building the community and and the dApps and I know we’re working on that, but it would be sad to see it not get there.

Greg Meredith 24:54
Oh, I feel confident that we’ll get there. We have to get we have to get this vote out. But the main discussion here is, it has come to my attention that despite the fact that we have been talking about this architecture for five years, the vast majority of certain sub sectors of this community do not understand what the architecture or the value prop is. So a good portion of the RChain community seems to think that the token is a store of value token, and does not understand what the economic and scaling issues are. And so I am making this point, and we’re recording it now. So that it can be made available to people to understand how RChain achieves scale, and what that means in terms of the economic model. Because most of this stuff is, you know, distributed computing 101. We’re not doing anything that isn’t like now a part of folklore in distributed computing. I am astounded, to put it bluntly, at the proposed blockchain architectures, because they do not acknowledge decades of understanding of distributed computing. So to me, you know, we’re just starting from what we already know, and making that work.

Jacob B 26:47
But a lot of people don’t understand a lot of things. It’s just the way it is.

Greg Meredith 26:53
Yeah, yeah.

Rich H 26:54
I think Greg, you just said something very pertinent to anybody watching this, which is that the utility of this network, the fact that it’s a utility network, not a store of value network, is really the key point. That the value of the mainnet REV that you see on screen at the top there, is because we’re going to have so many users using the layers underneath. …..I think if you can somehow drive that point home, that we we just need to continue because we can deliver this utility network and just like Amazon or Wechat or or anything else, the reason REV will rise is because there’s so much throughput going through main net, but we need the money to finish it. It’s complicated.

Rich Holden