Subscribe to RCast on iTunes | Google Play | Stitcher
Greg Meredith discusses cost accounting, Testnet-3, and RChain’s unique features with Isaac DeFrain.
Transcript
Greg: We were initially going to talk about Reflective Proof Theory, which is moving along very nicely. Since we just released Testnet-3, I wanted to talk about the main feature, what it means for the network, and how it relates to the overall arc of development.
Isaac: Sounds good.
Greg: Do you remember a while back when we talked about Polkadot (and some other networks) and tried to do an analysis and comparison?
Isaac: Yes.
Greg: One of the things that we came upon was a guiding principle that it’s all about the data. If you make a network that can demonstrably handle data, then you can specialize into payments, but it’s many times harder to go the other way around. If you make a network that’s optimized for payments and then try to expand into data, you’re going to run into a lot of issues that just aren’t there in payments networks.
Isaac: That makes sense.
Greg: If we start there, then the first thing that you want to do is you want to make a kickass store. Ravi actually picked this up right away. If you think about Rholang plus RSpace, this is an incredibly fancy NoSQL database. It’s a leg up on all the existing NoSQL databases because there are three things that it offers.
Number one: it stores continuations (i.e. code) as well as data. It’s a big step forward if you want to allow for fancy things. This hearkens back to stored procedures in SQL. Everybody says don’t use stored procedures. Then when they optimize their store, they end up using stored procedures. It does it more cleanly so that you don’t run into a lot of the issues that stored procedures raise for SQL databases. The other thing is that most of the NoSQL databases don’t have a query language that has the semantic rigor of SQL.
That’s really problematic for a number of reasons that I’m sure that anyone who has used these kinds of stores in production can attest to. You don’t actually know what your code will do until you run it. With SQL databases, the query language has really crisp semantics. You know what your queries should do. The difference between in principle and in practices is often a bit of a gap. You have to account for that whenever you’re building real systems. In general, people really benefit from the crips, well-definedness of SQL. We offer that.
In particular, Rholang, if you look at it correctly, is a very fancy query language. This is because the Rholang execution semantics map directly onto the RSpace storage semantics. The two semantics are in lockstep. This is the point of correct-by-construction. We extract the Rholang semantics from the RSpace semantics or vice-versa, we extract the RSpace storage semantics from the abstract Rholang semantics. Because they’re isomorphic, there are all kinds of advantages, such as no code injection attacks and other kinds of things. That’s another benefit that you get from this very fancy store.
Finally, it offers concurrent execution. SQL offers concurrent execution as well, but the programmer has no control over the concurrency. There are no concurrency mechanisms (other than locking) that are directly available to the programmer. Rather than using a locking discipline, we offer the comm rule from the Rho calculus as the basic transactional mechanism and the basic concurrency mechanism all rolled into one.
The other advantage of this if you just view it as a store. In fact, if you look at the development of the arc of RChain, this is the first thing we did. We built RSpace and we built a Rholang and the Rholang interpreter. That represented one of the major releases. Not quite node zero, but close to it. Does that make sense?
Isaac: Yes. I never thought about Rholang as a query language, but that makes sense the way you described it. You’ve talked about it being a specification language before, which I thought was a really cool idea, but I never thought about it as a query language.
Greg: With the pattern matching, it’s a very fancy query language. You can go into the structure of the data at the keys. It ends up being a very powerful query language.
That’s the first step of development. You can think of that as basically a standalone kind of thing. You run Rholang plus RSpace on a single computer and then clients can go touch that door. They can use that store as a standalone instance for storing and retrieving data.
The next step is to have multiple instances of these stores. What we’re marching toward is a replicated right fault tolerance store. We put a comms layer between two instances. What we choose for that comms layer is a state of the art implementation that has some nice features with respect to the transport protocol that it layers over top of.
Essentially, there’s no rocket science there. There’s no improvement or pushing forward on the state of the art. It’s just: take that and make it work, with some improvements with respect to the transport layer. Which, again, is not rocket science; it’s just good factoring and good abstraction. That’s what Rholang plus RSpace plus comms is. That again corresponds to a particular release within the RChain development.
Isaac: The comms layer doesn’t have anything to do with consensus. This is all the domain of Casper.
Greg: Yes. Consensus is all in Casper. Right now, you have two instances that are connected. If they stomp on each other or they have disagreements about source code, that’s up to the programmer to deal with. But that’s a very good point. It segues into the next step.
The next step is you build a consensus layer. If you want all of those instances to essentially be coordinated so they all agree about whatever it is they’re storing, then you need a consensus layer. That’s what CBC Casper is providing. For those people who haven’t gotten into the details of Casper, the secret ingredient to Casper is the justification structure of the blocks. Because a block has justification pointers from previous blocks, that’s how you can detect equivocation and other kinds of things, and thereby deliver safety.
That same structure is also used to impose synchronization constraints and fairness constraints. That’s the distinguishing aspect of Casper that makes it perform the way it performs. Casper could be used to get consensus on anything. It doesn’t really care what you get consensus on. Casper’s role in RChain is to make sure that all the nodes have agreement on the winners of races. That’s all that’s necessary to deliver this fault-tolerant distributed computer.
Isaac: Just make sure we’re using resources in a consistent manner.
Greg: That’s exactly right. What you end up with is Rhoalng plus RSpace plus comms plus CBC Casper. That gives you a distributed fault-tolerant data store. In the world of distributed systems, that’s really cool. Companies like SAP build these kinds of things and tout it to the world. SAP HANA is exactly that. But it doesn’t have all the features that we’ve listed so far.
You couldn’t deploy SAP HANA outside the firewall. It has to be deployed inside an organization’s firewall because such a thing is still subject to denial of service attacks. Someone can put a Rholang program that will go forever or consume a bunch of storage space—there are lots and lots of ways that you could spam this network.
By the way, there’s a particular release where we’ve got all the Casper stuff in. I wish I had release numbers for this discussion, but I don’t have them at my fingertips. A lot of this stuff can be done in parallel. You could do the comms layer while you’re working on Rholang and RSpace. You can do a bunch of Casper independent of those things. Eventually, you have to assemble those parts. If you look at the release schedule of RChain, it doesn’t exactly correspond to this linear path that I’m describing, but it’s darn close.
Back to our denial of service issue. What you do in order to prevent the denial of service is that you make the attacker pay to use the compute and the storage. You instrument the Rholang compute and the RSpace storage. As you’re doing a compute step, which is a comm event, or you’re doing a store step, then what you end up with is a cost for that. Now anyone who attempts to DOS a network has to pay for the attack.
By the way, all that I’ve talked about (with the exception of the concurrency semantics and that sort of thing) is all industry standard. Even the Guardian’s data access or Google, all of those data points are tokenized. Whenever they expose on the internet, you have to have tokens in order to gain access. The tokens are limited to a certain number of requests. That’s before there was ever anything like Bitcoin. As soon as people started putting stuff up on the network, they were DOSed.
Isaac: You’re saying that’s to prevent these DOS attacks, right?
Greg: That’s right. This is all stuff we knew for a long time. It was the insight, the combination of taking that idea of tokenizing online services and marrying it to the micro-transaction structure afforded by something like the blockchain, that’s when you get another innovation step because now you have a micro-transactions architecture. You could imagine that the Amazon AWS engineers could have come up with this architecture. They’re probably kicking themselves for not doing so. The opportunity was staring in the face for the last 15 years.
That’s what Testnet-3 is: adding in the costing structure so that an attacker can’t DOS the network. They have to pay for their attacks. This directly impacts the throughput of the network. In particular, if you don’t have cost accounting, then you can merge blocks that have comm events much more easily because most of the comm events are going to be completely separate. When merging two blocks, you just have to check and see that they don’t mention the same names.
Isaac: By “separate” you just mean they’re not even competing for the same resources. They have nothing to do with one another.
Greg: There’s some subtlety there. You can even merge blocks that do much the same names if they follow a particular discipline. Mike Stay was very gracious to have analyzed all those situations. He gave this big spreadsheet of all the situations where you could potentially have conflicts and when you can merge blocks safely. By the way, that analysis comes directly out of the structure of Rholang. Rholang in the use of names is precisely how you can determine that you can merge blocks.
Isaac: You’re talking about mergeability above and beyond something like two sends to the same name.
Greg: Yes. It’s a little bit more subtle than that. You have a linear receive and a send, which updates values. When is that safe to merge with something else? Those kinds of things. We don’t have Mike’s full table implemented and we don’t have complete support in the vault structure, which is the account structure that RChain maintains to do cost accounting. That’s being fixed so that we can separate out.
If you only have one account that everyone’s depositing into, that is a problem. But if you have an account per validator, then you have minimized a lot of the conflict. You can shard up the REV vault where we collect data that way. Those are basically sub-wallets of a larger wallet that holds all the REV that’s collected as a part of the cost accounting for client deploys. Does that make sense?
Isaac: Yes, I think so. Were you saying that there was an attack associated with that solution of separating up all of the account users?
Greg: No, I was saying that we just simply hadn’t done that. Because of that, right now we can’t merge blocks. We have to have a single parent network. As soon as we put that in and then do a little bit of other stuff, we’re back to merging blocks. That dramatically increases the throughput of the network.
Isaac: You get the increase in this throughput because you’re not trying to make changes to just one account. Right now you have several different accounts and they don’t even have anything to do with one another in most cases.
Greg: That’s the idea. You get additional throughput because let’s say that you’ve got some block that everyone agrees on, but two different validators are operating at different speeds. They both may build new blocks on top of that block and then submit that. Then a third validator gets both of those blocks and says, “Instead of sequentializing these, I’ll just merge them.”
As long as you keep the active merging relatively low cost—if the logic starts to really balloon, then you have a problem; it may just be better to sequence them. As long as the logic doesn’t balloon out of proportion, then merging gives you much higher throughput. That’s a big win. That is coming along post-cost accounting. We’re over the hump. We’re getting those things done. Then we start hardening.
One of the things that we’ll be doing as part of the hardening is to take the RCat application and using it to drive the whole network. One of the things that we’ve done is to rip off the player client from RCat. This has a significant advantage that we can now just make it a very simple client that just hammers RCat requesting assets.
RCat doesn’t know what the thing on the other end is doing with those assets. That gives us an ability to really be quite harsh on the network with these RCat clients. We can have a bunch of them and really load up the network with requests for data assets, which cycles us back to the beginning of what I was trying to say.
The whole point of RSong and RCat was to demonstrate, “if data is king, this is the network that does data.” We demonstrated that as clearly as you can possibly demonstrate it. Now we’re going to go back to that hypothesis and say, “we’re driving this optimized for data,” which makes it easy to do payments, but we’re tackling much harder problems. Having this demonstration with RCat, and especially as a part of the hardening piece of the puzzle, will tie a ribbon around it.
Isaac: My understanding is that RCat is like a boilerplate asset management DApp that RChain wants to support as a platform. A lot of the hardening efforts will be to make sure that all of the features that we have do support everything that we want RCat to do.
Greg: That is exactly right. RCat is this asset management system that we abstracted out of RSong. It’s really agnostic as to what the assets you store are.
I’m hoping to do a simple little application that uses RCat. The idea with InkMe is that users can upload an avatar (a picture of themselves) and then they have exactly one action. What they can do is color other users. They can just tag a user’s profile with a color—they bring up a color wheel or a box of crayons and select a color and say, “right now I think you’re blue” or “right now I think you’re red.” You don’t say what those colors mean. Whatever people feel in the moment. That means a user can also bring up their profile and see how they have been colored by the community. They may be that everybody colors them red. They have a sense that there’s a communal consensus about red and the way they’ve engaged the community. They can begin to discover color and meaning.
Or maybe there’s a rainbow, in which case there’s a heterogeneous view of them depending on how they interact. It’s not a win-lose game. It’s more like discovering what colors mean within this community. Because the client on this is incredibly simple, and the data assets are just pictures plus color tags, it can be done relatively simply. Again, this provides a way to test the network. At the same time, it could be a lot of fun for people.
Isaac: That’s something that you want to deploy on Testnet-3?
Greg: Yes, or just after. Once we get a couple of issues sorted out with Testnet-3, we’ll tag that as a public beta, which will be the hardening piece. It’s probably on that tagged network that we can build this little InkMe application.
It’s been clear to me that not everyone has understood the arc of the development and what the design rationale is. I wanted to put that out there and say why things developed in the sort of shape that they did. There were a lot of people who were confused as to why we were declaring particular victories at particular times. That’s because they didn’t have this data-centric view. That’s really what we’ve been going after. Ultimately, it’s because we really want to support this growing movement that people should be in charge of their own data.
Isaac: This was helpful for me to kind of see the bigger picture. Anybody can go and look at all of the nitty-gritty details of what was in each one of the releases of RNode and see the progress and the logical progression there. It’s hard to put all those little pieces together and see the bigger picture like you’ve been describing here.