Developer Podcast RCast Research

RCast 56: RChain's Most Disruptive Feature

Subscribe to RCast on iTunes | Google Play | Stitcher

Greg Meredith explains nominality and RChain’s unique features during Casper Standup 111.


This part of the talk begins at the 35-minute mark of Casper Standup 111. Please see the video for the equations discussed below. 

Greg: In terms of the development methodology of RChain, we want to get Rholang to the point where user contracts can be correct-by-construction. That plugs the hole that any smart contracting system has, any smart contracting blockchain platform has a massive security hole in it. As long as it allows for user-defined smart contracts, it is going to have this big old hole because code auditing does not catch a serious bug. The Dao is a great example; the Parity wallet error is a great example. They will continue to happen over and over again until you can have some kind of decent correct-by-construction methodology.

What is the most disruptive feature of RChain? It’s this feature that we can plug the security hole opened by smart contracts. Lots of blockchains are going to ultimately get to high TPS numbers. Right now, most of them are getting high TPS numbers by sacrificing decentralization. EOS sacrifices decentralization. Many of these will not have the trust model that Bitcoin has. We’ll advertise these numbers, but the fine print is you don’t have decentralization. In that case, who cares if it’s a blockchain? Why not just use SQL? Why not just use Oracle or SQL server or any one of those? 

Ultimately, your solution will be much faster using those much more hardened technologies. You’ll have a much broader development base, a lot of code that you can reuse, and yada, yada, yada. If you are sacrificing decentralization, it’s not a valuable blockchain. You’re just ticking off a checkbox. You’re not actually doing the real thing. It’s much better in that situation to use these hardened, mature software tools, especially if you’re doing anything mission-critical like high-dollar finance or tracking of mission-critical resources. 

You can’t sacrifice decentralization and still have any right to call yourself a blockchain. Remember that there is an infinite number of these that are economically-secured, leaderless, distributed consensus outlets. Infinity is a large number. Since there are so many of them, eventually there will be blockchains that have consensus algorithms that allow them, like RChain, to have high TPS. Once you have high TPS, now you have this glaring, gaping security hole if you allow for smart contracts. If you’re a payments network, it doesn’t matter. If it’s just pushing one token from one side to the other and the tokens are fixed and there’s no tweaking of behavior, it doesn’t matter at all. You can just build a better PayPal.

A decentralized PayPal clearly has value. There is no question about that. If you want to do something more than PayPal, a decentralized payments network, if you want to be able to build real applications on top of the platform, if you look over the last 20 years of internet applications, what are the major applications? Gmail, Google Maps, Facebook, Spotify—those are just a handful of many of them. 

PayPal as an instance is a very tiny fraction of what’s really going on. What’s really going on is the movement of digital assets. The movement of tokens is a tiny fraction of internet traffic. Over the last 20 years, it’s all been about the data. If you build a network that is optimized for payments, it will not scale to data.

That’s your next requirement. You need to have a network that handles data. That means smart contracting. Once you have smart contracting, you have a major security hole. What is that security hole? Users can really bad contracts (like the Dao contract). If you’re running at Visa speeds, 40,000 transactions per second, and you have a really bad user contract, that can bring the entire network down. The usual security workarounds, like running things in sandboxes, don’t work. The whole point of this network is that smart contracts interact with other smart contracts. If you sandbox it, now you have eliminated its economic viability. 

That kind of security doesn’t work. We need a new approach to security. That’s where correct-by-construction comes in. You plug that hole with correct-by-construction—that is the disruptive feature of RChain. 

The interesting point is that it’s more disruptive than just plugging a security hole. That’s because if you squint, this type system becomes a query language. You have some resource of these behaviors. From our point of view, the programs are just these terms, like C and K. Those terms are behaviors. They could be programs like you store on Github, but they could also be other things. 

For example, models of cell signaling that are currently stored in boutique databases all around the biology community. They could be other kinds of things, like geometric data. They could be terms that represent geometric information, like the kinds of things that Boeing uses to model its airplanes or that Toyota uses to model its cars. All of those kinds of terms are stored in a database.

Github is one example. These boutique biological databases are another example. The geometric information that Boeing and Toyota use are all examples of what I’m talking about. Searching those kinds of databases on the basis of behavior together with structure is impossible today. It can’t be done. Let that sink in. You can’t do that today. That means that searching Github is reduced to tagging things with meta tags, metadata, or social searches. That can only get you so far. It will not get you all the way.

These richer type systems allow you to form a query language. You can now go query for things in your database, which are just terms in a programming language that satisfy a particular specification. Now your specification can be fairly rich and fairly detailed without giving away the farm. If the specification is just a full-on program, then you’re out of luck. The search may take forever. At least you have a search now, but it may take forever. As long as you keep your specifications dialed back in terms of expressiveness, then your searches will complete in finite time and you’ll get very useful answers. 

A really good example of this is Caires’s “Behavioral and Spatial Observations in a Logic for the π-Calculus.” In that paper, he identifies a class of terminating processes and a limit on the logic where that check is terminating. It’s quite rich. You can do quite a lot with that. You can build up hierarchies. If you want to have your specifications be so rich that they’re effectively programs, then you can also rate-limit it using the same strategy we’ve used here. You put a resource on it. It will keep searching as long as you’re willing to spend. 

This gives you a new approach to many things. For example, you can formulate queries of the following for the boutique biological databases: “Find me all the small molecules which, when introduced into a cell-signaling pathway in this concentration, prevent it from reaching this state.” That query is medicine. That query is a cure. You’re talking about introducing a drug into a cell pathway to prevent you from reaching a certain state. 

Let me give you an example of such. In his book, Lifespan, David Sinclair, who is a preeminent Harvard medical researcher, has identified what he believes is the single source of aging. It has to do with epigenetic errors. The cure is essentially error-correction. Let me give you an example of one such pathway. 

In this pathway, there are homologous sirtuin gene structures in yeast, worms, mice, and humans. The sirtuins have two jobs. Job number one is they repair damaged DNA. Job number two is they prevent the cell from duplicating. Essentially, DNA breaks under normal operating conditions in the world. You need some mechanism by which you repair it. That’s the job of the sirtuins. You don’t want the cell to be duplicating in the case that the DNA is damaged because you’ll end up with two copies of a cell with damaged DNA. Maybe one copy has some of the DNA and the other copy has some other DNA. That’s aging—replication of damaged cells—according to Sinclair. Now we have a clear pathway. 

The question is, “Find me all of the small molecules that when you introduce it into that pathway, you don’t get this to happen.” An example of a solution to that query would be NAD. NAD provides additional fuel for the sirtuins so that if they are overwhelmed doing damage, they still have enough juice, so to speak, to prevent the cell from duplicating. That’s an example of a solution to such a query. 

My argument is that if you have a stochastic version of this type system, you can write that down. It’s possible that the query system would find NAD as a solution. To give one more example of what I’m talking about, imagine that you want to reason about pathways for a self-driving car. You want to represent the pathways in some kind of algebraic data structure. It turns out that in the last 20 years, people have been vigorously investigating an algebraic data structure called the Clifford algebras because they have significant advantages for reasoning about geometry. Clifford algebras can be run through this very same algorithm just like the PI calculus, or even vector spaces. We can run these outbreak data structures through this process and get a query language. Now your queries are asking questions about geometry. You can use those queries to reason about paths in space. You can use this to address, for example, navigation systems for self-driving cars. I’ve now given you two far-ranging examples of what you can do with this kind of system. 

The existing behavioral type systems for Rholang, which are derived from this process that I’ve talked about today, allow us to plug a major security hole that all the other blockchains suffer. Even if someone stumbles on a very specific type system that does the trick for their high-TPS blockchain, the bar is already so high that there’s no one there. Even if a competitor or a cooperator were to do that, they still don’t have the power of this algorithm because this algorithm affects everything we do. It affects how we build buildings, cars, and airplanes; it affects how we do medicine; it affects everything. That’s the point. If we get this algorithm rich enough, you can use it as a query language for a huge range of behavior. That’s the truly disruptive capability of RChain.

Christian: The reason why it doesn’t click is because I’m not a programmer. Now it clicks because you’re doing trust at scale with the security, then you’re adding programs on top of that. That’s where it’s disconnected for me because I’m not a programmer. The two examples you just laid out, the more examples that we get that are high-dollar value, especially in the financial industry, it’s going to be a lot easier for me to wrap my head around this. For everything you just said, this is going to make total sense from programmers, right? 

Greg: Just think about it as it gives you a query mechanism for a range of behaviors. Right now we do not have query mechanisms for rich behavior coupled with rich structure. That doesn’t exist. This gives you a query mechanism for rich behavior coupled with rich structure. You can think about the design of RChain as effectively iterating this. If you look at our development path, RSpace is a new storage mechanism. It’s a key-value database with a very important twist. Some of the values that we store our programs are continuations. This is a Delta from all the tuple space models. You get to store data and programs. 

They’re related by a constraint, which is at a particular name at a particular key in the database. You either get data or you can get program, but not both. That’s an integrity constraint of that data source. We built a new kind of database. On top of that database, you can view Rholang as a new query language. Rholang is the analog of SQL. 

When looked at from the lens of database theory, the big difference is that unlike SQL, Rholang is Turing-complete, but its semantics relative to the datastore is perfect. They are in lockstep. There is no real difference between Rholang and the data store. That’s the essential correct-by-construction aspect. You can think of the data store as being extracted from Rholang, or you can think of Rholang as being extracted from the data store. There is no Delta between those two semantics.

That’s why it’s correct. You can’t get a disconnect between Rholang and RSpace. At least in principle. In practice, because we don’t have special tools that are checking things, we can get disconnects. At the design level, there is no way that there is a disconnect. 

Rholang is a query language for that store. These types become query languages for things above that, at the next level up. Essentially, we’re iterating that. If you look carefully at the development of RChain, first we built the data store, then we built the query engine—essentially what you had was a Rholang interpreter—then we built a communication mechanism. That’s the comms layer so that different Rholang interpreters could talk to each other. Then we added in Casper. That makes sure that all of the different copies of this Rholang interpreter, which is a query language for the data store, are in lockstep with each other. If one copy does something, the other copy does the same thing. They’re all in agreement. Casper puts in a consensus about that. You go from a distributed data store to a fault-tolerant distributed data store. If any one copy fails, the others can take on the job. 

But that isn’t resource constraining. Someone can put it in a Rholang program that eats up all the network resources. That’s no good. The final piece that we just added—that’s what Testnet-3 is—we add cost accounting. Cost accounting basically makes each step in executing the Rholang program costs you some REV. Any storage that you do costs you some REV. The only way that someone could eat up all the resources on the network is if they had unbounded amounts of REV. That’s not going to happen. That’s the economic security. The system is secured from those kinds of DOS attacks by this economic limiter. 

That’s the history of the development and how it relates to all of these different pieces. Essentially, we’re just iterating a notion of building query languages, because that’s what people want to do. They want to store stuff and get it back. We’re adding different levels of semantics each time. The first time we’re adding this capability to store data and program and then building the semantics around that. The next time we want to be able to reflectively introspect about the behavior programs. That’s the type-level query language. That’s the onion view of RChain. Hopefully, that makes sense.

Christian: It makes perfect sense. It’s weaved together. It’s very difficult to understand from the outside if you’re not a programmer. But that explanation helps enormously.

Greg: I’m glad. Any comments from anyone else? 

Eric: This is awesome. Can you say how this K  will look like in Rholang? 

Greg: K is par in Rholang. You take the par of the for comprehension and an output. That produces a term. The rewrite rule is the comm rule. 

Eric: I’m having difficulty distinguishing par and the comm rule. 

Greg: Let me do it over. Remember, K takes two terms. A T and a U. Here’s my T for Y from X to P. That’s my T. It’s a lowercase T.  My U is on X spend Q. Now my K is T R U, which is the same as—that’s just a term. Now the comm rule says that that term can evaluate to P in which we have at Q for Y. K is equal to this, which is equal to that. That’s the trick. Just to be super clear, I’ll make a copy of this. Our rewrite rule, Rho—now I have instantiated this for Rho calculus. Everything that is done abstractly here I’ve now specialized to Rho.

Eric K will never happen in some kind of production, right?

Greg: That’s correct. The reason is because we start with ordinary Rho calculus. It doesn’t need any resource R. Then we turn it into a theory, where you don’t get to do your rewrite rule unless you have an R. That’s REV. That’s your token. We start with a theory like this. Then we make a new theory that says, “Oh, by the way, you can’t do this. You have to have one of these in the mix.”

Eric: K and Rho are the same, right?

Greg: That’s right. In ordinary Rho, this goes through. But in the new theory, you can’t do a reduction unless you have an R. The cool thing is that this K now becomes a kind of tensor. You have all that you need to do a computation, but you don’t have the resource. It is literally a reflection; at the same time, it’s a form of tensor.

Christian: For a marketing buzzword, what would this be—trusted computation at scale? I’m trying to package all of this into, “This is what RChain is in three or four words.” 

Greg: Notice that this whole approach, I’m going to have to do some work, but I think that this does better than SNARKS. There’s a reason. My program extraction from my types allows us to do SNARKS. I still have a lot of work to do. 

What I’ve been telling the dev team, just to round out this thought, when a block is proposed by a validator, the validator has already run the comm rule (and possibly several comm rules). The other validators, their jobs are to check that running that collection of comm events will get you to the state that the proposing validator claims. That’s the cryptographic security.

Instead of rerunning the program, what if we just constructed a proof that you could get from the state to the other state? The way we constructed that proof is by starting from the state and running the proof backward. Essentially, doing a standard kind of theory proving trick using resolution techniques to prove that from the starting state we can get to this state without having to run the program. That would dramatically speed up what RChain does today, because replay takes more time than play because it has to check additional constraints about ordering that the play doesn’t have to do. 

If you look here, the play just says, “Is there a match? Okay, there is, I’m picking that one.” Now there might be a bunch of races, but it just picks one of the races based on the stake weights. But the replay has to not only see that there’s a match but also make sure that it’s the match that happened in play. That’s extra computation. If you could avoid that and just construct a proof without having to do the full replay, then it is possible—again, there’s a lot of work to be done to verify this—that this is much faster then the way we’re doing it today. If we can do that, if we have this proof construction, then it’s effectively like SNARKS. 

There is a wide range of applications. Basically, wherever you’re seeing SNARKS being used, this is a place where you could substitute this in for SNARKS. It allows us to handle concurrency and a bunch of other stuff.

Eric: Probably more performance than SNARKS. 

Greg: By a lot. 

Eric: This also sounds like program synthesis when we go backward. 

Greg: That’s exactly right. There’s lots of work to be done. This is why LADL is so critical to RChain. It’s much deeper than just the behavioral types for Rholang. It widens out the scope so RChain can apply to many more things. The reason we want this widened scope of applicability is because we have to change the manufacturing infrastructure. As an example, if you could consult the database instead of running a bunch of chemical tests, that would be potentially far superior because it would have a much lower environmental impact. That’s just one of many examples where having a system like this would make it possible to dramatically impact the manufacturing infrastructure.