Podcast RCast Research

RCast 62: Predicate Calculus for RChain

Subscribe to RCast on iTunes | Google Play | Stitcher

Christian Williams discusses the results of his latest research with Greg Meredith and Isaac DeFrain. This paper was referenced during the talk. The corresponding slides are below the transcript.


Greg: Today, Christian is going to talk to us about a construction that he’s been working on that cleans up some aspects of what Mike and I have been working on with LADL. Maybe it’s a slightly different direction, but Christian, take it away.

Christian: As a quick reminder of who I am, in case people don’t know why I’m here, my name is Christian Williams. I’m a Ph.D. student at UC Riverside, studying category theory with John Baez. I got connected to RChain through Baez’s previous student, Mike Stay, that Greg has been working with. I’m very glad to have been connected with this and very glad to have finally made some kind of significant progress. 

What I’m going to be talking about today is a formalization of this LADL idea that Greg and Mike have been working on. I’m calling this presentation, “Predicate Calculus for RChain.” The main example of the idea that you can find online is Greg’s paper on namespace logic.

As you’ve probably heard Greg talk about before (and explain better), the basic idea is that we want to take the languages that are being used in RChain and equip them with the notion of a predicate so that we can do higher-order reasoning about terms in the language and so that we can form types in the language that are determined by the logical structure of terms. We use these types for checking the inputs and outputs of programs on RChain. This is an extremely useful and powerful thing for the RChain system. 

Basically, we can write these logical formulae that express conditions on processes or names. For example, we can write this formula called Sole Access, where if there’s some area of the internet (or what Greg calls a namespace) that you care about—some local community in your network—then you could write this formula that is filtering and only accepting processes that receive on that namespace fee and don’t receive on any other namespace. This would act as a firewall protecting the community, acting as this filter on the boundary of the community that’s checking for every process that they’re going to interact with. Who else is that process interacting with in order to ensure security?

Greg: I just want to add, Christian, that the difference between this and standard firewall technology is that this is a compile-time phenomenon. We check this before we ever deploy the process. It’s not checked during runtime. 

Christian: Yes. This isn’t a very full explanation of the power of namespace logic. 

Greg: You’re doing great. This is really for the community to understand. What makes this special is that most firewalls are runtime. They’re checking addresses during execution, which slows down execution. Here, we check it before we ever start running the process. It happens at compile time or deployment time for what I would call service-binding time to connect up to processes.

Isaac: I had a comment and a question. The comment begin, this Sole Access predicate is basically saying something like a namespace I can think of as a set of names and what I want to look at is all of the processes that only receive on names that are in this namespace that I’m thinking of; so receive on these names (or some subset of them) and don’t receive on any names that are outside of this namespace, right? 

Christian: Yes, that’s it. 

Isaac: Okay. The question: How far down does this go? Is this just for top-level receives or is this for any receive that can ever manifest in a continuation? Does it apply to them or is this the top-level stuff?

Christian: Good question. Greg, do you want to answer that? 

Greg: Well, the way you’ve written your formula here, it’s not recursing. When you put the Sole Access as the continuation to the receive, then you get a recursive formula and it’s all the way down.

Isaac: Right. Because then you’re saying not only does this top-level receive satisfy that, but also any continuation that comes out of it.

Greg: Yes. That’s a very insightful question.

Christian: Ever since I learned about this idea, I’ve been in love with it. The way I like to explain it to my friends is as a deeper version of Google. Rather than only searching by keywords, you can search by the logical structure of terms. It’s a very deep and intrinsic approach to a typing system for the language. Whereas most languages make up a few base types and then describe some type constructors on them, in this you’re really using the constructors of the Rho calculus itself, plus the basic constructors of first-order logic. From those, you can create arbitrarily fine-grained logical formulae. It’s a very natural and powerful typing system on the Rho calculus. 

The question has been how to formalize this idea in category theory. Greg has used this word LADL; this acronym stands for Logic As a Distributive Law. Greg and Mike were approaching the problem by modeling the language as one monad and some notion of collection as another monad and trying to get the formulas from a distributive law between them. 

What I’m presenting today is a different approach, one that looks like it will work for free without extra conditions on the language or on the kind of logic that we want. There are still some things to be figured out. There might be some limitations, but today is just about the basic idea. 

We’re going to be talking about algebraic theories. Greg, do you remember the last time we talked about algebraic theories on the podcast? 

Greg: It was a while ago. 

Christian: It’s definitely been a while. 

Greg: It’s worth reviewing. 

Christian: An algebraic theory is a way to abstract some kind of structure that you care about and take it into its general abstract form so that you can model the theory in different contexts and you can recognize when an object in some category has this extra structure that you want to specify.

For example, people care a lot about groups in algebra. You could talk about the theory of a group. Something fewer mathematicians know about, but more category theorists care about, is a monoid. A monoid is just something with a binary operation—you can call it multiplication—and a unit, which is some element such that the multiplication is associative. If you have three things, it doesn’t matter the order that you multiply them, so that the multiplication is unital. The unit acts as an identity for the operation. 

This is a very common structure. For example, the natural numbers is the classic example of a monoid where multiplication—you could either choose it to be plus or times—and then the unit would either be zero or one.

Greg: If you don’t mind, I want to jump in and make a couple of comments. First of all, Christian has a couple of nice diagrams [below] and those diagrams express the associative constraint and the unital constraint. These are typically coherence conditions that we demand of these algebras to tell us exactly how they play or that they play nicely, and that all the operations interact with each other nicely. Then it’s a very compact and expressive way of characterizing them, which is why they’re so popular in category theory. 

The other thing is that when we’re talking specifically about monoids and the natural numbers, there’s something magical and mysterious going on. When you think about the natural numbers as a monoid under addition, there’s the identity, which is zero, and exactly one generator, so it’s a very tiny construction. When you think about the natural numbers under multiplication, the unit is one and the generators are infinite. They’re the primes. The relationship between those two monoids has been the subject of a lot of deep and profound number theory.

Christian: It’s pretty amazing how even the highest math is still just thinking about this difference between plus and times. Algebraic theories are a nice, compact, elegant way to encapsulate the kind of structures that you care about. The way that we’re going to be using them is that we are going to be able to write down the Rho calculus or the other languages that are used in RChain as certain kinds of theories. The idea is that we’ll be talking about with this simpler theory will carry over. 

This theory is a category. It has this one object M and then all the other objects are just finite products of that object. That’s so you can talk about in inary operations being certain morphisms from M to the N back to M. This way we package up the structure as a certain category. Then when we want to model this theory, it’s modeled by a functor that preserves the products. 

If we model in the category of sets, we carry this abstract M, this kind of ideal abstract monoid to a real monoid, which is some specific set in the category of sets. Then you carry the multiplication over to a function from that set squared to itself; same for the unit is going to pick out an actual element of the set. We get this duality between the syntax provided by the theory and then the semantics provided by the models.

Isaac: In all of these contexts, I see products preserving as a really important property. I’m not 100% sure I understand why that is.

Christian: The idea is that algebra and these kinds of structures that we care about are really just specified by operations and equations. The only kind of structure that you need to specify operations is products because an operation takes in two things from your monoid and gives you one thing in your monoid. That’s all that we need to preserve. 

Greg: Are you using concrete theories synonymously with Lawvere theories? 

Christian: Yes, because once you start talking about multi-sorted Lawvere theories, they’re basically just categories with finite products. What we’re going to be doing with these theories is thinking about certain trees of operations inside of them. These trees are going to correspond to constructors for these predicates on the theory of a monoid. 

There’s a concept called a sieve in category theory. It’s some class of morphisms into an object such that it’s closed under pre-composition. For example, if we pick our operation M, the multiplication, and then we say, “give me all morphisms, all operations, that end in this multiplication,” they factor through this multiplication. What you’re really specifying there is some sort of predicate that is like M of blank blank. “Give me anything that is the multiplication of two things.” 

It’s really simple, but we can view it as a constructor for predicates. For example, in the Rho calculus, we could think of the par as the multiplication. W could say, “give me everything that is the parallel of two things.” It’s not an extremely useful predicate in and of itself, but you can make useful predicates from it as constructor. 

To skip ahead, the one that we’re going to be caring about is prime. This is an example of a predicate on monoids that you would care about. “Give me all elements that are not the identity, not the unit, and not the multiplication, not the product of two things that are not the unit.” That’s what it means to be prime. 

We would like a framework in which we can specify this predicate and actually extract the prime elements of a monoid. The way that we’re getting these constructors is through these things called sieves. Let me just pause and see if this part makes sense. 

Greg: Yes. It’s quite clear. The intuition I use is the old-style construction in standard undergraduate or first-year graduate mathematics where you lift an operation pointwise. If you have an operation like a multiplication defined on a monoid, you can lift it to a multiplication defined on subsets of the monoid by doing it pointwise.

Christian: Exactly. That’s a good way to say it.

Isaac: I understand the idea that you’re describing. I’m not really sure I understand the connection of this sieve object to the idea you’re describing.

Christian: I probably should have drawn this out a little more explicitly. What we’re specifying when I draw this tree of operations, all operations that end in M, what we’re getting is the set of all operations such that they are of the form M of two other operations. You have something going into M that factors as something into M squared and then M.

Isaac: By that multiplication operation.

Christian: Right. The things in this sieve are everything of the form M of something times something else. 

Greg: This will give you what I called the spatial connectives will give you all the predicates that are associated with term constructors. What it won’t give you is logical predicates like “and” or “not” or “implication.”

Isaac: No modalities or anything like that? 

Christian: Not yet. 

Isaac: Okay. Cool. I’m on the same page.

Christian: Cool. The way that we’re getting logic into the mix is through this important equivalence. There’s an equivalence between these sieves and sub-functors of a certain special functor. If we think about theory, like our theory, of blank comma M, where I’m using this notation to mean the thing that gives you hom-sets. This is a functor that takes in M to the N, some other object in your theory, and gives you the set of all inary operations. 

This would be like the “true” in namespace logic. It would give you all operations accepting anything. The idea is that what we’re really caring about is the lattice of sub-functors of this functor. Weird words that most people don’t think about very much. The way to picture this is that a sub-functor is just like a subset, except on the level of functors.

This means for every N you’re getting a subset of the inary operations in a functorial way. It’s just saying this tree of operations that you were picking out, for each end that was picking out a subset of the inary operations that satisfied that formula. You took all of the inary ones and you checked, “okay, which ones are of the form M of something times something.” What we’re really thinking about is this structure, this lattice of sub-functors of this universal one that’s picking out all inary operations.

Isaac: You’re saying that it’s sub-functor because that last top-level operation has to be the multiplication, like in this particular example. 

Christian: Yes.

Isaac: Okay, that makes sense. In general, that last operation or top-most operation could literally be any operation from your theory.

Christian: Yes. This lattice is where we get the operations of first-order logic.

Isaac: Maybe I should ask you a question here. Why are you emphasizing the lattice structure?

Christian: Mainly because it’s a shorter word than hating algebra, but it just means we’re going to have intersections, unions, and then we’re going to have top and bottom. A universal one that gives all inary operations, and then the empty one that’s just always empty. Then we also have implication, which is a really useful one. It’s the partial-order version of being Cartesian-closed. It’s like a proposition. If we were thinking of these sub-functors as propositions, then forming the implication would be forming the implication of propositions.

Isaac: Okay. That corresponds to some kind of subset relation—you said something like a poset, right?

Christian: Yes. The actual poset structure, like when you say that P is less than or equal to the universal presheaf, for example, you’re saying P is a sub-functor of that.

Isaac: That makes sense.

Christian: If you think about the power set of a set, instead of all subsets, this forms a nice lattice structure. It’s a Boolean algebra. People would tend to draw it similar to how it’s being drawn here, where the arrows mean inclusion. 

Greg: Sorry to interrupt. That because you’re taking sets of operations. If you were taking some other kind of gadget of operations, like a list of operations, you might have something different. 

Christian: Yes.

Isaac: You naturally have that extension of subset notion to sub-functor because you’re taking set structure.

Christian: Yes.

Isaac: That makes sense.

Christian: We’re inheriting these nice operations from the category of sets. Right now the place where this structure is living is that all of these things are functors from theory op; they’re contravariant to set. These are called presheaves on our theory. Anytime you take presheaves on a category, it forms this nice category called a topos, which has this rich internal logic. That’s what we’re utilizing here. This theory blank M the universal has this lattice of sub-functors. We’re using its logical structure. 

For example, here, if we want to think about prime—being prime means that you’re not the unit and you’re not the product of two non-units. For that first part, not being the unit, you think about this same example with multiplication, but you think about the sieve generated by the unit operation going from one to M. You say, “give me all operations that end in the unit.” that corresponds to one of these sub-functors. Then we can use this negation, where here we define it to be implying the empty presheaf, like implying contradiction, that negation gives you another sub-functor somewhere else on the lattice. 

To form the other one, you notice that here, this part of the formula is a little bit different, because we’re also going to want to lift the operations of the theory to act on these predicates themselves. What we want is a full predicate calculus where we can not only do logical operations on the predicates but we can actually still do the algebraic operations that we had in our theory. This is that pointwise multiplication that Greg was talking about. 

If we have some multiplication on elements, there’s a natural way to lift it to subsets by defining it pointwise. That’s the last piece of the puzzle that we need to specify. Once we have that, we take the intersection of these two sub-functors. We get a certain functor where you plug it in N and it gives you precisely the inary operations that fall under this definition of prime. When you plug in one, you would get the prime elements of your theory.

Greg: I like this construction a lot. There’s this modularity to the construction, which is the thing that I’m looking for, where the source of the spatial predicates is distinct from the source of the logical predicates. We want to be able to vary those independently in some sense. I’d very much like to be able to take a look at what would happen if instead of taking a set of operations, we take some other kind of gadget, like a tree or a graph or a list.

Christian: The last piece of the puzzle to specify the predicate prime is talking about what it means to lift the operations of your theory to act on our predicates. One thing that I forgot to say earlier is the thing that we’re really always using here is the correspondence between a predicate—like X is greater than three—and the set of terms that satisfies the predicate—all numbers that are greater than three. 

Here we’re implicitly going back and forth between those things. We’re thinking of our predicates as the subset of terms that satisfies them. Like Greg was saying, if you have a multiplication on your monoid and you have two subsets of your monoid, there’s a natural way to take the “product” of them, where you just take the product of the subsets, you take all possible pairs of something in your subset A and something in your subset B, and then you multiply all of those pairs. You get all things of the form little A times little B. In this way, you’re lifting M, your multiplication, to your subsets. 

Everything that I just said in the case of subsets extends to this framework of sub-functors. It’s all the same idea. Now we can actually make sense of this prime predicate. We can take the sub-functor corresponding to the unit, negate it, and then take the intersection with the negation of the lifted multiplication of the two non-units. 

The funny thing is when you do this in the theory of a monoid, it’s actually empty. That’s not a problem. The thing that’s going on is that there are no elements in the theory about monoid besides the identity. When you try to say “not the identity” and you work out what that means, every operation, if you just load in N copies of the identity into it, it’s equivalent to the identity. There’s nothing that can really escape the identity. It’s only when you actually get to models of your theory that you get non-identity elements and then you get the actual prime elements.

Greg: Another way to approach it would be to go ahead and put in your generators. If you have operations from one to the generator class, that would do the trick.

Christian: That’s why it’s not going to be a problem for the Rho calculus, because there are plenty of processes that are not the null process without having to go to the models. This is just talking about for this more general theory, being a useful thing to do in general, there is a way to transfer your predicates and your theory to predicates on the models. Then you can get the actual prime elements. 

The reason why we would care about primeness in RChain is if you think about your unit being the null process and your multiplication being the par, then when we say, “all processes that are not the null process and they’re not the parallel of two non-null processes,” then you’re specifying all single-threaded processes that are not the parallel of two things.

This is a very important set of processes to get a hold of because they’re the building blocks of the other processes. Every time you want to think about race conditions or the complexities involved in the parallel terms, you start from getting a hold of these building blocks and then building them up one by one and checking for those kinds of race conditions. 

These predicates that we’ve been talking about can be applied to the theory of the Rho calculus and space calculus and the other kinds of languages that we’ll be using in RChain. This paper called “Namespace Logic” that I highly recommend everyone look at will be implemented through the framework that we’ve been describing.

Greg: There’s a little bit of subtlety here, which is you’re not talking about binding operators. You’ve got the reflection and the reification, but you don’t have the modal operators that would address binding.

Christian: Do you mean the Mu? 

Greg: No, Mu is for a fixed point. You can easily get those in this construction. It’s for after this action there’s a substitution.

Christian: I wasn’t planning on getting too much into this, but the framework that I’m going forward with is Marcelo Fiore’s second-order algebraic theories. The basic idea is instead of thinking about categories with finite products, your generator objects, you specify that they’re exponentiable objects so that you have evaluation maps that do substitution. In this lower left, we can specify the input operation as being of the form N times this exponential P to the N, which means “processes with a free variable of sort name.”

When it comes to defining the communication, it’s a certain communitive diagram that uses the evaluation that comes built-in from specifying that N was an exponentiable object. I can attach that second-order algebraic theories paper to the blog if people want to look at it. This is the way it’s looking like we’re going to have a theory of the Rho calculus.

Greg: That’s awesome. That would mean that we should be able to, within that basic framework, implement the notion of freshness around the observation that a process cannot contain a name that contains the quote of the process (because of the way the grammar works). From that observation, it’s easy to build names that are fresh relative to the names in a given process. You should be able to completely lift that construction up to the predicate level. We should be able to give a predicate for freshness, which is deeper in some sense than the freshness construction that we get in nominal constructions, because for them it’s a built-in idea, whereas here it derives from the structure. That actually would be quite lovely to have both: here’s the essence of freshness, which is that a process cannot contain a name that contains the quote of the process, and then build a predicate for that, and then build up a more generic predicate for freshness. That to me feels like a very pleasant construction.

Christian: Yes, that sounds nice.

Isaac: Does a predicate for freshness have any relation to the hidden name predicate in Caires’s work?

Greg: Yes, that’s exactly right. But for Caires, that’s a primitive.

Isaac: I see.

Greg: Whereas for us, we get to build it out of the smaller parts.

Isaac: Oh, yeah. That’s pretty cool.

Greg: Exactly. We explain more phenomena. He has to assume certain phenomena, where we explain it.

Isaac: That makes sense.

Christian: The more I’ve been thinking about this the past few weeks, the more excited I get, because laying out the framework clarifies what can and cannot be said. It’s turning out that you can say a lot more in this than I would have guessed. 

For example, if we think about this input operation, in the original example, we were thinking about “input on some name space, input on some predicate, and then do some process.” Here, if you lift this operation to predicates, that corresponds to restricting that first N to some predicate. You’re listening on a certain namespace. You could also do it on this P to the N. Rather than just accepting anything coming down the pipe, you can only accept things that fed a certain predicate.

Greg: Yes. Filtration systems. That’s exactly right. Which goes along with that pattern matching. 

Christian: That just seems really powerful. Is that like the deeper Google search that we’ve talked about?

Greg: Oh, no. The deeper Google search is one level up. The way in which I’m talking about generalizing Google is not at the level of a specific model, but at the level of being able to search for code on the basis of its structure and its behavior. It doesn’t matter whether the code is Rho calculus or Rust or JavaScript. Right now the only way to search for code is either through metadata or social—I know a developer who knows a developer.  That’s not sustainable. As the amount of code grows, our capacity to hold in in mind what that code is is limited. This gives us the ability to maintain much larger codebases because we can now search on the basis of what the code does and how it behaves.

Christian: Not only could you view this as a powerful query, but could you also be viewing this as lifting the whole Rho calculus to the level of predicates, to be this second-order version of the Rho calculus where you have a term that’s like receive? You take this first term and rather than just viewing it as denoting a set of processes that satisfies it, you view it as a term in itself that can communicate with a huge set of processes.

Greg: That’s exactly right. I’m sure you guys are familiar with logic programming. Now we’re lifting the idea of logic programming off of the logic that’s associated with simply typed Lambda calculus to any logic that’s associated with some particular rewrite system. The reason that’s interesting is because we want to be able to program at the ensemble level. If you think about programming biological systems, you don’t tell them what to do. You throw in a bunch of entities and because of their particular characteristics they self-organize. The way in which they self-organize satisfies certain criteria, and the way in which you express those criteria is at the formula level. The formula level allows us to do ensemble-based programming, which is what we have to do if we’re going to program either at the biological scale or even at the much smaller internet scale. 

Christian: It just seems so powerful. 

Greg: Yes, exactly. This is the essence of realizability. Realizability is demanding that the meaning of our predicates is a bunch of programs. As long as we start thinking about our predicates as characteristics for allowing agents to self-organize and self-assemble, then we can use it as this higher-order programming model.

Isaac: I honestly like hearing you guys talk about this stuff. This is all really cool. My thought, and maybe this is just a naive thought, is why it hasn’t been done already?

Greg: In some sense, I’m following a thread that is many decades old. Abramsky came up with this domain theory in logical form, in which you could generate a logic from a domain theoretic specification along with computation. It was very restricted in the sense that he was basically only considering models of computation that were essentially like the Lambda calculus, which is fine, except that if you take into account the natural world, most computational systems are concurrent. That’s too restricted a class of notions of computation to be practical in a realistic setting. Certainly, by the time you get to programming the internet, it’s got to be concurrent. There’s just no way around it. 

By the way, the approach is brilliant. Many people still are rightfully in awe of his construction. I’m only criticizing it because it’s so good. It’s worthy of critique. The other thing about it is that domain theory is largely semantic as opposed to syntactic. The shift in the LADL approach is to be predominantly syntactic since most notions of computation ended up being symbolic. Lisp is clearly a symbolic form of computation. Java straddles the fence on this. Certainly Prologue and other things, they’re symbolic forms of computation. The Pi calculus is decidedly symbolic. The Rho calculus is symbolic. 

Shifting over to the syntactic side is actually really important because you are able to get a tighter connection between what you’re doing and the operational semantics. The denotational semantics erases a lot of information that’s available in the operational semantics. That information is exactly what we need to reason about things like complexity. It’s the same idea. Caires picks it up again with the spatial behavior logic. They just didn’t think to address it using category theory. The thread has been there for a long time and it took someone foolish enough to say, “Hey, I think the scope of this is much larger.”

Isaac: That shift to the syntactical approach also lends itself to doing all of these checks at compile time. You just need the code and that’s it.

Greg: That’s exactly right. You’re not going to some intermediate format that is more semantic in nature.