diff --git a/podcast/66/transcript.markdown b/podcast/66/transcript.markdown new file mode 100644 index 00000000..bfee6b53 --- /dev/null +++ b/podcast/66/transcript.markdown @@ -0,0 +1,217 @@ +*Mike Sperber (0:00:15)*: Welcome to the Haskell Interlude. I’m here with Niki Vazou, and we talked to Daniele Micciancio, who’s a professor at UC San Diego. He’s been using Haskell for 20 years and works in lattice cryptography. We talked to him about how he got into Haskell, using Haskell for teaching theoretical computer science, and of course, for his research, and the role type systems and maybe comonads or coinduction could play in the design of cryptographic algorithms. Along the way, he gave an accessible introduction to post-quantum cryptography, which we really enjoyed, and we hope you do too. + +*Niki Vazou (0:00:48)*: Hello, Daniele, do you want to tell us how did you start with Haskell? + +*Daniele Micciancio (0:00:51)*: Yeah, sure. That’s in fact an interesting story. I discovered Haskell about 20 years ago. It was 2000, early 2000, and this was at an FCRC event, the Federated Conference, when they have many different conferences in the same location, which happened to be in San Diego. So I’m in theoretical computer science cryptography. That’s my general area. So I’m not exactly a PL or Haskell person, but there was a great opportunity to meet people from other conferences. And interestingly, at lunch, the event was outdoor. All the food lunch was outdoors in San Diego. So typically, you sit outside. There had these big tables for maybe, I don’t know, 10 people where people would gather from different conferences. So there were maybe two people, me and somebody else from the theoretical computer science conference, and people from other areas. And then just to break the ice, somebody said, “Hey, let’s just do a round of introduction.” And everybody says, “What’s your favorite programming language?” which is kind of a silly question. It’s like asking people, “What’s your favorite color?” And people started going around. And when it came to me, I said, “Okay, theoretical computer science. We use LaTeX to write papers, and then we compile it to PostScript.” At the time, PDF was still not the standard. The PostScript was the target implementation language. And then after me, there were these two folks from Galois, the company. At the time, it was still called the Galois Connections. And they said, “Oh yeah, we work with Galois Connections, and there we use Haskell, which is a functional programming language.” Nobody had ever heard of it before. So there were people who asked, “So what is this?” This was, keep in mind, early 2000. So this was a time when Haskell still had that pretty good community, but it was not as big as a – + +*NV (0:02:51)*: A smaller one. + +*DM (0:02:52)*: Yeah. So it was a big growth shortly after that, I think. And people asked about it. “Oh, is it like, I don’t know, the dot-com?” “Oh, no, no, no. Haskell is a pure functional programming language. You don’t have – yeah. Everything is a pure function.” And from that little that I remember from studying or reading about functional programming, which was not been my area, was not much, I asked them, “Oh, it’s the one that uses strings to do input/output.” And they said, “Oh, no, no, no. Haskell uses monads,” which sounded like a joke, but no, they were serious about it, and it was really, really intrigued. These are people from industry. If they’re using a programming language, it is because it’s useful. And how can a language that uses monads, and just to do Hello World, you need to use monads. It was kind of I wanted to understand. So I just took down the name “Haskell,” then I went to look for it. That night or the following day in my office, I searched for “Haskell,” and I started looking into Haskell. And the first time I looked at the Haskell program, I looked at it and I thought that, “Oh, I know this language. I’ve seen this before. I know exactly what it is. The language was something that I definitely knew. I simply couldn’t remember where I had seen it before.” And then when thinking about it, I started remembering that when I was undergraduate—so that was in the early mid-’90s—we had a course where the implementation – so we had to implement the programming language compiler interpreter of sort, and the implementation language that we had to use was Miranda. So that was before Haskell, but the syntax is very, very close. And then when I remembered the name, I searched for “Miranda,” and I found out, oh yeah, yeah, Haskell was actually standardized around that time. So that was before. Maybe Haskell was already out there, was already standardized because it was in maybe 1992, perhaps the year, but still Miranda was still around. So that was my first exposure to Haskell, and Miranda really liked the language.  + +*NV (0:05:05)*: And where was that? Where was the Miranda course? + +*DM (0:05:09)*: So this was in Pisa, University of Pisa, the computer science program. And that was perhaps the only course that used the functional programming language. And I remember I liked the course. I liked the language at the time. In fact, I was – of course, it was lazy. And I remember it is one of the courses for which I actually do remember something from doing it, when I was doing it, which was we had to implement the language, which had functions. I forgot if it was to implement recursion or higher-order functions or something, but the standard way that we had studied and we needed to implement was to use closures. Then, when working on the project with my teammates, Marco and Paolo, we figured that if the language is lazy, if you have a store, you don’t really need closures. You can just put a loop in the store, and it’s still going to work, which felt a little bit like cheating the assignment. So before doing that, we asked the professor, the TA, if we were allowed to do that. Of course, the professor was very happy, was surprised that we had figured out that we could do that.  + +I remember we were not even sure ourselves if it could work. I mean, I suggested doing that, and then we had doubts about, well, isn’t it going into an infinite loop? Is it going to work? I mean, at the time, keep in mind there was undergrad student who not had much experience. And that was a very simple way to see if it worked, which was to just do it and run it. I thought it was going to work, but it’s only after we actually ran the thing and it didn’t loop and gave the correct result with the test program that I felt sure, 100% sure, that yes, that’s how lazy function evaluation works, and you can do some very, very amazing things with it. So, that’s how I started.  + +But then it didn’t really click in at the time. I don’t know why. I guess there were just not opportunities to use Miranda, Haskell, or any similar languages after that for me, maybe for other people in other places. Over there in college, I worked a little bit on logic programming, Prolog. That’s what I did as a bachelor thesis with Georgia Levy at Pisa. And that’s what most people – from what I remember, that’s what most people were doing there in the department. The logic programming was the topic that many people were interested in at the time, more so than functional. So I simply took a long break from functional programming. I had a feeling that I liked functional programming better than Prolog, than logic programming. I mean, Prolog seemed much more specialized. I mean, functional programming feels a little bit more like a general-purpose programming language, more so than Prolog. + +*NV (0:08:19)*: I mean, now there are many trends that combine these two languages. + +*DM (0:08:23)*: Yeah, yeah, yeah. Of course. So you can combine the two things. And I remember waiting – in fact, okay. So when I was in grad school, which was, that is something totally different. I was at MIT. I was working on cryptography. That’s when I discovered cryptography, and I started working on that. But anyway, so moving forward to Haskell, okay. So when I was in college, I was – every now and then, it’s to see, ”Okay, so what’s going on? Is there something that combines the logic and functional programming?” And I remember already back then there was something, but there was nothing that people were using. I mean, again, I was doing this not as a programming language researcher but just as a potential user. And before something you can actually use it, it takes a little bit more than just coming up with a language and having an implementation.  + +But anyway, so back to Haskell. So when I rediscovered Haskell much, much later, I started looking into it, and I was beautifully surprised by how nice the language was. And I read – so I didn’t learn – I had to learn it again. So I did remember something from back then, but not using it. I basically learned Haskell 20 years ago. + +*NV (0:09:38)*: So when you had this discussion with the Galois people, you knew what was monad? + +*DM (0:09:44)*: No.  + +*NV (0:09:44)*: Because I’m surprised, like what you said is that you hear somebody telling you, “We don’t have side effects, we have monads, and this is how we do it.”  + +*DM (0:09:53)*: No, no, no, no. + +*NV (0:09:53)*: And instead of discouraging you, you said, “Oh, that sounds amazing.” This is like usually what discourages people, no? + +*DM (0:09:58)*: No, no, no. Yeah, no, no, no, no. I said that to me, it sounded like a joke, and I meant it literally. I mean, I knew the word “monad,” but it was not from programming languages. That was from Leibniz, from philosophy.  + +*MS (0:10:12)*: Okay. I was wondering the same thing.  + +*DM (0:10:13)*: It’s nothing to do with – yeah, it’s the same word. I’m not sure if there is any technical connection between the – I don’t think so. + +*NV (0:10:21)*: I think there is, but I don’t know that. + +*DM (0:10:24)*: Yeah, I don’t know enough about philosophy. I should probably brush up on my philosophy. + +*NV (0:10:29)*: It goes through category theory for sure, but I think there is. I don’t know. Okay. + +*DM (0:10:35)*: Okay. So yeah. So I knew the word. I had no clue what the monad was, and it took me a little bit. I mean, there were monads in Miranda.  + +*MS (0:10:43)*: Really?  + +*DM (0:10:44)*: I think that’s something that came up later, correct? + +*NV (0:10:47)*: Hmm. + +*MS (0:10:48)*: There were no monads in Miranda, right? + +*DM (0:10:49)*: Yeah, yeah. So that’s something that was totally new for me. And so I had to learn Haskell. So what I did was to pick this – I found this Gentle Introduction to Haskell, which I think was great. And then later on, I read that people were complaining about this not being gentle at all and being very hard to read. They should be called “Brutal Introduction to Haskell.” But honestly, I thought it was great. It was exactly what I needed and what people like need. You need something concise to the point for people that know what programming is. And they know they have the concepts. They just need to learn the language. Plus, there are some new concepts that are used in Haskell, which I was not familiar with, but it’s something that you can do within a survey. And I know there is often a lot of discussion about making things easier for everybody, but I mean, who’s taking care of people like me? I think that people that are not already in the field but they are expert in something else that is still within computer science are often left behind. So I thought that that was a great way to learn Haskell. And then from there, I think it was written there. The suggestion was okay, and then from here, just read the standard library. There was the suggestion to learn more about Haskell, which at first seemed a little bit daunting. You read code to learn the language.  + +But Haskell, one of the nice things about it is that you can actually learn a lot from reading programs, and the programs are readable or nicely written. That’s probably one of the things I like most about Haskell, that it makes a good way to express programs and algorithmic ideas in a way that you can write real code. And at the same time, it is concise and readable. There was a time when Real World Haskell was being written. So I remember reading the chapters of the book when they were being posted online for comments. I remember even sending a few typos to the author. Okay, so that’s how I learned Haskell and started using it. I really liked the language. I thought it was very nice, especially because it has a small core, which is very easy to learn. And everything else is built on top of it. The language has some very good abstraction mechanisms. So there’s no limit. I mean, there are limits, okay? So when you get into dependent types, of course, it starts being a little bit tight. But compared to the typical imperative programming language, Haskell is way above what you can do in terms of abstraction. But you can start very, very little and then go up a little step at a time. And when you go up, it is more about ideas that are implemented on top of Haskell rather than extending the language itself with new features. And of course, GHC has all these sorts of language extensions that were added to it. But even without that, you can really go a long way with a core Haskell and type system.  + +*NV (0:14:18)*: And have you ever tried to use it to build something for your work or for a class? + +*DM (0:14:25)*: Yeah. So since I learned Haskell, I’ve been trying to use it everywhere, every time. That’s my programming language of choice. It became very quickly my language of choice, and I’ve been using it mostly small things. I don’t write much code. But I’ve been using it for myself. If I need to write a script, I write in Haskell. So I read a lot of papers. My work is more paper and pencil than actually the computer, but one of the activities that I perform on the computer is to download papers from the web to read them. And I read them on a tablet now, these days. It’s one of these electronic ink tablets. And the papers, I know where they are. There is this Cryptography Archive ePrint where everybody posts their papers. So I wrote a little script that I give the ePrint number. It downloads the paper. It also downloads the title, the author, all the information. And then it writes a markdown. In order to keep the papers organized, it produces some markdown to keep an index on the tablet. It downloads the paper, the PDF, it puts it there. And all of this was done with a few lines of Haskell. And then it could have been done in many other languages. But I find it convenient, especially the code is small, so there is not that much to remember in terms of syntax. One of the problems with using a programming language when it’s not something that you do on a daily basis is that you don’t remember exactly what syntax you should use here and there, and Haskell, from that point of view, is very, very minimal. And the syntax is also well designed. It makes sense.  + +So I’ve been using it also in – so yes, you said teaching. I tried to use it in teaching. Probably the most interesting example is the theory of computation course. So this is not a programming language course. It’s a course where students study how to model computation mathematically. So a good part of the course is on automata theory. And then from there we move into Turing machines. And typical topics in the course is to show that regular languages, languages that can be described and recognized by automata, are closed under various properties. So they are closed under union, they’re closed under concatenation, and from there you go into regular expressions, and then you can transform regular expressions into automata. And all of this is done on paper, typically. So these are textbooks. So they have pictures. They have pseudocode describing how to build automata to do one thing or another. They may be small automata. For example, automata. Usually, if there is a picture describing the automaton, the automaton is not doing anything interesting. For example, checking if the number of A’s in the input string is even. I mean, that’s something that you don’t really typically care about. And of course, automata can be used to do a lot of interesting things. They’re used everywhere. They’re used in lexical analysis, in parsing, in context-free grammars, and there are many other things you can do with them.  + +So when it comes to closure properties, if you have, for example, an automaton for language A and automaton for language B, typically the way you prove that the union of the two languages is still regular is by giving a construction that combines those two automata and produces an automaton for the union of the two languages. And so the proof is constructive. It gives not only proof that the union is regular. Regular means it is the language of a finite automaton. It also gives a way to do it, to build it. And typically, the interesting part of the proof is the construction. And then once you have the construction, you prove by induction that the construction is correct. But typically, that’s pretty intuitive. The construction is the idea.  + +Now, usually, this is done as a mathematical proof written on paper. So what I did was to use Haskell to basically do this, but in a way that you can program the construction and give assignments where students are asked to show that languages are closed under a certain property by writing a program that takes as input to automata and produces as output the automaton for the combination, which normally would be a pretty complex task for just a simple homework assignment. So what I did was to write a small parsing library that allows to draw automata using an editor. There are various editors for automata. There are various pieces of software to work with automata theory, mostly developed for educational purposes, where you can draw diagrams, state transition diagrams on the screen, and then you save it in some form of XML format. Just using some XML parsing library, it’s very easy to import that into Haskell and turn it into an abstract data structure, which is very close to the way automata are described mathematically in the book. So the definition of the automaton is an automaton is given by a set of states (Q) and the transition function that, given the current state and the input symbol, tells you which state you should move to. So it’s just a simple function. And then, of course, you have the input alphabet. You have a set of accepting states that is just a collection of objects. And importantly, the set of states should be fine, which is not something that can be easily captured in Haskell. But that’s fine. So we were not getting to that level of detail in the implementation.  + +But something nice that you can do is to use the type system to perform these constructions using the Haskell types to do what you want in mathematics. In the mathematical constructions, often you want to take a Cartesian product of the set of states, of the two input automata, or maybe this joint union of those states. When you build the concatenation of two automata, you first go through the first automaton, then you move to the second one. So you want this joint union of the two states. And this is something that can be done very easily using sum and product types using pairs and left and right. + +*NV (0:21:28)*: Not dependent types, like just polymorphism basically. You’re using the polymorphism. + +*DM (0:21:33)*: Yeah, yeah. Just using polymorphism. So the automaton is polymorphic in – so the input automaton is polymorphic in the state type. And the output of the construction uses a state type, which is a combination of the two input state types. And you don’t need to go through the adding to number the states and the final mapping between the numbers from one to n to the actual states in the automaton. Of course, with enough, with a more powerful language, with the dependent types, you could define a finite set to show that certain operations preserve the finiteness of the set, but that’s too much. This is a course for all students, not just theory students, but students who have never seen Haskell before. So I just wanted to use the minimal subset of the language that could be used to describe this construction. In a way, it was almost identical to the textbook. So the solution typically was three or four lines of Haskell, lines that were even syntactically very similar to how the textbook was describing the construction mathematically. So you have to describe the transition function of the output as a combination of the two, which can be done easy. And you combine those two functions, and you define a new function. So Haskell was a very, very good fit for that.  + +And then once you can do this programmatically, you can do examples that are much more interesting than what you can do by hand. For example, you can do the following using automata. You can also define non-deterministic automata where the transition function gives you a set, and you can use non-deterministic automata, for example, to model a noisy channel. A channel where you send an input string and the output string will contain some errors, but not too many. It will contain most one error every five symbols.  + +And then you can ask to design an automaton that works as an error-correcting code. It takes an input string and then encodes it in a redundant way. And also, a similar decoding algorithm, also coded as an automaton, that takes a noisy encoded string and should correct the errors and recover the original input. And once you have this automata, if you know that the automata are closed under concatenation, you can run the program on these three automata, combine them together into a single one, which should just be the identity function. It should map any input to the output while correcting the errors in the process. And verification, checking if two automata are equivalent, that’s also something which is easily done using the closure properties of the automata. So you can build an automaton for the symmetric difference of the two languages and then check for emptiness. Check that there is no path taking the automaton to an accepting state.  + +So this is something that doing it by hand would’ve not been feasible. Writing programs that implement disclosure properties is a good way to see these proofs and constructions hands-on, being able to run them. Then you can design your own input automata, pass them as input, look at the output, and run it. And I thought it was a great way for us to bridge also the gap between a fully theoretical approach and something more concrete for a general computer science audience.  + +So I’m not sure if it was particularly successful by this attempt because even if I was using a very small set of Haskell, a subset of Haskell, I encountered some resistance from the students. The fact that it was a new and different programming language and their programs wouldn’t compile is not something that was making the students happy. And they’ve already gone through two years of struggle to get their Java programs to compile. Finally, they could write programs in Java without getting error messages, and getting back to the error message stage felt like the programming language’s fault. + +What the programming language was doing actually was providing a type of error messages, which was very different beyond the pure syntax. So if the syntax is correct, then in Haskell, you start getting error messages about the types. Type errors. The type errors typically say something about the mathematics that you’re trying to encode with the construction. It’s exactly the errors that I was seeing the students making when solving this problem by hand. That is a sequence of mathematical symbols, but you cannot even parse. It makes no mathematical sense. I mean, you have something which is a function, and you give it as input, and you put in a place where instead you should have a set, or the other way around. And this type of discipline of using the right mathematical object where it should be used is something that is a very good match for what you get from the simple polymorphic Haskell.  + +We were not using monads. We were not using type classes. It was really basic Haskell. So students were given just like one hour introduction to basic Haskell in a discussion session by a teaching assistant, and they were supposed to go from there. This is not a first-year course. This is a course for third- and fourth-year students. So I was also assuming that a student at that point can learn a programming language on his own, as most people do in regular life. Taking classes is how you typically learn a new programming language. But students were a bit frustrated at first. And those who had enough energy to make it through this first barrier, in the end, they were very happy about it. In fact, I still remember a nice story about the student that once came to my office hour after complaining, complaining, complaining, and blaming Haskell for all the problems of the world. One day, he came to my office and was very happy and told me, “Professor, professor, I finally found the trick to write programs in Haskell.” “Oh, there is a trick. So what is the trick? That’s very useful. We should tell all the other students in the class.” He told me, “The trick to write programs in Haskell is that before writing the program, you should think about it,” which is kind of obvious. In retrospect, that’s actually a very good feature of Haskell. You should think before writing any line of code. Thinking is good. So the fact that the language just with the type system forces you to spend a little bit more time thinking about what you want to do before you start doing it, that’s a good thing. Sure, it makes it harder to write the first line of code. It will take more time. But overall, you save a lot of time in the long run. And that’s one of the reasons I like Haskell. Sure, I have to think a little bit more. I never thought about it explicitly that, oh, one good features of Haskell is that – I mean, I typically try to think about a problem that I want to solve before I start writing the solution. Yeah, in some sense, with Haskell, you have no other option. So that was one of my experiences in using Haskell in education.  + +It was working very well for some of the students. Some students really liked it. But at some point, I started making it an optional part of the course because some students simply didn’t want to do it. In big class, 200 students, only 100 students would submit the first Haskell assignment, and then only 50 students would submit a second Haskell assignment, only 25. It was going down by a factor of two with every assignment. + +*NV (0:30:10)*: So what was the alternative? To do it on paper or to do it in another language, or not to do it at all?  + +*DM (0:30:18)*: No, the alternate – okay. So some students were asking, “Oh, I know how to do this, but I’d like to do it in Java.” And I told them, “Yeah, sure. I mean, I don’t think it’s easier. I think it’s harder to do. You have to write much more. And getting it right is also more difficult.” And once given the option, typically it was clear to the students that that was just wishful thinking, that the difficulty was having to use Haskell, and no, no, no. I still think that Haskell is a very good language. And of course, with dependent types, you could do something that more closely matches the mathematics, but that would require more machinery. In some sense, from an educational point of view, I think Haskell makes a lot of sense to use a language that as a formal syntax, a formal semantics, it is clear exactly what it means, but it is close to how you would write things mathematically. So the goal was to focus on the mathematics. I didn’t want the programming language to be a distraction. And I think that doing this in a mainstream programming language like Java or so, I think, would’ve been a distraction for the course. And plus, it would’ve not helped the students to learn how to write things mathematically. So, writing programs in Haskell, there was also a way for students to learn something when they’re asked to write things on paper, which typically they have to do in a final exam, in an in-class exam. So, interestingly, when asked to write things in mathematics in the exams, when I was doing this, some students would write actual Haskell. + +*NV (0:32:04)*: It is easier. + +*DM (0:32:05)*: Or would write some mathematics, some math-like, Haskell-like mathematics, so using a different symbol. So for the Cartesian products, instead of using the times symbol, they were using the comma, the Haskell syntax, which was fine. I mean, that’s just a small syntactic difference, but it was exactly – it’s clear what they meant, and they were using the right mathematical structures. + +*MS (0:32:34)*: So that’s teaching, but can you tell us a little bit about your research work, maybe in terms that Haskell programmers might understand, and if Haskell plays any role in that? Supposedly algorithms there could also be written as programs, right? + +*DM (0:32:48)*: Yeah, yeah, there are algorithms to be written. And so, generally speaking, algorithms, they are theoretical objects. They’re typically written in pseudocode. Pseudocode is something I don’t quite like. The reason is you want to use something concise, so something short, and you want to abstract from irrelevant details. But the downside is that pseudocode is just free-form writing. You can write whatever you want. There are no constraints, and many times you end up finding things that are not even clear what it means. It’s ambiguous, it is hard to understand, and it is nothing like code. You would think that in theoretical computer science, in mathematics, you should be precise. Pseudocode is exactly the opposite of being precise. And so languages like Haskell, I think they come somewhere in between the needs of writing things concisely and abstract from details which are not relevant, while at the same time having precision, knowing exactly what it means. So I’ve been trying, if not using Haskell, at least to use ideas from Haskell to find better ways to write programs and to write the code that is used in my research papers.  + +And so my main research area is lattice cryptography. Okay. So that’s something that is based on using lattice. Lattices are like vector spaces, but over the integers, vectors with integer coordinates. So you can add them up, you can subtract them, but you cannot divide them. You cannot divide them by a custom. You can multiply them by an integer, but you cannot scale them. You cannot divide them by two. So it’s something very, very simple. And the reason these objects turn out to be interesting and useful in cryptography is that restricting problems to the integers makes the problem harder. Intuitively, you may think that makes them easier, but that’s not quite so, especially when you focus on small integers. Consider the simplest case. I give you a linear equation, AX + B, with many variables. Over the reals, that’s very easy to solve. You can plug any numbers you want. Then you divide by A, and you find your solution. Over the integers, it’s a little bit harder. You can plug some numbers. You have to do some form of GCD computation to find an integer solution, but you can still do it efficiently. But if I ask you to find a small solution to the linear question, for example, a (0, 1) solution, a solution with coefficients which are 0 and 1. I give you a sequence of numbers, A1, A2, A3, and you have to find 0-1 multipliers that produce, as a sum, a target value. So that’s exactly the subset sum problem. That’s the problem of finding a subset of the input numbers that adds up to be, which is a classic NPR problem, a problem for which there is no known efficient polynomial-time algorithm. And for which there is no such a – not only the algorithm is not known, assuming that P is different from NP, such algorithm does not even exist. So that makes the problem hard.  + +And you can think of lattices as a sort of higher-dimensional generalization of subset sum. So in some sense, subset sum is already the – the set of solutions to subset sum is already a lattice. So if I give you an equation, AX = B, and I ask you find all the integer solutions – let’s do AX = 0, a homogeneous equation. So then the set of solutions is closed under addition. You can adapt to solutions. It’s still a solution. You can multiply a solution by an integer. It is still a solution. So the set of solutions forms a lattice. You cannot divide a solution by two because if you divide by two, it may not be an integer solution anymore. So that’s a lattice. And the classic problem on lattices is, given a lattice, find a short vector in it. For example, it could be a vector with 0-1 coordinates. So it is a hard problem. The problem doesn’t seem very useful. It looks like that you can – + +*NV (0:37:19)*: If you are with small numbers and everything, why don’t you try them all?  + +*DM (0:37:22)*: Because you have many of them. So it is true that you can try 0 and 1. But if you have an equation with n variables, say 100 variables, each one of them is 0-1, you have to try to do the n possibilities to try them all. So the dimension of the lattice, the number of coordinates, is actually used as a security parameter in cryptography because the number of solutions goes up exponentially in the number of variables. And typically, in practice, lattice is being used these days, or is close to being used, because it has recently been picked as one of the standards for post-quantum cryptography. So the type of cryptography which is used everywhere these days is based on number theory, which theoretically can be broken using quantum computers. We still don’t have quantum computers capable enough to solve those problems, to, say, factor numbers. I think the biggest number that has been factored by using quantum computers so far has been 21, 3 times 7. + +*NV (0:38:28)*: Oh, that is something. + +*DM (0:38:31)*: So, yeah, no, there are major challenges in building large-scale quantum computers where you can run efficient factoring algorithms. But the factoring, the efficient quantum factoring algorithms, have already been discovered a long time ago. So, Peter Shor, in the mid-’90s, found such an algorithm, and that’s what brought attention to post-quantum cryptography in general. It was not still post-quantum. Post-quantum means that it is cryptography that you can run on regular computers. You don’t need a quantum computer to run it. But it will remain secure even after, even when quantum computers get built. And one of the reasons people worry about it now before they are built is because you use encryption to protect your data. You want to keep it hidden and secret. And these days, typically, you transmit your data over the internet. Internet is an open network. Now, anybody can collect this encrypted data being transmitted. So even if the quantum computer does not exist, you can store the data. And then when the quantum computer is built, you can decrypt it. So you can perform a type of—they call it—”harvest now, decrypt later” attack. So this is considered a concern even before quantum computers are built if this is something that may happen in the foreseeable future, right? Now, I’m not an expert in quantum computation, especially quantum engineering, the construction of quantum computers. But if you ask experts working in the area when will we have a large-scale quantum compute, they typically tell you, “Oh, we’ll have it in 10 years,” which is not that much in the future. I mean, I heard this 20 years ago. I heard that 10 years ago. It’s not entirely clear. If it’s 10 years from today or whatever today is, but anyway. + +*NV (0:40:34)*: So the lattice-based cryptography is protected from the quantum computers, or no? + +*DM (0:40:42)*: Yeah. In the mid-’90s, which was also the time when I was in grad school at MIT, and that’s when I discovered cryptography. I didn’t know anything about cryptography. So this quantum algorithm, factoring algorithm was discovered. And then there was also a major breakthrough on the construction of cryptography based on lattices by Miklós Ajtai that showed that – so building cryptographic functions is a very difficult task because you’d need not only a hard problem. So the cryptoanalysis problem should be hard, but it should be hard in a very strong sense. So in algorithms, typically you select to solve the problem. If you have an algorithm that, given an input instance to the problem, it produces the correct output, and you expect the algorithm to always give the correct result. If for some input it gives the wrong result, then you cannot claim to have solved the problem algorithmically.  + +Now, in cryptography, the problem is, given a key, a cryptographic key, break it, for some definition of breaking it. And the key has been chosen at random. Now, in this setting, even if you can solve the problem only with probability 1%, or even 0.001%, that means that you will break a good fraction of the cryptographic keys that are being used. So, for a problem to be usable in cryptography, it should be cryptographically hard, meaning it should be hard to solve it even with a very small probability when the input is chosen at random. And finding problems which have this type of hardness, or hardness in the average case, is a much more difficult task than finding a problem, which we don’t – problems that we don’t know how to solve algorithmically, they’re everywhere. In fact, typically, the problem is that we wish we had a better solution, but we don’t have it.  + +So Ajtai was able to show that for a function, which is pretty much the subset sum function I described to you before, if you can solve random instances of the subset sum function – so I give you a random equation A and you can find a small solution. If you can do that, even with a very small probability, then you can solve lattice problems in the worst case. You can use that to build an algorithm that, given any input lattice, finds a short vector in it with probability exponentially close to one. So, assuming that there is no strong algorithmic solution that solves lattice problems with probability practically one on every input distance, then you can build lattice problems in the form of a subset sum-like function, which are extremely hard. They are hard for randomly chosen instances with a probability very close to one. So the probability of breaking it will be very close to zero. So these pinpointed to lattices as a problem to be used in cryptography. And since then, lattices, lattice problems have been at the top of the list that people working in quantum computing, quantum algorithms want to solve. So people have been trying to find the quantum algorithms for these problems for 30 years by now with very, very little progress. There are some specialized versions of lattice problems that got solved using quantum computers, but those that were being used to motivate and support lattice cryptography, lattice-based cryptography, they are still considered secure. So we don’t have a proof that there is no quantum algorithm that breaks a thing. So the same way as we don’t have any proof that P is different from MP. If P is equal to MP, you can guess the key, and then everything is broken. You can guess the secret key non-deterministically, and then everything is broken.  + +But anyway, so this is to – yeah. You asked me what I do in lattice cryptography. Lattice cryptography with a post-quantum crypto angle is often in the news. There is another thing about lattice cryptography that got a lot of attention, which is the ability to compute on encrypted data. So you can build encryption schemes where you can encrypt the data. And then, without knowing the secret key, without encrypting the messages, in fact, without even knowing what the secret key is, you can take two ciphertexts and add them or multiply them together. You can perform operations on encrypted data. And once you can do additional multiplication or basic Boolean gates—AND, OR, NOT—then you can do any computation modeled as a circuit, Boolean circuit, or arithmetic circuit. And to this day, essentially, lattices is the only type of cryptography that supports this type of applications.  + +We don’t know how to build homomorphic encryption—that’s the name of this—using pre-quantum cryptography, using factor, the hardness of factoring integers. So these two things together got a lot of attention. And homomorphic encryption is still a long way from being widely usable on a large scale, meaning that it is not efficient enough to be used. In principle, you could use it to do an encrypted Google search. You want to search for something, you encrypt your query, you send it to your search engine. Google being – I mean, you choose where to send it. And without knowing their query, they run their search algorithm, they find the result, and the result is still encrypted. They send the result back to you, and you decrypt it. So in principle, this would allow you to perform a web search without revealing the query or the answer to the query. But if you were to do this at that scale, then the program would be way too slow. Plus, it would have to scan the entire internet.  + +One of the difficulties is that when working on encrypted data, and if you want to keep everything hidden, even the control path of your computation should be hidden. So the control path reveals information about the input. So for this reason, the computations are typically described as circuits, not as sequential programs with loops, regression, or other sequential control structures, because those are things that reveal the control flow of the program. And if the data is encrypted, you don’t even know the control flow. You cannot even tell what the control flow of the program should be. Should you terminate the loop or go for another round? You don’t know. The data is encrypted. You can perform the test in the for loop. The result of the test is still encrypted. You don’t know if it’s saying repeat the loop or not. So this is something that, of course, changes completely the programming model.  + +None of this is about Haskell. Haskell is something that I try to use. Okay. So something that I do, there’s a new programming language that I want to try. I try to write a lattice reduction algorithm, the LLL. So this is a classic algorithm which was discovered in the early ‘80s by Lovász and Arjen, Hendrik Lenstra, and it is an algorithm that had a lot of applications. It’s an algorithm that solves the lattice problems, but only approximately. A vector, which is not the shortest vector in the lattice, is at most like a million times longer than the shortest vector in the lattice for some large approximation factor. The approximation factor is actually exponential in the dimension. Still, it is highly nontrivial to find such vectors. And the algorithm turned out to be useful in many, many, many applications outside of cryptography, too. And of course, it is relevant to cryptoanalysis of lattice-based cryptography.  + +I don’t even know how many different versions of the LLL algorithm I implemented in Haskell, trying to see if it could be done well. That was not a good experience. That’s something I was doing mostly just to play around with Haskell. And you normally would expect that, oh, I’m doing mathematics. Haskell is a mathematically nice language. It should be a good fit. So when it comes to working with vectors, integer vectors, and performing operations really, really fast, I know that there is a number of projects that have been explored in the use of Haskell, especially when it comes to parallelization, but I’ve never been successful in it, in finding a good way to – I mean, I could write – the nice thing is that I could write this LLL algorithm, which is highly nontrivial. I could write it in a very small number of lines of code. The code was also fairly readable. And what was written was something that also corresponded to meaningful mathematical concepts like Gram-Schmidt orthogonization and so on. So the program was readable, but the performance was not good, and I never figured out why it was so. I somehow suspected that the reason was that lattice cryptography often uses integers, which are very big beyond the 64 bits, and I had the feeling that the way this was being compiled in Haskell is that every time you were performing an operation on two vectors, it would create a new integer for the result. It was not overriding.  + +So this is something that using machine-sized integers would work very well in Haskell. It is not the language, it’s the compiler. But when working with a long precision integer library, it was not properly optimizing the code. So it was allocating a lot of memory. The memory was also being disallocated. So there was no memory leak, but still, the memory allocation was taking a lot of time. So this was a long time ago. Perhaps it is better now. Perhaps I should try to compile my programs from 15 years ago, 10 years ago again and see if now they are any faster.  + +*NV (0:51:28)*: But you were using in-place update vectors, no? You were using Haskell’s vectors on your own? + +*DM (0:51:34)*: Yeah. So I was trying to use array, so yeah. I also wrote a version with lists. But yeah, I tried to use arrays with the mutable arrays, doing the in-place update and so on, and I think what was happening is that the – what is the technical term? The tank? So the array is an array of finite-size values, which are pointers to the integers. And that was being overwritten in place, most likely. But chances are that it was every time pointing to a different location. So the really time-consuming part, which was allocating the space, these are integers that can even grow in size and shrink in size. And when these programs are implemented in C using GMP or some other C-based, highly optimized library, the library goes to a great extent to try to avoid allocating new memory. So there are special versions of the instruction that says write it in place. And you write in the program that no, there is not going to be overflow. You don’t need to allocate new memory. Just do that. And to me, this always seemed like a type of problem that would’ve been a great fit for Liquid Haskell.  + +*NV (0:52:54)*: Yeah.  + +*DM (0:52:54)*: You don’t need the full power of dependent types. You wanted to keep track of the size of the vectors, the size of the numbers involved. + +*NV (0:53:03)*: Yeah. But then you need to tell the compiler, “Okay, that’s good. I know what I’m doing. Do not go.” + +*DM (0:53:09)*: Yeah. And there is no convenient way to do that. The other thing for which I didn’t find a convenient way to do it is that on the research lab, this implementation side, certain things, you can write them in assembly and get very good performance. And it is not that hard. I mean, I’m a theoretician. I could do that. So there was a little bit of a surprise when, after – I mean, I looked a little bit at assembly when I was a child with 8-bit computers. So I looked at the assembly again 20 years ago. I was surprised to see that instructions used to be three letters, three consonants. Now they’re still all consonants, no vowels in the assembly instructions, but now they are between seven and 12 letters. They have this very, very long intel, especially when it comes to ADX-type of instructions when you want to use register-level parallelism, which is a very good fit for lattice cryptography. So you can really gain a lot by using those instructions.  + +And cryptographic functions, they tend to be very small. It is not a lot of code to write. So you can write highly optimized functions. And I was not really writing assembly. I was writing C but then compiling to assembly, then looking at the result because I didn’t know the instructions. Of course, I’m not going to read a manual. Then, from the instructions, I was trying to figure out the assembly, and then I was tweaking the C programs and put in the right directive to get what I wanted, which was clear to me. + +So I was not able to do anything like this with Haskell. In fact, I’ve never been able to even figure out how to use this ADX, these streaming instructions. They’re supported, and you can get them to some extent using LLVM, but you don’t have level of – so the level of control that you have as a programmer is not enough to do what I wanted. I’m sure that there are some projects that can do that.  + +*MS (0:55:02)*: Well, there’s a trade-off, right? I mean, there is high-performance numerics with vectors for Haskell in the Accelerate library, but it means that you need to write a lot more explicitly how computations happen. And I mean, you started out by saying that Haskell is a good alternative to pseudocode, so you really are interested in the readability of the code. And of course, you could fiddle with your Haskell code to make it more performant, but then you also lose some of the readability. + +*DM (0:55:35)*: Yeah. Actually, in this case, I would argue that readability is still fine. I think Haskell will still be a good way to do things because the computation they are performing are very structured. So even when you design the function or the way to do it, you think of it in a very structured way. You think of it as a circuit. + +*MS (0:55:53)*: Right. Then it seems it would be interesting to get together with Conal Elliott, who was on this podcast just a little while ago. + +*DM (0:56:01)*: Yeah, that was the last – yeah, I heard him talking about that. + +*MS (0:56:05)*: His Compiling to Categories framework gives you, among other things, a way to write down circuits. And then there’s also ways to then compile that to highly performant numeric code. But still, all this stuff is just a long way from really practical usage. So you need to put in a lot of Haskell arcane voodoo knowledge in order to make that sort of practical. + +*DM (0:56:31)*: Which is something that has been on my to-do list when I find the time. But for me, this is not real work. So this is something that, for me, would be considered entertainment. My real work is lattice cryptography, meaning coming up or analyzing mathematical constructions based on lattices. Programming doesn’t count as real work, as real research work.  + +But something that I’d like to mention in this context is that there has been a very interesting attempt by Chris Peikert and my students at the time. They wrote a lattice-based cryptography library in Haskell, and in fact, it is a very interesting library. It is the only library that implements lattice cryptography for a wide range of mathematical structures. So I describe integer lattices. You can build lattices on larger rings. Instead of using integers, you can use polynomials with integer coefficients, polynomials in some quotient ring, typically cyclotomic polynomial rings. And cyclotomic polynomials in dimension, which is a power of two, they are particularly nice and simple, and they make cryptography very, very, very efficient. And that’s one of the first things that I implemented in almost like assembly. So, they were not really used anywhere, these power-of-two cyclotomics until 2007 with Chris and my student at the time, Vadim Lyubashevsky, and Alon Rosen. So we implemented a lattice-based cryptographic function on power-of-two cyclotomic polynomials, and the code was pretty much written in assembly. And even it was a mathematical construction, it was as fast as AES block cipher type of instructions, which are typically cooked up by simply putting together a sequence of instruction that nobody understands, so it must be hard to break. Of course, this is another simplification and distortion of the type of cryptographic constructions. But I think that in analogy to mathematical constructions, that’s what they are. And we show that for power-of-two cyclotomics, you could implement things very, very, very fast. So the code to compute – these FFTs were computed in – you could count the number of clock cycles, like 200 clock cycles. You get your FFT done using a number of optimizations. And now these are implemented and used in pretty much all the mainstream lattice cryptographic libraries, which are written in C, C++. There’s one of them written in Go. People started using Rust a little bit, but they’re pretty much all the same thing.  + +And so Chris implemented – so they built this library that implements arbitrary rings, not just power of two. And when you move to arbitrary cyclotomic rings, you have to look into the factorization of the dimension to break it into a tensor product of smaller-dimensional rings. And all of this is encoded at the type level. They had to use singleton types and all the overhead that is needed to do some kind of dependent type in Haskell, but they capture – so the ring is parametrized at the type level. So it’s a very interesting experiment from a binary design point of view. But even with the C++ backend, they were never able, I think, to get a level of performance that was anywhere close to the other libraries. I think part of it is that using Haskell anywhere keeps most of the programming community cut out of it. So there are far fewer people that can possibly contribute to the project. People still think of Haskell as making things happen. In cryptography, Haskell is not popular at all. + +*NV (01:00:42)*: It is, but because it doesn’t have the libraries, no?  + +*DM (01:00:45)*: No. Okay. So let’s drop lattice cryptography. Let’s get to cryptography, which I think that’s where the interesting problems are between cryptography and programming languages and perhaps things like Haskell. People working in cryptography are mostly theoretical computer scientists of the complexity theory type. So they work with the three machines. That’s their model of computation, which makes it very easy to measure the running time of your program. Okay, so it took this many steps, this many instructions. So from a purely theoretical point of view, that’s fine. Of course, nobody is ever going to write an actual Turing machine to describe. They're unreadable, unwritable, and almost certainly wrong. They don’t give the correct result. It is not a convenient way to describe the concrete programs. It’s fine as a generic model of computation. But that’s what people are familiar with. If they know a programming language, it’s C. Otherwise, it is imperative-style pseudocode. So if they see something written in Haskell, or even worse, Lisp with all those parenthesis, they will think of it as something that is arcane. Yeah. So why would you write something like this? Just to make it hard to read by other people? And if you have never seen it before, of course, you can read it. + +And moving from pseudocode to something that is more like actual code, for the sake of precision and even something that is well defined, is something that a number of people in the community think that, “Oh, it would be nice.” But typically, people will suggest that, “Oh, why don’t we start using Python instead of pseudocode?” which I mean, it’s – yeah, I’m not exactly – okay. I don’t want to say anything about it. + +*NV (01:02:34)*: So if there is some Haskeller that is listening to us right now, and they think that all this lattice and cryptography stuff are very, very cool, do you have any suggestion on what they should do, or like how can they use Haskell and cryptography together? + +*DM (01:02:53)*: So cryptography people, outsiders, they think that cryptography, when it comes to cryptography in practice, is about implementing cryptographic functions. That’s not the main job of a cryptographer. That’s not the hard part of doing cryptography. The hard part, the part where the ingenuity is, is cryptographic security definitions. How do you specify what is achieved by your cryptographic construction? And you can define what security means mathematically using – so this was done starting the early ‘80s, the work of Goldwasser and Micali on probabilistic encryption. So, for the first time, they show that you can give a definition of what security means, and it’s a mathematical definition. And this definition is something that is not part of the program.  + +So ideally, I think that a good way to use programming languages together with cryptography would be to have a lightweight method to capture security properties, for example, at the type level. But it should be cryptographic properties that are properties that are cryptographically meaningful, and they correspond to what cryptography means, cryptographers mean when they say that something is secure. But it shouldn’t take much than the equivalent of a type and notation to describe what you’re doing. And there have been a number of things done in the past using types to describe security properties, but there’s always been a little bit of a mismatch between what is done on the programming language side and what cryptographers would like to have and what is meant for something to be secure on the security side. And I try to do some work along those lines. Trying to come up with the symbolic models of cryptography, where cryptography is treated as an abstract data type, but the model is computationally sound, meaning that if two cryptographic expressions are syntactically similar, then they are secure the way the cryptographers mean it. And then you can try to use that inside the programming language. So it’s all open. It’s all work that is to be done. + +*NV (01:05:12)*: I guess you want, if your program type checks, it is cryptographic secure. + +*DM (01:05:17)*: Yes, exactly.  + +*NV (1:05:18)*: Easy.  + +*DM (1:05:20)*: Okay. It’s not the checking the part. I think that the barrier right now is, what does it mean to be secure? People, I mean, commoners, cryptographers, they use security as a catch-all word. Oh, something is secure, just one bit. Secure, yes. I don’t have to worry about it. Let’s move on. In cryptography, security can have many, many different meanings. You can have authenticity of your message, unforgeability of signatures, the fact that ciphertexts do not reveal any information about the encoded message. So there is something to it. Saying that something is secure is something similar to say, “Oh, let’s do a strongly typed programming language where there is only one type.” If there is only one type, it is not really doing anything interesting. So you would like to have a type system that captures the security properties.  + +And what makes it hard to do it, compared to the work that has been done so far in programming languages, is the cryptographic definitions, they are given in terms of what an adversary cannot do. So you want to say that no adversary can tell the difference between the encryption of zero and the encryption of one, for example. So it’s not about what the program does, but it is about what adversary, which is not running your program. It can do anything. It can do things written in a different programming language. What about the adversary cannot do? That’s what you want to try to capture in your program. And the type checker should say that, yes, the designer wrote the nice programming language with the type checking and everything, and you can conclude that when it is used in a network exchanging messages with possible adversarial parties, things will not break down. You still have certain types of guarantees. Perhaps I should have talked less about the other things and focused on – + +*NV (01:07:19)*: Yeah. Oh my gosh.  + +*MS (1:07:21)*: I’m very happy with this episode.  + +*NV (1:07:23)*: I want to say now that this sounds like monodic encapsulation, but maybe we should have another episode of that because the monad like captures what can go in and what can go out. + +*DM (01:07:36)*: Yeah. Let me tell you something, and then you say, “Oh yeah, either we keep talking or we have another.” Or we can talk. It doesn’t have to be an episode. I’m happy to talk about this outside of the episode. So I have a cryptography paper that starts with the word, or in the title, it has the word “comonad.” + +*NV (01:07:52)*: Ah. + +*DM (01:07:52)*: Comonad. And I can tell you briefly what the intuition is. So why comonads? So in these symbolic models of cryptography, this is something that has been done before. There was a first paper trying to build a computationally sound, cryptographically secure symbolic model of cryptography by Abadi and Rogaway 25 years ago, which they were showing that if you have an adversary that, given some cryptographic encrypted data, maybe recovers some keys and then tries to recover more and more and more keys and then get some view of the communication, then they were trying to prove that, yeah, if you do this symbolically, just treating the cryptographic expressions as symbolic terms, then if two expressions look the same symbolically, then no adversary can tell them apart when they are implemented using cryptography. And so the result required to assume that the expression didn’t contain encryption cycles. If you were encrypting K1 with K2, K2 with K3, K3 with K2, and then K5 with K1, then the result wouldn’t work. + +At some point, I realized that the reason, the real difficulty, is that in cryptography, when you reason about what the adversary cannot do, you should go the other way around. You should start from the assumption that, hypothetically, perhaps all the keys are known by the adversary. The adversary can decrypt under all the keys. And then using the same key recovery operator used by Abadi and Rogaway to do the symbolic treatment, if instead of starting from the empty set and getting more and more keys, you start from the set of all keys—it’s a monotone operator, so you will get smaller and smaller sets—then you will get a greatest fixed point of the key recovery operator, which tells you that, “Oh, this is some – so the keys outside of this set, sure, the adversary cannot get to them.” And this allows to establish a similar soundness, cryptographic soundness result, without having to assume that the expression has no encryption cycles. And doing the greatest fixed points is a form of coinduction. Okay. Sorry, I’m wrong. I was wrong. I didn’t use the word “comonad.” I used the word “coinduction.”  + +*NV (1:10:14)*: Coinduction.  + +*DM (1:10:15)*: Then, based on that, when working, when trying to – it is true that, yeah, you can think of using the monads to model a certain type of execution. What I was trying to do is I had the intuition that, okay, perhaps, one should use comonads instead to do this in cryptography in a way that is cryptographically meaningful. Actually, I was never able to figure out how to use comonads to do this, but I have the feeling that it is something that may be interesting to do. Okay, so perhaps you should edit and cut out a part of the interview where I said I have a paper with the word “comonad.” + +*NV (01:10:54)*: Actually, it’s good enough. Coinduction is by default in Haskell. You know that, no? Because everything is infinite. So maybe you don’t need comonads to implement it, but yeah. + +*DM (01:11:04)*: Yah, no, the comonads would’ve been a combination of coinduction together with monads. The monads are interesting for other reasons. You have a probability. Everything is probabilistic, so, okay. Sure, you have probability monads. Computations with the presence of an adversary is typically modeled as having access to some kind of oracle. So oracle-based computation also is something that can be given some kind of monodic treatment. So in terms of abstraction of Haskell – I mean, the abstract control, abstraction mechanism of Haskell, I think there are very good way to try to formalize what you typically find described in cryptography papers in a way that is totally unstructured. It is just plain English describing words, some complex interaction model.  + +*NV (01:11:53)*: Cool. So thank you very much. This was a very interesting interview. + +*DM (01:11:58)*: Okay. Well, thanks for having me. I look forward for your next episodes. I’ve been listening to your podcast since I discovered it and I still have to run to listen to some other episodes. Yeah. So they are great. You’ve been interviewing some very interesting people, and I told you before, I feel a little bit out of place. I am an intruder. I’m not exactly, but perhaps, yeah, no, the cryptography is an interesting topic overall. + +*NV (01:12:26)*: Yeah. Cool. Thank you very much. + +*Narrator (01:12:32)*: The Haskell Interlude Podcast is a project of the Haskell Foundation, and it is made possible by the generous support of our sponsors, especially the Gold-level sponsors: Input Output, JustPay, and Mercury.