diff --git a/lectures/lec06.tex b/lectures/lec06.tex index 69a117b..7135490 100644 --- a/lectures/lec06.tex +++ b/lectures/lec06.tex @@ -1,4 +1,414 @@ + \chapter{RSA Signatures} -\label{sec:rsa} -TBD. +In this chapter, we will discuss the RSA digital-signature scheme. +% +%\paragraph{Why study RSA?} +Even though the RSA cryptosystem is going out of style, for reasons +we will discuss, the RSA cryptosystem is still worth studying for a few reasons: +\begin{itemize} + \item RSA's security is related to the problem of factoring large integers, + which is (arguably) the most natural ``hard'' computational problem + out there. + + \item RSA gives the only known instantiation of a \emph{trapdoor one-way permutation}, + which we will define shortly. + + \item RSA has a number of esoteric properties that are useful for advanced + cryptographic constructions. For example, it gives a ``group of unknown order.'' + See + \href{https://toc.cryptobook.us/book.pdf#page=436}{Boneh-Shoup, Chapter 10.9} for details. + + \item RSA signatures are used on the vast majority of public-key certificates today.\footnote{As + of today, around 94\% of certificates in the Certificate Transparency logs use RSA signatures: +\url{https://ct.cloudflare.com/}.} + +\section{Background: RSA} +\paragraph{1974:} Ralph Merkle introduced public key exchange in an 1974 + undergraduate project report at Berkeley~\autocite{M78}. + He gave a key-exchange protocol based on one-way functions in + which the honest parties run in time $n$ and the best attack + runs in time $\Omega(n^2)$. + +\paragraph{1976:} Diffie and Hellman, in their \emph{New Directions} paper~\autocite{DH76}, + defined public key exchange, public-key encryption, and digital signatures. + They constructed a key-exchange scheme from discrete log with conjectured security + against all poly-time adversaries: honest parties run in time $n$, + attacker runs in superpolynomial time. + +\paragraph{1977:} Rivest, Shamir, and Adleman (RSA)~\autocite{G77,RSA} give the \emph{first} construction + of public-key encryption and digital signatures from a problem + related to the hardness of factoring integers. + + Later results from Lamport, Merkle, Naor and Yung, and others showed that + it is possible to build digital-signature schemes from one-way functions alone---i.e., + just from standard hash functions. + Today, we still do not know how to construct public-key encryption or key exchange + from one-way functions. + +\paragraph{2011:} Google stops using RSA-based key exchange by default on their front-end web servers. + Instead, they use RSA-based key exchange only for backwards compatibility with old clients. + (Most HTTPS servers today still use RSA for digital signatures to authenticate + the messages in a Diffie-Hellman key exchange.) + +\end{itemize} + +\section{Trapdoor one-way permutations} + +\subsection{Definition} +RSA implements a \emph{trapdoor one-way permutation} (``trapdoor OWP''), which we will now define. + +A trapdoor one-way permutation over input space $\calX$ is a triple of +efficient algorithms:\marginnote{If we wanted to be completely formal, the +input space would be parameterized by the security parameter $\lambda$. +So we would have a family of input spaces $\{\calX_\lambda\}_{\lambda \in \N}$---% +one for each choice of $\lambda$. This way the input space can grow with~$\lambda$.} + +\begin{itemize} + \item $\Gen(1^\lambda) \to (\sk, \pk)$. + The key-generation algorithm takes as input the security parameter $\lambda \in \N$, + expressed as a unary string, and outputs a secret key and a public key. + \item $F(\pk, x) \to y$.\marginnote{In the RSA construction, the input space $\calX$ +depends on the public key, but we elide that technical detail here.} + The evaluation algorithm $F$ takes as input the + public key $\pk$ and an input $x \in \calX$, and outputs + a value $y \in \calX$. + \item $I(\sk, y) \to x'$. + The inversion algorithm $I$ takes as input the secret key $\sk$ + and a point $y \in \calX$, and outputs its inverse $x \in \calX$. +\end{itemize} + +\paragraph{Correctness.} +Informally, we want that for keypairs $(\sk, \pk)$ output by $\Gen$, +we have that $F(\pk, \cdot)$ and $I(\sk, \cdot)$ are inverses of each other. +More formally, for all $\lambda \in \N$, $(\sk, \pk) \gets \Gen(1^\lambda)$, and $x \in \calX$, +we require: +\[ I(\sk, F(\pk, x))= x.\] + +\paragraph{Security.} +Security requires that $F(\pk, \cdot)$ is hard to invert (in the sense of a one-way function) +on a randomly sampled input in the input space $\calX$, even when the adversary +is given the public key $\pk$. +That is, for all efficient adversaries $\calA$, there exists a negligible function $\negl(\cdot)$ +such that +\[ \Pr\left[ +\calA(\pk, F(\pk, x)) = x +\colon \begin{aligned} + (\sk, \pk) &\gets \Gen(1^\lambda)\\ + x &\getsr \calX +\end{aligned}\right] \leq \negl(\lambda).\] + + +\paragraph{\textbf{IMPORTANT}:} +Just as a one-way function is only hard to invert on a \emph{randomly sampled input}, +a trapdoor one-way function is only hard to invert on a randomly sampled input. +Many of the cryptographic failures of RSA come from assuming that the RSA +one-way function is hard to invert on non-random inputs. + + +\subsection{Digital signatures from trapdoor one-way permutations} + +This construction is called ``full-domain hash.''\autocite{BR93} +The construction makes use of a hash function $H$ and resulting +signature scheme is secure, provided that we model the hash function~$H$ +as a ``random oracle.'' + +In other words, to argue security, it is not sufficient to show that +$H$ is, for example, collision resistant. +Instead, we can only prove security provided that we pretend that $H$ +is a truly random function---i.e., in the random-oracle model. +When we instantiate the hash function $H$ with some concrete cryptographic +hash function, such as SHA256, we hope that the resulting signature +scheme is still secure. +In practice, this approach works quite well. + +One way to think about it is that if a signature scheme is secure +in the random-oracle model, then the concrete signature scheme +is in some sense secure against attacks that do not exploit the peculiarities +of the hash function. + +\medskip + +In the construction, we use: +\begin{itemize} + \item a trapdoor one-way permutation $(\Gen, F, I)$, and + \item a hash function $H \colon \zo^* \to \calX$, + which we model as a random oracle in the + security analysis. +\end{itemize} + +\paragraph{Construction.} +We construct a digital-signature scheme $(\Gen, \Sign, \Ver)$ as follows: +\begin{itemize} + \item $\Gen$ -- Just run the key-generation algorithm for the + trapdoor one-way permutation. + \item $\Sign(\sk, m) \to \sigma$. + Hash the message down to an element $h$ of the input space $\calX$ + of the trapdoor OWP using the hash function $H$. Then invert + the trapdoor OWP at that point: + \begin{itemize} + \item Compute $h \gets H(m)$. + \item Output $\sigma \gets I(\sk, h)$. + \end{itemize} + \item $\Ver(\pk, m, \sigma) \to \zo.$ + \begin{itemize} + \item Compute $h' \gets H(m)$. + \item Accept if $F(\pk, \sigma) = h'$. + \end{itemize} +\end{itemize} + +Notice that the use of a hash function here is \textbf{critical} to +security, since (in the random oracle) it means that forging a signature +is as hard as inverting $F$ on a random point in its co-domain. +Without the hash function, forging a signature is only as hard as +inverting $F$ on an attacker-chosen point in its co-domain, which +could be easy.\marginnote{In fact, inverting $F$ at attacker-chosen points +\emph{is} easy when $F$ is the RSA function.} + +\paragraph{Correctness.} +For all $\lambda \in \N$, $(\sk, \pk) \gets \Gen(1^\lambda)$, and $m \in \zo^*$, +we have: +\begin{align*} +\Ver(\pk, m, \Sign(\sk, m)) &= 1\{ F(\pk, I(\sk, H(m))) = H(m) \}\\ +&= 1\{ I(\sk, F(\pk, I(\sk, H(m)))) = I(\sk, H(m)) \} +\intertext{and by correctness of the trapdoor one-way permutation:} +&= 1\{ I(\sk, H(m)) = I(\sk, H(m)) \} = 1. +\end{align*} + +\paragraph{Security.} +The intuition here is that if the adversary cannot invert $F$, +it cannot find the preimage of $H(m)$ under $F$ for any message +on which it has not seen a signature. +See \href{https://toc.cryptobook.us/book.pdf#page=550}{Boneh-Shoup Chapter 13.3} +for the full security analysis. + +\section{The RSA construction: Forward direction} + +The algorithms for +key-generation and +for evaluating the RSA permutation +in the forward direction are not too complicated. + +In what follows, we present RSA with +public exponent $e=5$. +The same construction works with many other choices of $e$, +just by replacing all of the ``5''s below with some other +small prime: 3, 7, 13, etc. +A popular choice of the public exponent $e$ in practice is $e=2^{16}+1$. +The complexity of computing the RSA function in the forward +direction scales with the size of $e$, so we prefer small +choices of $e$. + +\begin{itemize} + \item $\Gen(1^\lambda) \to (\sk, \pk)$.\marginnote{In practice, + we usually take the bitlength of primes to be $\lambda=1024$ + or $\lambda = 2048$.} + \begin{itemize} + \item Sample two random $\lambda$-bit primes $p$ and $q$ + such that $p \equiv q \equiv 4 \pmod 5$.\marginnote{% + Standard RSA implementations require + the weaker condition that the public exponent $e$ + shares no prime factors with $p-1$ and $q-1$. + Using the stronger condition here simplifies + the inversion algorithm.} + \item Set $N \gets p \cdot q$. + \item Output $\sk \gets (p, q)$, and $\pk = N$. + \end{itemize} + \item $F(\pk = N, x) \to y$. +\begin{itemize} + \item The input space for the RSA function is \\ + $\calX = \Z_N = \{0, 1, 2, 3, \dots, N-1\}$.\marginnote{% + To be completely precise, we should write that the + input space of the RSA function is $\Z^*_N$, which is the + set of numbers in $\Z_N$ that are relatively prime to the modulus~$N$. + Since we only ever sample random numbers + from $\calX$, the probability that a random sample + from $\Z_N$ is not also in $\Z_N^*$ is +\begin{align*} + 1 - \frac{\abs{\Z^*_N}}{\abs{\Z_N}} &= 1 - \frac{(p-1)(q-1)}{N}\\ +&= 1 - \frac{N - p - q + 1}{N}\\ +&\leq \frac{p + q}{N}\\ +&\approx 2/\sqrt{N}\\ +&\approx 2^{-\lambda}\end{align*} + which is negligible in the security parameter $\lambda$. + In other words, you are as likely to hit one of these ``bad'' + elements as you are to guess a prime factor of $N$. + } + \item Output $y \gets x^5 \bmod N$. +\end{itemize} +\end{itemize} + +\begin{remark} +The key-generation algorithm relies on us being able +to sample large random primes. +One perhaps surprising fact is that there are many many +large primes. +In particular, if you pick a random $\lambda$-bit number, +the probability that it is prime is roughly $1/\lambda$.\marginnote{For more +on this, look up the Prime Number Theorem.} + +We can sample a random $\lambda$-bit prime by just +picking random integers in the range $[2^\lambda, 2^{\lambda + 1})$ +until we find a prime. +We can test for primality in $\approx \lambda^4$ time using +the Miller-Rabin primality test. +We also need that there are infinitely many primes congruent to +$4 \bmod 5$, but fortunately there are. +Generating RSA keys is expensive---it can take +a few seconds even on a modern machine. +\end{remark} + +Notice that computing the RSA function in the forward direction is +relatively fast: it just requires three multiplications +modulo a 2048-bit number $N$. That is, to compute $x^5 \bmod N$, we compute: +\[ (x^2)^2 \cdot x = x^5\mod N.\] + +\medskip + +Before describing the RSA inversion algorithm, we discuss +why the RSA trapdoor one-way permutation should be hard +to invert without the secret key. + +\subsection{Why should the RSA function be hard to invert?} +To invert the RSA function, the attacker's is effectively given +a value $y \getsr \Z_N$ and must find a value $x$ such that $x^5 = y \bmod N$. +Or, put another way, the attacker's task is essentially the following: +\begin{itemize} + \item \textbf{Given:} A polynomial $p(X) \deq X^5 - y \in \Z_N[X]$, + for $y \getsr \Z_N$. + \item \textbf{Find:} A value $x \in \Z_N$ such that $p(x) = 0 \in \Z_N$. +\end{itemize} + +So the attacker must find the root of a polynomial modulo a composite integer~$N$. + +The premise of RSA-style cryptosystems is that we only +know of essentially two ways to find roots of polynomials modulo $N$: +\begin{itemize} +\item \textbf{Factor $N$ into primes} and find a root modulo each of the primes. + (We will say more on this in a moment.) + Since the best algorithms for factoring run in time roughly $2^{\sqrt[3]{\log N}} = 2^{\sqrt[3]{\lambda}}$, + this approach is infeasible at present without knowing the factorization of $N$. + +\item \textbf{Find a root over the integers} and reduce it modulo $N$.\marginnote{Actually, + it suffices to find a root over the rational numbers, but the distinction isn't important here.} + For example, it is easy to find a root of polynomials such as:\\ +\begin{align*} + X + 4 &= 3 &&\mod N,&\\ + X + 2Y &= 5 &&\mod N,&\\ + X^2 &= 9 &&\mod N\text{, and}&\\ + X^2-3x+2 &= (X-2)(X-1) = 3 &&\mod N. +\end{align*}\marginnote{There are many clever attacks +for solving polynomial equations modulo composites +that work in certain special cases, but for most purposes +these are the two known attacks.} +When $y \getsr \Z_N$, the probability that $y$ is a perfect +5-th power, and thus that there is an integral root to $X^5 - y$, +is $\sqrt[5]{N}/N \approx 2^{-4\lambda/5}$, which is negligible +in the security parameter~$\lambda$. +So solving this equation over the integers is a dead end. + +\paragraph{Is inverting the RSA function as hard as factoring the modulus?} +No one knows---the question has been open since the invention of RSA. +We do know that finding roots of certain polynomial equations, such as +$p(X) \deq X^2 - y \bmod N$ for random $y \getsr \Z_N$ \emph{is} as +hard as factoring the modulus $N$. +But for RSA-type polynomials, the answer is unclear. + +\end{itemize} + +\section{The RSA construction: Inverse direction} + +To understand how the inversion algorithm works, we will need some +number-theoretic tools. + + +\subsection{Tools from number theory} +For a natural number $N$, let $\phi(N)$ denote the number of integers +in $\Z_N = \{1, 2, 3, \dots, N\}$ that are relatively prime to $N$.\marginnote{Two +natural numbers are \emph{relatively prime} if they share no prime factors.} +When $p$ is prime $\phi(p) = p-1$. +The function $\phi(\cdot)$ is called \emph{Euler's totient function}. + +When $N=pq$ is the product of two distinct primes, $\phi(N) = (p-1)(q-1)$. +That is so because all numbers in $\Z_N$ are relatively prime to $N$ except +$N$ and the multiples of $p$ and $q$: +\[ p, 2p, 3p, \dots, (q-1)p,\ \ \ q, 2q, 3q, \dots, (p-1)q.\] +So there are $N - (q-1) - (p-1) - 1 = (p-1)(q-1)$ numbers +in $\Z_N$ relatively prime to $N$. + +\begin{theorem}[Euler's Theorem]\label{thm:euler} +Let $N$ be a natural number. Then for all $a \in \Z^*_N$, +\[ a^{\phi(N)} = 1 \mod N.\] +\end{theorem} +\begin{proof} +Consider the sets $\Z^*_N$ and $\{ax \bmod N \mid x \in \Z^*_N\}$. +These sets are equal, so the product of the elements in the two +sets is equal. +Let $X \gets \prod_{x \in \Z^*_N} x \bmod N$. +Then +\[ X = a^{\phi(N)}X \bmod N \quad \Rightarrow\quad 1 = a^{\phi(N)} \bmod N.\] +\end{proof} + +\begin{lemma}\label{lemma:inv} +Let $p$ and $q$ be distinct primes congruent to $4$ modulo $5$. +Define $d = \frac{\phi(N) - 4}{5} + 1$. +Then $5d \equiv 1 \bmod \phi(N)$. +\end{lemma} + +\begin{proof} +Observe that +\[ p \equiv 4 \bmod 5 \quad \Rightarrow \quad \phi(N) - 4 \equiv 0 \bmod 5,\] +so $\frac{\phi(N) - 4}{5}$ is an integer and thus $d$ is well defined. +Then $5 d = \phi(N) - 4 + 5 = 1 \bmod \phi(N)$. +\end{proof} + + +\subsection{Inverting the RSA function} + +With the number theory out of the way, we can now describe how +to invert the RSA function. +All we have to do is to show how to compute a fifth root of $y \bmod N$. + +\begin{itemize} + \item $I(\sk, y) \to x$. +\begin{itemize} + \item The secret key $\sk$ consists of the prime factors $p,q$ of $N$. + Recall that $\phi(N) = (p-1)(q-1)$. + \item Compute the integer $d \gets \frac{\phi(N) - 4}{5} + 1$, as in + \cref{lemma:inv}.\marginnote{% + We sometimes call $d$ the \emph{private exponent} in RSA.} + + \item Return $y^d \bmod N$. +\end{itemize} +\end{itemize} + +It is not obvious why the inversion algorithm is correct. +Say that $y = x^5 \bmod N$. +Then: +\begin{align*} + y^d &= (x^5)^d &&\bmod N\\ + &= x^{5d} &&\bmod N\\ +&= x^{k \cdot \phi(N) + 1} &&\bmod N, \qquad \text{for some $k \in \Z$, by \cref{lemma:inv}}\\ + &= x \cdot (x^{\phi(N)})^k &&\bmod N\\ +&= x &&\bmod N, \qquad\text{by \cref{thm:euler}}.\\ +\end{align*} +We could write $5d = k \phi(N) + 1$ because from \cref{lemma:inv}, +we know that $5d \equiv 1 \bmod \phi(N)$. + + +\paragraph{Is inverting RSA as hard as factoring the modulus $N$?} +The inversion algorithm we showed here requires knowing the prime factors +of the modulus $N$. +Inverting RSA is thus \emph{no harder than} factoring $N$. + +Is inverting RSA \emph{as hard as} factoring $N$? +In particular, if we have an efficient algorithm $\calA$ that inverts +RSA, can we use $\calA$ to factor the modulus $N$? +No one knows! + +Most cryptographers, I would guess, believe that inverting the RSA function +is as hard as factoring. +But for all we know, it could be that computing fifth roots modulo $N$ is \emph{easier} +than factoring the modulus. + + diff --git a/ref.bib b/ref.bib index 753cfda..ec87e1f 100644 --- a/ref.bib +++ b/ref.bib @@ -144,3 +144,54 @@ @article{HILL99 pages = {1364--1396}, year = 1999, } + + +@article{RSA, + title={A method for obtaining digital signatures and public-key cryptosystems}, + author={Rivest, Ronald L. and Shamir, Adi and Adleman, Leonard}, + journal={Communications of the ACM}, + volume={21}, + number={2}, + pages={120--126}, + year={1978}, + publisher={ACM New York, NY, USA} +} + +@article{G77, + title={A new kind of cipher that would take millions of years to break}, + author={Gardner, Martin}, + journal={Scientific American}, + volume={237}, + number={8}, + pages={120--124}, + year={1977} +} + + +@article{M78, + title={Secure communications over insecure channels}, + author={Merkle, Ralph C.}, + journal={Communications of the ACM}, + volume={21}, + number={4}, + pages={294--299}, + year={1978}, + publisher={ACM New York, NY, USA}, + note={See original project report at \url{https://www.ralphmerkle.com/1974/}} +} + +@article{K88, + title={Historical development of the {Chinese} {Remainder Theorem}}, + author={Kangsheng, Shen}, + journal={Archive for History of Exact Sciences}, + pages={285--305}, + year={1988} +} + +@inproceedings{BR93, + title={Random oracles are practical: A paradigm for designing efficient protocols}, + author={Bellare, Mihir and Rogaway, Phillip}, + booktitle={ACM Conference on Computer and Communications Security}, + year={1993} +} +