The RSA cryptosystem

Try a demo of encryption signature

The math and definitions explained

In cryptography we often encrypt and sign messages that contains characters, but asymmetric key cryptosystems (Alice and Bob uses different keys) such as RSA and ElGamal are based on arithmetic operations on integer. And symmetric key cryptosystems (Alice and Bob uses the same key) such as DES and AES are based on bitwise operations on bits (a bit is either equal 0 or 1 and is an abbreviation for binary digit). Therefore, we have to convert the characters into integers or bits before we apply the cryptosystem to the message.

Because a computer can only handle bits, there already exists a method for converting characters into integers or bits, that uses the ASCII (American Standard Code for Information Interchange) table. A part of the ASCII table from Wikipedia is listed in the following table (notice that the binary representation on the Wikipedia site is only seven bits but we use eight bits, i.e. we have prepended an extra 0):

Character Decimal Hexadecimal Binary
(space) 32 20 00100000
! 33 21 00100001
\( \vdots \) \( \vdots \) \( \vdots \) \( \vdots \)
A 65 41 01000001
B 66 42 01000010
\( \vdots \) \( \vdots \) \( \vdots \) \( \vdots \)
a 97 61 01100001
b 98 62 01100010
\( \vdots \) \( \vdots \) \( \vdots \) \( \vdots \)

E.g. if Alice wants to send the message "Hey Bob!" encrypted to Bob with a symmetric key cryptosystem, she first converts it into its integer representation and then into its binary representation:

\( \eqalign{ H &\rightarrow 72 &&\rightarrow 01001000 \\ e &\rightarrow 101 &&\rightarrow 01100101 \\ y &\rightarrow 121 &&\rightarrow 01111001 \\ &\rightarrow 32 &&\rightarrow 00100000 \\ B &\rightarrow 66 &&\rightarrow 01000010 \\ o &\rightarrow 111 &&\rightarrow 01101111 \\ b &\rightarrow 98 &&\rightarrow 01100010 \\ ! &\rightarrow 33 &&\rightarrow 00100001 } \)

I.e. the message "Hey Bob!" is represented with integers as "72 101 121 32 66 111 98 33" and in binary as "01001000 01100101 01111001 00100000 01000010 01101111 01100010 00100001".

The exponent of a number says how many times to multiply the number by it self. Ex:

\( 4^{3} = 4 \cdot 4 \cdot 4 = 64 \)

where 3 is the exponent (or power) and 4 is the base. In words we say 4 to the power 3.

Here are the laws of exponents:

Law Example
\( x^{1} = x \) \( 3^{1} = 3 \)
\( x^{0} = 1 \) \( 4^{0} = 1 \)
\( x^{-1} = \frac{1}{x} \) \( 5^{-1} = \frac{1}{5} \)
\( x^{m} \cdot x^{n} = x^{m+n} \) \( x^{2} \cdot x^{3} = (x \cdot x) \cdot (x \cdot x \cdot x) = x \cdot x \cdot x \cdot x \cdot x = x^{5} = x^{2+3} \)
\( \frac{x^{m}}{x^{n}} = x^{m-n} \) \( \frac{x^{4}}{x^{2}} = \frac{x \cdot x \cdot x \cdot x}{x \cdot x} = x \cdot x \cdot \frac{x \cdot x}{x \cdot x} = x \cdot x \cdot 1 = x^{2} = x^{4-2} \)
\( (x^{m})^{n} = x^{m \cdot n} \) \( (x^{2})^{3} = (x \cdot x) \cdot (x \cdot x) \cdot (x \cdot x) = x \cdot x \cdot x \cdot x \cdot x \cdot x = x^{6} = x^{2 \cdot 3} \)
\( (x \cdot y)^{n} = x^{n} \cdot y^{n} \) \( (x \cdot y)^{2} = (x \cdot y) \cdot (x \cdot y) = x \cdot x \cdot y \cdot y = x^{2} \cdot y^{2} \)
\( (\frac{x}{y})^{n} = \frac{x^{n}}{y^{n}} \) \( (\frac{x}{y})^{3} = (\frac{x}{y}) \cdot (\frac{x}{y}) \cdot (\frac{x}{y}) = \frac{x \cdot x \cdot x}{y \cdot y \cdot y} = \frac{x^{3}}{y^{3}} \)
\( x^{-n} = \frac{1}{x^{n}} \) \( x^{-3} = (x^{-1})^{3} = (\frac{1}{x})^{3} = \frac{1}{x} \cdot \frac{1}{x} \cdot \frac{1}{x} = \frac{1 \cdot 1 \cdot 1}{x \cdot x \cdot x} = \frac{1}{x^{3}} \)

You already use modulo computation when you look at the clock and e.g. needs to figure out what time it's 3 hours after 11 o'clock, which is 2 o'clock. In math we write that as:

\( (11 + 3) \: mod \: 12 = 2 \)

where 12 is the modulus because we want the time as an integer between 0 and 11 (12 o'clock is in this case denoted by 0). In words we say 11 plus 3 modulo 12 is equal 2. The result of a modulo computation is an integer between 0 and the modulus minus 1. E.g. with the modulus 3 we have that:

  • \( 1 \: mod \: 3 = 1 \)
  • \( 2 \: mod \: 3 = 2 \)
  • \( 3 \: mod \: 3 = 0 \)
  • \( 4 \: mod \: 3 = 1 \)
  • \( 5 \: mod \: 3 = 2 \)
  • \( 6 \: mod \: 3 = 0 \)
  • etc.

If we e.g. look at \( 27 \: mod \: 5 \) then modulo computes the number of times 5 divides 27 and then returns the remainder of the result which is 2 in this case, i.e. \( 27 \: mod \: 5 = 2 \). But how did we get this result?

First we compute the number of times it's possible to multiply 5 with the number \( x \) such that we get an integer as close as possible to 27 without exceeding it, i.e. we have to find the maximun value of \( x \) such that \( 5 \cdot x \leq 27 \). In this case we have that \( x = 5 \) because \( 5 \cdot 5 = 25 \leq 27 \). Then by subtracting 27 with 25 we get the answer \( 27 - 25 = 2\).

If the integer is negative e.g. \( -27 \: mod \: 5 \) we have to do it slightly different and the answer is \( -27 \: mod \: 5 = 3 \). In this case the integer \( x \) is negative and should be the closest integer that exceed -27, i.e. we have to find the minimum value of \( -x \) such that \( 5 \cdot -x \geq -27 \). Now we have that \( -x = -6 \) because \( 5 \cdot -6 = -30 \geq -27 \). Then by subtracting -27 with -30 we get the answer \( -27 - (-30) = -27 + 30 = 3\).

It's important that \( x \) or \( -x \) is an integer such as \( -14, 3, 17 \) etc. and NOT a fraction or float such as \( \frac{1}{4}, \frac{-3}{7}, 2.5, 5.1 \) etc.

If two integers \( a \) and \( b \) modulo the same modulus \( c \) returns the same remainder \( r \), then we say that \( a \) and \( b \) are congruent modulo \( c \). I.e. if \( a \: mod \: c = r \) and \( b \: mod \: c = r \) then \( a \equiv b \: (mod \: c) \). Also, notice that if the modulus \( c \) is greater than the integer \( a \), i.e. \( c > a \), the result will always be equal \( a \: mod \: c = a \).

A prime number is an integer greater than 1, which only can be divided evenly by 1 and itself. Divided evenly means that the result must not be a float. E.g. if you try to divide 13 with 3 it returns the float \( \frac{13}{3} = 4.333 \). We have that 13 is a prime number because it can only be divided evenly by 1 and itself, i.e. \( \frac{13}{1} = 13 \) and \( \frac{13}{13} = 1 \).

If an integer is not a prime number it's called a composite number. E.g. the integer 6 is a composite number because it can be divided evenly by 1, 2, 3 and 6:

\( \frac{6}{1} = 6 \), \( \frac{6}{2} = 3 \), \( \frac{6}{3} = 2 \) and \( \frac{6}{6} = 1 \)

The term composite means "something made by combining things". So, a composite number is an integer made by multiplying prime numbers:

  • 2 is a prime
  • 3 is a prime
  • 4 is a composite: \( 2 \cdot 2 = 4 \)
  • 5 is a prime
  • 6 is a composite: \( 2 \cdot 3 = 6 \)
  • 7 is a prime
  • 8 is a composite: \( 2 \cdot 2 \cdot 2 = 8 \)
  • 9 is a composite: \( 3 \cdot 3 = 9 \)
  • etc.

Therefore, an integer is either a prime number or a product of prime numbers (a composite number).

The famous Greek mathematician Euclid proved that there are infinite many prime numbers.

Michael Rabin and Gary Miller have developed an algorithm that decides whether an integer is a prime or composite number by testing the integer with multiple bases denoted by \( a \). The algorithm is called the Rabin-Miller primality test.

Before we describe what the greatest common divisor of two integers are, we first define what we mean by a divisor. In this context is a divisor of an integer \( x \) some integer that divides \( x \) evenly, i.e. the result must not be a float. E.g. if you try to divide \( x=12 \) with 5 it returns the float \( \frac{12}{5} = 2.4 \) and 5 is therefore not a divisor of \( x=12 \). For \( x=12 \) we have the divisors 1, 2, 3, 4, 6 and 12 because \( \frac{12}{1} = 12 \), \( \frac{12}{2} = 6 \), \( \frac{12}{3} = 4 \), \( \frac{12}{4} = 3 \), \( \frac{12}{6} = 2 \) and \( \frac{12}{12} = 1 \).

Likewise the divisors of 16 is 1, 2, 4, 8 and 16 because \( \frac{16}{1} = 16 \), \( \frac{16}{2} = 8 \), \( \frac{16}{4} = 4 \), \( \frac{16}{8} = 2 \) and \( \frac{16}{16}=1 \).

The greatest common divisor of 12 and 16 is therefore 4, because it is the largest integer of the common divisors. In math we write that as \( \gcd(12, 16) = 4 \).

Two integers with greatest common divisor 1 are called relatively prime numbers or co-primes. E.g. 15 and 28 are relatively prime numbers because \( \gcd(15, 28) = 1 \) (notice that 28 is not a prime number).

If one of the two integers is a prime number the greatest common divisor will always be 1, i.e. \( \gcd(a, p) = 1 \) where \( a \) is an integer (either a prime number or a composite number) and \( p \) is a prime number.

One method to compute the greatest common divisor of two integers is by using the Euclidean algorithm developed by the famous Greek mathematician Euclid. See "The extended Euclidean algorithm" for more information about how to compute the greatest common divisor of two integer.

The extended Euclidean algorithm is an extended version of the Euclidean algorithm, which only returns the greatest common divisor of two integers. Given two integers \( a \) and \( b \) the extended Euclidean algorithm returns the integers \( a \), \( b \), \( \lambda \) and \( \mu \) such that:

\( a \cdot \lambda + b \cdot \mu = \gcd(a, b) \)

where \( \lambda \) and \( \mu \) are called the Bézout coefficients for \( a \) and \( b \). Only if \( a \) and \( b \) are relatively prime numbers, i.e. \( \gcd(a, b) = 1 \), then:

\( a \cdot \lambda + b \cdot \mu = 1 \)

and \( \lambda \; mod \; b \) is the inverse of \( a \), denoted \( a^{-1} = \lambda \; mod \; b \), and \( \mu \: mod \: a \) is the inverse of \( b \), denoted \( b^{-1} = \mu \: mod \: a \) (see "Modulo computation" for more information about the \( mod \) operator). One useful property of an integer and its inverse is that \( a \cdot a^{-1} \; mod \; b = 1 \) and \( b \cdot b^{-1} \; mod \; a = 1 \).

You can easily compute \( \gcd(a, b) \), \( \lambda \) and \( \mu \) for e.g. \( a=5 \) and \( b=39 \) with a simple table. So, let us first create a table with 3 columns (we do not yet know how many rows there will be in the table). Let us denote the entry in the first row and first column for [1,1], the entry in the first row and second column for [1,2], the entry in the second row and first column for [2,1] and so on.

Next we write \( b=39 \) in entry [1,1] and \( a=5 \) in entry [2,1]. Then we try to find the biggest integer \( q_{1} \), such that \( q_{1} \cdot a \leq b \). We have that \( q_{1}=7 \), which we write in entry [2,2], because \( 7 \cdot 5 = 35 \leq 39 \) and a remainder of \( r_{1}=4 \), which we write in entry [3,1].

Again we try to find the biggest integer \( q_{2} \), such that \( q_{2} \cdot r_{1} \leq a \). We have that \( q_{2}=1 \), which we write in entry [3,2], because \( 1 \cdot 4 = 4 \leq 5 \) and a remainder of \( r_{2}=1 \) that we write in entry [4,1]. Notice that we just computed the same as before, just with integers in a lower row.

The next computation returns a remainder of \( r_{3} = 0 \) because \( q_{3} \cdot r_{2} = 4 \cdot 1 = 4 \leq 4 = r_{1} \). We have now computed \( \gcd(5, 39)=r_{2}=1 \) since \( r_{3} = 0 \). And because 5 and 39 are relatively prime numbers, we know that \( \lambda \) and \( \mu \) exists and we can then start using the last column.

First we write \( x_{1}=0 \) in entry [4,3] and \( x_{2}=1 \) in entry [3,3]. Then we write \( x_{3}=q_{2} \cdot x_{2} + x_{1} = 1 \cdot 1 + 0 = 1 \) in entry [2,3]. For entry [1,3] we compute the same as before, just with integers from the row above, i.e. \( x_{4}=q_{1} \cdot x_{3} + x_{2} = 7 \cdot 1 + 1 = 8 \).

Finally we have that \( a \cdot x_{4} \pm b \cdot x_{3} = r_{2} \), where we need to decide whether it should be plus or minus between the two terms. Because \( a \cdot x_{4} = 5 \cdot 8 = 40 \), \( b \cdot x_{3} = 39 \cdot 1\) and \( 40 \geq 39 \) we have that \( 5 \cdot 8 - 39 \cdot 1 = 1 \) (which is the same as \( 5 \cdot 8 + 39 \cdot (-1) = 1 \)) and the Bézout coefficients are \( \lambda=8 \) and \( \mu=-1 \). Notice that \( a^{-1} = \lambda \; mod \; b = 8 \; mod \; 39 = 8\) and \( b^{-1} = \mu \; mod \; a = -1 \: mod \: 5 = 4\) where \( a \cdot a^{-1} \; mod \; b = 5 \cdot 8 \; mod \; 39 = 1 \) and \( b \cdot b^{-1} \; mod \; a = 39 \cdot 4 \; mod \; 5 = 1 \).

The table for computing \( 5 \cdot \lambda + 39 \cdot \mu = \gcd(5, 39) \) is:

\( b=39 \) \( x_{4}=8 \)
\( a=5 \) \( q_{1}=7 \) \( x_{3}=1 \)
\( r_{1}=4 \) \( q_{2}=1 \) \( x_{2}=1 \)
\( r_{2}=1 \) \( q_{3}=4 \) \( x_{1}=0 \)
\( r_{3}=0 \)

The set of integers \( \{ \dots, -2, -1, 0, 1, 2, \dots \} \) is denoted by the symbol \( \mathbb{Z} \), i.e. \( \mathbb{Z} = \{ \dots, -2, -1, 0, 1, 2, \dots \} \), and is called a ring of integers or a group. The group of integers modulo \( n \) is a subset of \( \mathbb{Z} \) and is denoted by the symbol \( \mathbb{Z}/n\mathbb{Z} \), but we use the shorthand version \( \mathbb{Z}_{n} \). This subset contains the following elements (because we compute modulo \( n \)):

\( \mathbb{Z}_{n} = \{ 0, 1, 2, \dots, n - 1 \}\)

Notice that whenever we make an addition or multiplication in \( \mathbb{Z}_{n} \), we always compute the addition or multiplication modulo \( n \) to obtain an integer in \( \mathbb{Z}_{n} \). To illustrate this we look at \( n = 5 \):

\( \mathbb{Z}_{5} = \{ 0, 1, 2, 3, 4 \}\)

When adding \( 3 \) with \( 4 \) in \( \mathbb{Z}_{5} \) we do the following: \( (3 + 4) \: mod \: 5 = 7 \: mod \: 5 = 2 \). Likewise, when multiplying \( 3 \) with \( 4 \) in \( \mathbb{Z}_{5} \) we do the following: \( (3 \cdot 4) \: mod \: 5 = 12 \: mod \: 5 = 2 \).

An integer \( a \) in \( \mathbb{Z}_{n} \) has an inverse if the greatest common divisor of \( a \) and \( n \) is 1, i.e. \( \gcd(a, n) = 1 \) (see "The extended Euclidean algorithm" for more information). The integer \( a \) is then called an unit. The set of units (all integers with an inverse in \( \mathbb{Z}_{n} \)) is denoted by the symbol \( (\mathbb{Z}/n\mathbb{Z})^{*} \). Again, we will use the shorthand version \( \mathbb{Z}_{n}^{*} \). If \( a_{1} \) and \( a_{2} \) are units then is the multiplication \( (a_{1} \cdot a_{2}) \: mod \: n \) also always an unit (i.e. \( a_{1} \cdot a_{2} \) is in the group \( \mathbb{Z}_{n}^{*} \)), but the addition \( (a_{1} + a_{2}) \: mod \: n \) may NOT be an unit (i.e. \( a_{1} + a_{2} \) is in the group \( \mathbb{Z}_{n} \) but may not be in the group \( \mathbb{Z}_{n}^{*} \)). We see the difference in the two sets \( \mathbb{Z}_{8} \) and \( \mathbb{Z}_{8}^{*} \):

  • \( \mathbb{Z}_{8} = \{ 0, 1, 2, 3, 4, 5, 6, 7 \}\)
  • \( \mathbb{Z}_{8}^{*} = \{ 1, 3, 5, 7 \} \)

If we chose \( n \) to be a prime number \( p \) then for all integers \( a \) excluding 0 in \( \mathbb{Z}_{p} \) we have that \( \gcd(a, p) = 1 \), which result in that \( \mathbb{Z}_{p}^{*} \) contains all integers from \( \mathbb{Z}_{p} \) excluding 0, i.e.:

\( \mathbb{Z}_{p}^{*} = \{ 1, 2, 3, \dots, p - 1 \}\)

E.g. for \( p=7 \) we see that the only difference in the two sets \( \mathbb{Z}_{7} \) and \( \mathbb{Z}_{7}^{*} \) is the integer 0:

  • \( \mathbb{Z}_{7} = \{ 0, 1, 2, 3, 4, 5, 6 \}\)
  • \( \mathbb{Z}_{7}^{*} = \{ 1, 2, 3, 4, 5, 6 \} \)

The number of elements in \( \mathbb{Z}_{n}^{*} \) is denoted by the symbol \( \phi(n) \) after a famous Swiss mathematician called Euler and is called Euler's phi function. As we saw previously, then if \( n \) is a prime number \( p \) then \( \phi(p) = p-1 \). The number of elements in a group \( G \) is also called the order of \( G \) and is written as \( \left| G \right| \). The order of:

  • \( \mathbb{Z}_{n} \) is \( \left| \mathbb{Z}_{n} \right| = n \) where \( n \) is a composite number
  • \( \mathbb{Z}_{n}^{*} \) is \( \left| \mathbb{Z}_{n}^{*} \right| = \phi(n) \) where \( n \) is a composite number
  • \( \mathbb{Z}_{n}^{*} \) is \( \left| \mathbb{Z}_{n}^{*} \right| = \phi(n) = (p-1) \cdot (q-1) \) where \( n = p \cdot q \) and \( p \) and \( q \) are prime numbers
  • \( \mathbb{Z}_{p}^{*} \) is \( \left| \mathbb{Z}_{p}^{*} \right| = \phi(p) = p-1 \) where \( p \) is a prime number

If we e.g. chose the elements \( x \), \( a \) and \( b \) from the group \( \mathbb{Z}_{p}^{*} \) and want to compute \( x^{a + b} \: mod \: p \), then in the exponent we actually compute \( a + b \) modulo the order of the group, i.e. \( a + b \: mod \: (p-1) \) because \( \left| \mathbb{Z}_{p}^{*} \right| = \phi(p) = p-1 \). So, what we actually compute is \( x^{a + b \: mod \: (p-1)} \: mod \: p \). The same is true if we had chosen one of the other groups; We always compute modulo the order of the group in the exponent. Therefore, next time you look at an equation from a cryptosystem and wonder why they suddenly compute e.g. modulo \( p-1 \) instead of modulo \( p \), then it's because the equation is used in the exponent of some integer.

Let \( n \) be the product of two large prime numbers \( p \) and \( q \), i.e. \( n = p \cdot q \). Because \( n \) is a product of relatively prime numbers, i.e. \( \gcd(p, q)=1 \), we can use the Chinese Remainder Theorem for the equation \( X \: mod \: n \) where \( X \) is some expression.

The Chinese Remainder Theorem says that solving \( X \: mod \: n \) is the same as solving \( X \: mod \: p \) and \( X \: mod \: q \) (which may be easier). And hence, if we can find a solution for \( X \: mod \: p \) and \( X \: mod \: q \) then we also have found a solution for \( X \: mod \: n \).

Let \( n \) be the product of two large prime numbers \( p \) and \( q \), i.e. \( n = p \cdot q \). Then, if you know the value of \( n \) it's hard to compute \( p \) and \( q \) from \( n \).

The problem of computing \( p \) and \( q \) is called the prime factorization problem.

Security of an asymmetric key (public-key) cryptosystem such as RSA and ElGamal is measured with respect to a chosen plaintext attack (CPA) and a chosen ciphertext attack (CCA).

In a chosen plaintext attack (sometimes called a semantic attack) is Alice and Bob's adversary Eve passive, i.e. she only observe the sent ciphertexts between Alice and Bob and tries to guess the plaintexts. We say that an asymmetric cryptosystem is CPA-secure if Eve's advantage in guessing correct in the following game is negligible:

Eve sends a message \( m \) to an oracle and it returns a ciphertext \( c \) where \( c \) is either an encryption of \( m \) or a random message with the same length as \( m \). Eve then tries to guess what \( c \) is an encryption of.

Hence, if a asymmetric cryptosystem is CPA-secure, the only information that a ciphertext leaks is the length of the encrypted message.

In a chosen ciphertext attack is Eve active and it's therefore a stronger attack compared to a chosen plaintext attack. Now Eve can get a ciphertext decrypted by the oracle and we say that an asymmetric cryptosystem is CCA-secure if Eve's advantage in guessing correct in the following game is negligible:

Eve sends a ciphertext \( c_{i} \) to an oracle and it returns the decrypted message \( m_{i} \) of \( c_{i} \). This is repeated as many times as Eve wants. Then Eve sends a message \( m \) to the oracle and it returns a ciphertext \( c \) where \( c \) is either an encryption of \( m \) or a random message with the same length as \( m \). Finally Eve sends a ciphertext \( c_{i}' \), that has to be different from \( c \), to the oracle and it returns the decrypted message \( m_{i}' \) of \( c_{i}' \). This is repeated as many times as Eve wants. Eve then tries to guess what \( c \) is an encryption of.

Again, if a asymmetric cryptosystem is CCA-secure, the only information that a ciphertext leaks is the length of the encrypted message.

Security of a digital signature such as RSA and ElGamal is measured with respect to a chosen plaintext attack (CPA).

In a chosen plaintext attack is Alice and Bob's adversary Eve passive, i.e. she only observe the sent messages and signatures between Alice and Bob and tries to forge a signature. We say that a digital signature is CPA-secure if Eve's advantage in forging a signature in the following game is negligible:

Eve sends a message \( m_{i} \) to an oracle and it returns its signautre \( \sigma_{i} \). This is repeated as many times as Eve wants. Finally Eve outputs a message \( m \), which has to be different from the messages \( m_{i} \), and its signature \( \sigma \). Eve wins the game if \( \sigma \) is a valid signature of the message \( m \).

The encryption scheme explained

The RSA cryptosystem is named after its inventors Ron Riverst, Adi Shamir and Leonard Adleman who first described the algorithm in 1977. RSA is a public key cryptosystem based on the prime factorization problem, i.e. every person has a key pair \( (sk, pk) \), where \( sk \) is the secret key and \( pk \) is the public key, and given only the public key one has to find the prime factors (solve the prime factorization problem) to get the secret key.

RSA is both an encryption scheme (this section) which helps Alice and Bob with the problem of exchanging sensitive information over an insecure channel eavesdropped by their adversary Eve and a digital signature scheme (the next section) which helps them with creating digital signatures.

In the encryption scheme Alice first chooses two large secret prime numbers \( p \) and \( q \) and computes the product \( n = p \cdot q \). She then chooses a public encryption exponent \( e \) such that \( 3 \leq e \leq n-1 \) and \( \gcd(e, \phi(n)) = 1 \) where \( \phi(n) = (p-1) \cdot (q-1) \) is called Euler's phi function. The reason why the public encryption exponent \( e \) has to be relative prime to \( \phi(n) \) (i.e. \( \gcd(e, \phi(n)) = 1 \)) is because Alice needs the corresponding secret decryption exponent \( d \), which is the inverse of \( e \). And in the case of the RSA cryptosystem \( e \) only has an inverse when \( \gcd(e, \phi(n)) = 1 \).

Alice then publish the public key \( pk = (n, e) \) and stores the secret key \( sk = (n, d) \). Notice that both Alice's friend Bob and their adversary Eve know the public key, but both have to compute the prime factors \( p \) and \( q \) of \( n \) (solving the prime factorization problem) to compute the secret decryption exponent \( d \) because \( d \) is uniquely determined by \( e \), \( p \) and \( q \) (this is shown later on).

Suppose that Bob has a secret message he wants to send to Alice where he would like to hide the content of the message for Eve. First Bob converts the message into an integer \( m \) such that \( 1 \leq m \leq n-1 \) where \( m \) is called the plaintext. He then encrypts \( m \) using the RSA encryption algorithm \( E \) with the public key \( pk = (n, e) \) by \( c = E_{pk}(m) = m^{e} \: mod \: n \) where \( c \) is called the ciphertext, which he sends to Alice.

For Alice's secret key \( sk = (n, d) \) we haven't discussed how the secret decryption exponent \( d \) is computed: Remember that Alice chooses the two prime numbers \( p \) and \( q \), i.e. she can compute Euler's phi function \( \phi(n) = (p-1) \cdot (q-1) \). She then uses the extended Euclidean algorithm to compute \( d \). The extended Euclidean algorithm gives her the equation \( e \cdot \lambda + \phi(n) \cdot \mu = \gcd(e, \phi(n)) \) where \( d = \lambda \; mod \; \phi(n) \) is called the inverse of \( e \) because \( e \cdot d \: mod \: \phi(n) = 1 \) (remember that \( d \) only exists because \( \gcd(e, \phi(n)) = 1 \). And hence, \( d \) is uniquely determined by \( e \), \( p \) and \( q \)).

Now it's easy for Alice to decrypt the received ciphertext \( c \) with the RSA decryption algorithm \( D \) and the secret key \( sk = (n, d) \) by \( m' = D_{sk}(c) = c^{d} \: mod \: n \) where \( m' \) is equal the plaintext \( m \), which we will prove in the following: As described above is \( e \cdot d \: mod \: \phi(n) = 1 \) because \( d \) is the inverse of \( e \). This is the same as \( e \cdot d = 1+k \cdot \phi(n) \) for some integer \( k \geq 1 \), and hence:

\( \eqalign{ m' &= c^{d} \: mod \: n &&(c = m^{e}) \\ &= (m^{e})^{d} \: mod \: n &&(\mbox{exponent rule}) \\ &= m^{e \cdot d} \: mod \: n &&(e \cdot d \: mod \: \phi(n) = 1+k \cdot \phi(n))\\&= m^{1+k \cdot \phi(n)} \: mod \: n &&(\mbox{see below})\\ &= m } \)

The step \( m^{1 + k \cdot \phi(n)} \: mod \: n = m \: mod \: n \) maybe needs an explanation: First notice that instead of solving modulo \( n \) we can solve modulo \( p \) and modulo \( q \) according to the Chinese Remainder Theorem because \( n = p \cdot q \) (\( n \) is a product of prime numbers) and \( \gcd(p, q) = 1 \) (the factors \( p \) and \( q \) are relative prime to each other). I.e. we have to solve the two equations \( m^{1+k \cdot \phi(n)} \: mod \: p = m \: mod \: p \) and \( m^{1+k \cdot \phi(n)} \: mod \: q = m \: mod \: q \) where the proof for both cases are the same. Therefore, we only needs to prove that \( m^{1+k \cdot \phi(n)} \: mod \: p = m \: mod \: p \). We have two cases: either \( m \: mod \: p = 0 \) (\( p \) divides \( m \)) or \( m \: mod \: p \neq 0 \). If \( m \: mod \: p = 0 \) then:

\( \eqalign{ m^{1+k \cdot \phi(n)} \: mod \: p &= m \: mod \: p &&(m \: mod \: p = 0) \\ 0^{1+k \cdot \phi(n)} \: mod \: p &= 0 \: mod \: p \\ 0 \: mod \: p &= 0 \: mod \: p } \)

Otherwise if \( m \: mod \: p \neq 0 \) then \( \gcd(m, p) = 1 \) and Euler's Theorem tells us that \( m^{\phi(p)} \: mod \: p = 1 \) for all \( 1 \leq m \leq p-1 \) where \( \phi(p) = p-1 \):

\( \eqalign{ m^{1+k \cdot \phi(n)} \: mod \: p &= m \: mod \: p &&(\phi(n) = (p-1) \cdot (q-1) \\ m^{1+k \cdot (p-1) \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\phi(p) = (p-1) \\ m^{1+k \cdot \phi(p) \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{exponent rule}) \\ m^{1} \cdot m^{k \cdot \phi(p) \cdot (q-1)} \: mod \: p &= m \: mod \: p \\ m^{1} \cdot m^{\phi(p) \cdot k \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{exponent rule}) \\ m^{1} \cdot (m^{\phi(p)})^{k \cdot (q-1)} \: mod \: p &= m \: mod \: p &&(\mbox{Euler's Theorem}) \\ m^{1} \cdot 1^{k \cdot (q-1)} \: mod \: p &= m \: mod \: p \\ m \: mod \: p &= m \: mod \: p \\ } \)

The encryption algorithm \( E \) is most efficient when the public encryption exponent \( e \) is small and similarly the decryption algorithm \( D \) is most efficient when the secret decryption exponent \( d \) is small. But because \( d \) is determined by the value of \( e \) Alice can't choose both of them to be small and \( e \) also has to be relatively prime to \( \phi(n) \) so it has to be strictly bigger than \( 2 \), i.e. \( e \geq 3 \). Therefore, the smallest possible value is \( e = 3 \). However, if Alice wants to send the same message \( m \) encrypted to, say 3 different people with each their public keys \( pk_{1} = (e, n_{1}) \), \( pk_{2} = (e, n_{2}) \) and \( pk_{3} = (e, n_{3}) \) where \( e = 3 \), Eve can easily compute the message \( m \) from the 3 different ciphertexts \( c_{1} = m^e \: mod \: n_{1} \), \( c_{2} = m^e \: mod \: n_{2} \) and \( c_{3} = m^e \: mod \: n_{3} \) by using the Chinese Remainder Theorem: Eve first computes a number \( t \) such that \( t = m^{3} \: mod \: (n_{1} \cdot n_{2} \cdot n_{3}) \) and then the original message \( m \) by \( m = \sqrt[3]{t} \). Therefore, to be secure against this type of attack and still have a fast encryption the value \( e = 2^{16} + 1 = 65537 \) is often used as the encryption exponent, which is a prime number.

The RSA cryptosystem is also called a one-way trapdoor function because it is easy to compute the ciphertext \( c \) from the plaintext \( m \) and hard vice versa. However, with special information (the "trapdoor" information), which in this case is information about the two prime numbers \( p \) and \( q \), it's easy to compute the plaintext \( m \) from the ciphertext \( c \).

Since the above "textbook" form of RSA is a deterministic cryptosystem, i.e. if we encrypt the same message twice it always returns the same ciphertext because no randomness is used, it's neither CPA-secure or CCA-secure because Eve could first encrypt every message by herself using the public key and then compare them to the observed ciphertext. However, it's possible to use the "textbook" form of RSA to build a CPA- and CCA-secure version. The CCA-secure version is called OAEP (Optimal Asymmetric Encryption Padding) and in the following we will describe the CPA-secure version:

Alice first generates a key pair \( (sk, pk) \) as in the "textbook" version and sends the public key \( pk = (n,e) \) to Bob and stores the secret key \( sk = (n,d) \). One of the main difference between this version and the "textbook" version is that Bob only can send one bit encrypted to Alice, i.e. the message is either \( b = 0 \) or \( b = 1 \). Bob encrypts the bit by first choosing a random integer \( m_{b} \) between \( 1 \) and \( n-1 \) such that the least significant bit of \( m_{b} \) is equal to \( b \), i.e. \( lsb(m_{b}) = b \). He then sends the ciphertext \( c = E_{pk}(m_{b}) \) to Alice, who decrypts it by \( m_{b} = D_{sk}(c) \) and computes \( b = lsb(m_{b}) \).

This CPA-secure version is of course not very efficient because we can only encrypt one bit. However in practice is RSA often used to encrypt keys for symmetric cryptosystem such as DES and AES where the key is usually much shorter than \( n \). We can then locate the key in the least significant end of the message that we want to send encrypted, but as described are the keys much shorter so we pad with random bits until we get the right length. In the above CPA-secure version was the key only one bit, however it has been proven that it's still secure with keys of size up to \( \log_2(n) \)-bit.

RSA is also a partially homomorphic cryptosystem because:

\( \eqalign{ E_{pk}(m_{1}) \cdot E_{pk}(m_{2}) \: mod \: n &= m_{1}^{e} \cdot m_{2}^{e} \: mod \: n &&(\mbox{exponent rule}) \\ &= (m_{1} \cdot m_{2})^{e} \: mod \: n \\ &= E_{pk}(m_{1} \cdot m_{2} \: mod \: n) } \)

i.e. multiplying two ciphertexts is the same as multiplying the two corresponding plaintexts. RSA is partially homomorphic and not fully homomorphic because it's only multiplication that have this property (and not addition).

This property is both an advantage and a disadvantage of the cryptosystem: It's an advantage when e.g. Alice have some private data \( m_{1} \) she wants a cloud service to make some computations on. First she computes \( c_{1} = E_{pk}(m_{1}) \) and then sends the ciphertext \( c_{1} \) to the could service. The cloud service then make some computations on Alice's data by \( c_{1} \cdot c_{2} \: mod \: n \) for some \( c_{2} = E_{pk}(m_{2}) \). Alice then receives and decrypts her ciphertext which returns the data \(m_{1} \cdot m_{2} \: mod \: n \). On the other hand it's a disadvantage when e.g. Eve intercept a ciphertext \( c_{1} \) sent from Alice to Bob and computes \( c_{1} \cdot c_{2} \: mod \: n \) for some \( c_{2} =E_{pk}(m_{2}) \) which she sends to Bob. When Bob decrypts the ciphertext he get the plaintext \( m_{1}\cdot m_{2} \: mod \: n \) instead of the plaintext \( m_{1} \).

The protocol
Key creation
Alice:
Chooses the secret prime numbers \( p \) and \( q \).
Computes \( n = p \cdot q \) and \( \phi(n) = (p-1) \cdot (q-1) \).
Chooses the public encryption exponent \( 3 \leq e \leq n-1 \) such that \( \gcd(e, \phi(n)) = 1 \).
Alice sends the public key \( pk = (n, e) \) to Bob.
Encryption
Bob:
Uses Alice's public key \( pk = (n, e) \) to compute the ciphertext \( c = E_{pk}(m) = m^{e} \: mod \: n \) where \( m \) is the plaintext and \( 1 \leq m \leq n-1 \)
Bob sends the ciphertext \( c \) to Alice.
Decryption
Alice:
Computes the secret decryption exponent \( d \) with the extended Euclidean algorithm and set the secret key as \( sk = (n, d) \).
Computes the plaintext \( m' = D_{sk}(c) = c^{d} \: mod \: n \) where \( m' = m \).

Try a demo of the encryption scheme here.

Example

Alice chooses two secret prime numbers \( p = 277 \) and \( q = 317 \) and computes \( n = p \cdot q = 277 \cdot 317 = 87809 \). She then chooses the public encryption exponent \( e = 19907 \) such that \( \gcd(e, \phi(n)) = 1 \) where \( \phi(n) = (p-1) \cdot (q-1) = (277-1) \cdot (317-1) = 87216 \). Finally she publish the public key \( pk = (n, e) = (87809, 19907) \).

Bob, a friend of Alice, find Alice's public key because he wants to send a secret message to Alice. First he converts the message into the integer \( m = 12345 \) satisfying \( 1 \leq m \leq n-1 \). He then uses Alice's public key to compute the ciphertext \( c = E_{pk}(m) = m^{e} \: mod \: n = 12345^{19907} \: mod \: 87809 = 66975 \) which he sends to Alice.

Before Alice can decrypt the received ciphertext, she needs the secret decryption exponent \( d \) which is a part of the secret key \( sk = (n, d) \). She uses the extended Euclidean algorithm that gives her the equation \( \gcd(e, \phi(n)) = e \cdot \lambda + \phi(n) \cdot \mu = 19907 \cdot 18011 + 87216 \cdot (-4111) = 1 \) where the secret decryption exponent is \( d = \lambda \; mod \; \phi(n) = 18011 \; mod \; 87216 = 18011 \). Alice then uses the secret decryption exponent \( d = 18011 \) to compute the plaintext \( m' = D_{sk}(c) = c^{d} \: mod \: n = 66975^{18011} \: mod \: 87809 = 12345 \).

The signature scheme explained

Two person, Samantha and Victor, uses a digital signature scheme when one of them, say Samantha, needs to send a digital signed piece of data, a document or message, to Victor and it's important that Victor know it's from Samantha. The case could e.g. be that Victor has installed a program developed by Samantha's software company and the company then publish a new update. Before Victor installs the new update he wants to make sure that the update is from Samantha's software company and not the evil hacker Eve who tries to install virus on peoples computers disguised as updates from Samantha's software company.

One such digital signature scheme is the RSA signature scheme which is more or less the same as the RSA encryption scheme. Therefore, for correctness of the digital signature scheme we refer to the description of the encryption scheme in the previous section.

In the signature scheme Samantha first chooses two large secret prime numbers \( p \) and \( q \) and computes the product \( n = p \cdot q \). She then chooses the public verification exponent \( v \) such that \( 3 \leq v \leq n-1 \) and \( \gcd(v, \phi(n)) = 1 \) where \( \phi(n) = (p-1) \cdot (q-1) \) is called Euler's phi function. Next she uses the extended Euclidean algorithm to compute the secret signing exponent \( s \). The extended Euclidean algorithm gives her the equation \( v \cdot \lambda + \phi(n) \cdot \mu = \gcd(v, \phi(n)) \) where \( s = \lambda \; mod \; \phi(n) \) is the inverse of \( v \) because \( v \cdot s \: mod \: \phi(n) = 1 \) (remember that in the RSA case \( s \) only exists because \( \gcd(v, \phi(n)) = 1 \)). Samantha then publish the public key \( pk = (n, v) \) and stores the secret key \( sk = (n, s) \).

As you may have noticed corresponds the public verification exponent \( v \) to the public encryption exponent \( e \) in the encryption scheme, and likewise the secret signing exponent \( s \) to the secret decryption exponent \( d \).

To sign a message \( m \) Samantha first computes the fingerprint \( \mathcal{H}(m) \) of the message with a public known hash function \( \mathcal{H} \) such that \( 1 \leq \mathcal{H}(m) \leq n-1 \). One property of the hash function is that given the same input, the hash function always return the same output. Another property is that the length (or size) of the input can be arbitrary long, but the output will have a fixed length. So, no matter how large the message Samantha wants to sign, the signing algorithm will be equally efficient because Samantha signs the fingerprint \( \mathcal{H}(m) \) instead of the message \( m \).

After Samantha has computed the fingerprint \( \mathcal{H}(m) \) of the message \( m \) she signs the fingerprint by using the RSA signing algorithm \( S \) with the secret signing key \( sk = (n, s) \) by \( \sigma = S_{sk}(\mathcal{H}(m)) = \mathcal{H}(m)^{s} \: mod \: n \) where \( \sigma \) is called the signature of \( \mathcal{H}(m) \). She then sends the message \( m \) and the signature \( \sigma \) to Victor.

It's important that Samantha signs the fingerprint instead of the message because otherwise could Eve easily forge a signature by using the partially homomorphic property of the RSA cryptosystem (see the previous section about the homomorphic property of RSA): Assume that Samantha has computed the signatures \( \sigma_{1} = S_{sk}(m_{1}) \) and \( \sigma_{2} = S_{sk}(m_{2}) \) (notice that the signatures are not of the fingerprints). Now if Eve intercepts the two messages \( m_{1} \) and \( m_{2} \) and their signatures \( \sigma_{1} \) and \(\sigma_{2} \) (when Samantha tries to send them to Victor) she can easily compute a valid signature of the message \( m_{1} \cdot m_{2} \: mod \: n \) by computing \( \sigma_{1} \cdot \sigma_{2} \: mod \: n \) because:

\( \eqalign{ S_{sk}(m_{1}) \cdot S_{sk}(m_{2}) &= m_{1}^{s} \cdot m_{2}^{s} \: mod \: n &&(\mbox{exponent rule}) \\ &= (m_{1} \cdot m_{2})^{s} \: mod \: n \\ &= S_{sk}(m_{1} \cdot m_{2} \: mod \: n) } \)

However, if Samantha computes the signatures of the fingerprints instead, i.e. \( \sigma_{1} = S_{sk}(\mathcal{H}(m_{1})) \) and \( \sigma_{2} =S_{sk}(\mathcal{H}(m_{2})) \), the signature \( \sigma_{1} \cdot \sigma_{2} \: mod \: n \) is no longer a valid signature of \( m_{1} \cdot m_{2} \: mod \: n \) because \( \mathcal{H}(m_{1}) \cdot \mathcal{H}(m_{2}) \neq \mathcal{H}(m_{1} \cdot m_{2}) \) (this is a slightly abuse of notation because it comes from the fact that the hash function is collision resistant):

\( \eqalign{ S_{sk}(\mathcal{H}(m_{1})) \cdot S_{sk}(\mathcal{H}(m_{2})) &= \mathcal{H}(m_{1})^{s} \cdot \mathcal{H}(m_{2})^{s} \: mod \: n &&(\mbox{exponent rule}) \\ &= (\mathcal{H}(m_{1}) \cdot \mathcal{H}(m_{2}))^{s} \: mod \: n \\ &\neq (\mathcal{H}(m_{1} \cdot m_{2}))^{s} \: mod \: n \\ &= S_{sk}(\mathcal{H}(m_{1} \cdot m_{2} \: mod \: n)) } \)

Another possible attack by Eve when Samantha signs messages instead of fingerprints is to choose a random signature \( \sigma \) and then compute the message by \( \sigma^{v} \: mod \: n = m \) (remember that the public key is \( pk = (n, v) \)). Now \( \sigma \) is a valid signature of \( m \), i.e. Eve has forged a signature of Samantha. But if a secure hash function \( \mathcal{H} \) is used then \( \sigma^{v} \: mod \: n = \mathcal{H}(m) \) and it's hard for Eve to compute \( m \) from \( \mathcal{H}(m) \), i.e. to compute the preimage.

Before Victor can verify the signature \( \sigma \) of the message \( m \) he uses the same hash function \( \mathcal{H} \) as Samantha to compute the fingerprint \( \mathcal{H}(m) \) of the message \( m \). Victor then verifies the signature by computing \( \sigma^{v} \: mod \: n \) and checking that it's equal to \( \mathcal{H}(m) \). Only if \( \sigma^{v} \: mod \: n = \mathcal{H}(m) \) is the signature valid.

The protocol
Key creation
Samantha:
Chooses the secret prime numbers \( p \) and \( q \).
Computes \( n = p \cdot q \) and \( \phi(n) = (p-1) \cdot (q-1) \).
Chooses the public verification exponent \( 3 \leq v \leq n-1 \) such that \( \gcd(v, \phi(n)) = 1 \).
Samantha sends the public key \( pk = (n, v) \) to Victor.
Signing
Samantha:
Computes the fingerprint \( \mathcal{H}(m) \) of the message \( m \) such that \( 1 \leq \mathcal{H}(m) \leq n-1 \).
Computes the secret signing exponent \( s \) with the extended Euclidean algorithm.
Signs the fingerprint \( \mathcal{H}(m) \) by computing \( \sigma = S_{sk}(\mathcal{H}(m)) = \mathcal{H}(m)^{s} \: mod \: n \).
Samantha sends the signature \( \sigma \) and the message \( m \) to Victor.
Verification
Victor:
Computes the fingerprint \( \mathcal{H}(m) \) of the message \( m \).
Uses Samantha's public key \( pk = (n, v) \) to verify the signature \( \sigma \) of the fingerprint \( \mathcal{H}(m) \) by checking that \( \sigma^{v} \: mod \: n = \mathcal{H}(m) \).

Try a demo of the signature scheme here.

Example

Samantha chooses two secret prime numbers \( p = 191 \) and \( q = 373 \) and computes \( n = p \cdot q = 191 \cdot 373 = 71243 \). She then chooses the public verification exponent \( v = 28261 \) such that \( \gcd(v, \phi(n)) = 1 \) where \( \phi(n) = (p-1) \cdot (q-1) = (191-1) \cdot (373-1) = 70680 \). Finally she publish the public key \( pk = (n, v) = (71243, 28261) \).

Before Samantha signs the message \( m = \mbox{"This update is authorized by Samantha"} \) she computes the fingerprint \( \mathcal{H}(m) = 14556 \) of the message with a public known hash function \( \mathcal{H} \) such that \( 1 \leq \mathcal{H}(m) \leq n-1 \). She also computes the secret signing exponent \( s \) with the extended Euclidean algorithm. The extended Euclidean algorithm gives her the equation \( \gcd(v, \phi(n)) = v \cdot \lambda + \phi(n) \cdot \mu = 28261 \cdot 15421 + 70680 \cdot (-6166) = 1 \) where the secret signing exponent is \( s = \lambda \; mod \; \phi(n) = 15421 \; mod \; 70680 = 15421 \).

Samantha then uses the secret key \( sk = (n, s) = (71243, 15421) \) to compute the signature \( \sigma = S_{sk}(\mathcal{H}(m)) = \mathcal{H}(m)^{s} \: mod \: n = 14556^{15421} \: mod \: 71243 = 41455 \) of the fingerprint which she sends to Victor together with the message \( m \).

Victor uses the same hash function \( \mathcal{H} \) as Samantha to compute the fingerprint \( \mathcal{H}(m) = 14556 \) of the message. He then uses Samantha's public key to verify that \( \mathcal{H}(m) = 14556 \) is equal to \( \sigma^{v} \: mod \: n = 41455^{28261} \: mod \: 71243 = 14556 \).