Elliptic Curve Cryptography: finite fields and discrete logarithms

最新推荐文章于 2022-04-28 22:36:01 发布

Oo璀璨星海oO

最新推荐文章于 2022-04-28 22:36:01 发布

阅读量959

点赞数

分类专栏：数学加密解密

加密解密同时被 2 个专栏收录

30 篇文章 9 订阅

订阅专栏

数学

14 篇文章 0 订阅

订阅专栏

转载自：https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/

This post is the second in the series ECC: a gentle introduction.

In the previous post, we have seen how elliptic curves over the real numbers can be used to define a group. Specifically, we have defined a rule for point addition: given three aligned points, their sum is zero (P+Q+R=0P+Q+R=0). We have derived a geometric method and an algebraic method for computing point additions.

We then introduced scalar multiplication (nP=P+P+⋯+PnP=P+P+⋯+P) and we found out an "easy" algorithm for computing scalar multiplication: double and add.

Now we will restrict our elliptic curves to finite fields, rather than the set of real numbers, and see how things change.

The field of integers modulo p

A finite field is, first of all, a set with a finite number of elements. An example of finite field is the set of integers modulo pp, where pp is a prime number. It is generally denoted as Z/pZ/p, GF(p)GF(p) or FpFp. We will use the latter notation.

In fields we have two binary operations: addition (+) and multiplication (·). Both are closed, associative and commutative. For both operations, there exist a unique identity element, and for every element there's a unique inverse element. Finally, multiplication is distributive over the addition: x⋅(y+z)=x⋅y+x⋅zx⋅(y+z)=x⋅y+x⋅z.

The set of integers modulo pp consists of all the integers from 0 to p−1p−1. Addition and multiplication work as in modular arithmetic (also known as "clock arithmetic"). Here are a few examples of operations in F23F23:

Addition: (18+9)mod23=4(18+9)mod23=4
Subtraction: (7−14)mod23=16(7−14)mod23=16
Multiplication: 4⋅7mod23=54⋅7mod23=5
Additive inverse: −5mod23=18−5mod23=18

Indeed: (5+(−5))mod23=(5+18)mod23=0(5+(−5))mod23=(5+18)mod23=0
Multiplicative inverse: 9−1mod23=189−1mod23=18

Indeed: 9⋅9−1mod23=9⋅18mod23=19⋅9−1mod23=9⋅18mod23=1

If these equations don't look familiar to you and you need a primer on modular arithmetic, check out Khan Academy.

As we already said, the integers modulo pp are a field, and therefore all the properties listed above hold. Note that the requirement for pp to be prime is important! The set of integers modulo 4 is not a field: 2 has no multiplicative inverse (i.e. the equation 2⋅xmod4=12⋅xmod4=1 has no solutions).

Division modulo p

We will soon define elliptic curves over FpFp, but before doing so we need a clear idea of what x/yx/y means in FpFp. Simply put: x/y=x⋅y−1x/y=x⋅y−1, or, in plain words, xx over yy is equal to xx times the multiplicative inverse of yy. This fact is not surprising, but gives us a basic method to perform division: find the multiplicative inverse of a number and then perform a single multiplication.

Computing the multiplicative inverse can be "easily" done with the extended Euclidean algorithm, which is O(logp)O(log⁡p) (or O(k)O(k) if we consider the bit length) in the worst case.

We won't enter the details of the extended Euclidean algorithm, as it is off-topic, however here's a working Python implementation:

def extended_euclidean_algorithm(a, b):
    """
    Returns a three-tuple (gcd, x, y) such that
    a * x + b * y == gcd, where gcd is the greatest
    common divisor of a and b.

    This function implements the extended Euclidean
    algorithm and runs in O(log b) in the worst case.
    """
    s, old_s = 0, 1
    t, old_t = 1, 0
    r, old_r = b, a

    while r != 0:
        quotient = old_r // r
        old_r, r = r, old_r - quotient * r
        old_s, s = s, old_s - quotient * s
        old_t, t = t, old_t - quotient * t

    return old_r, old_s, old_t


def inverse_of(n, p):
    """
    Returns the multiplicative inverse of
    n modulo p.

    This function returns an integer m such that
    (n * m) % p == 1.
    """
    gcd, x, y = extended_euclidean_algorithm(n, p)
    assert (n * x + p * y) % p == gcd

    if gcd != 1:
        # Either n is 0, or p is not a prime number.
        raise ValueError(
            '{} has no multiplicative inverse '
            'modulo {}'.format(n, p))
    else:
        return x % p

Elliptic curves in FpFp

Now we have all the necessary elements to restrict elliptic curves over FpFp. The set of points, that in the previous post was:

{(x,y)∈R2|y2=x3+ax+b,4a3+27b2≠0} ∪ {0}{(x,y)∈R2|y2=x3+ax+b,4a3+27b2≠0} ∪ {0}

now becomes:

{(x,y)∈(Fp)2|y2≡x3+ax+b(modp),4a3+27b2≢0(modp)} ∪ {0}{(x,y)∈(Fp)2|y2≡x3+ax+b(modp),4a3+27b2≢0(modp)} ∪ {0}

where 0 is still the point at infinity, and aa and bb are two integers in FpFp.

Elliptic curves in Fp

The curve y2≡x3−7x+10(modp)y2≡x3−7x+10(modp) with p=19,97,127,487p=19,97,127,487. Note that, for every xx, there are at most two points. Also note the symmetry about y=p/2y=p/2.

Singular curve in Fp

The curve y2≡x3(mod29)y2≡x3(mod29) is singular and has a triple point in (0,0)(0,0). It is not a valid elliptic curve.

What previously was a continuous curve is now a set of disjoint points in the xyxy-plane. But we can prove that, even if we have restricted our domain, elliptic curves in FpFp still form an abelian group.

Point addition

Clearly, we need to change a bit our definition of addition in order to make it work in FpFp. With reals, we said that the sum of three aligned points was zero (P+Q+R=0P+Q+R=0). We can keep this definition, but what does it mean for three points to be aligned in FpFp?

We can say that three points are aligned if there's a line that connects all of them. Now, of course, lines in FpFp are not the same as lines in RR. We can say, informally, that a line in FpFp is the set of points (x,y)(x,y) that satisfy the equation ax+by+c≡0(modp)ax+by+c≡0(modp) (this is the standard line equation, with the addition of "(mod p)(mod p)").

Point addition for elliptic curves in Z/p

Point addition over the curve y2≡x3−x+3(mod127)y2≡x3−x+3(mod127), with P=(16,20)P=(16,20) and Q=(41,120)Q=(41,120). Note how the line y≡4x+83(mod127)y≡4x+83(mod127) that connects the points "repeats" itself in the plane.

Given that we are in a group, point addition retains the properties we already know:

Q+0=0+Q=QQ+0=0+Q=Q (from the definition of identity element).
Given a non-zero point QQ, the inverse −Q−Q is the point having the same abscissa but opposite ordinate. Or, if you prefer, −Q=(xQ,−yQmodp)−Q=(xQ,−yQmodp). For example, if a curve in F29F29 has a point Q=(2,5)Q=(2,5), the inverse is −Q=(2,−5mod29)=(2,24)−Q=(2,−5mod29)=(2,24).
Also, P+(−P)=0P+(−P)=0 (from the definition of inverse element).

Algebraic sum

The equations for calculating point additions are exactly the same as in the previous post, except for the fact that we need to add "mod pmod p" at the end of every expression. Therefore, given P=(xP,yP)P=(xP,yP), Q=(xQ,yQ)Q=(xQ,yQ) and R=(xR,yR)R=(xR,yR), we can calculate P+Q=−RP+Q=−R as follows:

xRyR===(m2−xP−xQ)modp[yP+m(xR−xP)]modp[yQ+m(xR−xQ)]modpxR=(m2−xP−xQ)modpyR=[yP+m(xR−xP)]modp=[yQ+m(xR−xQ)]modp

If P≠QP≠Q, the the slope mm assumes the form:

m=(yP−yQ)(xP−xQ)−1modpm=(yP−yQ)(xP−xQ)−1modp

Else, if P=QP=Q, we have:

m=(3x2P+a)(2yP)−1modpm=(3xP2+a)(2yP)−1modp

It's not a coincidence that the equations have not changed: in fact, these equations work in every field, finite or infinite (with the exception of F2F2 and F3F3, which are special cased). Now I feel I have to provide a justification for this fact. The problem is: proofs for the group law generally involve complex mathematical concepts. However, I found out a proof from Stefan Friedl that uses only elementary concepts. Read it if you are interested in why these equations work in (almost) every field.

Back to us — we won't define a geometric method: in fact, there are a few problems with that. For example, in the previous post, we said that to compute P+PP+P we needed to take the tangent to the curve in PP. But without continuity, the word "tangent" does not make any sense. We can workaround this and other problems, however a pure geometric method would just be too complicated and not practical at all.

Instead, you can play with the interactive tool I've written for computing point additions.

The order of an elliptic curve group

We said that an elliptic curve defined over a finite field has a finite number of points. An important question that we need to answer is: how many points are there exactly?

Firstly, let's say that the number of points in a group is called the order of the group.

Trying all the possible values for xx from 0 to p−1p−1 is not a feasible way to count the points, as it would require O(p)O(p)steps, and this is "hard" if pp is a large prime.

Luckily, there's a faster algorithm for computing the order: Schoof's algorithm. I won't enter the details of the algorithm — what matters is that it runs in polynomial time, and this is what we need.

Scalar multiplication and cyclic subgroups

As with reals, multiplication can be defined as:

nP=P+P+⋯+Pn timesnP=P+P+⋯+P⏟n times

And, again, we can use the double and add algorithm to perform multiplication in O(logn)O(log⁡n) steps (or O(k)O(k), where kk is the number of bits of nn). I've written an interactive tool for scalar multiplication too.

Multiplication over points for elliptic curves in FpFp has an interesting property. Take the curve y2≡x3+2x+3(mod97)y2≡x3+2x+3(mod97) and the point P=(3,6)P=(3,6). Now calculate all the multiples of PP:

Cyclic subgroup

The multiples of P=(3,6)P=(3,6) are just five distinct points (00, PP, 2P2P, 3P3P, 4P4P) and they are repeating cyclically. It's easy to spot the similarity between scalar multiplication on elliptic curves and addition in modular arithmetic.

0P=00P=0
1P=(3,6)1P=(3,6)
2P=(80,10)2P=(80,10)
3P=(80,87)3P=(80,87)
4P=(3,91)4P=(3,91)
5P=05P=0
6P=(3,6)6P=(3,6)
7P=(80,10)7P=(80,10)
8P=(80,87)8P=(80,87)
9P=(3,91)9P=(3,91)
...

Here we can immediately spot two things: firstly, the multiples of PP are just five: the other points of the elliptic curve never appear. Secondly, they are repeating cyclically. We can write:

5kP=05kP=0
(5k+1)P=P(5k+1)P=P
(5k+2)P=2P(5k+2)P=2P
(5k+3)P=3P(5k+3)P=3P
(5k+4)P=4P(5k+4)P=4P

for every integer kk. Note that these five equations can be "compressed" into a single one, thanks to the modulo operator: kP=(kmod5)PkP=(kmod5)P.

Not only that, but we can immediately verify that these five points are closed under addition. Which means: however I add 00, PP, 2P2P, 3P3P or 4P4P, the result is always one of these five points. Again, the other points of the elliptic curve never appear in the results.

The same holds for every point, not just for P=(3,6)P=(3,6). In fact, if we take a generic PP:

nP+mP=P+⋯+Pn times+P+⋯+Pm times=(n+m)PnP+mP=P+⋯+P⏟n times+P+⋯+P⏟m times=(n+m)P

Which means: if we add two multiples of PP, we obtain a multiple of PP (i.e. multiples of PP are closed under addition). This is enough to prove that the set of the multiples of PP is a cyclic subgroup of the group formed by the elliptic curve.

A "subgroup" is a group which is a subset of another group. A "cyclic subgroup" is a subgroup which elements are repeating cyclically, like we have shown in the previous example. The point PP is called generator or base point of the cyclic subgroup.

Cyclic subgroups are the foundations of ECC and other cryptosystems. We will see why in the next post.

Subgroup order

We can ask ourselves what the order of a subgroup generated by a point PP is (or, equivalently, what the order of PP is). To answer this question we can't use Schoof's algorithm, because that algorithm only works on whole elliptic curves, not on subgroups. Before approaching the problem, we need a few more bits:

So far, we have the defined the order as the number of points of a group. This definition is still valid, but within a cyclic subgroup we can give a new, equivalent definition: the order of PP is the smallest positive integer nn such that nP=0nP=0. In fact, if you look at the previous example, our subgroup contained five points, and we had 5P=05P=0.
The order of PP is linked to the order of the elliptic curve by Lagrange's theorem, which states that the order of a subgroup is a divisor of the order of the parent group. In other words, if an elliptic curve contains NN points and one of its subgroups contains nn points, then nn is a divisor of NN.

These two information together give us a way to find out the order of a subgroup with base point PP:

Calculate the elliptic curve's order NN using Schoof's algorithm.
Find out all the divisors of NN.
For every divisor nn of NN, compute nPnP.
The smallest nn such that nP=0nP=0 is the order of the subgroup.

For example, the curve y2=x3−x+3y2=x3−x+3 over the field F37F37 has order N=42N=42. Its subgroups may have order n=1n=1, 22, 33, 66, 77, 1414, 2121 or 4242. If we try P=(2,3)P=(2,3) we can see that P≠0P≠0, 2P≠02P≠0, ..., 7P=07P=0, hence the order of PP is n=7n=7.

Note that it's important to take the smallest divisor, not a random one. If we proceeded randomly, we could have taken n=14n=14, which is not the order of the subgroup, but one of its multiples.

Another example: the elliptic curve defined by the equation y2=x3−x+1y2=x3−x+1 over the field F29F29 has order N=37N=37, which is a prime. Its subgroups may only have order n=1n=1 or 3737. As you can easily guess, when n=1n=1, the subgroup contains only the point at infinity; when n=Nn=N, the subgroup contains all the points of the elliptic curve.

Finding a base point

For our ECC algorithms, we want subgroups with a high order. So in general we will choose an elliptic curve, calculate its order (NN), choose a high divisor as the subgroup order (nn) and eventually find a suitable base point. That is: we won't choose a base point and then calculate its order, but we'll do the opposite: we will first choose an order that looks good enough and then we will hunt for a suitable base point. How do we do that?

Firstly, we need to introduce one more term. Lagrange's theorem implies that the number h=N/nh=N/n is always an integer(because nn is a divisor of NN). The number hh has a name: it's the cofactor of the subgroup.

Now consider that for every point of an elliptic curve we have NP=0NP=0. This happens because NN is a multiple of any candidate nn. Using the definition of cofator, we can write:

n(hP)=0n(hP)=0

Now suppose that nn is a prime number (for reason that will be explained in the next post, we prefer prime orders). This equation, written in this form, is telling us that the point G=hPG=hP generates a subgroup of order nn (except when G=hP=0G=hP=0, in which case the subgroup has order 1).

In the light of this, we can outline the following algorithm:

Calculate the order NN of the elliptic curve.
Choose the order nn of the subgroup. For the algorithm to work, this number must be prime and must be a divisor of NN.
Compute the cofactor h=N/nh=N/n.
Choose a random point PP on the curve.
Compute G=hPG=hP.
If GG is 0, then go back to step 4. Otherwise we have found a generator of a subgroup with order nn and cofactor hh.

Note that this algorithm only works if nn is a prime. If nn wasn't a prime, then the order of GG could be one of the divisors of nn.

Discrete logarithm

As we did when working with continuous elliptic curves, we are now going to discuss the question: if we know PP and QQ, what is kk such that Q=kPQ=kP?

This problem, which is known as the discrete logarithm problem for elliptic curves, is believed to be a "hard" problem, in that there is no known polynomial time algorithm that can run on a classical computer. There are, however, no mathematical proofs for this belief.

This problem is also analogous to the discrete logarithm problem used with other cryptosystems such as the Digital Signature Algorithm (DSA), the Diffie-Hellman key exchange (D-H) and the ElGamal algorithm — it's not a coincidence that they have the same name. The difference is that, with those algorithms, we use modulo exponentiation instead of scalar multiplication. Their discrete logarithm problem can be stated as follows: if we know aa and bb, what's kk such that b=akmodpb=akmodp?

Both these problems are "discrete" because they involve finite sets (more precisely, cyclic subgroups). And they are "logarithms" because they are analogous to ordinary logarithms.

What makes ECC interesting is that, as of today, the discrete logarithm problem for elliptic curves seems to be "harder" if compared to other similar problems used in cryptography. This implies that we need fewer bits for the integer kk in order to achieve the same level of security as with other cryptosystems, as we will see in details in the fourth and last post of this series.

More next week!

Enough for today! I really hope you enjoyed this post. Leave a comment if you didn't.

Next week's post will be the third in this series and will be about ECC algorithms: key pair generation, ECDH and ECDSA. That will be one of the most interesting parts of this series. Don't miss it!

Read the next post of the series »

Oo璀璨星海oO

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Elliptic Curve Cryptography: finite fields and discrete logarithms

转载自：https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/This post is the second in the seriesECC: a gentle introduction.In theprevious pos...
复制链接

扫一扫

专栏目录