[2020][ASIACRYPT]Estimating quantum speedups for lattice sieves 报告文字版
Eamonn W. Postlethwaite
Hello, everybody and thank you for watching this youtube video.I’m going to be presenting work on 《estimating quantum speedups for lattice sieves》. And this is joint with my wonderful co-authors: Martin R. Albrecht, Vlad Gheorghiu, John M. Schanck, and I’m Eamon.
So what we are doing is trying to understand lattice sieves and in particular the quantum variance of them in the non-asymptotic regime. So for particular dimensions of lattices that might be interesting for cryptanalysis, we’re going to try and approach this problem by designing particular quantum circuits for some subroutines of these lattice sieves. And we’re also going to design some software which then optimizes these quantum circuits with respect to some germane cost metrics. And finally we’re doing this because a number of lattice-based crypto systems have their concrete security estimated using such sips.
So to begin with the works, what is the lattice for the purpose of this talk. A lattice is a discrete additive subgroup of real space, here two-dimensional real space. And it’s usually described by a basis which is any collection of linearly independent real vectors whose integer span gives all of the points of a lattice. And any lattice of dimension two or greater is going to have an infinite number of such bases.
Now, the problems that lattices solve are shortest vector type problems on-screen currently is the exact form of this problem. Whereby, you are given a basis of a lattice and asked to find a non-zero vector in that lattice which is shorter than or as short as every other non-zero vector and here short means Euclidean and also throughout this talk. The sieves that we consider actually solve an approximate variant of this problem but it’s really much the same.
And briefly, how do this work? Well, the first thing that they do is sample a very large number of lattice vectors, a number that is exponential in the dimension of the lattice. And if you’re given a basis that has no special structure which it shouldn’t really if you’re thinking in terms of cryptanalysis, then the vectors you sample in this way are going to be very long, in particular, they’re going to be an exponential factor again in the dimension of the lattice longer than the shortest vector in the lattice. And this operation is cheap, at least it is cheap to sample a single lattice vector. And a heuristic introduced in the numerical work of 2008 says that all of these long vectors that you sample using the basis are going to be approximately the same length and so you can consider them in some thin annulus or even better on surface of some very large sphere and then to ease analysis your normalize space, and you end up with all of these lattice vectors that you’re assuming are on the surface of the unit sphere.
And then what a lattice sieve does is it called the nearest neighbor search subroutine(NNS). Now, in generality, this is some procedure that given a list tries to find tuples of elements from that list that satisfies some condition. And this condition usually encodes some idea of being close or being near. The sieves that we consider are going to be two sieves so these tuples are going to be pairs and this nearness condition is that the inner product of these lattice vectors is going to be greater than or equal to
c
o
s
π
3
cos\frac{\pi}{3}
cos3π. And no this is because on the surface of a unit sphere this condition simply means that their difference is going to be length one or less. And so lattice is called nearest neighbor search subroutines and somewhere within this nearer neighbor’s nearest neighbor search subroutine is going to be a list search.
That calculating inner product of the lattice vectors to find the nearest neighbor is only useful on the surface of unit ball. How does it apply to the SVP problem further more.
Now the idea of lattice is to start with sufficiently many points that you can find enough pairs of points that satisfies this nearness condition, so that after you find these pairs you actually take that difference end up with a similarly size list of slightly shorter lattice vectors, and then iterate this procedure so you iterate this nearest neighbor search within your lattice.
And indeed, a lot of the progress that’s been made in practical lattice sieving over the past decade or so has been coming up with more and more efficient ways of implementing this nearest neighbor search subroutine. So we’re going to consider three: the Nuenvidix style which sort of started of practical lattice sieving which is a double loop over the list, and then these last two random bucket and listed coding are nearest neighbor search style subroutines that use something called locality sensitive filtering(LSF) which is the idea that you somehow pre-process your list of lattice points which maybe costs you something in complexity but also gives you these sub-lists in which points are more likely to satisfy the nearest neighbor condition. And we pick these two random bucket because it was implemented in the general sieving kernel. And list decoding because it represents the fastest known asymptotic nearest neighbor search subroutine .
Now, something important to notice here is that the quantum version of all of these lattice sieves they consider the nearest neighbor search subroutine and somewhere within that nearest neighbor search subroutine is a subroutine that searches through a list at some point you must do that. And the quantum variants replace that classical list search with Grover’s search algorithm. Another thing to notice here is that all of the lattice sieves that we’re going to consider and indeed, all lattice sieves require exponential space and the exact value of this exponent is not important but it excellently grows exponentially with dimension of the lattice.
So, now we have a brief understanding of how lattices operate, how are we going to get to this non-asymptotic understanding of their complexity. Well, we’re going to introduce a series of search routines which is going to end with a filtered quantum search routine. And just to give a frame of reference, this function
f
f
f which we’re calling a predicate this is going to be encoding the condition that’s on the bottom of the screen here(指:find pairs
(
v
i
,
v
j
)
(v_i,v_j)
(vi,vj) such that
∥
v
i
−
v
j
∥
≤
1
\lVert v_i -v_j \rVert\leq 1
∥vi−vj∥≤1
⇔
\Leftrightarrow
⇔
⟨
v
i
,
v
j
⟩
≥
c
o
s
π
3
\langle v_i,v_j\rangle \geq cos\frac{\pi}{3}
⟨vi,vj⟩≥cos3π) , so this inner product condition ultimately.
But we’re going to speak about it in generality for a little while we’re going to call f f f from [ N ] = { 1 , ⋯ , N } [N] = \{1,\cdots,N\} [N]={1,⋯,N} and unstructured(无条理的) predicate, so we don’t know any internal structure or workings of f f f. And we’re going to call it roots all of the elements upon which it evaluates to 0, and we’re going to call its kernel, the collection of all of these roots.
So depending whether we’re in the classical or quantum world, we find a root in a different way classically. We can certainly find a root if one exists by simply evaluating
f
f
f on its entire domain. And quantumly, we can actually measure this expression here, so this is Grover’s algorithm in a succinct(简明) form.
D
∣
0
⟩
\bold{D}|0\rangle
D∣0⟩ means putting the elements 1 to N in an equal superposition.
G
(
f
)
\bold{G}(f)
G(f) implies the Grover iteration encoding the predicate
f
f
f, so it’s some reversible implementation of
f
f
f and then a reflection operator. And if you apply this to the equal superposition
j
j
j times and measure it, then you’re meant to receive a root of
f
f
f.
In particular, if we’re in the following regime where the kernel of
f
f
f is small so there’s, perhaps, a small constant number of roots. Then classically, we expect this process to take order of
N
N
N queries. And quantumly we expect to have to applying this Grover’s iteration order of
N
N
N, order of
N
\sqrt{N}
N times. And this is the magic of Grover’s algorithm. But of course, the relative costs of
f
f
f in the classical world and
G
(
f
)
\bold{G}(f)
G(f) in the quantum world are important.
So staying in the classical world for a little while longer if
f
f
f is very expensive, we might somehow like to filter the calls we make to
f
f
f. And to do this we’re going to define some second predicate which we’re calling a filter or
g
g
g with the same domain and codomain(陪域). And we are going to demand that it has at least one root in common with
f
f
f. And classically, we could then think about evaluating
g
(
1
)
g(1)
g(1) and then only evaluating
f
(
1
)
f(1)
f(1) if we’ve hit a root of
g
(
1
)
g(1)
g(1), and continuing this process.
Now, this is not always a good idea depending on the exact properties of
g
g
g, but
g
g
g has there are certain properties, which if
g
g
g has, this is probably going to be a good filter. So in particular, if
g
g
g is cheap to evaluate, and certainly cheaper than
f
f
f, otherwise you would just evaluate
f
f
f, then this maybe is a good idea, but also if these false positive and false negative rates are small again, this points to this being a good filter.
So just to take one of these quantities, ρ f ( g ) \rho_f(g) ρf(g), the false positive rate of g g g, if this is large, it says that not many of the roots of g g g are roots of f f f. And therefore, a lot of the time when you hit a root of g g g, you’re going to evaluate f f f and you’re not going to get a root of f f f. Whereas, if this is small, it says that most of the roots of g g g are roots of f f f. And the false negative rate is somehow the opposite.
So, the problem, with applying this kind of idea immediately to the quantum world, is you can’t perform this kind of conditional branching within Grover’s Algorithm. And one of the first contributions of our work is to give a technical lemma that roughly says the following:
let g g g be a filter for predict f f f, and then we have two conditions using these real numbers P 、 Q P、Q P、Q and γ \gamma γ, which I shall come back to in a second, but the upshot(要点) of this lemma is that we can find a root of f f f with some constant probability, a cost dominated by(由…决定) γ 2 N Q \frac{\gamma}{2}\sqrt{\frac{N}{Q}} 2γQN calls(调用) to the Grover iteration. But the Grover iteration that encodes the filter and not the predicate, and in particular if the filter is a much cheaper quantum circuit than the predicate, then this might be a good thing.
Now there’re some details being swept(掠过) under the rug(毛毯) here,but the most pertinent(中肯的,相关的) of them is we actually have to make sure that this is the dominating cost, we do this in our analysis, so I’m presenting it this way for the sake(缘故) of clarity. Now, these two conditions using P 、 Q P、Q P、Q and γ \gamma γ effectively encode how much we know about the sizes of the kernels and the intersection(交) of various kernels. γ \gamma γ can go as small as 1 and as γ \gamma γ gets smaller, we’re saying we know more about the size of the kernel of g g g. We might know, for example, its expected value or something like this. But if γ \gamma γ is 1, then we know the size of the kernel of g g g exactly and indeed the cost decreases with γ \gamma γ. Similarly, the larger the Q Q Q can be, the bigger the intersection of the kernel of the predicate and the filter. And the larger this is if you think back to the false positive and false negative rates on the previous slide, then the lower those are going to be.
And so yes, the crucial(关键的) thing now is that we’re talking about this filtered quantum search routine in terms of the cost of the growth iteration of the filter and not of the predicate itself.
So to move back to the case of lattice sieving and in particular nearest neighbor search subroutines, we have this large list of lattice vectors we’re going to fix one of them and call it
u
u
u, and our predicate is going to be
f
u
f_u
fu, and then we have these other list of vectors
{
v
1
,
⋯
,
v
N
}
\{v_1,\cdots,v_N\}
{v1,⋯,vN} . And we’re trying to find
v
i
v_i
vi's such that the inner product is greater equal
c
o
s
π
3
cos\frac{\pi}{3}
cos3π.
Now, so this is the predicate what’s the filter we’re going to use something called xor and popcount which was introduced to the sieving world in this paper of Fitzpatrickettel el. And the idea here is that you have a certain number of planes which is represented by
n
n
n. So in this 2-dimensional example, these are the lines
h
1
h_1
h1 and
h
2
h_2
h2. And you have a threshold(阈值)
k
k
k and the idea is if some
v
i
v_i
vi is the same side of at least
k
k
k planes as
u
u
u is, then its passes the filter, and so in particular,
v
1
v_1
v1 and
v
2
v_2
v2 pass the filter here because they’re the same side of at least one plane as
u
u
u. But
v
0
v_0
v0 doesn’t because it’s the same side of no planes as
u
u
u.
So we had a filtered quantum search routine that was costed in terms of the Grover iteration encoding the filter. And I’ve just introduced a filter called popcount. So the natural thing to do next is to design a quantum circuit for this filter. This is what’s on screen at the moment. I think this is the case
n
=
31
n=31
n=31. And it’s reversible. This is just the forward direction, but given this we now know the size of the quantum circuit for popcount, and the quantum circuit for the Grover iteration of popcount. Another thing that we do in our work is to give a heuristic analysis of the false positive and false negative rates of this filter as a function of its two parameters
k
k
k, the cutoff, and
n
n
n, the number of planes. But also, the dimension
d
d
d of the lattice that we’re sitting in.
And so, we almost have all of the pieces of the puzzle. Now, the final thing that we need is a collection of cost metrics and we’re going to follow the work of Jakes and Skank from CRYPTO last year, which cost the quantum computation in terms of the classical control required to run it and depending what assumptions you’re willing to make about quantum memory. This leads to a number of different cost metrics.
The first of which is the so-called gates metric and in this word, you’re assuming that self-correcting quantum memory exists i.e. if you just have an identity wire in your circuit, then to make sure that for that layer of depth of your circuit that wire maintains its correct state. That’s a free operation, you know they’re self-correcting quantum memory you don’t have to do anything. And so, the cost therefore of your circuit is just the classical cost of enacting(制定) all of the quantum gates in your quantum circuit. And so you assign some constant cost to each gate.
Now, it’s an open problem whether self-correcting quantum memory exists or not. And if you don’t think that it does then you end up in a world where you have to actively correct your quantum memory. And in this word, it’s not only gates but also identity wires in your circuit. So, that is again a wire in a layer of depth of your circuit that has no gate on it. They,themselves also have to be acted upon to make sure they maintain the correct quantum state. And so the first metric in this word of active error correction is the depth and width metric, that simply says every part of your circuit needs to have classical control somehow. And so, the cost of your circuit is some constant times, the depth times, the width of your circuit.
But then, you can concretize(具体化) further and actually talk about how you’re going to ensure that all the correct states are kept. And to do this you need some sort of error correction routine. And in particular, we use the surface code in this work, and roughly, how this works is you have a logical qubit for each wire in your quantum circuit. And it’s actually formed somehow of l o g 2 ( D W ) log^2(DW) log2(DW) noisy physical qubits. So the number of physical qubits required for each logical qubit actually grows as your circuit grows. And for each layer of quantum depth, some number of these noisy physical qubits have to be measured. There has to be some classical sort of routine run on those measurements and they have to be re-initialized. And this is to keep the noise of those physical quantum qubits. Sort of low enough that the logical qubits maintains the correct state. And so, if you think about that procedure, then every quantum gate and every identity wire in you circuit actually costs you something like Ω ( l o g 2 ( D W ) ) \Omega(log^2(DW)) Ω(log2(DW)) of your circuit.
But the whole point of this work is to move past asymptotics as far as possible. So we concretize even further and look at a recent and very fidelity(忠诚) study of the surface code for particular quantum computations which was given by Gidney and Ekera. And we adapt their scripts to our use case.
使用了[GE19]中的error correction model和Clifford+T gate set来实现量子计算。
And so now we find ourselves in the position where we have a filtered quantum search routine. We have a quantum circuit for a filter we want to use. We have an understanding therefore if its costs. We have an understanding of its false positive and false negative rates. And we have a serious of cost metrics that represents different realities and we write some software that says, it is effectively, an optimization piece of software. we give the access to(允许访问) the
k
k
k and
n
n
n which determine your popcount filter. And we give the access to any other sort of internal parameters of your nearest neighbor search subroutine. And you’re asked for given
d
d
d and a given metric, what is the cheapest way that this can be performed.
And so, we produce the number of figures in our work. And we are gonna to reproduce some of them here, that we are going to look only at the List Decoding Sieve which was the fastest asymptotic known nearest neighbor search subroutine. And we see the dash lines are the asymptotic complexities of classical and quantum. The solid lines are the complexities suggested by our optimization software. And in this depth-width metric, the quantum crossover, i.e. the point at which the quantum variant becomes faster is quite low down in the dimensions. But this was the more optimistic(乐观) of the two active error correction or cost metrics that we considered.
If we actually look at the Gidney-Ekera study of the surface code, then this crossover point moves quite far up into your dimensions.
Now, what to take away from our estimates in this paper is that they suggest a great deal less quantum advantage than the asymptotics would suggest. So here, 0.292 is the asymptotic for the classical list decoding sieve and 0.265 for the quantum. But we also must say that they don’t entirely rule out the relevance of quantum sieves encrypt analytically interesting ranges.
So, here are some hopefully pertinent(相关的) numbers we have. We’re again talking about List Decoding as in the previous two figures and the quantum metric runs down the left. So, in the Gidney-Ekera metric, the crossover point where the the classical and quantum near neighbor research routine has roughly as the same cost as about dimension 312. And then, if you consider the range of cryptanalytically interesting sort of time complexities
2
128
2^{128}
2128 to
2
256
2^{256}
2256, you see that at the bottom of that range you really make very little advantage quantumly. But as you get towards the top end of that range, your quantum advantage begins to grow. However, if you consider constraining a different thing such as memory be that classical or quantumly accessible classical RAM, then the picture is a little different. In the Gidney-Ekera metric, you make an advantage of about
2
7
2^7
27 whereas in the depth-width metric, which was the more optimistic of the active memory correction metrics. You have a similar sort of advantage to what you had at the top end of the time complexities for Gidney-Ekera metric. And so, depending on what you’re wiling to constrain, you get much more or much less quantum advantage.
Now, we do stress that what we give in this work are ultimately estimates. We have made some approximations throughout this work. We have appealed to some heuristics. We’ve also set some costs to be 0, although the costs that we set to be 0 tend to be costs that would be exactly the same in the quantum and the classical versions of these algorithms. And so, to include them and to consider some maximum cost would only reduce the range of dimensions for which there would be a quantum advantage. But ultimately, they could, the quantum advantage could grow it, could shrink it, could disappear altogether.
How to shrink it ?
And here are 3 reasons why we think in a more perfect analysis or a more complete analysis, the quantum advantage would actually shrink further. So, while we’re able to capture error correction in our advantage, what we don’t capture are the relative costs of qRAM and RAM. So this is the act of making a query and receiving a superposition of registers, that’s qRAM. And making a query and receiving a register, that’s RAM. And we need these queries to enact the algorithm as we’ve described them. And in our work, we assign both of these operations unit cost. Now, neither of them really cost unit cost. But all roads point towards qRAM having a far high cost than RAM. And so that if it were to be incorporated(合并) into our analysis, we think that the quantum advantage would shrink. Similarly, I described how in the surface code cost model, you have to measure a certain number of physical qubits per layer of quantum circuit depth and perform some classical computation on those measurements before reinitializing them. Well, the time that this classical computation takes sets the clock speed for your progression(进展) through your quantum circuit. And our model doesn’t capture that in any way at all. And finally, we don’t apply any depth constraints to our circuit, so a depth constraint says the depth of your circuit cannot be greater. For example, in the NIST post quantum standardization process, the most lenient(宽容的) depth constraint they suggest is 2 96 2^{96} 296. Now, in the face of depth constraints, classical search can be trivially paralyzed(瘫痪). You just split the search space and it’s known that Grover’s search doesn’t parallelize as well in the face of depth constraints. So, another thing that where we to take them into account, we think would lessen quantum advantage.
A final subtlety(细微之处) to note is that what we’ve tried to cost is the nearest neighbor search subroutine within lattice sieves. And while it’s always I think the most expensive part of Alassa’s sieve. It’s not the full story and it shouldn’t necessarily be used as a direct stand-in for the cost of solving shortest vector type problems. So for example, within a lattice if this nearest neighbor subroutine is iterated many times. And so it could lead to it being an underestimate for the cost of SVP. And there are also other subroutines, such as some parts of the locality sensitive filtering(LSF) that some nearest neighbor search subroutines use. And also the cost of sampling the vectors to begin with that we don’t account for.
But equally, there’re ways in which it could be an overestimate for the cost of solving SVP. For example, there’s a dimension for free technique that Ducat introduced, which says that you can use nearest neighbor search in some dimension d d d to help solve the shortest vector problem in a slightly large dimension d ′ d' d′ . Although we note that if you understand for a concrete instance, the relationship between d ′ d' d′ and d d d, then you can use our analysis in this case. And there were also many heuristic tricks, both in implementations and in the literature that are simply not captured by our model. And so, we recommend caution when using the nearest neighbor search subroutine costs that we present in this work as a stand-in (站在) for the cost of the shortest vector problem.
So, all it remains for me to say is thank you and all of our data and software can be found at this github repo and the paper with this eprint address. I look forward to viewing a lot of the talks for the excellent schedule that AsiaCrypt has this year. And I look forward to your questions. So thank you!