2020年9月12日,本源量子面向全球发布了国内首个量子云计算平台,迈出了我国量子计算技术从实验室走向市场化的重要一步。
本文对本源量子云中提供的手写模式识别应用进行简要介绍,并附带给出了手写数字识别算法的具体实现。本源量子手写识别的链接如下:https://qcloud.qubitonline.cn/main/chartIndex用户可通过画笔工具在三个面板中写入对应的数字,点击识别即可识别出对应的数字: 虽然手写识别的功能在经典计算机中并不算太过于新鲜的事情,这件工作在现有的模式识别中太过于trivial,但却是小规模量子计算机具有一定实用价值的重要标志。该实现 标志了在经典计算机与量子计算机相互配合下能够解决某些具体问题。通过本公众号前面的文章我们可以认识到:量子计算机具有不同于经典计算机的强大计算能力,其算力是随着量子比特数目的增加呈指数级增长的。
量子计算遇上高性能计算系列(六)量子霸权
量子计算机具有不同于经典计算机的评价体系,其性能指标参数不仅仅是每秒钟能够完成的计算量,更重要的在于由于量子概率性引入的计算误差。
http://handwritten.qubitonline.cn/files/digits_number/MNIST_RECOGNITION.pdf
手写数字识别算法
1.手写数字识别简介
1.1手写数字识别任务
数字识别是计算机从一些来源诸如纸质文档、照片等接收信息,然后理解和识别可读数字的能力,手写数字识别是目前比较受关注的和典型的图像分类任务,广泛应用于卡号识别、邮编识别,大大提高业务处理效率。
任务输入:一系列手写数字图片,其中每张图片都是28x28像素的矩阵。
任务输出:经过一些列处理,输出对应的0~9数字标签。
1.2 MNIST数据集
目前使用的最广泛的数据集是MNIST数据集,该数据集包含60000条训练样本和10000条测试样本。数据由Yann LeCun等人进行标注整理完成。
该数据集的每个样本是由28x28像素的图片和对应的标签构成,通过对训练样本进行训练,得到模型,用测试样本去测试训练模型的效果,判断模型的好坏。
2. 手写数字识别解决的问题
利用训练好的模型权重参数,将需要识别数字0或者1变成28x28像素的图片,然后加载模型进行计算获得预测结果。
3. 实现步骤
首先用量子网络结构去训练获得实现目标值的量子线路参数值,然后利用经典的神经元模型将28x28像素的图片输入信息进行压缩形成量子线路参数,采用神经元参数训练模式,获得图片压缩的各个权重值,利用该权重与输入图片进行计算获得量子网络结构的线路参数,然后通过量子线路计算得到预测结果。
3.1 构建含参量子线路计算期望值
定义量子线路计算目标期望值,线路如下:
利用该线路去获得量子态的值和对应的概率,然后计算它们的期望,获得线路的计算结果。代码如下。
def cal_circuit(thetas):
pq.init(pq.QMachineType.CPU)
qubitlist = pq.qAlloc_many(1)
prog = pq.QProg()
for theta, qubit in zip(thetas, qubitlist):
prog.insert(pq.H(qubit))
prog.insert(pq.RY(qubit, theta))
result = pq.prob_run_dict(prog, qubitlist)
states = []
probabilities = []
for key, val in result.items():
states.append(int(key, 2))
probabilities.append(val)
states = np.array(states, 'float')
probabilities = np.array(probabilities)
expectation = np.sum(states * probabilities)
pq.finalize()
return np.array([expectation])
3.2 量子梯度计算
根据输入值和目标值进行量子梯度计算获取梯度值,代码如下:
def gradient(x, y):
z = cal_circuit(x)
input_list = np.array(x)
shift_right = input_list + np.ones(input_list.shape) * np.pi / 2
shift_left = input_list - np.ones(input_list.shape) * np.pi / 2
gradients = []
for i in range(len(input_list)):
expectation_right = cal_circuit([shift_right[i]])
expectation_left = cal_circuit([shift_left[i]])
gradient = expectation_right - expectation_left
gradients.append(gradient)
gradients = np.array(gradients)
return gradients * (z[0] - y)
3.3 损失值计算
根据前向预测值和目标值,计算差值平方获取损失值,代码如下。
def loss(z, y):
error = z - y
cost = error * error
return cost
3.4 量子参数训练
利用输入的目标值,训练量子线路的参数,使得参数符合计算目标,代码如下。
def train(y):
thetas = np.random.randn(1)
losses = []
for i in range(300):
z = cal_circuit(thetas)
cost = loss(z[0], y)
losses.append(cost)
grad = gradient(thetas, y).reshape(1, -1)[0]
thetas = thetas - 0.5 * grad
print('iter{:3d} , loss = {:.4f}'.format(i, cost))
return thetas
THETAS = []
for i in range(2):
THETAS.append(train(i))
print(THETAS[0])
通过训练后获取量子含参线路的参数值,以此参数作为经典神经元训练的目标值,下面介绍经典神经元训练参数部分。
3.5 数据处理
3.5.1 数据集文件读取
对MNIST数据集进行读取并进行处理,使其符合训练模型的要求。
首先是按照指定的目标标签内容读取数据集文件代码如下。
def load_mnist(dataset="training_data", digits=np.arange(2), path=".\\MNIST_data"):
if dataset == "training_data":
fname_image = os.path.join(path, 'train-images.idx3-ubyte')
fname_label = os.path.join(path, 'train-labels.idx1-ubyte')
elif dataset == "testing_data":
fname_image = os.path.join(path, 't10k-images.idx3-ubyte')
fname_label = os.path.join(path, 't10k-labels.idx1-ubyte')
else:
raise ValueError("dataset must be 'training_data' or 'testing_data'")
flbl = open(fname_label, 'rb')
magic_nr, size = struct.unpack(">II", flbl.read(8))
lbl = pyarray("b", flbl.read())
flbl.close()
fimg = open(fname_image, 'rb')
magic_nr, size, rows, cols = struct.unpack(">IIII", fimg.read(16))
img = pyarray("B", fimg.read())
fimg.close()
ind = [k for k in range(size) if lbl[k] in digits]
N = len(ind)
images = zeros((N, rows, cols), dtype=uint8)
labels = zeros((N, 1), dtype=int8)
for i in range(len(ind)):
images[i] = array(img[ind[i] * rows * cols: (ind[i] + 1) * rows * cols]).reshape((rows, cols))
labels[i] = lbl[ind[i]]
return images, labels
3.5.2 读取的数据处理
将数据进行形状变形和归一化处理,代码如下。
def load_samples(dataset="training_data"):
image, label = load_mnist(dataset)
X = [np.reshape(x, (28 * 28, 1)) for x in image]
X = [x / 255.0 for x in X]
if dataset == "training_data":
Y = [THETAS[y[0]] for y in label]
pair = list(zip(X, Y))
return pair
elif dataset == 'testing_data':
pair = list(zip(X, label))
return pair
else:
print('Something wrong')
3.6 定义网络结构
初始化神经网络结构和层数,代码如下。
def __init__(self, sizes):
self.sizes_ = sizes
self.num_layers_ = len(sizes)
self.w_ = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]
self.b_ = [np.random.randn(y, 1) for y in sizes[1:]]
3.7 激活函数和激活函数导数
定义sigmoid激活函数和其导数用于非线性处理计算结果,代码如下。
def sigmoid(self, z):
return 3.0 / (1.0 + np.exp(-z)) - 1.5
def sigmoid_prime(self, z):
return 3 / (1.0 + np.exp(-z)) * (1 - 1 / (1.0 + np.exp(-z)))
3.8 前向计算
根据输入和权重计算出结果,代码如下。
def feedforward(self, x):
for b, w in zip(self.b_, self.w_):
x = self.sigmoid(np.dot(w, x) + b)
return x
3.9 反向传播计算
根据输入和目标值,计算反向传播的参数,代码如下。
def backprop(self, x, y):
nabla_b = [np.zeros(b.shape) for b in self.b_]
nabla_w = [np.zeros(w.shape) for w in self.w_]
activation = x
activations = [x]
zs = []
for b, w in zip(self.b_, self.w_):
z = np.dot(w, activation) + b
zs.append(z)
activation = self.sigmoid(z)
activations.append(activation)
delta = self.cost_derivative(activations[-1], y) * \
self.sigmoid_prime(zs[-1])
nabla_b[-1] = delta
nabla_w[-1] = np.dot(delta, activations[-2].transpose())
for l in range(2, self.num_layers_):
z = zs[-l]
sp = self.sigmoid_prime(z)
delta = np.dot(self.w_[-l + 1].transpose(), delta) * sp
nabla_b[-l] = delta
nabla_w[-l] = np.dot(delta, activations[-l - 1].transpose())
return (nabla_b, nabla_w)
3.10 更新权重
根据当前批次的数据和学习率去更新权重值,代码如下。
def update_mini_batch(self, mini_batch, eta):
nabla_b = [np.zeros(b.shape) for b in self.b_]
nabla_w = [np.zeros(w.shape) for w in self.w_]
for x, y in mini_batch:
delta_nabla_b, delta_nabla_w = self.backprop(x, y)
nabla_b = [nb + dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw + dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
self.w_ = [w - (eta / len(mini_batch)) * nw for w, nw in zip(self.w_, nabla_w)]
self.b_ = [b - (eta / len(mini_batch)) * nb for b, nb in zip(self.b_, nabla_b)]
3.11 评估训练效果
根据测试集数据,来评估训练结果的好坏,代码如下。
def evaluate(self, test_data):
test_results = [(np.around(cal_circuit(self.feedforward(x))), y) for (x, y) in test_data]
return sum(int(x == y) for (x, y) in test_results)
3.12 计算输出结果与目标差值
通过网络结构计算的结果与目标值进行比较,计算差值,代码如下。
def cost_derivative(self, output_activations, y):
return (output_activations - y)
3.13 训练
利用训练数据对模型参数进行训练迭代,代码如下。
def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None):
if test_data:
n_test = len(test_data)
n = len(training_data)
for j in range(epochs):
random.shuffle(training_data)
mini_batches = [training_data[k:k + mini_batch_size] for k in range(0, n, mini_batch_size)]
for mini_batch in mini_batches:
self.update_mini_batch(mini_batch, eta)
if test_data:
print("Epoch {0}: {1} / {2}".format(j, self.evaluate(test_data), n_test))
else:
print("Epoch {0} complete".format(j))
3.14 保存训练参数模型
根据输入的文件名,将训练后的权重数据进行保存,代码如下。
def save(self, file_name):
np.savez(file_name, w=self.w_, b=self.b_)
3.15 加载训练参数模型
根据文件名,将训练参数权重数据文件加载到内存,方便后续预测使用,代码如下。
def load(self, file_name):
f = np.load(file_name, allow_pickle=True)
w = f['w']
b = f['b']
return w, b
3.16 图片文件处理
将输入的28*28像素图片文件,处理成归一化后的二值灰度图片并将形状改成符合要求的输入形状数据,代码如下。
def image_read(file_name):
image = cv2.imread(file_name, 0)
t, rst = cv2.threshold(image, 127, 1, cv2.THRESH_BINARY)
x = rst.reshape(28*28, 1)
return x
3.17 预测
根据输入输入,加载权重数据并带入带量子线路中进行结果预测,代码如下。
def predict(x, w_, b_):
plt.imshow(x.reshape(28, 28), cmap='gray')
plt.show()
for b, w in zip(b_, w_):
x = net.sigmoid(np.dot(w, x) + b)
result = np.around(cal_circuit(x))
print('预测值:', result)
4.所有代码和测试结果
代码如下:
import pyqpanda.pyQPanda as pq
import numpy as np
import random
import os, struct
from array import array as pyarray
from numpy import array, int8, uint8, zeros
import matplotlib.pyplot as plt
import cv2
class NeuralNet(object):
def __init__(self, sizes):
self.sizes_ = sizes
self.num_layers_ = len(sizes)
self.w_ = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]
self.b_ = [np.random.randn(y, 1) for y in sizes[1:]]
def sigmoid(self, z):
return 3.0 / (1.0 + np.exp(-z)) - 1.5
def sigmoid_prime(self, z):
return 3 / (1.0 + np.exp(-z)) * (1 - 1 / (1.0 + np.exp(-z)))
def feedforward(self, x):
for b, w in zip(self.b_, self.w_):
x = self.sigmoid(np.dot(w, x) + b)
return x
def backprop(self, x, y):
nabla_b = [np.zeros(b.shape) for b in self.b_]
nabla_w = [np.zeros(w.shape) for w in self.w_]
activation = x
activations = [x]
zs = []
for b, w in zip(self.b_, self.w_):
z = np.dot(w, activation) + b
zs.append(z)
activation = self.sigmoid(z)
activations.append(activation)
delta = self.cost_derivative(activations[-1], y) * \
self.sigmoid_prime(zs[-1])
nabla_b[-1] = delta
nabla_w[-1] = np.dot(delta, activations[-2].transpose())
for l in range(2, self.num_layers_):
z = zs[-l]
sp = self.sigmoid_prime(z)
delta = np.dot(self.w_[-l + 1].transpose(), delta) * sp
nabla_b[-l] = delta
nabla_w[-l] = np.dot(delta, activations[-l - 1].transpose())
return (nabla_b, nabla_w)
def update_mini_batch(self, mini_batch, eta):
nabla_b = [np.zeros(b.shape) for b in self.b_]
nabla_w = [np.zeros(w.shape) for w in self.w_]
for x, y in mini_batch:
delta_nabla_b, delta_nabla_w = self.backprop(x, y)
nabla_b = [nb + dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
nabla_w = [nw + dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
self.w_ = [w - (eta / len(mini_batch)) * nw for w, nw in zip(self.w_, nabla_w)]
self.b_ = [b - (eta / len(mini_batch)) * nb for b, nb in zip(self.b_, nabla_b)]
def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None):
if test_data:
n_test = len(test_data)
n = len(training_data)
for j in range(epochs):
random.shuffle(training_data)
mini_batches = [training_data[k:k + mini_batch_size] for k in range(0, n, mini_batch_size)]
for mini_batch in mini_batches:
self.update_mini_batch(mini_batch, eta)
if test_data:
print("Epoch {0}: {1} / {2}".format(j, self.evaluate(test_data), n_test))
else:
print("Epoch {0} complete".format(j))
def evaluate(self, test_data):
test_results = [(np.around(cal_circuit(self.feedforward(x))), y) for (x, y) in test_data]
return sum(int(x == y) for (x, y) in test_results)
def cost_derivative(self, output_activations, y):
return (output_activations - y)
def save(self, file_name):
np.savez(file_name, w=self.w_, b=self.b_)
def load(self, file_name):
f = np.load(file_name, allow_pickle=True)
w = f['w']
b = f['b']
return w, b
def cal_circuit(thetas):
pq.init(pq.QMachineType.CPU)
qubitlist = pq.qAlloc_many(1)
prog = pq.QProg()
for theta, qubit in zip(thetas, qubitlist):
prog.insert(pq.H(qubit))
prog.insert(pq.RY(qubit, theta))
result = pq.prob_run_dict(prog, qubitlist)
states = []
probabilities = []
for key, val in result.items():
states.append(int(key, 2))
probabilities.append(val)
states = np.array(states, 'float')
probabilities = np.array(probabilities)
expectation = np.sum(states * probabilities)
pq.finalize()
return np.array([expectation])
def gradient(x, y):
z = cal_circuit(x)
input_list = np.array(x)
shift_right = input_list + np.ones(input_list.shape) * np.pi / 2
shift_left = input_list - np.ones(input_list.shape) * np.pi / 2
gradients = []
for i in range(len(input_list)):
expectation_right = cal_circuit([shift_right[i]])
expectation_left = cal_circuit([shift_left[i]])
gradient = expectation_right - expectation_left
gradients.append(gradient)
gradients = np.array(gradients)
return gradients * (z[0] - y)
def loss(z, y):
error = z - y
cost = error * error
return cost
def train(y):
thetas = np.random.randn(1)
losses = []
for i in range(300):
z = cal_circuit(thetas)
cost = loss(z[0], y)
losses.append(cost)
grad = gradient(thetas, y).reshape(1, -1)[0]
thetas = thetas - 0.5 * grad
print('iter{:3d} , loss = {:.4f}'.format(i, cost))
return thetas
def load_mnist(dataset="training_data", digits=np.arange(2), path=".\\MNIST_data"):
if dataset == "training_data":
fname_image = os.path.join(path, 'train-images.idx3-ubyte')
fname_label = os.path.join(path, 'train-labels.idx1-ubyte')
elif dataset == "testing_data":
fname_image = os.path.join(path, 't10k-images.idx3-ubyte')
fname_label = os.path.join(path, 't10k-labels.idx1-ubyte')
else:
raise ValueError("dataset must be 'training_data' or 'testing_data'")
flbl = open(fname_label, 'rb')
magic_nr, size = struct.unpack(">II", flbl.read(8))
lbl = pyarray("b", flbl.read())
flbl.close()
fimg = open(fname_image, 'rb')
magic_nr, size, rows, cols = struct.unpack(">IIII", fimg.read(16))
img = pyarray("B", fimg.read())
fimg.close()
ind = [k for k in range(size) if lbl[k] in digits]
N = len(ind)
images = zeros((N, rows, cols), dtype=uint8)
labels = zeros((N, 1), dtype=int8)
for i in range(len(ind)):
images[i] = array(img[ind[i] * rows * cols: (ind[i] + 1) * rows * cols]).reshape((rows, cols))
labels[i] = lbl[ind[i]]
return images, labels
THETAS = [[-1.455], [1.455]]
def load_samples(dataset="training_data"):
image, label = load_mnist(dataset)
X = [np.reshape(x, (28 * 28, 1)) for x in image]
X = [x / 255.0 for x in X]
if dataset == "training_data":
Y = [THETAS[y[0]] for y in label]
pair = list(zip(X, Y))
return pair
elif dataset == 'testing_data':
pair = list(zip(X, label))
return pair
else:
print('Something wrong')
def image_read(file_name):
image = cv2.imread(file_name, 0)
t, rst = cv2.threshold(image, 127, 1, cv2.THRESH_BINARY)
x = rst.reshape(28*28, 1)
return x
def predict(x, w_, b_):
plt.imshow(x.reshape(28, 28), cmap='gray')
plt.show()
for b, w in zip(b_, w_):
x = net.sigmoid(np.dot(w, x) + b)
result = np.around(cal_circuit(x))
print('预测值:', result)
if __name__ == '__main__':
INPUT = 28 * 28
OUTPUT = 1
net = NeuralNet([INPUT, 40, OUTPUT])
train_set = load_samples(dataset='training_data')
test_set = load_samples(dataset='testing_data')
w_, b_ = net.load('0_1_w_b_5.npz')
fig, ax = plt.subplots(2, 5)
for i in range(10):
x = test_set[i][0]
for b, w in zip(b_, w_):
x = net.sigmoid(np.dot(w, x) + b)
result = int(np.around(cal_circuit(x))[0])
title = 'Predict: %d' % result
ax[i//5][i%5].imshow(test_set[i][0].reshape(28,28), cmap = 'gray')
ax[i//5][i%5].set_title(title)
ax[i//5][i%5].set_xticks([])
ax[i//5][i%5].set_yticks([])
plt.tight_layout()
plt.show()
测试结果:
图片为真实值,上方标签为预测值。
5.总结
通过量子经典混合神经网络实现手写数字0和1的识别功能,通过量子网络实现线路参数训练,通过经典网络实现输入数据到量子网络参数训练,二者结合实现整体预测功能。