最近我的小孩出生了,请了个保姆,她每天会去买菜,买菜后截图微信付款记录发给我对账,微信付款记录如下图所示。因为是去菜市买菜,每天都有好多笔付款记录,去手算对账肯定是比较麻烦的,作为程序员,当然要用代码去解决问题。
观察支付截图的内容可以看到手机信号栏和服务通知部分是固定的,我们不需要,两个红框也是我们不需要的,红框的宽度是固定的,用windows自带的画图打开微信支付截图,鼠标移动到箭头的位置,可以看到大概是340像素,用这种方法,可以得到橘框的宽度是400像素,蓝框高度是260像素。用jdk自带的BufferedImage类可以裁剪图片,因为付款金额字体是黑色的,在裁剪的同时,我们对图片进行二值化处理,可以进一步去掉一些不需要的信息,比如“下午1:40”。代码如下:
private static void cropAndBinarization() throws IOException
{
for (File imageFile : Configuration.BILL_SRC.listFiles()) {
BufferedImage src = ImageIO.read(imageFile);
// 裁剪
int cropWidth = 400;
int cropHeight = src.getHeight() - 260;
int startX = 340;
int startY = 260;
BufferedImage cropImage = new BufferedImage(cropWidth, cropHeight, BufferedImage.TYPE_INT_RGB);
Graphics2D tg = cropImage.createGraphics();
tg.drawImage(src,
0, 0, cropWidth, cropHeight,
startX, startY, startX + cropWidth, startY + cropHeight,
null);
tg.dispose();
// 二值化处理
for (int y = 0; y < cropHeight; y++) {
for (int x = 0; x < cropWidth; x++) {
int rgb = cropImage.getRGB(x, y);
int value = 0xff000000 | rgb;
int r = (value >> 16) & 0xFF;
int g = (value >> 8) & 0xFF;
int b = (value >> 0) & 0xFF;
boolean isBlack = r <= 100 && g <= 100 && b <= 100;
cropImage.setRGB(x, y, isBlack ? BLACK : WHITE);
}
}
// 写入到预处理后的文件夹
File cropFile = new File(Configuration.CROP_FOLDER, imageFile.getName());
ImageIO.write(cropImage, "png", cropFile);
}
}
裁剪、二值化后的图片如下图所示。
接下来就是怎么截取金额了。首先,想对图片进行横向切割,得到一行一行的文字,如何确定切割点?可以通过投影法,投影法的介绍可以参考这篇文章 【OpenCV】利用投影法进行字符分割 ,如果文章失效了或者懒得去看,大致意思就是计算每行有多少个黑色像素,存到一个数组里,比如[0,0,0,0,1,2,3,4,4,3,2,0,0,0,0,1,2,2,2,0,0,0] ,黑色像素个数从0变到非0的index,就是分割点,举例的这个数组分割点就是(4,10), (15,18)。还有一个取巧的地方就是付款金额比其他字体大,付款金额高度大概是94像素,利用这一点,就可以截取到付款金额,同样用预处理中类似的方法,可以得到所有分割出来的金额。代码如下:
private static class Partition {
int s;
int e;
int length;
}
private static List<Partition> partition(int[] blackCounts)
{
List<Partition> result = new ArrayList<>();
for (int i = 0, len = blackCounts.length; i < len; i++) {
if (blackCounts[i] != 0) {
for (int j = i + 1; j < len; j++) {
if (blackCounts[j] == 0) {
Partition p = new Partition();
p.s = i;
p.e = j - 1;
p.length = p.e - p.s + 1;
result.add(p);
i = j;
break;
}
}
}
}
return result;
}
private static void cropMoneyValue() throws IOException
{
int counter = 0;
for (File imageFile : Configuration.CROP_FOLDER.listFiles()) {
BufferedImage image = ImageIO.read(imageFile);
int h = image.getHeight();
int w = image.getWidth();
int[] blackCounts = new int[h];
for (int y = 0; y < h; y++) {
int blackCount = 0;
for (int x = 0; x < w; x++) {
int rgb = image.getRGB(x, y);
if (rgb == BLACK) {
++blackCount;
}
}
blackCounts[y] = blackCount;
}
List<Partition> ps = partition(blackCounts);
for (Partition p : ps) {
if (p.length > 90) {
BufferedImage itemImage = new BufferedImage(w, p.length, BufferedImage.TYPE_INT_RGB);
Graphics2D g = itemImage.createGraphics();
g.drawImage(image,
0, 0, w, p.length,
0, p.s, w, p.e,
null);
File file = new File(Configuration.MONEY_VALUE_FOLDER, counter++ + ".png");
ImageIO.write(itemImage, "png", file);
}
}
}
}
裁剪后的图片如下图所示:
最后就是计算出每张小图是多少钱。我们采用垂直分割的方法,把每笔付款金额拆分成一个个数字放到一个文件夹,因为第一个符号固定是¥,分割后的从第二个开始输出。代码如下:
public static List<BufferedImage> toNumbers(File imageFile) throws IOException
{
List<BufferedImage> result = new ArrayList<>();
BufferedImage image = ImageIO.read(imageFile);
int h = image.getHeight();
int w = image.getWidth();
int[] blackCounts = new int[w];
for (int x = 0; x < w; x++) {
int blackCount = 0;
for (int y = 0; y < h; y++) {
int rgb = image.getRGB(x, y);
if (rgb == BLACK) {
++blackCount;
}
}
blackCounts[x] = blackCount;
}
List<Partition> ps = partition(blackCounts);
for (int i = 1; i < ps.size(); i++) {
Partition p = ps.get(i);
BufferedImage itemImage = new BufferedImage(p.length, h, BufferedImage.TYPE_INT_RGB);
Graphics2D g = itemImage.createGraphics();
g.drawImage(image,
0, 0, p.length, h,
p.s, 0, p.e, h,
null);
result.add(itemImage);
}
return result;
}
private static void printNumbers() throws IOException
{
int counter = 0;
for (File imageFile : Configuration.MONEY_VALUE_FOLDER.listFiles()) {
List<BufferedImage> images = toNumbers(imageFile);
for (BufferedImage image : images) {
File file = new File(Configuration.NUMBERS_FOLDER, counter++ + ".png");
ImageIO.write(image, "png", file);
}
}
}
分割出来的数字如下图所示:
然后从这个文件夹里选0~9和.11张图片,文件名改成对应内容,例如0这张图片改成0.png,放到一个新的文件夹。
最后我们采用计算向量余弦相似度的方式来识别数字。
public class Calculator {
private static BufferedImage resizeImage(BufferedImage originalImage, int targetWidth, int targetHeight)
{
Image resultingImage = originalImage.getScaledInstance(targetWidth, targetHeight, Image.SCALE_SMOOTH);
BufferedImage outputImage = new BufferedImage(targetWidth, targetHeight, BufferedImage.TYPE_INT_RGB);
Graphics2D g2d = outputImage.createGraphics();
g2d.drawImage(resultingImage, 0, 0, null);
g2d.dispose();
return outputImage;
}
public static double[] imageToVector(BufferedImage image)
{
image = resizeImage(image, 100, 100);
int width = image.getWidth();
int height = image.getHeight();
double[] vector = new double[width * height];
int index = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
int rgb = image.getRGB(x, y);
vector[index++] = rgb;
}
}
return vector;
}
public static double[] imageToVector(File imageFile) throws IOException
{
BufferedImage image = ImageIO.read(imageFile);
return imageToVector(image);
}
// 计算余弦相似度
public static double cosineSimilarity(double[] vecA, double[] vecB)
{
double dotProduct = 0.0;
for (int i = 0; i < vecA.length; i++) {
dotProduct += vecA[i] * vecB[i];
}
double normA = 0.0;
double normB = 0.0;
for (double v : vecA) {
normA += v * v;
}
for (double v : vecB) {
normB += v * v;
}
normA = Math.sqrt(normA);
normB = Math.sqrt(normB);
if (normA == 0 || normB == 0) {
return 0.0;
}
return dotProduct / (normA * normB);
}
public static double recognitionNumber(Map<String, double[]> vecMap, File imageFile) throws IOException
{
List<BufferedImage> numbers = PreProcessor.toNumbers(imageFile);
StringBuilder result = new StringBuilder();
for (BufferedImage number : numbers) {
double[] vec = imageToVector(number);
double maxS = -1;
String value = null;
for (Map.Entry<String, double[]> nameVec : vecMap.entrySet()) {
double[] v = nameVec.getValue();
double s = cosineSimilarity(v, vec);
if (s > maxS) {
maxS = s;
value = nameVec.getKey();
}
}
result.append(value);
}
return Double.parseDouble(result.toString());
}
public static void main(String[] args) throws IOException
{
// 加载样本数据
Map<String, double[]> vec = new HashMap<>();
for (File file : Configuration.SAMPLES_FOLDER.listFiles()) {
double[] vector = imageToVector(file);
String name = file.getName().substring(0, file.getName().lastIndexOf("."));
if (name.equals("dot")) {
name = ".";
}
vec.put(name, vector);
}
double all = 0d;
for (File imageFile : Configuration.MONEY_VALUE_FOLDER.listFiles()) {
double value = recognitionNumber(vec, imageFile);
all += value;
System.out.println(value);
}
System.out.println("all = " + all);
}
}
可以看到识别出每一笔付款金额和总和。最终证明这个保姆没有多报销,难道,她算少了?
所有代码已上传至gitee,项目地址:https://gitee.com/doraemon_unexpected/wechat-bill-recognition