keras冻结_关于C++部署keras分类模型一点经验

最新推荐文章于 2024-06-01 10:03:56 发布

weixin_39830225

最新推荐文章于 2024-06-01 10:03:56 发布

阅读量197

点赞数

文章标签： keras冻结

本文链接：https://blog.csdn.net/weixin_39830225/article/details/111652675

版权

近期在做一个涉及到验证码识别的小程序，使用了一万多张图片进行训练，得到了一个几兆的模型，预测模块也写好了，结果用pyinstaller打包后发现居然有400多M（tensorflow-cpu 2.3版）。如此庞大的体积真的使我有了进一步优化的兴趣。随后，我决定将模型+预测代码嵌入c++ dll中。首先的方案肯定是手动编译tensorflow，然后把keras模型冻结，转为tensorflow的pb模型等等。其实呢这本身算不上多么困难，但是编译tensorflow的坑太多，而且文件体积未必能小到哪里去。于是，经过我在github的一番查找，找到了一个名为“frugally-deep”的库，从它的说明中我得知它可以在没有tensorflow的参与下进行前向传播，也就是预测。这是链接：

frugally-deepgithub.com

由于它本身依赖很多库，这里直接给大家配置它的教程，大家只要按照教程走就不会有太多问题：

C++中用frugally-deep读取keras的模型并进行预测（详细）_1037号森林里一段干木头的博客-CSDN博客blog.csdn.net

不过需要注意的是，这个教程中提到了opencv，这个库体积也是很大的，它也需要动态链接dll，这显然与我追求更小的文件大小相违背，何况我的需求不需要如此重型的库。事实上，frugally-deep自身不依赖图片库。我在github上找了一个名为spot的库以实现与图片相关的交互功能。

spotgithub.com

我的dll中以资源文件的形式嵌入了两个模型，并在dllmain中调用LoadModelResourceToMemory函数将其加载入内存。predict函数用来预测。示例代码如下：

#include "resource.h"
#include <fdeep/fdeep.hpp>
#include <cstdlib>
#include "spot.hpp"
#include <Windows.h>
#include <tlhelp32.h>
#include <tchar.h>
#include <ShellAPI.h>
HINSTANCE hd;
const char a[30] = { '1','2','3','4','5','6','7','8','9',
       'a','b','c','d','e','f','h','j','k','m','n','p','r','s','t','u','v','w','x','y','z' };
const char b[10] = { '0','1','2','3','4','5','6','7','8','9' };
int assist_func(const fdeep::tensor& t)
{
    const auto xs = *t.as_vector();
    int maxcount = 0;
    for (int i = 0; i < xs.size(); i++)
    {
        if (xs[maxcount] < xs[i])
            maxcount = i;
    }
    return maxcount;
}
fdeep::tensor gettensor(const void*img, int len) 
{
    spot::image simg(img,len);
    std::vector<unsigned char> pixels = simg.rgb();
    return fdeep::tensor_from_bytes(pixels.data(), simg.h, simg.w, 3,
        0.0f, 255.0f);
}
fdeep::tensor gettensor(std::string path)
{
    spot::image img(path);
    std::vector<unsigned char> pixels = img.rgb();
    return fdeep::tensor_from_bytes(pixels.data(), img.h, img.w, 3,
        0.0f, 255.0f);
}
struct modeldatapointer
{
    std::string model_str1;
    std::string model_str2;
    bool loadsucc = true;
};
modeldatapointer md;
void LoadModelResourceToMemory()
{
    UINT modres1 = IDR_MODEL1, modres2 = IDR_MODEL2;
    LPCSTR name1 = MAKEINTRESOURCEA(modres1), name2 = MAKEINTRESOURCEA(modres2);
    auto hres1 = FindResourceA(hd, name1, "MODEL"), hres2 = FindResourceA(hd, name2, "MODEL");
    HGLOBAL hresdata1, hresdata2;
    hresdata1 = LoadResource(hd, hres1);
    hresdata2 = LoadResource(hd, hres2);
    auto size1 = SizeofResource(hd, hres1);
    md.model_str1.resize(size1);
    auto size2 = SizeofResource(hd, hres2);
    md.model_str2.resize(size2);
    char* pres1 = (char*)LockResource(hresdata1);
    char* pres2 = (char*)LockResource(hresdata2);
    if (!pres1 || !pres2) { FreeResource(hresdata1); FreeResource(hresdata2); md.loadsucc = false; return; }
    CopyMemory(&md.model_str1[0], pres1, size1);
    CopyMemory(&md.model_str2[0], pres2, size2);
    FreeResource(hresdata1); FreeResource(hresdata2);
    return;

}
fdeep::model* p_mod1 = nullptr;
fdeep::model* p_mod2 = nullptr;
extern "C" __declspec(dllexport) char* __stdcall predict(const void* img, int len)
{
    if (((const char*)(img))[0]!='G')
    {
        auto input = gettensor(img,len);
        const auto pre_class = p_mod1->predict({ input });
        char* temp1 = new char[6];
        for (int i = 0; i < 5; i++)
            temp1[i] = a[assist_func(pre_class[i])];
        temp1[5] = '0';
        return temp1;
    }
    else
    {
        auto input = gettensor(img,len);
        const auto pre_class = p_mod2->predict({ input });
        char* temp1 = new char[5];
        for (int i = 0; i < 4; i++)
            temp1[i] = b[assist_func(pre_class[i])];
        temp1[4] = '0';
        return temp1;
    }
}
extern "C" __declspec(dllexport) void __stdcall release(void* p)
{
    delete p;
}
BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        hd = hModule;
        if (!p_mod1)
        {
            LoadModelResourceToMemory();
            static auto model = fdeep::load_model_string(md.model_str1);
            static auto model1 = fdeep::load_model_string(md.model_str2);
            p_mod1 = &model;
            p_mod2 = &model1;
        }
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

若不算上模型，仅将frgually-deep和spot两个库简单整合，生成的dll仅有1M左右，可以说是小巧玲珑了。在python中与这个dll进行交互也是简单的。我的python代码示例如下：

from ctypes import *
import queue, threading, ctypes, os

class Identify(threading.Thread):

    def __init__(self):
        self.queue = queue.Queue()
        self.queue2 = queue.Queue()
        super().__init__()
        self.dll = ctypes.windll.LoadLibrary('.predict.dll')
        self.dll.predict.restype = c_uint64
        self.mod = 0

    def run(self):
        while True:
            r = self.queue.get()   #这里r是requests.get()得来的图像字节流。
            if r is None:
                print('S2:-----------------------sub thread exit')
                return
            p = self.dll.predict(cast(r.content, c_void_p), c_int(len(r.content)))
            r = string_at(p)
            self.dll.release(c_int64(p))
            self.queue2.put(r)

最后经过重新打包，我的程序只有60M，相比原来的400M，空间优化效果显著。不过，我在此介绍的只是简单的分类模型，如果是生成模型，检测模型，目测是不可行的。frugally-deep的readme.md中写出了不支持的网络层类型，可以参考。

weixin_39830225

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
keras冻结_关于C++部署keras分类模型一点经验

近期在做一个涉及到验证码识别的小程序，使用了一万多张图片进行训练，得到了一个几兆的模型，预测模块也写好了，结果用pyinstaller打包后发现居然有400多M（tensorflow-cpu 2.3版）。如此庞大的体积真的使我有了进一步优化的兴趣。随后，我决定将模型+预测代码嵌入c++ dll中。首先的方案肯定是手动编译tensorflow，然后把keras模型冻结，转为tensorflow的p...
复制链接

扫一扫