caffe训练resnet50分类宠物狗

训练环境

  • 硬件
    • GTX3090
    • 内存:32GB
  • 软件
    • 驱动:460.56
    • CUDA:V11.1.105
    • CUDNN:8.0.5
    • OpenCV:4.5.2-pre(训练caffe不需要编译对opencv支持)
    • 操作系统:manjaro

Caffe配置文件:


## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#       You should not set this flag if you will be reading LMDBs with any
#       possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3 or 4
OPENCV_VERSION := 4
USE_PKG_CONFIG := 1

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the lines after *_35 for compatibility.
CUDA_ARCH := -gencode arch=compute_80,code=sm_80

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
# BLAS := atlas
BLAS := open
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
BLAS_INCLUDE := /opt/OpenBLAS/include
BLAS_LIB := /opt/OpenBLAS/lib

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
# PYTHON_INCLUDE := /usr/include/python2.7 \
#               /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := $(HOME)/miniconda3
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
                $(ANACONDA_HOME)/include/python3.8 \
                $(ANACONDA_HOME)/lib/python3.8/site-packages/numpy/core/include \

# Uncomment to use Python 3 (default is Python 2)
PYTHON_LIBRARIES := boost_python38 python3.8
# PYTHON_INCLUDE := /usr/include/python3.8 \
#                 /usr/lib/python3/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.
# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/local/include/opencv4 /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @

这里需要注意:如果你编译opencv支持,你需要设置opencv4,这里不是随便写的。如果你是源码编译通常不会生成一个叫做opencv.pc的文件,你需要手动创建否则无法加载opencv动态库,编译caffe的时候就会出现libopencvxxx未定义的引用。如果你开启opencv使用的和我一样最新的opencv4,你需要手动创建 /usr/local/lib/pkgconfig/opencv4.pc。

我的opencv4.pc内容如下:

prefix=/usr/local
exec_prefix=${prefix}
includedir=${prefix}/include
libdir=${exec_prefix}/lib

Name: opencv
Description: The opencv library
Version:4.5.2-pre
Cflags: -I${includedir}/opencv4 -I${includedir}/opencv4/opencv2
Libs: -L${libdir} -lopencv_bgsegm -lopencv_bioinspired -lopencv_calib3d -lopencv_core -lopencv_cudaarithm -lopencv_cudabgsegm -lopencv_cudacodec -lopencv_cudafeatures2d -lopencv_cudafilters -lopencv_cudaimgproc -lopencv_cudalegacy -lopencv_cudaobjdetect -lopencv_cudaoptflow -lopencv_cudastereo -lopencv_cudawarping -lopencv_cudev -lopencv_datasets -lopencv_dnn_objdetect -lopencv_dnn -lopencv_dnn_superres -lopencv_dpm -lopencv_face -lopencv_features2d -lopencv_flann -lopencv_freetype -lopencv_gapi -lopencv_hdf -lopencv_highgui -lopencv_imgcodecs -lopencv_imgproc -lopencv_intensity_transform -lopencv_mcc -lopencv_ml -lopencv_objdetect -lopencv_optflow -lopencv_photo -lopencv_plot -lopencv_quality -lopencv_rapid -lopencv_reg -lopencv_rgbd -lopencv_saliency -lopencv_sfm -lopencv_shape -lopencv_stereo -lopencv_stitching -lopencv_superres -lopencv_text -lopencv_tracking -lopencv_videoio -lopencv_video -lopencv_videostab -lopencv_world -lopencv_xfeatures2d -lopencv_ximgproc

这里同样需要注意的是:我的opencv几乎编译了对一切的支持,所以我的动态库会比较多,如果你没有编译所有的动态库,你需要填入你自己的动态库,ls /libopencv*之类的操作一下就好了,否则同样会出现未定义的引用。

如果你和我一样使用manjaro这个优秀的一笔的os,你需要注意另一点。以上配置保证你能正确编译caffe,但是不能保证caffe能正确调用库,这时直接运行caffe通常会出现libopencvxxx没有找到,原因是你的opencv编译的库不是在标准库路径下/usr/lib。你可以通过export LD_LIBRARY_PATH=’/usr/local/lib’😒{LD_LIBRARY_PATH}临时添加一下,或者加入到ld.config.d里面的caffe.conf(自己创建)。这里需要非常注意的是:顺序,查找库的顺序必须保证最新的在最前面,如果你需要libpython3.8然而你用的是anaconda的lib库,放在最前面了,会导致你的系统默认使用anaconda的老库,在manjaro上就体现为sddm启动不了,关机之后无法进入桌面系统。所以保证anaconda指定的库路径放在最后一行。

数据集使用的宠物狗数据集,这个数据集可以用来做分割、检测、分类数据规模更大一些,后续将作为我的benchmark。数据集有点大,我就不提供下载链接了,上传百度网盘还没有你自己从国外下载来的快。需要注意的是pet数据集里面有几张图片可能是编码问题会出现读取不了的情况。所以拿到数据之后建议你先读取所有的图片检查空图片(为空应该是png编码但是后缀又是jpg解析就未空了),我没有验证,直接给删了,否则darknet或者mmsegmentation报错,debug了好久,我一直以为是我mmseg安装有问题,fk。

预处理脚本

import numpy as np
import glob
import shutil
from os.path import join, basename, dirname, exists
import os
import json
import shlex
import subprocess
import caffe_pb2
import time
from google.protobuf import text_format
import argparse
import json


def datapreprocess(dataset_path, output_path, rate=0.8):
    images = glob.glob("{}/*.jpg".format(dataset_path))
    class_names = set()
    dataset_dict = {}
    def image_to_name(x): return "_".join(basename(x).split('_')[:-1])
    for image in images:
        class_name = image_to_name(image)
        class_names.add(class_name)
        if class_name not in dataset_dict:
            dataset_dict[class_name] = [image]
        else:
            dataset_dict[class_name].append(image)
    if not exists(output_path):
        os.makedirs(output_path)
    data = sorted(list(class_names))
    class_to_num = {class_name: num for num, class_name in enumerate(data)}
    with open(join(output_path, 'label.json'), 'w') as f:
        json.dump(fp=f, obj=class_to_num)

    label_hander = {phase: open(
        join(output_path, '{}.txt'.format(phase)), 'w') for phase in ['train', 'val']}
    for key in dataset_dict:
        images = dataset_dict[key]
        num_train = int(len(images)*rate)
        split_image = {'train': images[:num_train], 'val': images[num_train:]}
        for phase in ['train', 'val']:
            output_path_tmp = join(output_path, phase)
            if not exists(output_path_tmp):
                os.makedirs(output_path_tmp)
            [shutil.copy(image, output_path_tmp)
             for image in split_image[phase]]

            [label_hander[phase].write("{} {}\n".format(basename(
                image), class_to_num[image_to_name(image)])) for image in split_image[phase]]
    [label_hander[key].close() for key in label_hander]
    result = {'train_image_label': join(
        output_path, 'train.txt'), 'val_image_label': join(output_path, 'val.txt'), 'label_to_num': join(output_path, 'label.json'), 'train_dataset': join(output_path, 'train'), 'val_dataset': join(output_path, 'val')}
    return result


def findfile(start, name):
    res = None
    for relpath, dirs, files in os.walk(start):
        if name in files:
            full_path = os.path.join(start, relpath, name)
            res = os.path.normpath(os.path.abspath(full_path))
    return res


def convert_dataset(dataset_path, dataset_store, shape=(227, 227)):
    output_res = datapreprocess(dataset_path, dataset_store)
    caffe_home = os.path.expanduser('~/caffe')
    if not os.path.exists(caffe_home):
        caffe_home = os.path.expanduser('~/caffe-env')

    convert_tool = findfile(caffe_home, 'convert_imageset')
    assert convert_tool is not None, "Can't find convert_imageset"
    if not exists(dataset_store):
        os.makedirs(dataset_store)
    for phase in ['train', 'val']:
        output_lmdb = join(dataset_store, '{}_lmdb'.format(phase))

        if exists(output_lmdb):
            shutil.rmtree(output_lmdb)

        command = "{} --shuffle --resize_height={} --resize_width={}  {}/ {}  {}".format(
            convert_tool, shape[0], shape[1], output_res['{}_dataset'.format(phase)], output_res['{}_image_label'.format(phase)], output_lmdb)
        output_res["{}_lmdb".format(phase)] = output_lmdb
        args = shlex.split(command)
        ferror = open('log.err', 'w')
        p_data = subprocess.Popen(args, stdout=ferror)
        compute_image_mean = findfile(caffe_home, 'compute_image_mean')
        if compute_image_mean is not None:
            command = "{} {} {}".format(compute_image_mean, output_lmdb+"/", join(
                dataset_store, 'mean_{}.binaryproto'.format(phase)))
            mean_args = shlex.split(command)
            p_data.wait()
            output_res['{}_mean'.format(phase)] = join(
                dataset_store, 'mean_{}.binaryproto'.format(phase))
            p = subprocess.Popen(mean_args)
    ferror.close()
    return output_res


def caffe_home():
    home_path = os.path.expanduser('~/')
    caffe_path = findfile(home_path, 'caffe')
    return caffe_path


def base_network(in_path, output_path, args):
    train_val = join(in_path, 'train_val.prototxt')
    network_module = caffe_pb2.NetParameter()
    solver_module = caffe_pb2.SolverParameter()
    with open(train_val, 'r') as f:
        text_format.Parse(f.read(), network_module)
    for layer in network_module.layer:
        if layer.type == 'Data':
            for phase_mesg in layer.include:
                if phase_mesg.phase == 0:
                    layer.transform_param.mean_file = args['train_mean']
                    layer.data_param.source = args['train_lmdb']
                    layer.data_param.batch_size = 16
                else:
                    layer.transform_param.mean_file = args['val_mean']
                    layer.data_param.source = args['val_lmdb']
        if layer.type == 'InnerProduct':
            if layer.inner_product_param.num_output == 1000:
                layer.inner_product_param.num_output = 37
    with open(join(output_path, 'train_val.pbtxt'), 'w') as f:
        f.write(text_format.MessageToString(network_module))

    backup_path = join(output_path, 'model')
    if not exists(backup_path):
        os.makedirs(backup_path)
    with open(join(in_path, 'solver.prototxt'), 'r') as f:
        text_format.Parse(f.read(), solver_module)
        solver_module.net = join(output_path, 'train_val.pbtxt')

        solver_module.snapshot_prefix = backup_path
    with open(join(output_path, 'solver.prototxt'),'w') as f:
        f.write(text_format.MessageToString(solver_module))
    # solver = join(in_path, 'solver.prototxt')


if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='create parser parser data!')
    parser.add_argument('--output_path', '-o', default='/tmp/pet', type=str)
    parser.add_argument('--dataset_path', '-d',
                        default='~/Datasets/pets/images', type=str)
    parser.add_argument('--base_network', '-b',
                        default='~/caffe/models/bvlc_reference_caffenet', type=str)
    args = parser.parse_args()
    config = convert_dataset(args.dataset_path, args.output_path)
    with open(join(args.output_path, 'info.json'), 'w') as f:
        json.dump(obj=config, fp=f)

    with open(join(args.output_path, 'info.json'), 'r') as f:
        config = json.load(f)

    base_network(args.base_network, args.output_path, config)

一键训练脚本

CAFFE_ROOT=${HOME}/caffe
SLOVER_ROOT=${CAFFE_ROOT}/models/resnet50
${CAFFE_ROOT}/.build_debug/tools/caffe train --solver=$SLOVER_ROOT/solver.prototxt --gpu=0

(solver.protxt)我就不贴了,你需要自己找到resnet50的solver和train_val配置好上面python脚本生成的路径。上面的脚本没有生成mean.binary使用如下命令:

${CAFFE_HOME}/.build_debug/tools/compute_image_mean /tmp/caffe_dataset/train_lmdb  /tmp/caffe_dataset/mean_train.binary

最后罗嗦一句,如果你编译caffe的时候出现cblas啥的未定义的引用,那你需要好好确定一下你的blas库,如果你用Openblas的话定位一下它的库是否在正确的路径,否则你需要下载OpenBlas源码编译一下。当然如果你是用manjaro,万能的pacman可以帮你解决一切安装问题。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值