1.背景
楼主是个新手小白,前段时间因为pycharm软件到期的问题,按照网上的方法破解软件,结果软件无法打开,卸载以后有些配置环境不见了,运行代码报错,因此又得重新配置依赖环境。以下两个链接是当时同事的配置过程,基本上是参照这些过程一一配置的,但其中也报了一些错,因此自己也重新整理一下
tensorflow object detection win10环境配置
使用Tensorflow object detection API训练自己的数据教程
注意事项(重点):
(1)新建一个conda/miniconda的虚拟环境,根据代码一一安装依赖
(2)tensorflow与CUDA、python版本、models版本要一一对应,详见如下链接:
tensorflow与models/CDUA/PYTHON版本对应
(3)下载protobuf要找对应的系统,win10系统对应的是win32.zip,正确的包里有bin文件,没有的都不对
下载链接
(4)models版本不需要与tensorflow或者你的工程文件夹路径一致,但一定要选正确的版本,以及版本里需要查看是否有research文件夹,有些坑爹的models版本没有这个文件夹,那么这个版本不行!
2.准备工作
2.1 根据网络教程安装miniconda3,链接如下:
Windows 10下安装Miniconda3
2.2 根据安装路径配置好解释器
2.3 配置环境:
python 3.6 CUDA9.0 tensorflow-gpu1.12 models1.12
3. 配置流程
3.1 安装ttensorflow1.12 ,下载的时候可以切换到国内的源
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-gpu==1.12
如果安装报错没有这个版本,则可能是当前安装的python版本下没有指定的tensorflow版本,可以点击那个网址查看一下,有哪些版本然后安装有的
当然还有一种情况,就是什么版本都找不到,这是建立的虚拟环境python版本过低的问题,我在ubuntu系统上用python2.7装tensorflow-gpu1.14随便装,但是windows上python2.7就什么版本都找不到,最后在这个网站上找到了解决方法,就是要升级python版本,于是又重新创建了一个基于python3.7的虚拟环境,然后安装tensorflow-gpu1.14就没问题了
3.2 下载protobuf ,版本比较随意,我这里下载的是3.6.0下载链接
下载解压的文件目录如下图,将bin里的protoc.exe文件复制到C:\Windows目录下
3.3 下载tensorflow 对应版本的models,我这里下载的是1.12,如果是其他的tensorflow版本,那么models版本也需要对应
可以根据这个链接里的表格去下载对应的版本[models下载1],如果没有的话也可以在这个网站上找自己需要的版本modes下载2
对应的版本下载地址如图
3.4 安装research
在research文件夹下:
cd research
python setup.py build
python setup.py install
如果是python2.7的虚拟环境,可能在这一步出现错误
error: Setup script exited with Beginning with Matplotlib 3.4, Python 3.7 or above is required. You are using Python 2.7.18.
解决方法:命令行的提示是让你更新pip,但实际上更新了pip版本以后错误还是会出现,真正的解决方案在这里
pip install matplotlib==2.1.0
另外python2.7版本在配置安装opencv的时候,也可能报错ERROR: Command errored out with exit status 1
解决方案在这里,需要安装python2.7适配的cv版本
3.5 使用protobuf编译protos文件
3.5.1安装slim`
cd slim
python setup.py build
python setup.py install
3.5.2 编译models文件,有两种方法,目的都是在object detection文件夹下生成.py文件
方法1:在models/research/下运行: protoc object_detection/protos/*.proto --python_out=.
方法2: 这个链接里的4.2的步骤
生成的.py文件如图
3.6 测试安装是否成功
在research目录下:
cd research
python object_detection/builders/model_builder_test.py
4. 关于报错
4.1 测试安装是否成功时报错某些文件没有找到,但实际上定位到路径下是有这个文件的,这个问题网上的建议是升级tensorflow的版本,我也是通过这个方法解决的,tensorflow和models由原来的1.8.0升级到1.12.0
例如gru_ops.so not found
4.2 如果测试配置是否成功的时候报错cannot import name ‘anchor_generator_pb2’,解决方法如下:
传送们
4.3 如下图
这个问题是最终是对照了一个object detection的模板修改了代码才解决的,源代码在我没有误操作之前可以用,但是重新配置环境以后,又不行了,很魔幻,最后还是按照模板改了一些代码,程序才得以成功运行
附上模板代码
from numpy import mat
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import cv2
import time
import pyrealsense2 as rs
from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
import matplotlib
from PIL import Image
import pylab
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
pipeline = rs.pipeline()
config = rs.config()
#config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 1280, 720, rs.format.bgr8, 6)
profile = pipeline.start(config)
align_to = rs.stream.color
align = rs.align(align_to)
def get_aligned_images():
frames = pipeline.wait_for_frames()
#aligned_frames = align.process(frames)
# aligned_depth_frame = aligned_frames.get_depth_frame()
color_frame = frames.get_color_frame()
#depth_image = np.asanyarray(aligned_depth_frame.get_data())
#depth_image_8bit = cv2.convertScaleAbs(depth_image, alpha=0.03)
#pos=np.where(depth_image_8bit==0)
#depth_image_8bit[pos]=255
color_image = np.asanyarray(color_frame.get_data())
#depth_image_3d = np.dstack((depth_image,depth_image,depth_image)) #depth image is 1 channel, color is 3 channels
return color_image
#matplotlib.use('Qt5Agg')
# MODEL_NAME = 'mask_rcnn_inception_v2_coco'
# PATH_TO_FROZEN_GRAPH = "exported_model/frozen_inference_graph.pb"
# PATH_TO_LABELS = "mouth_dataset/label_map.pbtxt"
# MODEL_NAME = 'zr_2020_07_24'
# PATH_TO_FROZEN_GRAPH = "data/zr.pb"
# PATH_TO_LABELS = "data/tonsil_label_map.pbtxt"
MODEL_NAME = 'data/mask_rcnn_inception_v2_coco'
PATH_TO_FROZEN_GRAPH = "data/frozen_inference_graph.pb"
PATH_TO_LABELS = "data/tonsil_label_map.pbtxt"
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
def net_init(graph):
with graph.as_default():
# with tf.Session() as sess:
# Get handles to input and output tensors
ops = tf.get_default_graph().get_operations()
all_tensor_names = {output.name for op in ops for output in op.outputs}
tensor_dict = {}
for key in [
'num_detections', 'detection_boxes', 'detection_scores',
'detection_classes', 'detection_masks'
]:
tensor_name = key + ':0'
if tensor_name in all_tensor_names:
tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
tensor_name)
if 'detection_masks' in tensor_dict:
# The following processing is only for single image
detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
# Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes,
720, 1280)
# detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes, image.shape[0], image.shape[1])
detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)
# Follow the convention by adding back the batch dimension
tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)
image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')
#image_tensor = 'image_tensor:0'
return image_tensor, tensor_dict
def run_inference_for_single_image(image_tensor, tensor_dict, graph, image, sess):
with graph.as_default():
output_dict = sess.run(tensor_dict, feed_dict={image_tensor: np.expand_dims(image, 0)})
# all outputs are float32 numpy arrays, so convert types as appropriate
output_dict['num_detections'] = int(output_dict['num_detections'][0])
output_dict['detection_classes'] = output_dict[
'detection_classes'][0].astype(np.uint8)
output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
output_dict['detection_scores'] = output_dict['detection_scores'][0]
if 'detection_masks' in output_dict:
output_dict['detection_masks'] = output_dict['detection_masks'][0]
return output_dict
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
sess = tf.Session(graph=detection_graph)
# category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS)
# Size, in inches, of the output images.
#IMAGE_SIZE = (12, 8)
image_tensor, tensor_dict = net_init(detection_graph)
while 1:
t0 = time.time()
rgb = get_aligned_images()
#rgb = cv2.imread('mouth_dataset/test/1_107.jpg')
#quit()
#cv2.imwrite('./tensorflow_maksrcnn_opencv/1.jpg',rgb)
#image = Image.open('./tensorflow_maksrcnn_opencv/1.jpg')
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
#image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
# image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_tensor, tensor_dict, detection_graph, rgb, sess)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
rgb,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks'),
use_normalized_coordinates=True,
line_thickness=8)
t1 = time.time()
print("take cost %f s" % (t1 - t0))
#rgb = cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB)
cv2.imshow('test', rgb)
cv2.waitKey(1)
#cv2.destroyWindow('test')