Tensorflow 填坑日记_tensorflow pb模型离线结果和在线结果误差-CSDN博客

本文链接：https://blog.csdn.net/ChenLuLiang/article/details/78813270

坑一、UnicodeEncodeError: 'utf-8' codec can't encode character '\udcce' in position 1936: surrogates not a

这个坑搞了我很久,开始以为是python的版本问题然后一路pint命令下去发现是label_map_util.py这边的

with tf.gfile.FastGFile(path, 'r') as fid:

label_map_string = fid.read()

出了问题我单独抽出来

于是自己写了一段

import matplotlib.pyplot as plt;  
import tensorflow as tf;  
  
image_raw_data_jpg = tf.gfile.FastGFile('D:\我的图片s\1.jpg', 'rb').read()  
  
with tf.Session() as sess:  
    img_data_jpg = tf.image.decode_jpeg(image_raw_data_jpg) #解码  
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.uint8)  
    encode_image_jpg = tf.image.encode_jpeg(img_data_jpg) #jpg编码  
    encode_image_png = tf.image.encode_png(img_data_jpg)#png编码  
  
    with tf.gfile.GFile('output.jpg', 'wb') as f:  
        f.write(encode_image_jpg.eval())  
  
  
    with tf.gfile.GFile('output.png', 'wb') as f:  
        f.write(encode_image_png.eval())

发现还是

UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd5' in position 1942: surrogates not allowed

于是我发现路径我打多了一个s

D:\我的图片s\1.jpg

于是我猜测是路径不对引起的读取错误

于是我

直接写

# In[ ]:
label_map = label_map_util.load_labelmap("D:\\models-master\\research\\object_detection\\data\\mscoco_label_map.pbtxt")
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

好了运行通过

问题：路径不对造成的读取错误

坑二、windows安装TensorFlow gpu版本时候的bug；No module named "_pywrap_tensorflow" ；DLL load failed.

因为重装系统了，所以重新安装了tensorflow，发现，在python 3.5环境下直接pip install tensorflow，最后测试import tensorflow;时候报错

import tensorflow as tf
Traceback (most recent call last):
File "D:\Program Files\Python35\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "D:\Program Files\Python35\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
File "<frozen importlib._bootstrap>", line 577, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 906, in create_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
ImportError: DLL load failed: 找不到指定的模块。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Program Files\Python35\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "D:\Program Files\Python35\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 21, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "D:\Program Files\Python35\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow_internal')
File "D:\Program Files\Python35\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: No module named '_pywrap_tensorflow_internal'

在网上看到一篇文章，卸载掉安装的tensorflow，重装tensorflow后解决了，不过不是pip install tensorflow，命令是：

pip install --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-0.12.0rc0-cp35-cp35m-win_amd64.whl

这里有点插曲，出现

ImportError: No module named '_pywrap_tensorflow_internal' 这个错误的问题呢是安装tensorflow的版本问题，直接打pip install tensorflow
是安装到了最新版本，而我的models-master是1.5版本的所以运行odel_builder_test.py就上面那个错
安装回对应的1.5版本命令
pip install tensorflow==1.5.0

坑三、在用generate_tfrecord.py制作record数据集的时候报TypeError: None has type NoneType, but expected one of: int, long

我把日志打印出来的时候发现是在ship434的时候报错去找了一下这个xml文件打开看发现

在框第二个的时候命名错误了不是Ship，是手误造成的，改回来还要重新生成数据集CSV数据

坑四、win7+tensorflow1.7+python3.5出现

ImportError: No module named '_pywrap_tensorflow_internal'

这个问题好狗血。win7+tensorflow1.5+python3.5是没有问题但装上了tensorflow1.7就出现一下这个问题

把百度的方法试了遍插件装了好几个，vs2015都装了都不行。看到外国一篇文章说是cpu不支持什么A来着（没记住这个单词）

哎呀，放弃了装这个1.7版本的用回tensorflow1.5

坑五、GPU占用，无法使用eval.py评估模型。

原因是：如果你没有在训练的时候把文件trainer.py文件里添加一下代码

#session_config.gpu_options.per_process_gpu_memory_fraction = 0.5#设置显存为50%

tensorflow默认占用全部的显存空间，这下如果你想eval.py评估模型会报错。显存已经占满

坑六、出现OOM错误解决方法

1、bath_size设置太大,tensorflow默认是24，如果设置到15左右还是oom，可能是你的图片尺寸太大，造成的OOM。（个人理解）

2、显存已经被占满，出现oom

坑七、 tensorBord的预测结果和实际转换pb模型出来的结果不一致问题

最近遇到一个大坑，我用tensorflow==1.7训练模型在tensorbord里面结果是很好的，但导出后用同样的图片结果居然和原来的预测的不一样？？？这个问题搞了我很久，后来遇到一位大神指点，是1.7更新了eval里面的代码，而我的inference代码是网上找的，没有和最新的inference一致，这里我用1.7的object_detection_tutorial.ipynb测试出来的结果和预测的结果一样，但导出的pb模型和预测的完全不一样。

坑八、不同格式的导入图片格式、预测结果不一样。。

用 python中Image.open和cv2.imread导入的图像格式同一张图片居然结果不一样。。。这里用cv2.imread导入的图片需要转格式，搬运代码如下：

PIL.Image转换成OpenCV格式：

[python] view plain copy
 
import cv2  
from PIL import Image  
import numpy  
  
image = Image.open("plane.jpg")  
image.show()  
img = cv2.cvtColor(numpy.asarray(image),cv2.COLOR_RGB2BGR)  
cv2.imshow("OpenCV",img)  
cv2.waitKey()  


OpenCV转换成PIL.Image格式：

[python] view plain copy
 
import cv2  
from PIL import Image  
import numpy  
  
img = cv2.imread("plane.jpg")  
cv2.imshow("OpenCV",img)  
image = Image.fromarray(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))  
image.show()  
cv2.waitKey()

-------------------------------------------------未完待续----------------------------------------------------------------------------------------