21个项目玩儿转Tensorflow的BUG记录

21个项目玩儿转Tensorflow的BUG记录

使用环境

win10+Python3.6+Tensorflow1.4。

BUG历程

第三章

运行data_convert.py出现错误:
Traceback (most recent call last):
  File "E:/03personal/DeepLearning/03IMG/data_prepare/data_convert.py", line 35, in <module>
    main(args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 409, in main
    command_args.validation_shards, command_args.labels_file, command_args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 361, in _process_dataset
    filenames, texts, labels = _find_image_files(directory, labels_file, command_args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 341, in _find_image_files
    random.shuffle(shuffled_index)
  File "C:\ProgramData\Anaconda3\lib\random.py", line 275, in shuffle
    x[i], x[j] = x[j], x[i]
TypeError: 'range' object does not support item assignment`

修改方式
shuffled_index = range(len(filenames))改为`shuffled_index = list(range(len(filenames)))
错误:

Traceback (most recent call last):
  File "E:/03personal/DeepLearning/03IMG/data_prepare/data_convert.py", line 35, in <module>
    main(args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 409, in main
    command_args.validation_shards, command_args.labels_file, command_args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 362, in _process_dataset
    _process_image_files(name, filenames, texts, labels, num_shards, command_args)
  File "E:\03personal\DeepLearning\03IMG\data_prepare\src\tfrecord.py", line 259, in _process_image_files
    for i in xrange(len(spacing) - 1):
NameError: name 'xrange' is not defined

修改方式
for i in xrange(len(spacing) - 1):改为for i in range(len(spacing) - 1):

再次运行data_convert.py时出现下列错误:
UnicodeDecodeError: ‘gbk’ codec can’t decode byte 0xff in position 0: illega
TypeError:tf.train.Feature TypeError: ‘RGB’ has type str, but expected one of: bytes
TypeError: ‘water’ has type str, but expected one of: bytes

需要修改下列地方

tfrecord.py第160行改为  with open(filename, 'rb') as f:
tfrecord.py第94和96行修改为  colorspace = b'RGB'     image_format = b'JPEG'
tfrecord.py第104行修改为  'image/class/text': _bytes_feature(str.encode(text)),
tfrecord.py第106行修改为   'image/filename':_bytes_feature(os.path.basename(str.encode(filename)))```
运行train_image_classifier.py出现错误:

Cannot assign a device for operation ‘InceptionV3/AuxLogits/Conv2d_2b_1x1/weights/RMSProp1’: Could not satisfy explicit device specification ‘/device:GPU:0’ because no supported kernel for GPU devices is available

修改方式

#修改代码

    ###########################
    # Kicks off the training. #
    ###########################
    config=tf.ConfigProto(allow_soft_placement=True)#修改这里
    slim.learning.train(
        train_tensor,
        logdir=FLAGS.train_dir,
        master=FLAGS.master,
        is_chief=(FLAGS.task == 0),
        init_fn=_get_init_fn(),
        summary_op=summary_op,
        number_of_steps=FLAGS.max_number_of_steps,
        log_every_n_steps=FLAGS.log_every_n_steps,
        save_summaries_secs=FLAGS.save_summaries_secs,
        save_interval_secs=FLAGS.save_interval_secs,
        sync_optimizer=optimizer if FLAGS.sync_replicas else None,
        session_config=config)

第四章

安装protoc问题

安装教程:https://blog.csdn.net/mr_jor/article/details/79071963
安装protoc时在models/research路径下cmd执行命令:
protoc object_detection/protos/*.proto --python_out=.

E:\03personal\DeepLearning\05ObjectDec\models\research>protoc object_detection/protos/*.proto --python_out=.
object_detection/protos/*.proto: No such file or directory

protoc版本高于3.5有BUG,使用3.4的。下载地址:https://github.com/google/protobuf/releases/tag/v3.4.0

E:\03personal\DeepLearning\05ObjectDec\models\research>protoc object_detection/protos/*.proto --python_out=.

E:\03personal\DeepLearning\05ObjectDec\models\research>
model_builder_test.py 问题
(base) E:\03personal\DeepLearning\05ObjectDec\models\research>python object_detection/builders/model_builder_test.py
C:\ProgramData\Anaconda3\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "object_detection/builders/model_builder_test.py", line 23, in <module>
    from object_detection.builders import model_builder
ModuleNotFoundError: No module named 'object_detection'

(base) E:\03personal\DeepLearning\05ObjectDec\models\research>SET PYTHONPATH=%cd%;%cd%\slim

run object_detection/builders/model_builder_test.py时候

object_detection/builders/model_builder_test.py:None (object_detection/builders/model_builder_test.py)
model_builder_test.py:23: in <module>
    from object_detection.builders import model_builder
model_builder.py:22: in <module>
    from object_detection.builders import box_predictor_builder
box_predictor_builder.py:20: in <module>
    from object_detection.predictors import convolutional_box_predictor
..\predictors\convolutional_box_predictor.py:19: in <module>
    from object_detection.core import box_predictor
..\core\box_predictor.py:137: in <module>
    class KerasBoxPredictor(tf.keras.Model):
E   AttributeError: module 'tensorflow.python.keras' has no attribute 'Model'

需要升级tensowflow 更新到版本1.12,到现在的1.14版本有其他问题

pip install -U tensorflow==1.12

在run model_builder_test.py出现错误No module named ‘nets’

object_detection/builders/model_builder_test.py:None (object_detection/builders/model_builder_test.py)
ImportError while importing test module 'E:\03personal\DeepLearning\05ObjectDec\models\research\object_detection\builders\model_builder_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
model_builder_test.py:25: in <module>
    from object_detection.builders import model_builder
model_builder.py:35: in <module>
    from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
..\models\faster_rcnn_inception_resnet_v2_feature_extractor.py:28: in <module>
    from nets import inception_resnet_v2
E   ModuleNotFoundError: No module named 'nets'

需要在model_builder_test.py文件最前面添加

import sys
sys.path.append("E:/03personal/DeepLearning/05ObjectDec/models")
sys.path.append("E:/03personal/DeepLearning/05ObjectDec/models/research/slim")
sys.path.append("E:/03personal/DeepLearning/05ObjectDec/models/research")

运行tutorial

The backend was *originally* set to 'Qt5Agg' by the following code:
  File "E:/03personal/DeepLearning/05ObjectDec/models/research/object_detection/object_detection_tutorial.py", line 12, in <module>
    from matplotlib import pyplot as plt
  File "C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\pyplot.py", line 71, in <module>
    from matplotlib.backends import pylab_setup
  File "C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\backends\__init__.py", line 16, in <module>
    line for line in traceback.format_stack()
  import matplotlib; matplotlib.use('Agg')  # pylint: disable=multiple-statements

修改:
import matplotlib
matplotlib.use(‘Agg’)
import matplotlib.pyplot as plt

第六章

运行第六章时候的环境为:
Win7+ pycharm + tensorflow1.6+ python3.6.4
出现导入facenet失败时;

Traceback (most recent call last):
  File "src/align/align_dataset_mtcnn.py", line 34, in <module>
    import facenet

ImportError: No module named 'facenet'

在文件前加入:
import sys
sys.path.append(“I:/github/DL_21tensorflow/06FaceDect/src”)
sys.path.append(“I:/github/DL_21tensorflow/06FaceDect”)
需要在anaconda prompt里运行,在文件位置正确的情况下,在I:\github\DL_21tensorflow\06FaceDect下运行

python  src/align/align_dataset_mtcnn.py   datasets/lfw/raw  datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order

如果出现Failed to get convolution algorithm. This is probably because cuDNN failed to initialize.....这样的错误,那么可能是现存不够,可以设置GPU限制:

Traceback (most recent call last):
  File "src/align/align_dataset_mtcnn.py", line 155, in <module>
    main(parse_arguments(sys.argv[1:]))
  File "src/align/align_dataset_mtcnn.py", line 104, in main
    bounding_boxes, _ = align.detect_face.detect_face(img, minsize, pnet, rnet, onet, threshold, factor)
  File "E:/03personal/DeepLearning/06FaceDec/src\align\detect_face.py", line 336, in detect_face
    out = pnet(img_y)
  File "E:/03personal/DeepLearning/06FaceDec/src\align\detect_face.py", line 299, in <lambda>
    pnet_fun = lambda img: sess.run(('pnet/conv4-2/BiasAdd:0', 'pnet/prob1:0'), feed_dict={'pnet/input:0': img})
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[node pnet/conv1/Conv2D (defined at E:/03personal/DeepLearning/06FaceDec/src\align\detect_face.py:154) ]]
         [[node pnet/prob1 (defined at E:/03personal/DeepLearning/06FaceDec/src\align\detect_face.py:215) ]]

在运行指令后加一个参数--gpu_memory_fraction 0.6,参数后是一个小于等于1的数,根据自己的机器改大小:

python src/align/align_dataset_mtcnn.py datasets/lfw/raw datasets/lfw/lfw_mtcnn_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.6

验证数据集也在项目跟目录下运行文件,python validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models
前面是数据集文件夹位置,后面是模型文件夹位置

在项目根目录下运行validate_on_lfw.py

compare

第七章

运行eval.py时候遇到问题:

Traceback (most recent call last):
  File "I:/github/DL_21tensorflow/07StyleWand/eval.py", line 76, in <module>
    tf.app.run()
  File "C:\Program Files (x86)\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "I:/github/DL_21tensorflow/07StyleWand/eval.py", line 46, in main
    generated = model.net(image, training=False)
  File "I:\github\DL_21tensorflow\07StyleWand\model.py", line 102, in net
    conv1 = relu(instance_norm(conv2d(image, 3, 32, 9, 1)))
  File "I:\github\DL_21tensorflow\07StyleWand\model.py", line 9, in conv2d
    x_padded = tf.pad(x, [[0, 0], [kernel / 2, kernel / 2], [kernel / 2, kernel / 2], [0, 0]], mode=mode)
  File "C:\Program Files (x86)\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1896, in pad
    tensor, paddings, mode="REFLECT", name=name)
  File "C:\Program Files (x86)\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3341, in _mirror_pad
    "MirrorPad", input=input, paddings=paddings, mode=mode, name=name)
  File "C:\Program Files (x86)\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 609, in _apply_op_helper
    param_name=input_name)
  File "C:\Program Files (x86)\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 60, in _SatisfiesTypeConstraint
    ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'paddings' has DataType float32 not in list of allowed values: int32, int64

需要将model.py的

x_padded = tf.pad(x, [[0, 0], [kernel / 2, kernel / 2], [kernel / 2, kernel / 2], [0, 0]], mode=mode)

修改为

        x_padded = tf.pad(x,
                          [[0, 0], [np.int(kernel / 2), np.int(kernel / 2)], [np.int(kernel / 2), np.int(kernel / 2)],
                           [0, 0]], mode=mode)

第八章

当运行python main.py --input_height 96 --input_width 96
–output_height 48 --output_width 48
–dataset anime --crop -–train
–epoch 300 --input_fname_pattern "*.jpg"时候出错:

    _sys.exit(main(argv))
  File "E:/03personal/DeepLearning/08GAN/main.py", line 86, in main
    raise Exception("[!] Train a model first, then run test mode")
Exception: [!] Train a model first, then run test mode

需要把运行的命令修正一下:

python main.py --input_height 96 --output_height 48 --dataset anime --crop True --train True --epoch 10

第十二章

运行sample.py时候:

AttributeError: 'str' object has no attribute 'decode'

将main函数里面的第一行代码注释掉。

# FLAGS.start_string = FLAGS.start_string.decode('utf-8')
  • 4
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 14
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 14
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值