说来也是很神奇,一般来说不管是自己定义是dataset还是自带的dataset的工具,加载进DataLoader后就变成可迭代的对象
但是我确定dataset可以打印出来,加载DataLoader也没问题,但使用dataloader居然失败:
from torch.utils.data import DataLoader
dataloader = DataLoader(dataset, batch_size=123, shuffle=True, num_workers=6, drop_last=True)
for i in dataloader:
print(123)
会报错,出现类似以这篇issue上的内容:https://github.com/AliaksandrSiarohin/first-order-model/issues/197
因此找不到自己的报错的内容了,直接拷贝上述链接的内容:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/imageio/plugins/ffmpeg.py in _read_frame_data(self)
620 raise RuntimeError(
--> 621 "Frame is %i bytes, but expected %i." % (len(s), framesize)
622 )
RuntimeError: Frame is 0 bytes, but expected 12288.
During handling of the above exception, another exception occurred:
CannotReadFrameError Traceback (most recent call last)
5 frames
/usr/local/lib/python3.6/dist-packages/imageio/plugins/ffmpeg.py in _read_frame_data(self)
626 err2 = self._stderr_catcher.get_text(0.4)
627 fmt = "Could not read frame %i:\n%s\n=== stderr ===\n%s"
--> 628 raise CannotReadFrameError(fmt % (self._pos, err1, err2))
629 return s, is_new
630
CannotReadFrameError: Could not read frame 1178:
Frame is 0 bytes, but expected 12288.
=== stderr ===
ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libavresample 3. 7. 0 / 3. 7. 0
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/content/gdrive/My Drive/first-order-motion-model/DBnuggest_1.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41
creation_time : 2020-07-28T17:25:57.000000Z
Duration: 00:00:19.71, start: 0.000000, bitrate: 368 kb/s
Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, smpte170m), 64x64, 186 kb/s, 60 fps, 60 tbr, 60k tbn, 120 tbc (default)
Metadata:
creation_time : 2020-07-28T17:25:58.000000Z
handler_name : Alias Data Handler
encoder : AVC Coding
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 158 kb/s (default)
Metadata:
creation_time : 2020-07-28T17:25:58.000000Z
handler_name : Alias Data Handler
总之一句话,就是dataset出问题了
在迭代dataloader的时候会去加载dataset的内容,而dataset里面读取数据的部分发生异常了,但没有抛出,上述的链接解决的很好,直接抛异常就行:
总之一句话,要是使用pytorch时,发现使用不了dataloader,可以考虑是dataset加载数据的步伐出异常了,然后抛出去或者解决就行。
看起来很简单,但是真没想到dataloader里面还有这样的问题,自己定义dataset的时候需要谨慎一些。
归根结底,dataset加载数据的部分出问题了,一部分没问题,这就为啥我可以打印dataset的数据内容,但是还是会出错,反正有条件都给抛个异常就好。