ffmpeg调dshow使用Kinect录视频音画不同步关于时间戳踩的坑

建立数据集收数据准备用Microsoft Kinect V2采集color视频,节省预算所以没有直接买一个好用的usb camera,从此开始被这个老古董折磨

注意一定要安装kinect runtime2.2,这个是单独安装的,Kinect SDK2.0装完虽然能用但是不好用,pykinect可能报错

然后不想弄了,直接上ffmpeg命令行

ffmpeg -rtbufsize 2000M -f dshow -video_size 1920x1080 -framerate 30 -sample_rate 96000 -sample_size 16 -channel_layout stereo -audio_buffer_size 50 -use_video_device_timestamps false -i video="Kinect V2 Video Sensor":audio="麦克风阵列 (2- Xbox NUI Sensor)" -c:v hevc_nvenc -preset p7  -profile:v main10 -level 4.1 -tier main -b:v 8000k -pix_fmt p010le -c:a flac -ar 96000 -ac 2 -sample_fmt s16 rawvideo.mkv

只有装了runtime2.2后才能看到dshow设备Kinect V2 Video Sensor

特性:

超大缓冲区

96khz音频flac

英伟达HEVC硬件编码 ,预设p7最佳质量,main10编码,层级level4.1@main,8M码率,10位色深YUV420采样

为什么要用HEVC_NVENC?因为想折腾一下不让显卡歇着,代价是支持的格式更少了,我心心念念的YUV422p10le得用libx265

原本想用YUV444p16le但是windows默认不支持,搞太先进了数据集怕没人会用

踩坑:

1,FFMPEG各个参数的修饰关系一定要清楚,针对输入的参数要放在-i前面,在指定了输入设备后开始写编码参数

2,-isync [num]input流之间的输入同步,官方文档有说明ffmpeg官方文档

-isync input_index (input)

Assign an input as a sync source.

This will take the difference between the start times of the target and reference inputs and offset the timestamps of the target file by that difference. The source timestamps of the two inputs should derive from the same clock source for expected results. If copyts is set then start_at_zero must also be set. If either of the inputs has no starting timestamp then no sync adjustment is made.

Acceptable values are those that refer to a valid ffmpeg input index. If the sync reference is the target index itself or -1, then no adjustment is made to target timestamps. A sync reference may not itself be synced to any other input.

Default value is -1.

用于多个输入流,-f dshow -i ""   -f dshow -i "",而不是-f dshow -i "video= XXX:audio= XXX",后者这种形式会默认时间戳同步,就是ffmpeg根据initial timestamps计算offset

3,-copyts 复制时间戳,在官方文档和这篇文章中均有提到,但是两者叙述相反

官方:-copyts

Do not process input timestamps, but keep their values without trying to sanitize them. In particular, do not remove the initial start time offset value.

Note that, depending on the vsync option or on specific muxer processing (e.g. in case the format option avoid_negative_ts is enabled) the output timestamps may mismatch with the input timestamps even when this option is selected.

民间:

Also this note ​that the input string is in the format video=<video device name>:audio=<audio device name>. It is possible to have two separate inputs (like -f dshow -i audio=foo -f dshow -i video=bar) though some limited tests had shown a difference in synchronism between the two options at times. Possibly you can overcome it using the "-copy_ts" flag. The reason this works is that each "input" is assumed to start "at its first input time" and FFmpeg, by default, basically normalizes it "from its first input" as meaning "0.0 seconds." Because ffmpeg is using two different dshow inputs, it basically starts one up, then starts up the second *after* so it might start sending in packets a fraction of a second later, and FFmpeg happily treats its "later starting" timestamps as also 0.0 so mixing them doesn't work well if they start off set. So if you use -copy_ts then it will start them with "relative to machine start time" timestamps which should be able to mix accurately in theory. Ping me if you want it fixed to come more than one audio and one video in an input and thus not need these work arounds ​rogerdpack@gmail.com

Synchronizing

The "copyts" flag might be useful to helping streams keep their input timestamps. Especially if you have multiple "-f dshow -i XXX -f dshow -i YYY" style inputs, the latter capture graph might get started up slightly after the former. If you desire to have more than "2 inputs, one audio, one video" to increase synconicity please request so.

用了这个选项以后录制的视频时间戳完全崩坏,只录了十秒结果这样了,音画不同步没有解决,同时还发现根据metadata中的audio中的,Delay relative to video     : -652 ms,手动调整了以后同步了,为什么会这样的

总之知道了Kinect时间戳不可靠

4,-use_video_device_timestamps false/true

找到了这个,的官方说明:

If set to false, the timestamp for video frames will be derived from the wallclock instead of the timestamp provided by the capture device. This allows working around devices that provide unreliable timestamps.

也就是改用计算机的时间戳,而非video device的,好用,此题终结

不过metadata中的Delay relative to video更大了,1.6s

再记一下最近搞的

ffmpeg -encoders

查询编码器列表

ffmpeg -h encoder=hevc_nvenc

编码器帮助

ffmpeg -list_devices true -f dshow -i dummy

查询dshow设备列表

ffmpeg -list_options true -f dshow -i video="Kinect V2 Video Sensor"

查询录制的模式

  • 10
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值