编译环境: 系统:window10
工具:mingw
最近在做视频流转码的工作, 想提高转码速率, 想把gpu利用进来
本来打算用cuda video sdk 来做,发现只能视频部分的,不能做音频,
所以直接用ffmpeg来转码, 用gpu来加速,之前没做过出了一些问题,把过程记录一下。
-----------------------------------------------------------------------------------------------------------------------------
参考下边两个人的教程:
windows上的ffmpeg可以参考:http://blog.chinaunix.net/uid-20718335-id-2980793.html
Gpu部分主要参考了:http://www.cnx-software.com/2016/01/04/faster-h-265hevc-video-encoding-with-nvidia-gtx960-gpu-and-ffmpeg/
-------------------------------------------------------------------------------------------------------------------------------
具体过程: 以后补充
需要注意的三个部分:
1. 将video_sdk 的头文件拷贝到对应的msys目录下
cp../nvidia_video_sdk_6.0.1/Samples/common/inc/*.h ourdir/MinGW/msys/1.0/local/include/
2.在执行./configure 时的参数,必须添加 --enable-nvenc --enable-nonfree
3.--extra-cflags=-Id:/ffmpeg/include
--extra-ldflags=-Ld:/ffmpeg/lib
这两个参数我误以为是最终生成ffmpeg的目录, 实际是编译ffmpeg需要用到的库的目录, 正常用msys编译后都位于yourdir/MinGW/msys/1.0/local/下, 如果用上边的目录, 就需要将yourdir/MinGW/msys/1.0/local/下的lib 和include 拷贝到d:/ffmpeg/include 和lib下
H.265 promises the same video quality as H.264 when using half the bitrate, so you may have thought about converting your H.264 videos to H.265/HEVC in order to reduce the space used by your videos. However, if you’ve ever tried to transcoding videos with tools such as HandBrake, you’ll know the process can be painfully slow, and a single movie may take several hours even with a machine with a power processor. However, there’s a better and fster solution thanks to hardware accelerated encoding available in some Intel and Nvidia graphics cards. For this purpose, GearBest sent me Maxsun MS-GTX960 graphics card, a second generation Maxwell GPU, that supports H.265 accelerated video encoding and promised up to 500 fps video encoding. So I’ve put the graphics card to the test in a computer running Ubuntu 14.04, and reports some of my findings here. Similar instructions can also be followed in Windows.
In order to leverage Nvidia Maxwell 2 GPU capabilities you’ll need to download and install Nvidia Video Codec SDK. The latest version (6.0.1) requires Nvidia Drivers 358.xx or greater, and my system had version 352.xx, so I followed some instructions to install the latest drivers in Ubuntu 14.04.
1
2
3
|
sudo
add
-
apt
-
repository
ppa
:
graphics
-
drivers
/
ppa
sudo
apt
-
get
update
sudo
apt
-
get
install
nvidia
-
358
nvidia
-
settings
|
Upon restart I had the latest 358.16 drivers installed.
Somehow the fonts were very small right after installation as xorg.conf was missing, so I recreated with the command:
1
|
sudo
nvidia
-
xconfig
--
no
-
use
-
edid
-
dpi
|
Then I adjust the font sizes further with Unity Tweak Tool.
The next step is to download and extract nvidia_video_sdk_6.0.1.zip into a working directory:
1
2
|
unzip
nvidia_video_sdk_6
.
0.1.zip
cd
nvidia_video_sdk_6
.
0.1
|
The instructions in the Readme simply tell you to go to Samples directory, and type make in order to build the samples, but I had to do a few more steps:
1
2
|
sudo
apt
-
get
install
libxmu
-
dev
freeglut3
freeglut3
-
dev
export
LDFLAGS
=
"-L /usr/lib/nvidia-358/"
|
I also had to modify Samples/NvTranscoder/Makefile to replace := by += in front of LDFLAGS.
1
2
3
4
5
6
7
|
ifeq
(
$
(
OS_SIZE
)
,
32
)
LDFLAGS
+=
-
L
/
usr
/
lib64
-
lnvidia
-
encode
-
ldl
-
lpthread
CCFLAGS
:
=
-
m32
else
LDFLAGS
+=
-
L
/
usr
/
lib64
-
lnvidia
-
encode
-
ldl
-
lpthread
CCFLAGS
:
=
-
m64
endif
|
and finally I could successfully build the samples:
1
2
|
cd
Samples
make
|
There are several samples in the SDK: NvEncoder, NvEncoderCudaInterop, NvEncoderD3DInterop, NvEncoderLowLatency, NvEncoderPerf, NvTranscoder, NvDecodeD3D9, and NvDecodeGL. For the purpose of this post I used NvTranscoder to convert H.264 video to H.265 using the GPU.
At first I had some issues with the error:
1
|
cuInit
(
0
,
__CUDA_API_VERSION
,
hHandleDriver
)
has
returned
CUDA
error
999
|
I followed a workaround provided on Blender, and it did not work at first, but after using NvTranscoder with sudo once, I could use the tool as a normal user thereafter.
Here’s the output to transcode a H.264 1080p video with High Quality preset.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
time
.
/
NvTranscoder
-
i
h264_1080p_sample
.m4v
-
o
h265_1080p_sample
.ts
-
codec
1
-
preset
hq
Encoding
input
:
"h264_1080p_sample.m4v"
output
:
"h265_1080p_sample.ts"
codec
:
"HEVC"
size
:
1920x1088
bitrate
:
5000000
bits
/
sec
vbvMaxBitrate
:
0
bits
/
sec
vbvSize
:
0
bits
fps
:
90000
frames
/
sec
rcMode
:
CONSTQP
goplength
:
INFINITE
GOP
B
frames
:
0
QP
:
28
preset
:
HQ_PRESET
Total
time
:
31314.338000ms
,
Decoded
Frames
:
4901
,
Encoded
Frames
:
4901
,
Average
FPS
:
156.509775
real
0m31.959s
user
0m33.667s
sys
0m1.429s
|
The video lasts 2 minutes 43 seconds (4901 frames in total), and encoding was done in about 32 seconds meaning about 5 times faster than real-time, and at 156.5 fps on average.
I repeated the same test by with High Performance preset.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
time
.
/
NvTranscoder
-
i
h264_1080p_sample
.m4v
-
o
h265_1080p_sample_fast
.ts
-
codec
1
-
preset
hp
Encoding
input
:
"h264_1080p_sample.m4v"
output
:
"h265_1080p_sample_fast.ts"
codec
:
"HEVC"
size
:
1920x1088
bitrate
:
5000000
bits
/
sec
vbvMaxBitrate
:
0
bits
/
sec
vbvSize
:
0
bits
fps
:
90000
frames
/
sec
rcMode
:
CONSTQP
goplength
:
INFINITE
GOP
B
frames
:
0
QP
:
28
preset
:
HP_PRESET
Total
time
:
23886.508000ms
,
Decoded
Frames
:
4901
,
Encoded
Frames
:
4901
,
Average
FPS
:
205.178589
real
0m24.433s
user
0m26.159s
sys
0m1.104s
|
Decoding took around 24 seconds at 205 fps. It looked pretty good, but I tried the same test with HandBrake using H.265 with RF quality set to 25, and it took 4 minutes and 30 seconds to encode the video, or about 9 times slower than with the GPU. For reference, my computer is based on an AMD FX8350 octa-core processor clocked at 4.0 GHz.
But then I tried to play the video, and I could not find any tool to play them, and NvTranscode appears to generate raw H.265 video data, so as I did not want to write my own little program, I found that ffmpeg also support nvenc, but just not by default, and you have to compile it yourself.
There are instructions to build ffmpeg with nvenc in Ubuntu 15.10, but they did not work on Ubuntu 14.04 so I mixed those with ffmpeg Ubuntu compilation guide to build it for my computer.
First we’ll need to install some dependencies and create a working directory:
1
2
3
4
|
sudo
apt
-
get
-
y
--
force
-
yes
install
autoconf
automake
build
-
essential
libass
-
dev
libfreetype6
-
dev
\
libsdl1
.
2
-
dev
libtheora
-
dev
libtool
libva
-
dev
libvdpau
-
dev
libvorbis
-
dev
libxcb1
-
dev
libxcb
-
shm0
-
dev
\
libxcb
-
xfixes0
-
dev
pkg
-
config
texinfo
zlib1g
-
dev
yasm
mkdir
ffmpeg_sources
|
You’ll also need to download and install/compile some extra packages depending on the codecs we want to enable. I’ll skip H.264 and H.265 since this will be handled by Nvidia GPU instead, and will enable AAC and MP3 audio encoders, VP8/VP9 and XviD video decoders and encoders, and libopus decoder and encoder as explained in the building guide:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
cd
ffmpeg_sources
wget
-
O
fdk
-
aac
.tar
.gz
https
:
/
/
github
.com
/
mstorsjo
/
fdk
-
aac
/
tarball
/
master
tar
xzvf
fdk
-
aac
.tar
.gz
cd
mstorsjo
-
fdk
-
aac
*
autoreconf
-
fiv
.
/
configure
--
prefix
=
"$HOME/ffmpeg_build"
--
disable
-
shared
make
make
install
make
distclean
cd
.
.
sudo
apt
-
get
install
libmp3lame
-
dev
wget
http
:
/
/
storage
.googleapis
.com
/
downloads
.webmproject
.org
/
releases
/
webm
/
libvpx
-
1.5.0.tar.bz2
tar
xjvf
libvpx
-
1.5.0.tar.bz2
cd
libvpx
-
1.5.0
PATH
=
"$HOME/bin:$PATH"
.
/
configure
--
prefix
=
"$HOME/ffmpeg_build"
--
disable
-
examples
--
disable
-
unit
-
tests
PATH
=
"$HOME/bin:$PATH"
make
make
install
make
clean
cd
.
.
sudo
apt
-
get
install
libxvidcore
-
dev
sudo
apt
-
get
install
libopus
-
dev
|
Now I’ll download and extract ffmpeg snapshot (January 3, 2016) and copy the required NVENC 6.0 SDK header files into /usr/local/include:
1
2
3
|
wget
http
:
/
/
ffmpeg
.org
/
releases
/
ffmpeg
-
snapshot
.tar
.bz2
tar
xjvf
ffmpeg
-
snapshot
.tar
.bz2
sudo
cp
.
.
/
nvidia_video_sdk_6
.
0.1
/
Samples
/
common
/
inc
/
*
.h
/
usr
/
local
/
include
/
|
Before configuring and building ffmpeg with nvenc enabled:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
cd
ffmpeg
PATH
=
"$HOME/bin:$PATH"
PKG_CONFIG_PATH
=
"$HOME/ffmpeg_build/lib/pkgconfig"
.
/
configure
\
--
prefix
=
"$HOME/ffmpeg_build"
\
--
pkg
-
config
-
flags
=
"--static"
\
--
extra
-
cflags
=
"-I$HOME/ffmpeg_build/include"
\
--
extra
-
ldflags
=
"-L$HOME/ffmpeg_build/lib"
\
--
bindir
=
"$HOME/bin"
\
--
enable
-
gpl
\
--
enable
-
libass
\
--
enable
-
libfdk
-
aac
\
--
enable
-
libfreetype
\
--
enable
-
libmp3lame
\
--
enable
-
libopus
\
--
enable
-
libtheora
\
--
enable
-
libvorbis
\
--
enable
-
libvpx
\
--
enable
-
nvenc
\
--
enable
-
libxvid
\
--
enable
-
nonfree
make
-
j9
|
You can also optionally install it (which I did):
1
|
sudo
make
install
|
This will install it in $HOME/bin/ffmpeg. Now we can verify nvenc support for H.264 and H.265 is enabled:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
ffmpeg
-
codecs
|
grep
nvenc
ffmpeg
version
N
-
77671
-
g97c162a
Copyright
(
c
)
2000
-
2016
the
FFmpeg
developers
built
with
gcc
4.8
(
Ubuntu
4.8.4
-
2ubuntu1
~
14.04
)
configuration
:
--
prefix
=
/
home
/
jaufranc
/
ffmpeg_build
--
pkg
-
config
-
flags
=
--
static
--
extra
-
cflags
=
-
I
/
home
/
jaufranc
/
ffmpeg_build
/
include
--
extra
-
ldflags
=
-
L
/
home
/
jaufranc
/
ffmpeg_build
/
lib
--
bindir
=
/
home
/
jaufranc
/
bin
--
enable
-
gpl
--
enable
-
libass
--
enable
-
libfdk
-
aac
--
enable
-
libfreetype
--
enable
-
libmp3lame
--
enable
-
libopus
--
enable
-
libtheora
--
enable
-
libvorbis
--
enable
-
libvpx
--
enable
-
nvenc
--
enable
-
nonfree
--
enable
-
libxvid
libavutil
55.
12.100
/
55.
12.100
libavcodec
57.
21.100
/
57.
21.100
libavformat
57.
21.100
/
57.
21.100
libavdevice
57.
0.100
/
57.
0.100
libavfilter
6.
23.100
/
6.
23.100
libswscale
4.
0.100
/
4.
0.100
libswresample
2.
0.101
/
2.
0.101
libpostproc
54.
0.100
/
54.
0.100
DEV
.LS
h264
H
.
264
/
AVC
/
MPEG
-
4
AVC
/
MPEG
-
4
part
10
(
decoders
:
h264
h264
_vdpau
)
(
encoders
:
nvenc
nvenc
_h264
)
DEV
.L
.
hevc
H
.
265
/
HEVC
(
High
Efficiency
Video
Coding
)
(
encoders
:
nvenc
_hevc
)
|
Perfect. Time for a test with our 1080p H.264 video sample, and encoding at 2000 kbps.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
|
time
ffmpeg
-
i
h264_1080p_sample
.m4v
-
vcodec
nvenc_hevc
-
b
:
v
2000k
h265_1080p_sample
.mkv
ffmpeg
version
N
-
77671
-
g97c162a
Copyright
(
c
)
2000
-
2016
the
FFmpeg
developers
built
with
gcc
4.8
(
Ubuntu
4.8.4
-
2ubuntu1
~
14.04
)
configuration
:
--
prefix
=
/
home
/
jaufranc
/
ffmpeg_build
--
pkg
-
config
-
flags
=
--
static
--
extra
-
cflags
=
-
I
/
home
/
jaufranc
/
ffmpeg_build
/
include
--
extra
-
ldflags
=
-
L
/
home
/
jaufranc
/
ffmpeg_build
/
lib
--
bindir
=
/
home
/
jaufranc
/
bin
--
enable
-
gpl
--
enable
-
libass
--
enable
-
libfdk
-
aac
--
enable
-
libfreetype
--
enable
-
libmp3lame
--
enable
-
libopus
--
enable
-
libtheora
--
enable
-
libvorbis
--
enable
-
libvpx
--
enable
-
nvenc
--
enable
-
nonfree
--
enable
-
libxvid
libavutil
55.
12.100
/
55.
12.100
libavcodec
57.
21.100
/
57.
21.100
libavformat
57.
21.100
/
57.
21.100
libavdevice
57.
0.100
/
57.
0.100
libavfilter
6.
23.100
/
6.
23.100
libswscale
4.
0.100
/
4.
0.100
libswresample
2.
0.101
/
2.
0.101
libpostproc
54.
0.100
/
54.
0.100
Input
#0, mov,mp4,m4a,3gp,3g2,mj2, from 'h264_1080p_sample.m4v':
Metadata
:
major_brand
:
mp42
minor_version
:
512
compatible_brands
:
isomiso2avc1mp41
creation_time
:
2015
-
12
-
29
10
:
35
:
15
title
:
MVI_0820
encoder
:
HandBrake
7412svn
2015082501
Duration
:
00
:
02
:
43.53
,
start
:
0.000000
,
bitrate
:
5870
kb
/
s
Stream
#0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 5703 kb/s, 29.97 fps, 29.97 tbr, 90k tbn, 180k tbc (default)
Metadata
:
creation_time
:
2015
-
12
-
29
10
:
35
:
15
handler_name
:
VideoHandler
Stream
#0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 160 kb/s (default)
Metadata
:
creation_time
:
2015
-
12
-
29
10
:
35
:
15
handler_name
:
Stereo
Output
#0, matroska, to 'h265_1080p_sample.mkv':
Metadata
:
major_brand
:
mp42
minor_version
:
512
compatible_brands
:
isomiso2avc1mp41
title
:
MVI_0820
encoder
:
Lavf57
.
21.100
Stream
#0:0(und): Video: hevc (nvenc_hevc) (Main), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 29.97 fps, 1k tbn, 29.97 tbc (default)
Metadata
:
creation_time
:
2015
-
12
-
29
10
:
35
:
15
handler_name
:
VideoHandler
encoder
:
Lavc57
.
21.100
nvenc_hevc
Side
data
:
unknown
side
data
type
10
(
24
bytes
)
Stream
#0:1(eng): Audio: vorbis (libvorbis) (oV[0][0] / 0x566F), 48000 Hz, stereo, fltp (default)
Metadata
:
creation_time
:
2015
-
12
-
29
10
:
35
:
15
handler_name
:
Stereo
encoder
:
Lavc57
.
21.100
libvorbis
Stream
mapping
:
Stream
#0:0 -> #0:0 (h264 (native) -> hevc (nvenc_hevc))
Stream
#0:1 -> #0:1 (aac (native) -> vorbis (libvorbis))
Press
[
q
]
to
stop
,
[
?
]
for
help
.
.
.
.
video
:
39825kB
audio
:
2327kB
subtitle
:
0kB
other
streams
:
0kB
global
headers
:
4kB
muxing
overhead
:
0.295402
%
real
0m30.338s
user
1m19.296s
sys
0m1.555s
|
It took 30 seconds, or about the same time as with NvTranscode, but this time I had a watchable video with audio, and I could not notice any visual quality degradation.
I repeated the test with a H.264 1080p movie lasting 1 hour 57 minutes 29 seconds. The movie H.264 stream was encoded at 2150 kbps, so to decrease the file size by half I encoded the movie at 1075 kbps (-b:v 1075k option). The encoding only took 13 minutes and 12 seconds, or about 9 times faster real-time at 218 fps.
I also checked some GPU details during the transcoding:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
nvidia
-
smi
Mon
Jan
4
11
:
57
:
40
2016
+
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
+
|
NVIDIA
-
SMI
358.16
Driver
Version
:
358.16
|
|
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
-
+
--
--
--
--
--
--
--
--
--
--
--
+
--
--
--
--
--
--
--
--
--
--
--
+
|
GPU
Name
Persistence
-
M
|
Bus
-
Id
Disp
.A
|
Volatile
Uncorr
.
ECC
|
|
Fan
Temp
Perf
Pwr
:
Usage
/
Cap
|
Memory
-
Usage
|
GPU
-
Util
Compute
M
.
|
|=
===
===
===
===
===
===
===
===
===
===
+=
===
===
===
===
===
===
===
+=
===
===
===
===
===
===
===
|
|
0
GeForce
GTX
960
Off
|
0000
:
01
:
00.0
On
|
N
/
A
|
|
42
%
38C
P2
33W
/
120W
|
658MiB
/
2047MiB
|
10
%
Default
|
+
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
-
+
--
--
--
--
--
--
--
--
--
--
--
+
--
--
--
--
--
--
--
--
--
--
--
+
+
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
-
+
|
Processes
:
GPU
Memory
|
|
GPU
PID
Type
Process
name
Usage
|
|=
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
===
=
|
|
0
1705
G
/
usr
/
bin
/
X
265MiB
|
|
0
2582
G
compiz
117MiB
|
|
0
10267
G
/
usr
/
lib
/
firefox
/
plugin
-
container
22MiB
|
|
0
22818
G
totem
35MiB
|
|
0
22917
C
ffmpeg
200MiB
|
+
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
--
-
+
|
This shows for example that it does not maxes out the GPU power consumption (P2 mode: 33 Watts). My processor load was however a bit higher than expected, although not at 100% all the time as would have been the case for software video transcoding.
Beside saving time, transcoding videos with a GPU graphics should also reduce your electricity bill. How much exactly will depend on your video library size, electricity rate, and overall computer power consumption.
While the original file size was 2.0GB, the H.265 video was only 985 MB large, and video quality appeared to be very close to the one of the original video.
Finally, I transcoded a 4K H.264 video @ 30 fps (big_buck_bunny_4k_H264_30fps.mp4) at slightly less half bitrate (3500 kbps for H.265 vs 7480 kbps for H.264) and it took 6 minutes and 56 seconds to encode the 10 minutes 30 seconds video. While checking quality the main problem was my computer struggled to cope with the H.265 4K video when using Totem and VLC video players with lots of artifacts at times, and sound cuts, but the videos played just fine with ffplay and Kodi.
I’d like to thanks GearBest for providing Maxsun MS-GTX960 graphics card selling for $240.04 on their website.
Read more: http://www.cnx-software.com/2016/01/04/faster-h-265hevc-video-encoding-with-nvidia-gtx960-gpu-and-ffmpeg/#ixzz4E448uALP
- 0
顶