H.265的压缩率早有耳闻,我想在我们的项目中使用它,于是花了几个小时的时间来预言X265是否适合在Android上使用,结论是不适合,因为CPU占用率过高,帧率很低。于是在qcom上看了几款CPU目前都是支持硬解软编,但是我们目前对编码的需求大于解码,所以目前只能放弃H.265。
Linux测试
- 测试的CPU
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Model name: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
Stepping: 3
CPU MHz: 3593.250
CPU max MHz: 3600.0000
CPU min MHz: 800.0000
BogoMIPS: 6400.21
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
- ubuntu下安装X265
apt-get install x265
- 测试参数
这里采用ultrafast
模式、去B帧、码率4MB、分辨率1080P、帧率50(这个不会影响编码性能)
x265 -p ultrafast --bframes 0 --bitrate 4000 --input-res 1920x1080 --fps 50 i420_1920x1080_50.yuv -o out.h265
- 测试分析
- X265使用了X86汇编加速(using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2)
- 在i5-4570上的平均编码帧率也就才16.64 fps,要知道手机的H.264的硬编码可以轻松上60 fps。
- CPU占用率奇高,跑它以后啥都别干了。
yuv [info]: 1920x1080 fps 50000/1000 i420p8 frames 0 - 500 of 501
x265 [info]: HEVC encoder version 1.5
x265 [info]: build info [Linux][GCC 4.9.2][64 bit] 8bpp
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
x265 [info]: Main profile, Level-4.1 (Main tier)
x265 [info]: WPP streams / frame threads / pool : 34 / 2 / 4
x265 [info]: CTU size / RQT depth inter / intra : 32 / 1 / 1
x265 [info]: ME / range / subpel / merge : dia / 25 / 0 / 2
x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 0
x265 [info]: Lookahead / bframes / badapt : 10 / 0 / 0
x265 [info]: b-pyramid / weightp / weightb / refs: 0 / 0 / 0 / 1
x265 [info]: Rate Control / AQ-Strength / CUTree : ABR-4000 kbps / 0.0 / 0
x265 [info]: tools: rd=2 psy-rd=0.30 early-skip deblock fast-intra tmvp
x265 [info]: frame I: 3, Avg QP:34.18 kb/s: 16426.80
x265 [info]: frame P: 498, Avg QP:34.74 kb/s: 3945.84
x265 [info]: global : 501, Avg QP:34.74 kb/s: 4020.58
x265 [info]: consecutive B-frames: 100.0%
encoded 501 frames in 30.10s (16.64 fps), 4020.58 kb/s
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12164 liu 20 0 764752 282116 4840 S 346.3 3.5 1:28.37 x265
Android测试
并没有死心,继续在Android上测试。测试的手机型号为:SM-G9009W,CPU为Qcom的8974。
- 8974的codec能力
所以为啥H.265目前普及不开是有原因的。
8974 Encoder capabilities
______________________________________________________
| Codec | W H fps Mbps MB/s |
|__________|_________________________________________|
| h264 | 3840 2160 30 100 972000 |
| | 4096 2160 24 100 829440 |
| mpeg4 | 1920 1088 30 40 244800 |
| vp8 | 1920 1088 30 20 244800 |
| h263 | 864 480 30 2 48600 |
|__________|_________________________________________|
8974 Decoder capabilities
______________________________________________________
| Codec | W H fps Mbps MB/s |
|__________|_________________________________________|
| h264 | 3840 2160 30 100 972000 |
| | 4096 2160 24 100 829440 |
| hevc | 1920 1088 30 6 244800 |
| mpeg4 | 1920 1088 60 60 489600 |
| vc1 | 1920 1088 60 60 489600 |
| vp8 | 3820 2160 30 20 972000 |
| divx3 | 720 480 30 2 40500 |
| div4/5/6 | 1920 1088 30 10 244800 |
| h263 | 864 480 30 2 48600 |
| mpeg2 | 1920 1088 30 40 244800 |
|__________|_________________________________________|
- 测试的CPU
Processor : ARMv7 Processor rev 1 (v7l)
processor : 0
BogoMIPS : 38.40
processor : 1
BogoMIPS : 38.40
processor : 2
BogoMIPS : 38.40
processor : 3
BogoMIPS : 38.40
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x06f
CPU revision : 1
Hardware : Qualcomm MSM8974PRO-AC
Revision : 000a
Serial : 0000083000009994
- 测试参数
由于手机上本身就已经有一个832x480分辨的yuv文件,所以我就直接拿它进行测试,码率降为500KB,其他参数保持和linux一样
./x265 -p ultrafast --bframes 0 --bitrate 500 --input-res 832x480 --fps 30 832_480.i420 -o out.h265
测试分析
- 虽然
using cpu capabilities: none
不过确实使用了neon
加速的,为什么不显示neon
呢,因为仅仅支持x86的显示,请看x265_report_simd这个函数。 - 832x480的分辨率帧率既然才有9 fps
- CPU占用率奇高,跑它以后啥都别干了
- 虽然
mes 0 --bitrate 500 --input-res 832x480 --fps 30 832_480.i420 -o out.h265 <
yuv [info]: 832x480 fps 30000/1000 i420p8 frames 0 - 500 of 501
raw [info]: output file: out.h265
x265 [info]: HEVC encoder version X265_VERSION
x265 [info]: build info [Linux][GCC 4.9.0][32 bit][noasm] 8bit
x265 [info]: using cpu capabilities: none!
x265 [info]: Main profile, Level-3 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: frame threads / pool features : 2 / wpp(15 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 0
x265 [info]: Lookahead / bframes / badapt : 5 / 0 / 0
x265 [info]: b-pyramid / weightp / weightb : 0 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-500 kbps / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing deblock
x265 [info]: frame I: 3, Avg QP:34.72 kb/s: 6350.88
x265 [info]: frame P: 498, Avg QP:37.60 kb/s: 491.82
x265 [info]: consecutive B-frames: 100.0%
encoded 501 frames in 55.63s (9.01 fps), 526.91 kb/s, Avg QP:37.58
31580 2 96% S 8 67568K 40544K shell ./x265
Android上X265的编译
- 下载
hg clone https://bitbucket.org/multicoreware/x265
- Android.mk
LOCAL_PATH := $(call my-dir)
#---------- static module ----------#
COMMON_CPP_SRCS := \
common/cpu.cpp \
common/ipfilter.cpp \
common/threadpool.cpp \
common/param.cpp \
common/picyuv.cpp \
common/framedata.cpp \
common/bitstream.cpp \
common/pixel.cpp \
common/predict.cpp \
common/quant.cpp \
common/constants.cpp \
common/md5.cpp \
common/dct.cpp \
common/loopfilter.cpp \
common/primitives.cpp \
common/scalinglist.cpp \
common/piclist.cpp \
common/frame.cpp \
common/slice.cpp \
common/common.cpp \
common/threading.cpp \
common/lowres.cpp \
common/intrapred.cpp \
common/wavefront.cpp \
common/winxp.cpp \
common/shortyuv.cpp \
common/yuv.cpp \
common/deblock.cpp \
common/cudata.cpp \
common/version.cpp
COMMON_ARM_SRCS := \
common/arm/asm-primitives.cpp \
common/arm/asm.S \
common/arm/blockcopy8.S \
common/arm/cpu-a.S \
common/arm/dct-a.S \
common/arm/ipfilter8.S \
common/arm/mc-a.S \
common/arm/pixel-util.S \
common/arm/sad-a.S \
common/arm/ssd-a.S
COMMON_X86_SRCS := \
common/x86/blockcopy8.asm \
common/x86/const-a.asm \
common/x86/cpu-a.asm \
common/x86/dct8.asm \
common/x86/intrapred16.asm \
common/x86/intrapred8_allangs.asm \
common/x86/intrapred8.asm \
common/x86/ipfilter16.asm \
common/x86/ipfilter8.asm \
common/x86/loopfilter.asm \
common/x86/mc-a2.asm \
common/x86/mc-a.asm \
common/x86/pixel-32.asm \
common/x86/pixel-a.asm \
common/x86/pixeladd8.asm \
common/x86/pixel-util8.asm \
common/x86/sad16-a.asm \
common/x86/sad-a.asm \
common/x86/ssd-a.asm \
common/x86/x86inc.asm \
common/x86/x86util.asm
ENCODER_CPP_SRCS := \
encoder/analysis.cpp \
encoder/api.cpp \
encoder/bitcost.cpp \
encoder/dpb.cpp \
encoder/encoder.cpp \
encoder/entropy.cpp \
encoder/frameencoder.cpp \
encoder/framefilter.cpp \
encoder/level.cpp \
encoder/motion.cpp \
encoder/nal.cpp \
encoder/ratecontrol.cpp \
encoder/reference.cpp \
encoder/sao.cpp \
encoder/search.cpp \
encoder/sei.cpp \
encoder/slicetype.cpp \
encoder/weightPrediction.cpp \
include $(CLEAR_VARS)
LOCAL_MODULE := common
LOCAL_ARM_MODULE := arm
LOCAL_CFLAGS := -Wall -Wextra -Wshadow -std=gnu++98 -fPIC -Wno-array-bounds -ffast-math -fno-exceptions -fpermissive -frtti -Wno-maybe-uninitialized
LOCAL_CFLAGS += -DEXPORT_C_API=1 -DHAVE_INT_TYPES_H=1 -DHIGH_BIT_DEPTH=0 -DX265_DEPTH=8 -DX265_NS=x265 -D__STDC_LIMIT_MACROS=1 -DHAVE_STRTOK_R
LOCAL_EXPORT_CFLAGS := $(LOCAL_CFLAGS)
LOCAL_SRC_FILES := $(COMMON_CPP_SRCS)
$(info arm = $(TARGET_ARCH_ABI))
ifneq (, $(findstring $(TARGET_ARCH_ABI),armeabi armeabi-v7a))
LOCAL_CFLAGS += -DHAVE_NEON -DX265_ARCH_ARM
LOCAL_SRC_FILES += $(COMMON_ARM_SRCS)
endif
ifeq ($(TARGET_ARCH_ABI),x86)
LOCAL_CFLAGS += -UX86_64 -DX265_ARCH_X86
LOCAL_SRC_FILES += $(COMMON_X86_SRCS)
endif
LOCAL_C_INCLUDES := $(LOCAL_PATH) $(LOCAL_PATH)/common $(LOCAL_PATH)/encoder
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_C_INCLUDES)
include $(BUILD_STATIC_LIBRARY)
#---------- static module ----------#
include $(CLEAR_VARS)
LOCAL_MODULE := encoder
LOCAL_ARM_MODULE := arm
LOCAL_SRC_FILES := $(ENCODER_CPP_SRCS)
LOCAL_STATIC_LIBRARIES := common
include $(BUILD_STATIC_LIBRARY)
#---------- static module ----------#
include $(CLEAR_VARS)
LOCAL_MODULE := input
LOCAL_ARM_MODULE := arm
LOCAL_SRC_FILES := \
input/input.cpp \
input/y4m.cpp \
input/yuv.cpp
LOCAL_C_INCLUDES := $(LOCAL_PATH)
LOCAL_STATIC_LIBRARIES := common
include $(BUILD_STATIC_LIBRARY)
#---------- static module ----------#
include $(CLEAR_VARS)
LOCAL_MODULE := output
LOCAL_ARM_MODULE := arm
LOCAL_SRC_FILES := \
output/reconplay.cpp \
output/raw.cpp \
output/y4m.cpp \
output/yuv.cpp \
output/output.cpp
LOCAL_C_INCLUDES := $(LOCAL_PATH)
LOCAL_STATIC_LIBRARIES := common
include $(BUILD_STATIC_LIBRARY)
include $(CLEAR_VARS)
LOCAL_MODULE := x265
LOCAL_ARM_MODULE := arm
LOCAL_WHOLE_STATIC_LIBRARIES := encoder input output
include $(BUILD_SHARED_LIBRARY)
#---------- binary module ----------#
include $(CLEAR_VARS)
LOCAL_MODULE := x265_test
LOCAL_ARM_MODULE := arm
LOCAL_SRC_FILES := x265-extras.cpp x265.cpp
LOCAL_C_INCLUDES := $(LOCAL_PATH)
LOCAL_STATIC_LIBRARIES := encoder input output
include $(BUILD_EXECUTABLE)
- 编译结果
libx265.so
在Android上可用的动态库x265_test
我在上面用到的测试程序
[armeabi-v7a] SharedLibrary : libx265.so
[armeabi-v7a] Install : libx265.so => out/libs/armeabi-v7a/libx265.so
[armeabi-v7a] Compile++ thumb: x265_test <= x265-extras.cpp
[armeabi-v7a] Compile++ thumb: x265_test <= x265.cpp
[armeabi-v7a] Executable : x265_test
[armeabi-v7a] Install : x265_test => out/libs/armeabi-v7a/x265_test