一.什么是内存踩踏
访问了不合法的地址 。通俗一点就是访问了不属于自己的地址。如果这块地址分配给了另一个变量使用,就会破坏别人的数据。从而导致程序运行异常,挂死,输出图像破图等。
二.内存踩踏可能情形
- 数组访问越界;
- 字符串操作越界;
- 野指针;
- 重复释放;
- 指针类型转换错误;
- 栈溢出;
- 堆溢出
- 释放在使用;
- 多线程读写的数据没有保护;
- 多线程使用线程不安全的函数;
- 其他;
三.如何排查
1.查看ylog,帮助缩小排查范围;
2.加入debug log,进一步缩小范围;打印相关内存地址/值;
3.将被踩踏内存设置成只读,在调用栈中查看谁在写;
4.链接排查内存踩踏的工具Asan来帮助排查;
5.其他;
四.如何避免
- 严格遵守编程规范;
- 数组访问边界检查,数值传递需防止溢出;
- 申请内存/使用内存判空和释放内存需置空;
- 变量定义好后及时赋初值,特别像结构体这种;
- 多线程访问时做好线程保护;
- 其他;
内存踩踏定位的成本很高,一旦出现严重的内存踩踏问题,排查的成本是很高很高的。
五FaceHal内存踩踏案例分析
1.涉及bug
问题1: 【K2】录入人脸后,打开活体检测后进行解锁,无法解锁
问题2: 【K2】录入人脸后,未识别到人脸时或识别非录入脸,无提示语以及图标不跳动
2.问题现象
(1)K2平台 人脸解锁大概率失败,其他平台无此异常;
(2)从log看各个平台都能通过tag将metadata.update(ANDROID_XXX_TO_FACEIDSERVICE_PHYADDR,&mUnlockPhyaddr, 1);
将物理地址传递给FaceHal,FaceHal将物理地址&相关统计信息传递给FaceUnlock算法处理使用,当算法根据信息无法解锁时会将result状态置为AUTH_CONTINUE然后通过
CameraHelper::getInstance()->freeAlgoBuffer(main_addr,sub_addr)接口去实现buffer的轮转。
M008192 10-09 05:43:12.004 335 1533 D Cam3SingleFaceIdU: 1091, processCaptureResultMain: callback phy addr=0xbf6fc000
M008193 10-09 05:43:12.004 335 1533 D Cam3SingleFaceIdU: 1100, processCaptureResultMain: check phy addr=0xbf6fc000
M008216 10-09 05:43:12.018 359 1507 D CameraHelper: onCameraResultAvailable: getConstEntry entryAddrToFaceID bf6fc000.
M008217 10-09 05:43:12.018 359 1507 D CameraHelper: onCameraResultAvailable: getConstEntry resultAeState 2.
M0082B7 10-09 05:43:12.032 359 511 I CameraHelper: AlgoHandler onMessageReceived ALGO_AUTH_REQUEST .
M0082BA 10-09 05:43:12.032 359 511 D FaceIdHal: get_virtual_address result:0xad617000 , in_size:fd200 , real size:fe000 ,pagesize:1000
M0082BB 10-09 05:43:12.032 359 511 D FaceIdHal: get_virtual_address input phy addr:bf6fc000
M0082BC 10-09 05:43:12.032 359 511 D FaceIdHal: auth_proc_run help interact info, 0 - 0 -0 -0 -0 - 0 .
M0082BD 10-09 05:43:12.033 359 511 I FaceID : 485, getHelpInfo: aeStable: 0, backlightPro : 0, brightValue: 548, blEnable: 0, faceLum: 0
M0082D1 10-09 05:43:12.033 359 511 I FaceID : 247, getFaceShape: detect face num = 0
M0082D3 10-09 05:43:12.046 359 511 D FaceIdHal: auth_proc_run single camera, ret is 1
M0082D4 10-09 05:43:12.046 359 511 I FaceIdHal: auth_proc_run duration:1352 ms , max dura:8000 ms , end:41169726393 ,start:39817445240
M0082D5 10-09 05:43:12.046 359 511 I FaceIdHal: face_do_authenticate_process(), result: 2
M008323 10-09 05:43:12.052 359 511 I CameraHelper: freeAlgoBuffer set PhyAddrFromFaceID with main_addr: bf6fc000.
现在较奇怪的现象是S1等平台通过该接口可以使Face buffer正常轮转,而K2的Face buffer无法正常轮转,ACaptureRequest_setEntry_i64&ACaptureRequest_setEntry_i32下设meta出现native framework 存在error信息:
M008323 10-09 05:43:12.052 359 511 I CameraHelper: freeAlgoBuffer set PhyAddrFromFaceID with main_addr: bf6fc000.
M008325 10-09 05:43:12.052 359 511 E CamComm1.0-VTDesc: getTagType: Vendor descriptor id is missing!
M008327 10-09 05:43:12.052 359 511 E CamComm1.0-VTDesc: getTagName: Vendor descriptor id is missing!
M008328 10-09 05:43:12.052 359 511 E CamComm1.0-MD: Mismatched tag type when updating entry (null) (-2147483583) of type byte; got type int64 data instead
M0083D5 10-09 05:43:12.118 359 511 E CamComm1.0-VTDesc: getTagType: Vendor descriptor id is missing!
M0083D7 10-09 05:43:12.118 359 511 E CamComm1.0-VTDesc: getTagName: Vendor descriptor id is missing!
M0083D8 10-09 05:43:12.118 359 511 E CamComm1.0-MD: Mismatched tag type when updating entry (null) (-2147483597) of type byte; got type int32 data instead
M0083FF 10-09 05:43:12.122 538 1555 E camera_metadata: validate_camera_metadata_structure: Entry index 0 had 0 items, but offset was non-0 (196608), tag name: mode
M008400 10-09 05:43:12.122 538 1555 E cameraserver: convertFromHidl: Malformed camera metadata received from HAL
M008401 10-09 05:43:12.122 538 1555 E cameraserver: copyPhysicalCameraSettings: Unable to convert physicalCameraSettings from HIDL to AIDL.