移动端实时三维重建KinectFusion-ios(2)——算法调用、算法框架

       博主的源码地址:https://github.com/sjy234sjy234/KinectFusion-ios

       原论文:"KinectFusion: Real-time dense surface mapping and tracking."

       本文主要介绍KinectFusion-ios中的接口调用方法,以及KinectFusion的算法框架。

一、算法调用示例

       在上一篇的博文里已经介绍过,算法调用示例的代码主要编写在ViewController.mm文件中。其中,从depth.bin读取深度帧数据流的过程,这是ios研发相关内容,不再展开介绍。这里主要解释一下KinectFusion的调用示例,主要是FusionProcessor类的初始化和调用。

       首先,是viewDidLoad方法中FusionProcessor的初始化代码

    self.fusionProcessor = [FusionProcessor shareFusionProcessorWithContext: _metalContext];
    [self.fusionProcessor setRenderBackColor: {24.0 / 255, 31.0 / 255, 50.0 / 255, 1}];
    simd::float4 cube = {-107.080887, -96.241348, -566.015991, 223.474106};
    [self.fusionProcessor setupTsdfParameterWithCube: cube];

        其中,第1行是类的初始化,第2行设置渲染展示时的背景颜色,第3、4行设置一个立方体(x, y, z, w)。这里解释一下,KinectFusion算法需要初始化一个立方体包围盒,代码中的cube就是一个预先确定的人脸立方体包围盒,这可以通过计算机视觉的方法获得,这里不展开介绍。其中,(x, y, z)表示立方体的一个顶点,w表示立方体的边长,从而确定三维空间中的一个立方体包围盒。

        然后,是读取深度数据流后执行的回调方法

- (void)stream:(NSStream *)stream handleEvent:(NSStreamEvent)eventCode {
    switch(eventCode) {
        case NSStreamEventHasBytesAvailable:
        {
            //read every frame from depth.bin, which contains one single disparity frame of 640 x 480 x float16,
            //we can easily derive depth from disparity: depth = 1.0 / disparity;
            int frameLen = PORTRAIT_WIDTH * PORTRAIT_HEIGHT * 2;
            uint8_t* buf = new uint8_t[frameLen];
            unsigned int len = 0;
            len = [(NSInputStream *)stream read:buf maxLength:frameLen];
            if(len == frameLen)
            {
                BOOL isFusionOK = [self.fusionProcessor processDisparityData:buf withIndex:m_fusionFrameIndex withTsdfUpdate: YES];
                if(isFusionOK)
                {
                    id<MTLTexture> textureAfterFusion=[self.fusionProcessor getColorTexture];
                    [self.scanningRenderer render: textureAfterFusion];
                    m_fusionFrameIndex++;
                }
                else
                {
                    NSLog(@"Fusion Failed");
                }
            }
            delete buf;
            break;
        }
        default:
            if(m_fusionFrameIndex > 0)
            {
                m_isFusionComplete = YES;
            }
    }
}

        其中,7-10行从数据流读取一帧的深度数据,单帧的大小是480x640x2 byte;13行完成一次KinectFusion的单帧处理的调用;16-17行是将当前重建的模型进行渲染展示;18行更新帧的序列号m_fusionFrameIndex。

        下面是重置FusionProcessor的操作,也是app中唯一的一个按键回调,重新进入一次全新的扫描

- (IBAction)onResetScan:(id)sender {
    if(m_isFusionComplete)
    {
        m_fusionFrameIndex = 0;
        m_isFusionComplete = NO;
        simd::float4 cube = {-107.080887, -96.241348, -566.015991, 223.474106};
        [self.fusionProcessor setupTsdfParameterWithCube: cube];
        [self setUpStreamForFile: self.streamPath];
    }
}

        第4行中,帧的序列号m_fusionFrameIndex初始化时必须为0,FusionProcessor会根据m_fusionFrameIndex的值是否为0,自动完成初始化,无需显式调用初始化方法。然后,第6-7行,重新设置新的立方体cube,就可以进入一次全新的扫描。

二、算法框架介绍

        这里,从算法实现的角度,画了一个KinectFusion算法框架的流程图如下

        

        其中,圆形框表示数据流,矩形框表示处理模块。可以看到,算法主要由4个处理模块构成:数据准备、TSDF、ICP、MarchingCube。

        之前的文章介绍过,项目的FusionProcessor文件夹里包含KinectFusion的源码。其中,所有的4个处理模块都存在其下的FusionComputer子文件夹中。并且FusionComputer下的4个子文件夹一一对应各模块的实现:FuPreProcess、FuTsdfFusioner、FuICPMatrix、FuMarchingCube。最后,FusionProcessor类的文件FusionProcessor.mm则直接放在FusionProcessor文件夹里,实现了上面算法流程图里面的整个过程,将各个模块组织成完整的KinectFusion算法。

       需要注意的是,上述的流程图,需要对照源码进行理解,并且不一定十分完备,不要过于深究。

       首先,介绍下FusionProcessor类的对外主要接口,有如下两个

- (BOOL) processDisparityData: (uint8_t *)disparityPixelBuffer withIndex: (int) fusionFrameIndex withTsdfUpdate:(BOOL)isTsdfUpdate;

- (BOOL) processDisparityPixelBuffer: (CVPixelBufferRef)disparityPixelBuffer withIndex: (int) fusionFrameIndex withTsdfUpdate:(BOOL)isTsdfUpdate;

        其中,第1行就是在前面的算法调用示例中出现过的接口调用,传入每一帧新的深度帧数据byte流,以更新重建三维场景。第3行是用于直接从iphoneX获取的深度数据类型CVPixelBufferRef的实时调用借口。实际上,两者完成的操作是一模一样的,只是传入的数据流格式略有差别,下面是两个方法的实现

- (BOOL) processDisparityData: (uint8_t *)disparityData withIndex: (int) fusionFrameIndex withTsdfUpdate:(BOOL)isTsdfUpdate
{
    id<MTLBuffer> inDisparityMapBuffer = [_metalContext.device newBufferWithBytes:disparityData
                                                                           length:PORTRAIT_WIDTH*PORTRAIT_HEIGHT*2
                                                                          options:MTLResourceOptionCPUCacheModeDefault];
    return [self processFrame: inDisparityMapBuffer withIndex: fusionFrameIndex withTsdfUpdate: isTsdfUpdate];
}

- (BOOL) processDisparityPixelBuffer: (CVPixelBufferRef)disparityPixelBuffer withIndex: (int) fusionFrameIndex withTsdfUpdate:(BOOL)isTsdfUpdate
{
    id<MTLBuffer> inDisparityMapBuffer = [_metalContext bufferWithF16PixelBuffer: disparityPixelBuffer];
    return [self processFrame: inDisparityMapBuffer withIndex: fusionFrameIndex withTsdfUpdate: isTsdfUpdate];
}

        可以看到,两个方法只是做了数据的中转处理,把深度数据统一处理成MTLBuffer数据类型,然后统一调用processFrame私有方法,实现KinectFusion的单帧处理过程。

        processFrame是KinectFusion算法框架的核心处理函数,其内容如下        

- (BOOL) processFrame: (id<MTLBuffer>) inDisparityMapBuffer withIndex: (int) fusionFrameIndex withTsdfUpdate:(BOOL)isTsdfUpdate;
{
    //pre-process
    [_fuMeshToTexture drawPoints: m_mCubeExtractPointBuffer normals: m_mCubeExtractNormalBuffer intoColorTexture: m_colorTexture andDepthTexture: m_depthTexture withTransform: m_projectionTransform * m_globalToFrameTransform];
    [_fuTextureToDepth compute: m_depthTexture intoTexture: m_preDepthMapPyramid[0] with: m_cameraNDC2Depth];
    [_fuDisparityToDepth compute: inDisparityMapBuffer intoDepthMapBuffer: m_currentDepthMapPyramid[0]];
    for(int level=1;level<PYRAMID_LEVEL;++level)
    {
        [_fuPyramidDepthMap compute: m_currentDepthMapPyramid[level - 1] intoDepthMapBuffer: m_currentDepthMapPyramid[level] withLevel: level];
        [_fuPyramidDepthMap compute: m_preDepthMapPyramid[level - 1] intoDepthMapBuffer: m_preDepthMapPyramid[level] withLevel: level];
    }
    for(int level=0;level<PYRAMID_LEVEL;++level)
    {
        [_fuDepthToVertex compute: m_currentDepthMapPyramid[level] intoVertexMapBuffer: m_currentVertexMapPyramid[level] withLevel: level andIntrinsicUVD2XYZ: m_intrinsicUVD2XYZ[level]];
        [_fuVertexToNormal compute: m_currentVertexMapPyramid[level] intoNormalMapBuffer: m_currentNormalMapPyramid[level] withLevel: level];
        [_fuDepthToVertex compute: m_preDepthMapPyramid[level] intoVertexMapBuffer: m_preVertexMapPyramid[level] withLevel: level andIntrinsicUVD2XYZ: m_intrinsicUVD2XYZ[level]];
        [_fuVertexToNormal compute: m_preVertexMapPyramid[level] intoNormalMapBuffer: m_preNormalMapPyramid[level] withLevel: level];
    }
    
    //icp
    if(fusionFrameIndex<=0)
    {
        //first frame, no icp
        NSLog(@"first frame, fusion reset");
        [self reset];
    }
    else
    {
        //icp iteration
        BOOL isSolvable=YES;
        simd::float3x3 currentF2gRotate;
        simd::float3 currentF2gTranslate;
        simd::float3x3 preF2gRotate;
        simd::float3 preF2gTranslate;
        simd::float3x3 preG2fRotate;
        simd::float3 preG2fTranslate;
        matrix_transform_extract(m_frameToGlobalTransform,currentF2gRotate,currentF2gTranslate);
        matrix_transform_extract(m_frameToGlobalTransform, preF2gRotate, preF2gTranslate);
        matrix_transform_extract(m_globalToFrameTransform, preG2fRotate, preG2fTranslate);
        for(int level=PYRAMID_LEVEL-1;level>=0;--level)
        {
            uint iteratorNumber=ICPIteratorNumber[level];
            for(int it=0;it<iteratorNumber;++it)
            {
                uint occupiedPixelNumber = [_fuICPPrepareMatrix computeCurrentVMap:m_currentVertexMapPyramid[level] andCurrentNMap:m_currentNormalMapPyramid[level] andPreVMap:m_preVertexMapPyramid[level] andPreNMap:m_preNormalMapPyramid[level] intoLMatrix:m_icpLeftMatrixPyramid[level] andRMatrix:m_icpRightMatrixPyramid[level] withCurrentR:currentF2gRotate andCurrentT:currentF2gTranslate andPreF2gR:preF2gRotate andPreF2gT:preF2gTranslate andPreG2fR:preG2fRotate andPreG2fT:preG2fTranslate  andThreshold:m_icpThreshold andIntrinsicXYZ2UVD:m_intrinsicXYZ2UVD[level] withLevel:level];
                if(occupiedPixelNumber==0)
                {
                    isSolvable=NO;
                }
                if(isSolvable)
                {
                    [_fuICPReduceMatrix computeLeftMatrix:m_icpLeftMatrixPyramid[level] andRightmatrix:m_icpRightMatrixPyramid[level] intoLeftReduce:m_icpLeftReduceBuffer andRightReduce:m_icpRightReduceBuffer withLevel:level andOccupiedNumber:occupiedPixelNumber];
                    float result[6];
                    isSolvable=matrix_float6x6_solve((float*)m_icpLeftReduceBuffer.contents, (float*)m_icpRightReduceBuffer.contents, result);
                    if(isSolvable)
                    {
                        simd::float3x3 rotateIncrement=matrix_float3x3_rotation(result[0], result[1], result[2]);
                        simd::float3 translateIncrement={result[3], result[4], result[5]};
                        currentF2gRotate=rotateIncrement*currentF2gRotate;
                        currentF2gTranslate=rotateIncrement*currentF2gTranslate+translateIncrement;
                    }
                }
            }
            if(!isSolvable)
            {
                break;
            }
        }
        if(isSolvable)
        {
            matrix_transform_compose(m_frameToGlobalTransform, currentF2gRotate, currentF2gTranslate);
            m_globalToFrameTransform=simd::inverse(m_frameToGlobalTransform);
        }
        else
        {
            NSLog(@"lost frame");
            return NO;
        }
    }
    
    if(isTsdfUpdate||fusionFrameIndex<=0)
    {
        //tsdf fusion updater
        [_fuTsdfFusioner compute:m_currentDepthMapPyramid[0] intoTsdfVertexBuffer:m_tsdfVertexBuffer withIntrinsicXYZ2UVD:m_intrinsicXYZ2UVD[0] andTsdfParameter:m_tsdfParameter andTransform:m_globalToFrameTransform];
        
        //marching cube
        int activeVoxelNumber = [_fuMCubeTraverse compute:m_tsdfVertexBuffer intoActiveVoxelInfo:m_mCubeActiveVoxelInfoBuffer withMCubeParameter:m_mCubeParameter];
        if(activeVoxelNumber==0)
        {
            NSLog(@"alert: no active voxel");
            m_mCubeExtractPointBuffer = nil;
            m_mCubeExtractNormalBuffer = nil;
            return NO;
        }
        else if(activeVoxelNumber>=m_mCubeParameter.maxActiveNumber)
        {
            NSLog(@"alert: too mush active voxels");
            m_mCubeExtractPointBuffer = nil;
            m_mCubeExtractNormalBuffer = nil;
            return NO;
        }
        else
        {
            void *baseAddress=m_mCubeActiveVoxelInfoBuffer.contents;
            ActiveVoxelInfo *activeVoxelInfo=(ActiveVoxelInfo*)baseAddress;
            for(int i=1;i<activeVoxelNumber;++i)
            {
                activeVoxelInfo[i].vertexNumber=activeVoxelInfo[i-1].vertexNumber+activeVoxelInfo[i].vertexNumber;
            }
            uint totalVertexNumber=activeVoxelInfo[activeVoxelNumber-1].vertexNumber;
            m_mCubeExtractPointBuffer = [_metalContext.device newBufferWithLength: 3 * totalVertexNumber * sizeof(float) options:MTLResourceOptionCPUCacheModeDefault];
            m_mCubeExtractNormalBuffer = [_metalContext.device newBufferWithLength: 3 * totalVertexNumber * sizeof(float) options:MTLResourceOptionCPUCacheModeDefault];
            [_fuMCubeExtract compute: m_mCubeActiveVoxelInfoBuffer andTsdfVertexBuffer: m_tsdfVertexBuffer withActiveVoxelNumber: activeVoxelNumber andTsdfParameter: m_tsdfParameter andMCubeParameter: m_mCubeParameter andOutMCubeExtractPointBufferT: m_mCubeExtractPointBuffer andOutMCubeExtractNormalBuffer: m_mCubeExtractNormalBuffer];
        }
    }
    
    return YES;
}

        博主已经注释出了4个处理模块的代码被调用所在的位置。其中,3-18行是数据准备模块,20-79行是ICP模块,83-84行是TSDF模块,86-114行是MarchingCube模块。

        这个方法的代码量较多,不在这里一次性全部展开介绍。后续的文章中,博主会分4个部分,对每一个模块的原理和代码实现做进一步的详细讲解。

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值