三维人脸重建：精解读代码: pipeline.py

live_for_myself

于 2021-12-23 22:58:57 发布

阅读量2.9k

点赞数 2

分类专栏： 3维人脸重建文章标签： python 计算机视觉几何学

本文链接：https://blog.csdn.net/landing_guy_/article/details/122116915

版权

3维人脸重建专栏收录该内容

29 篇文章 34 订阅

订阅专栏

文章目录

转载请注明出处

前言

首先说明坐标系变换关系, 一般认为有4个坐标系, 世界坐标系,相机坐标系, 图像物理坐标系(原点在图片中间), 图像像素坐标系(原点在图片左上角)
这里3d世界坐标系在中间, 相机位置也是中间, 而文中说的图像左边西是指的像素坐标系

变换流程

具体从一个mesh文件变换到图片的流程如下:

3d点的坐标变换, 示例代码:

transformed_vertices = s * vertices.dot(R.T) + t3d[np.newaxis, :]

世界坐标系到相机坐标系的变换, 示例代码:

transformed_vertices = vertices - eye # 平移变换
transformed_vertices = transformed_vertices.dot(R.T)

从相机坐标系到图片像素坐标系的投影变换, 分正交与透视变换, 示例代码:

image_vertices = mesh.transform.to_image(projected_vertices, h, w)

渲染, 也就是把2d的离散的坐标点插值到屏幕上的像素点, 示例代码:

rendering = mesh.render.render_colors(image_vertices, triangles, lit_colors, h, w)

代码分析

省略部分不必要的代码

1. 载入网格数据

C = sio.loadmat('Data/example1.mat')
vertices = C['vertices']; colors = C['colors']; triangles = C['triangles']
colors = colors/np.max(colors) 
 # 归一化， 其实最大值已经是1了，np.max取得是全部中的最大值

数据以字典形式存储。各个key的shape：
其中full_triangles是triangle和嘴部的triangle放在一块了

2. 顶点变换

给定s， R和t，对3d的坐标进行缩放，旋转和平移，这里的s大小对应你最后要得到的图片大小

s = 180/(np.max(vertices[:,1]) - np.min(vertices[:,1]))
# 这里是y的最大值减去最小值

R = mesh.transform.angle2matrix([0, 30, 0])
# y轴逆时针旋转30°

# 位移矩阵
t = [0, 0, 0]

transformed_vertices = mesh.transform.similarity_transform(vertices, s, R, t)
 # 相似变换， 表示3d坐标旋转平移后的位置
# 3D: s*R.dot(X) + t

旋转矩阵记住公式就可以了，可以看我的这篇博客总结，比之前更简洁了：链接直达

3. 颜色/纹理变换

现在得到的点是只有坐标的, 虽然我们有color的值, 但是这个值会受到光照的影响
光强会对原来3d点的颜色产生影响, 所以点光源的位置和强度影响的是原来存储的颜色的变化 (这里主要是漫反射)
这里的纹理定义就是颜色

# 点光源在世界坐标系的坐标
light_positions = np.array([[-128, -128, 300]])

# 点光源的强度
light_intensities = np.array([[1, 1, 1]])
# 在已定义的点光下，变换颜色

lit_colors = mesh.light.add_light(transformed_vertices, triangles, colors, light_positions, light_intensities)  
# 里面会求得顶点的法线

下面看看这个add_light怎么做的

关于漫反射可以看我的这个博客点击直达
这里是对每个顶点着色, 着色频率就是高洛德着色

def add_light(vertices, triangles, colors, light_positions = 0, light_intensities = 0):
# 只用到了漫反射,没有环境光和镜面反射
 	nver = vertices.shape[0]
    normals = get_normal(vertices, triangles) # [nver, 3]
	# 这个get_normal是求解每个顶点的法线坐标
	
 	# diffuse（漫反射光的公式, 影响颜色）
    # Ld = kd*(I/r^2)max(0, n*l)
    direction_to_lights = vertices[np.newaxis, :, :] - light_positions[:, np.newaxis, :] 
    # [nlight, nver, 3] 定义光照方向,也就是光照位置和顶点之间的向量方向
    
    # 下面两行就是把光照方向的向量变为单位长度向量
    direction_to_lights_n = np.sqrt(np.sum(direction_to_lights**2, axis = 2)) # [nlight, nver] 
    direction_to_lights = direction_to_lights/direction_to_lights_n[:, :, np.newaxis]

	# 下面两行是得到法线和光线的点乘, 先元素级相乘然后求和,相当于dot
	normals_dot_lights = normals[np.newaxis, :, :]*direction_to_lights # [nlight, nver, 3]
    normals_dot_lights = np.sum(normals_dot_lights, axis = 2) # [nlight, nver] 这是一个萝卜一个坑, 一个顶点对应一条光线, 一个法线, nlight是1

	# diffuse（漫反射光的公式, 影响颜色）
    # Ld = kd*(I/r^2)max(0, n*l)		
	diffuse_output = colors[np.newaxis, :, :]*normals_dot_lights[:, :, np.newaxis]*light_intensities[:, np.newaxis, :]
    diffuse_output = np.sum(diffuse_output, axis = 0) 
    # [nver, 3] 这里其实是把nlight维度合并了, 当然原来是1, 因为只有漫反射的光,没有镜面反射或者别的

	lit_colors = diffuse_output # only diffuse part here.
    lit_colors = np.minimum(np.maximum(lit_colors, 0), 1)
    return lit_colors

下面看看求解法线, get_normal怎么做的

这里得到的是顶点的法线, 计算过程就是叉乘得到面的法线, 然后看有重合的法线就取平均

在这里插入图片描述

def get_normal(vertices, triangles):
 
    # triangls 是构成三角形点的索引
    pt0 = vertices[triangles[:, 0], :] # [ntri, 3]
    pt1 = vertices[triangles[:, 1], :] # [ntri, 3]
    pt2 = vertices[triangles[:, 2], :] # [ntri, 3]
   
    # np.cross向量积, 向量积又称外积、叉积(Cross product), 
    # 叉积的结果是一个垂直于两个向量平面的向量， 也就是法向量
    tri_normal = np.cross(pt0 - pt1, pt0 - pt2) # [ntri, 3]. normal of each triangle
    # 三角形两边叉乘就是法线
   
    normal = np.zeros_like(vertices, dtype = np.float32).copy() # [nver, 3]

   
    for i in range(triangles.shape[0]): 
    # 这里注意索引有重复的其实， 假如说有重复的就会一起加起来， 所以这些顶点的坐标其实是均值， 
    # 因为后面有归一化
       
        normal[triangles[i, 0], :] = normal[triangles[i, 0], :] + tri_normal[i, :]           
        normal[triangles[i, 1], :] = normal[triangles[i, 1], :] + tri_normal[i, :]     
        normal[triangles[i, 2], :] = normal[triangles[i, 2], :] + tri_normal[i, :]


    # normalize to unit length
    mag = np.sum(normal**2, 1) # [nver]

    zero_ind = (mag == 0) # 这个就是一个mask，nver维度, 记录了为0的法线,一般没有
    mag[zero_ind] = 1;
    normal[zero_ind, 0] = np.ones((np.sum(zero_ind)))

    normal = normal/np.sqrt(mag[:,np.newaxis])

    return normal

4. 顶点变换

现在我们得到了每个点的坐标, 以及经过光照后的颜色变化值, 现在需要变到相机坐标系上

camera_vertices = mesh.transform.lookat_camera
(transformed_vertices, eye = [0, 0, 200], at = np.array([0, 0, 0]), up = None)
# 将物体从3D世界坐标投影到2D投影面，可以是正交投影，也可以透视投影。
# 这个eye是眼睛的位置， 现在这个数大小无所谓， 因为不需要，因为是正交投影不是透视投影hh

# 这里是正交投影, 就是直接去掉z坐标
projected_vertices = mesh.transform.orthographic_project(camera_vertices)

介绍下这个从世界坐标系变换到相机坐标系的函数

初始原点都相同, 规定摄像头的向上的方向是y, 朝向-z
当然这里的朝向是z, 影响不大, 只是对z-buffer有些影响

def lookat_camera(vertices, eye, at = None, up = None):
   
     # 相机位于原点。
     # 向上为y向-z看
'''
      vertices: [nver, 3] 
      eye: [3,] 应该是人眼的位置, 在z方向,是正的z,上面给定的值是200, 其实这里只要是正数就可以了
      at: [3,] 相机中心所在的xyz坐标
      up: [3,] 上方向, 是y
    Returns:
      transformed_vertices: [nver, 3]
'''
    """
    if at is None:
      at = np.array([0, 0, 0], np.float32)
    if up is None:
      up = np.array([0, 1, 0], np.float32)

    eye = np.array(eye).astype(np.float32)
    at = np.array(at).astype(np.float32)
    z_aixs = -normalize(at - eye) # look forward
    '''
    这一句有点意思, 这是指的相机的朝向, 其实opengl指向的是-z方向, 而这里相机的朝向其实是z方向  
    '''
    x_aixs = normalize(np.cross(up, z_aixs)) # look right
    y_axis = np.cross(z_aixs, x_aixs) # look up

    R = np.stack((x_aixs, y_axis, z_aixs))#, 旋转矩阵# 3 x 3
    transformed_vertices = vertices - eye # 平移变换, 变换了z
    '''
    因为这里用的是正交投影, z直接去掉了, 
    所以你眼睛的位置不重要, 这个在透视投影是有关系的
    '''
    transformed_vertices = transformed_vertices.dot(R.T) # 旋转变换
    return transformed_vertices

5.渲染

所谓渲染不过是把点变到屏幕上的过程

现在我们得到了2d的坐标, 也就是相机坐标系下正交投影的坐标
现在我想要让这些离散的点显示成一张图, 首先我要把这些点变到图片坐标系, 然后再进行插值得到点中间大概的颜色
要是不进行插值的话那岂不是得到一些有颜色的点而已hh, 不是连贯的像素
这里注意之前相机的朝向是z, 所以谁的z值更大说明距离观察者更近, 就显示它的颜色值

这里要强调的是这里是变到图片像素坐标系, 原点在左上角

h = w = 256
# change to image coords for rendering
image_vertices = mesh.transform.to_image(projected_vertices, h, w)
# render 
rendering = mesh.render.render_colors(image_vertices, triangles, lit_colors, h, w) # 假如没有这个应该就是得到离散的颜色点

看看这个从相机坐标系到图片坐标系的变换

def to_image(vertices, h, w, is_perspective = False):
    ''' change vertices to image coord system
	这里图片坐标系的中心是图形的左上角, 同时反转y坐标,其实就是基础的坐标啦
	只是向下是正, 和相机坐标系相反
   return:
   	projected_vertices: [nver, 3]  
    '''
    image_vertices = vertices.copy()
    if is_perspective:
        # if perspective, the projected vertices are normalized to [-1, 1]. so change it to image size first.
        # 如果使用透视变换， 投影三维点已经被归一化了，需要先缩放到图像坐标下
        image_vertices[:,0] = image_vertices[:,0]*w/2
        image_vertices[:,1] = image_vertices[:,1]*h/2
    # move to center of image
    image_vertices[:,0] = image_vertices[:,0] + w/2
    image_vertices[:,1] = image_vertices[:,1] + h/2
    # 翻转y轴
    image_vertices[:,1] = h - image_vertices[:,1] - 1 
    # 这个-1没有太懂, 有知道的可以解答一下
    return image_vertices

最重要的就是这个得到颜色的过程了

这是用了重心坐标, 重心坐标可以插值得到很多东西, 比如颜色, z轴的深度, 反射率等等

def render_colors(vertices, triangles, colors, h, w, c = 3):
    '''
    Returns:
        image: [h, w, c]. 
    '''
    assert vertices.shape[0] == colors.shape[0]
    
    # 初始化二维图像
    image = np.zeros((h, w, c))
    
    # 初始化缓冲区
    depth_buffer = np.zeros([h, w]) - 999999.

    for i in range(triangles.shape[0]):
        tri = triangles[i, :] # 3顶点索引
        
        #  限制一下不要超过渲染的图片范围
         # 因为对于每个三角形都需要看看渲染的图片中有没有对应的点，如果有就需要额外的计算颜色，
        # 没有的就是黑色，也就是zeros 所以可以减少一些循环
        umin = max(int(np.ceil(np.min(vertices[tri, 0]))), 0)
        umax = min(int(np.floor(np.max(vertices[tri, 0]))), w-1)

        vmin = max(int(np.ceil(np.min(vertices[tri, 1]))), 0)
        vmax = min(int(np.floor(np.max(vertices[tri, 1]))), h-1)

        if umax<umin or vmax<vmin:
            continue

        for u in range(umin, umax+1):
            for v in range(vmin, vmax+1):
                if not isPointInTri([u,v], vertices[tri, :2]): 
                '''
                如果图像上的点不是人脸的点， 就忽略
                ''''
                    continue
                w0, w1, w2 = get_point_weight([u, v], vertices[tri, :2])
                # 这个是计算屏幕上(u,v)这个点它的重心权重, 因为它已经在三角形内了
               '''
                这个也叫重心坐标, 得到这个权重后就插值得到这个点的深度
                '''
                point_depth = w0*vertices[tri[0], 2] + w1*vertices[tri[1], 2] + w2*vertices[tri[2], 2]
           

                if point_depth > depth_buffer[v, u]:
                    # 更新z-buffer的值, 因为朝向是z嘛, 所以越大说明越近
                    depth_buffer[v, u] = point_depth
              
                    image[v, u, :] = 
                    w0*colors[tri[0], :] + w1*colors[tri[1], :] + w2*colors[tri[2], :]
					# 颜色也是插值得到的

    return image

这个判断是不是点其实是用了叉乘, 判断一个点是不是在三角形内, 在里面就需要渲染, 不在就不要

def isPointInTri(point, tri_points):
    ''' Judge whether the point is in the triangle
    Returns:
        bool: true for in triangle
    '''
    tp = tri_points

    # vectors
    v0 = tp[2,:] - tp[0,:]
    v1 = tp[1,:] - tp[0,:]
    v2 = point - tp[0,:]

    # dot products
    dot00 = np.dot(v0.T, v0)
    dot01 = np.dot(v0.T, v1)
    dot02 = np.dot(v0.T, v2)
    dot11 = np.dot(v1.T, v1)
    dot12 = np.dot(v1.T, v2)

    # barycentric coordinates
    if dot00*dot11 - dot01*dot01 == 0:
        inverDeno = 0
    else:
        inverDeno = 1/(dot00*dot11 - dot01*dot01)

    u = (dot11*dot02 - dot01*dot12)*inverDeno
    v = (dot00*dot12 - dot01*dot02)*inverDeno

    # check if point in triangle
    return (u >= 0) & (v >= 0) & (u + v < 1)

求解重心坐标

假如我有条直线, 坐标分别是(0,0)和(1,0), 很容易得到中间点的坐标, 可以按比例算出来

在这里插入图片描述
此时计算点 P 的公式为 $P = A t + B (1 - t)$ 。

同理，在三角形 ABC 中，三角形内点 P 的计算公式为 $P = (1 - m - n) A + m B + n C$

在这里插入图片描述
那么如何计算参数 m 和 n 呢？直接给出结果

我们将 $P - A$ 记作向量 $v_2$ ，将 $B - A$ 记作向量 $v_0$ ，将 $C - A$ 记作向量 $v_1$ ，则公式为：

在这里插入图片描述

代码如下:

def get_point_weight(point, tri_points):
    ''' Get the weights of the position
   
    Returns:
        w0: weight of v0
        w1: weight of v1
        w2: weight of v3
     '''
    tp = tri_points
    # vectors
    v0 = tp[2,:] - tp[0,:]
    v1 = tp[1,:] - tp[0,:]
    v2 = point - tp[0,:]

    # dot products
    dot00 = np.dot(v0.T, v0)
    dot01 = np.dot(v0.T, v1)
    dot02 = np.dot(v0.T, v2)
    dot11 = np.dot(v1.T, v1)
    dot12 = np.dot(v1.T, v2)

    # barycentric coordinates
    if dot00*dot11 - dot01*dot01 == 0:
        inverDeno = 0
    else:
        inverDeno = 1/(dot00*dot11 - dot01*dot01)

    u = (dot11*dot02 - dot01*dot12)*inverDeno
    v = (dot00*dot12 - dot01*dot02)*inverDeno

    w0 = 1 - u - v
    w1 = v
    w2 = u

    return w0, w1, w2

6. 保存图片

之前已经得到了一个颜色的矩阵, 现在就把它保存一下, 完事了

save_folder = 'results/pipline'
if not os.path.exits(save_folder):
    os.mkdir(save_folder)
io.imsave('{}/rendering.jpg'.format(save_folder), rendering)