【PyTorch3D】API文档学习笔记

该努力啦

已于 2023-09-25 18:03:09 修改

阅读量524

点赞数 1

文章标签： pytorch 学习笔记

于 2023-09-09 09:34:30 首次发布

本文链接：https://blog.csdn.net/weixin_59941945/article/details/132759868

版权

官方文档：Welcome to PyTorch3D’s documentation! — PyTorch3D documentation

PyTorch3D是深度学习中用于三维数据处理的库

1.pytorch3d.structures

3.2 look_at_view_transform

1.pytorch3d.structures

1.1 Meshes()

class pytorch3d.structures.Meshes(verts, faces, textures=None, *, verts_normals=None)

该类提供的函数用于处理具有不同面数和顶点数的批量三角网格，以及在不同表示法之间进行转换。用于定义网格。参数：顶点，面，纹理，顶点法线

在网格中，面和顶点数据有三种不同的表示方法：

List：只用于输入，作为向其他表示方法转换的起点
Padded：有特定的batch维度
Packed：没有batch维度，有用于索引填充表示的辅助变量

2.pytorch3d.io

2.1 load_obj()

pytorch3d.io.load_obj(f, load_textures: bool = True,
                        create_texture_atlas: bool = False, 
                        texture_atlas_size: int = 4, 
                        texture_wrap: Optional[str] = 'repeat', 
                        device: Union[str, torch.device] = 'cpu', 
            path_manager: Optional[iopath.common.file_io.PathManager] = None)

从 .obj 文件加载网格，也可选择从 .mtl 文件加载纹理。目前，它可以处理顶点、面、顶点纹理 UV 坐标、法线、纹理图像和材质反射率值。

请注意，.obj 文件是 1 索引。此函数返回的张量为 0-索引。OBJ 规范参考：OBJ规范参考

.obj文件格式举例：

# this is a comment
v 1.000000 -1.000000 -1.000000
v 1.000000 -1.000000 1.000000
v -1.000000 -1.000000 1.000000
v -1.000000 -1.000000 -1.000000
v 1.000000 1.000000 -1.000000
vt 0.748573 0.750412
vt 0.749279 0.501284
vt 0.999110 0.501077
vt 0.999455 0.750380
vn 0.000000 0.000000 -1.000000
vn -1.000000 -0.000000 -0.000000
vn -0.000000 -0.000000 1.000000
f 5/2/1 1/2/1 4/3/1
f 5/1/1 4/3/1 2/4/1

每一行最前面的字母表示数据格式：

v 是顶点（vertex）
vt 是一个顶点的纹理坐标（texture coordinate of one vertex）
vn 是一个顶点的法线（normal of one vertex）
f 是一个面（face）

面的解释如下：

5/2/1 describes the first vertex of the first triangle

5: index of vertex [1.000000 1.000000 -1.000000]
2: index of texture coordinate [0.749279 0.501284]
1: index of normal [0.000000 0.000000 -1.000000]

如果一个面有三个以上的顶点，则其为三角形，以逆时针方向为多边形顶点的正方向，则一个正确的矩形面如下：

这个面被分割成两个三角形：( 0、2、1)和( 0、3、2)，这两个三角形也是逆时针方向的，并且具有指向屏幕外的法线(右手定则)。

参数解释：

f：传入的文件，可以为文件路径或文件名
load_textures：布尔值，表示文件是否被加载
create_texture_atlas：布尔值，如果为True，将创建一个面纹理图(texture map)，并且返回值aux中将带有张量texture_atlas
texture_atlas_size：在Create _ texture _ atlas = True时指定每张面的纹理图的分辨率。每张面创建一个map(texture_ size ,texture_ size , 3)。
texture_wrapstring：one of [“repeat”, “clamp”]. If texture_mode=”repeat”，对于范围[0,1]以外的uv值，忽略整数部分，形成重复图案。If texture_mode=”clamp”，数值被钳位到[0，1]范围内。If None, 不进行纹理值变换。
device：Device (as str or torch.device) on which to return the new tensors.
path_manager：可选一个PathManager对象来解释路径。

返回值：6元组

Verts：形状为( V、3 )的FloatTensor。

Faces：带字段的命名元组：

verts_idx: LongTensor of vertex indices, shape (F, 3).
normals_idx: (optional) LongTensor of normal indices, shape (F, 3).
textures_idx: (optional) LongTensor of texture indices, shape (F, 3). This can be used to index into verts_uvs.
materials_idx: (optional) List of indices indicating which material the texture is derived from for each face. If there is no material for a face, the index is -1. This can be used to retrieve the corresponding values in material_colors/texture_images after they have been converted to tensors or Materials/Textures data structures - see textures.py and materials.py for more info.

aux: NamedTuple with fields:

normals: FloatTensor of shape (N, 3)
verts_uvs: FloatTensor of shape (T, 2), giving the uv coordinate per vertex. If a vertex is shared between two faces, it can have a different uv value for each instance. Therefore it is possible that the number of verts_uvs is greater than num verts i.e. T > V. vertex.
material_colors: if load_textures=True and the material has associated properties this will be a dict of material names and properties of the form:

3.pytorch3d.renderer

3.1cameras

classpytorch3d.renderer.cameras.CamerasBase(dtype: torch.dtype = torch.float32, device: Union[str, torch.device] = 'cpu', **kwargs)

CamerasBase 是所有相机的父类。

对于相机来说，有四种不同的坐标系

世界坐标系：这是物体生存的系统。

相机视图坐标系：这是原点位于相机且 Z 轴垂直于图像平面的系统。在 PyTorch3D 中，我们假设 +X 指向左侧，+Y 指向上方，+Z 指向图像平面外。世界坐标系经过旋转 (R) 和平移 (T) 之后变换为相机坐标系

NDC坐标系：这是标准化坐标系，它将点限制在对象或场景的渲染部分（也称为视图体）中。对于正方形图像，根据 PyTorch3D 约定，(+1, +1, znear) 是视图体的左上角近角，(-1, -1, zfar) 是视图体的右下角。如果在 NDC 空间中定义，则在应用相机投影矩阵 (P) ，可以将相机坐标系转换为NDC 坐标系。对于非正方形图像，我们缩放点，使得最小边的范围为 [-1, 1]，最大边的范围为 [-u, u]，且 u > 1。

屏幕坐标系：这是视图体的另一种表示形式，其 XY 坐标在图像空间而不是标准化空间中定义。

坐标系的说明可以在 pytorch3d/docs/notes/cameras.md 中找到。

CameraBase 定义了所有相机型号通用的方法：

get_camera_center 返回世界坐标系中相机的光学中心的坐标

此处设置 R 或 T 将更新 init 中设置的值，因为稍后在渲染管道中可能需要这些值，例如用于照明计算。

返回：一批形状为 (N, 3) 的 3D 位置，表示该批次中每个相机的中心位置。

get_world_to_view_transform 返回世界坐标到相机坐标的3D 变换（R，T）

get_full_projection_transform 全变换：包括投影变换 (P) 与刚体变换 (R, T)

get_ndc_camera_transform 定义从屏幕/NDC 到 PyTorch3D 的 NDC 空间的变换

transform_points_ndc 获取世界坐标中的一组点并将它们投影到 PyTorch3D 的 NDC 坐标系

transform_points_screen 获取世界坐标中的一组点并将它们投影到屏幕坐标系/像素坐标系

对于每个新相机，应该实现 get_projection_transform 例程，该例程返回从相机视图坐标到相机坐标（NDC 坐标系或屏幕坐标系）的映射。

特定于每个相机模型的另一个有用函数是 unproject_points，它根据函数的世界坐标系布尔参数将点从相机坐标（NDC 或屏幕）发送回相机视图坐标系或世界坐标。

get_projection_transform(**kwargs)  计算投影变换矩阵

**kwargs – 投影参数可以作为关键字参数传入，以覆盖 __init__ 中设置的默认值。

return 一个 Transform3d 对象，表示一批形状为 (N, 3, 3) 的投影矩阵

unproject_points(xy_depth: torch.Tensor, **kwargs)
参数说明：
xy_depth – 形状为 (…, 3) 的torch张量。
world_coordinates – 如果为 True，则使用相机外部 R 和 T 将点取消投影回世界坐标。如果为 False，则忽略 R 和 T 并取消投影到相机视图坐标。
from_ndc – 如果为 False（默认），则假设输入的 xy 部分在 NDC 坐标系中（如果 self.in_ndc()），其余在屏幕坐标系中。 如果为 True，则假设 xy 位于 NDC 空间中，即使相机是在屏幕空间中定义的。

返回值：new_points，与xy_depth形状相同的未投影点。

作用:将输入点从相机坐标（NDC 或屏幕）转换为世界/相机坐标。

每个输入点的xy_depth表示为(...,3),包括xy位置和深度。
例如，对于形状为 (num_points, 3) 的输入 2D 张量，xy_深度 采用以下形式：
xy_depth[i] = [x[i], y[i], depth[i]],
for a each point at an index i.

以下示例演示了transform_points和unproject_points之间的关系：
cameras = # camera object derived from CamerasBase
xyz = # 3D points of shape (batch_size, num_points, 3)
# transform xyz to the camera view coordinates
xyz_cam = cameras.get_world_to_view_transform().transform_points(xyz)
# extract the depth of each point as the 3rd coord of xyz_cam
depth = xyz_cam[:, :, 2:]
# project the points xyz to the camera
xy = cameras.transform_points(xyz)[:, :, :2]
# append depth to xy
xy_depth = torch.cat((xy, depth), dim=2)
# unproject to the world coordinates
xyz_unproj_world = cameras.unproject_points(xy_depth, world_coordinates=True)
print(torch.allclose(xyz, xyz_unproj_world)) # True
# unproject to the camera coordinates
xyz_unproj = cameras.unproject_points(xy_depth, world_coordinates=False)
print(torch.allclose(xyz_cam, xyz_unproj)) # True

3.2 `look_at_view_transform`

pytorch3d.renderer.cameras.look_at_view_transform(
                                                 dist,elev, azim, 
                                                degrees: bool = True, 
                                                eye,
                                                at=((0, 0, 0),), 
                                                up=((0, 1, 0),),
                                                device
                                                )
该函数返回一个旋转和平移矩阵(R,T)，实现从世界坐标系到相机坐标系的变换