目录
xformers,torch_scatter 安装会自动安装torch2.5
香港科技大学提出了一种叫PSHuman的新框架。这个方法利用了一个多视角扩散模型的“先验知识”来更好地重建人物。如果直接用多视角扩散技术去处理单视角图像,结果往往会出现很大的几何扭曲,尤其是在生成的面部细节上。为了解决这一问题,设计了一种跨尺度扩散方法,这种方法能同时兼顾整体身体形状和局部面部特征的细节,生成的结果既细致又真实,没有几何上的畸变。
github地址:
模型下载地址:
https://huggingface.co/pengHTYX/PSHuman_Unclip_768_6views
演示地址:
https://huggingface.co/spaces/fffiloni/PSHuman
依赖项:
torch版本,目前不能高,也不能低,各种版本不兼容。
xformers,torch_scatter 安装会自动安装torch2.5
torch版本:
torch 2.5.1+cu124
torchvision 0.20.1+cu124
pip install rembg
pip install open3d
pip install yacs
pip install pymeshlab
pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html
pip install icecream
pip install torch_scatter
pip install xatlas
pip install diffusers==0.27.2
依赖项测试:
python -c "from kaolin.ops.mesh import check_sign"
xformers安装:
不更新会报错:
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
解决方法,安装xformers
pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu121
安装完了后,torch会自动更新为2.5.1
nvdiffrast安装:
https://github.com/NVlabs/nvdiffrast
下载下来,cd nvdiffrast
pip install .
加载模型:
def load_pshuman_pipeline(cfg, device="cuda"):
pipeline = StableUnCLIPImg2ImgPipeline.from_pretrained(cfg.pretrained_model_name_or_path, torch_dtype=weight_dtype)
pipeline.unet.enable_xformers_memory_efficient_attention()
if torch.cuda.is_available():
pipeline.to(device)
return pipeline
pretrained_model_name_or_path模型路径:
'/shared_disk/comfyui/models/models--pengHTYX--PSHuman_Unclip_768_6views'
u2net.onnx
'https://github.com/danielgatis/rembg/releases/download/v0.0.0/u2net.onnx' to file '/root/.u2net/u2net.onnx'.
omegaconf.errors.ConfigKeyError: Key ' ' not in 'TestConfig'
临时解决方法:
/mnt/pfs/users/lbg/envs/py310wj/lib/python3.10/site-packages/omegaconf/dictconfig.py
def _validate_get(self, key: Any, value: Any = None) -> None:
is_typed = self._is_typed()
if key==" ":
return
调用:
CUDA_VISIBLE_DEVICES=0 python inference.py --config configs/inference-768-6view.yaml \
pretrained_model_name_or_path='/xxx/models/models--pengHTYX--PSHuman_Unclip_768_6views' \
validation_dataset.crop_size=740 \
with_smpl=false \
validation_dataset.root_dir=examples \
seed=600 \
num_views=7 \
save_mode='rgb'
smpl_data下载:
huggingface-cli download --resume-download lilpotat/pytorch3d --local-dir lilpotat--pytorch3d
拷贝到目录:
smpl_related/smpl_data
dino_feature报错解决
UNet2DConditionModel.forward() got an unexpected keyword argument ‘dino_feature‘-CSDN博客
模型库:
diffusion_pytorch_model.safetensors:
model.safetensors:
{
'_metadata': ContainerMetadata(ref_type = typing.Any, object_type = < class '__main__.TestConfig' > , optional = True, key = None, flags = {}, flags_root = False, resolver_cache = defaultdict( < class 'dict' > , {}), key_type = typing.Any, element_type = typing.Any),
'_parent': None,
'_flags_cache': None,
'_content': {
'pretrained_model_name_or_path': 'stabilityai/stable-diffusion-2-1-unclip',
'revision': None,
'validation_dataset': {
'prompt_embeds_path': 'mvdiffusion/data/fixed_prompt_embeds_7view',
'root_dir': 'test_data/t_pose',
'num_views': 7,
'bg_color': 'white',
'img_wh': [768, 768],
'num_validation_samples': 1000,
'crop_size': 740,
'margin_size': 50,
'smpl_folder': 'smpl_image_pymaf'
},
'save_dir': 'mv_results',
'seed': 42,
'validation_batch_size': 1,
'dataloader_num_workers': 1,
'save_mode': 'rgba',
'local_rank': -1,
'pipe_kwargs': {
'num_views': 7
},
'pipe_validation_kwargs': {
'num_inference_steps': 40,
'eta': 1.0
},
'unet_from_pretrained_kwargs': {
'unclip': True,
'sdxl': False,
'num_views': 7,
'sample_size': 96,
'zero_init_conv_in': False,
'projection_camera_embeddings_input_dim': 2,
'zero_init_camera_projection': False,
'num_regress_blocks': 3,
'cd_attention_last': False,
'cd_attention_mid': False,
'multiview_attention': True,
'sparse_mv_attention': True,
'selfattn_block': 'self_rowwise',
'mvcd_attention': True
},
'validation_guidance_scales': 3.0,
'validation_grid_nrow': 7,
'num_views': 7,
'enable_xformers_memory_efficient_attention': True,
'with_smpl': False,
'recon_opt': {
'res_path': 'examples/out',
'save_glb': False,
'num_view': 6,
'scale': 4,
'mode': 'ortho',
'resolution': 1024,
'cam_path': 'mvdiffusion/data/six_human_pose',
'iters': 1000,
'clr_iters': 200,
'debug': False,
'snapshot_step': 100,
'lr_clr': 0.002,
'gpu_id': 0,
'replace_hand': True
}
}
}
推理步骤:
- Given a human image, we use Clipdrop or
rembg
to remove the background. For the latter, we provide a simple scrip.
python utils/remove_bg.py --path $DATA_PATH$
Then, put the RGBA images in the $DATA_PATH$
.
- By running inference.py, the textured mesh and rendered video will be saved in
out
.
CUDA_VISIBLE_DEVICES=$GPU python inference.py --config configs/inference-768-6view.yaml \
pretrained_model_name_or_path='pengHTYX/PSHuman_Unclip_768_6views' \
validation_dataset.crop_size=740 \
with_smpl=false \
validation_dataset.root_dir=$DATA_PATH$ \
seed=600 \
num_views=7 \
save_mode='rgb'
SMPLDataset
from econdataset import SMPLDataset
import os
import sys
import torch
os.chdir(os.path.dirname(os.path.abspath(__file__)))
current_dir = os.path.dirname(os.path.abspath(__file__))
paths = [os.path.abspath(__file__).split('scripts')[0]]
print('current_dir',current_dir)
paths.append(os.path.abspath(os.path.join(current_dir, './')))
paths.append(os.path.abspath(os.path.join(current_dir, '../')))
for path in paths:
sys.path.insert(0, path)
os.environ['PYTHONPATH'] = (os.environ.get('PYTHONPATH', '') + ':' + path).strip(':')
class Params:
pass
validation_dataset=Params()
validation_dataset.root_dir='examples'
# dataset_param = {'image_dir': validation_dataset.root_dir, 'seg_dir': None, 'colab': False, 'has_det': True, 'hps_type': 'pixie'}
dataset_param = {'image_dir': validation_dataset.root_dir, 'seg_dir': None, 'colab': False, 'has_det': True, 'hps_type': 'pixie'}
econdata = SMPLDataset(dataset_param, device='cuda')
渲染背景颜色设置黑色:
PSHuman/lib/common/render.py
init_renderer
def init_renderer(self, camera, type="clean_mesh", bg="black"):
bg="black"
if "mesh" in type:
# rasterizer
self.raster_settings_mesh = RasterizationSettings(
image_size=self.size,
blur_radius=np.log(1.0 / 1e-4) * 1e-7,
faces_per_pixel=30,
)
self.meshRas = MeshRasterizer(
cameras=camera, raster_settings=self.raster_settings_mesh)
if bg == "black":
blendparam = BlendParams(1e-4, 1e-4, (0.0, 0.0, 0.0))
elif bg == "white":
blendparam = BlendParams(1e-4, 1e-8, (1.0, 1.0, 1.0))
elif bg == "gray":
blendparam = BlendParams(1e-4, 1e-8, (0.5, 0.5, 0.5))
if type == "ori_mesh":
lights = PointLights(
device=self.device,
ambient_color=((0.8, 0.8, 0.8), ),
diffuse_color=((0.2, 0.2, 0.2), ),
specular_color=((0.0, 0.0, 0.0), ),
location=[[0.0, 200.0, 0.0]],
)
self.renderer = MeshRenderer(
rasterizer=self.meshRas,
shader=SoftPhongShader(
device=self.device,
cameras=camera,
lights=None,
blend_params=blendparam,
),
)
if type == "silhouette":
self.raster_settings_silhouette = RasterizationSettings(
image_size=self.size,
blur_radius=np.log(1.0 / 1e-4 - 1.0) * 5e-5,
faces_per_pixel=50,
cull_backfaces=True,
)
self.silhouetteRas = MeshRasterizer(
cameras=camera,
raster_settings=self.raster_settings_silhouette)
self.renderer = MeshRenderer(rasterizer=self.silhouetteRas,
shader=SoftSilhouetteShader())
if type == "pointcloud":
self.raster_settings_pcd = PointsRasterizationSettings(
image_size=self.size, radius=0.006, points_per_pixel=10)
self.pcdRas = PointsRasterizer(
cameras=camera, raster_settings=self.raster_settings_pcd)
self.renderer = PointsRenderer(
rasterizer=self.pcdRas,
compositor=AlphaCompositor(background_color=(0, 0, 0)),
)
if type == "clean_mesh":
self.renderer = MeshRenderer(
rasterizer=self.meshRas,
shader=cleanShader(device=self.device,
cameras=camera,
blend_params=blendparam),
)