代码说明
victordibia/handtracking
Handtrack.js Examples in the Browser
Application - Cut gesture language Video
Contents
➜ tree -L 2
.
├── hand_inference_graph
│ ├── frozen_inference_graph.pb
│ ├── frozen_inference_graph_141\ 14-51-46-798.pb
│ └── hand_label_map.pbtxt
├── protos
│ ├── BUILD
│ ├── __init__.py
│ ├── anchor_generator.proto
│ ├── anchor_generator_pb2.py
│ ├── argmax_matcher.proto
│ ├── argmax_matcher_pb2.py
│ ├── bipartite_matcher.proto
│ ├── bipartite_matcher_pb2.py
│ ├── box_coder.proto
│ ├── box_coder_pb2.py
│ ├── box_predictor.proto
│ ├── box_predictor_pb2.py
│ ├── eval.proto
│ ├── eval_pb2.py
│ ├── faster_rcnn.proto
│ ├── faster_rcnn_box_coder.proto
│ ├── faster_rcnn_box_coder_pb2.py
│ ├── faster_rcnn_pb2.py
│ ├── grid_anchor_generator.proto
│ ├── grid_anchor_generator_pb2.py
│ ├── hyperparams.proto
│ ├── hyperparams_pb2.py
│ ├── image_resizer.proto
│ ├── image_resizer_pb2.py
│ ├── input_reader.proto
│ ├── input_reader_pb2.py
│ ├── losses.proto
│ ├── losses_pb2.py
│ ├── matcher.proto
│ ├── matcher_pb2.py
│ ├── mean_stddev_box_coder.proto
│ ├── mean_stddev_box_coder_pb2.py
│ ├── model.proto
│ ├── model_pb2.py
│ ├── optimizer.proto
│ ├── optimizer_pb2.py
│ ├── pipeline.proto
│ ├── pipeline_pb2.py
│ ├── post_processing.proto
│ ├── post_processing_pb2.py
│ ├── preprocessor.proto
│ ├── preprocessor_pb2.py
│ ├── region_similarity_calculator.proto
│ ├── region_similarity_calculator_pb2.py
│ ├── square_box_coder.proto
│ ├── square_box_coder_pb2.py
│ ├── ssd.proto
│ ├── ssd_anchor_generator.proto
│ ├── ssd_anchor_generator_pb2.py
│ ├── ssd_pb2.py
│ ├── string_int_label_map.proto
│ ├── string_int_label_map_pb2.py
│ ├── train.proto
│ └── train_pb2.py
└── utils
├── __init__.py
├── detector_utils.py
└── label_map_util.py
Usage
from utils import detector_utils as detector_utils
import numpy as np
import cv2
detection_graph, sess = detector_utils.load_inference_graph()
def detect_video(video_url:str, score_thresh:float):
vc = cv2.VideoCapture()
vc.open(video_url)
while(vc.isOpened()): # 逐帧循环
ret, frame = vc.read()
if not ret: # 未读取到帧数据
break
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
boxes, scores = detector_utils.detect_objects(frame, detection_graph, sess)
# score process
score_index = np.where(scores>=score_thresh)
if list(score_index):
scores = scores[score_index]
boxes = boxes[score_index]
else:
continue
vc.release() # 释放
Details
Returns an array of classes and confidence scores that looks like:
[{
bbox: [x, y, width, height],
class: "hand",
score: 0.8380282521247864
}, {
bbox: [x, y, width, height],
class: "hand",
score: 0.74644153267145157
}]
经测试,bbox数据皆以帧图像的宽、高以基准缩放,例如 [0.12900555, 0.7695243, 0.24960786, 0.85627383]
且其实际意义并非 [x, y, width, height],而是 [top, left, bottom, right]
(left, right, top, bottom) = (boxes[1] * im_width, boxes[3] * im_width,
boxes[0] * im_height, boxes[2] * im_height)
box_center = ((left+right)//2, (top+bottom)//2) #(center_x, center_y)
Others
FFMPEG
视频截取
FFMPEG 以精确时间截取视频文件的方法,并非以帧为截断计量 https://zhuanlan.zhihu.com/p/27366331
ffmpeg -ss <start> -t <duration> -accurate_seek -i <in.mp4> -codec copy -avoid_negative_ts 1 <out.mp4>
不会对视频重新编码,直接截取相关时间,导出视频,但是ta有时会自动扩展时长至关键帧位置;
避免关键帧的丢失并精确截取时间,这个时候最好是重新编码视频:
ffmpeg -ss %s -i small.mp4 -t %s -c:v libx264 -c:a aac -strict experimental -b:a 96k %s.mp4
视频格式转换
HTML 仅支持h264编码的MP4文件:
ffmpeg -i input.mp4 -vcodec h264 output.mp4
debugout.js
参考http://inorganik.github.io/debugout.js/,保存console.log()输出为debugout.txt
bugout 有缓存大小的限制,不支持过多输出:
修改其autoTrim 参数设置
opencv-python
cv2.VideoWriter(save_path, fourcc, fps, size)
其 size 不匹配,将导致生成的文件损坏(无法打开)
写在最后:若本文对您有帮助,请点个赞啦 ٩(๑•̀ω•́๑)۶