计算机视觉和短时长记忆：学习预测施工过程中的不安全行为

Robust Da

已于 2022-11-18 22:01:05 修改

阅读量827

点赞数

分类专栏：工程机械 Construction Machinery 文章标签：学习人工智能

于 2022-11-18 14:31:15 首次发布

本文链接：https://blog.csdn.net/dazheng121/article/details/127868871

版权

5 篇文章 2 订阅

订阅专栏

前言

In this paper, we combine computer vision with Long-Short Term Memory (LSTM) to predict unsafe behaviours from videos automatically.
在本文中，我们将计算机视觉与长短期记忆(LSTM)相结合，自动预测视频中的不安全行为。
Our proposed approach for predicting unsafe behaviour is based on: (1) tracking people using a SiamMask; (2) predicting the trajectory of people using an improved Social-LSTM; and (3) predicting unsafe behaviour using Franklin’s point inclusion polygon (PNPoly) algorithm.
我们提出的预测不安全行为的方法是基于:(1)使用SiamMask追踪人群;(2)利用改进的Social-LSTM预测人群的运动轨迹;(3)利用富兰克林点包含多边形(PNPoly)算法预测不安全行为。

Traditionally root cause models juxtaposed with the psychological theories have formed the basis to predict people’ unsafe behaviour as they help us understand how people’s behaviour changes under differing working conditions.
与心理学理论并列的根本原因模型已成为预测人们不安全行为的基础，因为它们有助于我们了解人们在不同工作条件下的行为如何变化。
The models assume that people’s behaviour is planned; hence, it predicts deliberate behaviour . However, they are “widely applied without sufficient attention paid to what makes [them] work in its contexts of origin, and without adequate customisation for the specifics”. To this end, root cause models provide a “flawed reductionist view” of safety issues.
模型假设人们的行为是有计划的；因此，它预测了蓄意行为。然而，它们“被广泛应用，但没有充分注意到是什么使它们在其起源环境中发挥作用，也没有针对具体情况进行适当的定制”。为此，根本原因模型提供了安全问题的“有缺陷的简化论观点”。
The data used for predictive modelling is usually derived from studies of unsafe behaviour rendering their predictability and relevance to represent practice questionable.
用于预测建模的数据通常来自对不安全行为的研究，因此它们的可预测性和代表实践的相关性值得怀疑。

Object Detection

Object Segmentation

Visual object tracking (VOT) is used to generate an object’s trajectory over time by locating its position obtained from videos.
视觉物体跟踪(VOT)是通过定位从视频中获得的物体的位置，生成物体随时间的轨迹。

VOT tracking approaches

deep learning-based
point tracking
kernel tracking
Tracking objects using a segmentation mask requires more computational power than a simple bounding box-based approach. A fully convolutional Siamese framework (SiamFC) developed to track objects with speed in real-time, dubbed the SiamMask, can be used to overcome the requirement for increased computational power.

This paper focuses on the use of a SiamMask, Social LTSM and a PNPoly algorithm.

The Social-LSTM considers the people-people interaction in predicting their trajectory, improving the robustness and accuracy of multi-people tracking. The Kalman filter is adopted to correct the Social-LSTM results to enhance the robustness of the prediction method. A Kalman filter is an optimal estimator that can infer parameters of interest from indirect, inaccurate and uncertain observations, which can estimate the people’s walking based on their historical trajectory.
Social-LSTM在预测人的运动轨迹时考虑了人的相互作用，提高了多人跟踪的鲁棒性和准确性。采用卡尔曼滤波对Social- LSTM结果进行校正，增强了预测方法的鲁棒性。卡尔曼滤波是一种最优估计器，它可以从间接的、不准确的和不确定的观察中推断出感兴趣的参数，它可以根据人们的历史轨迹估计出他们的行走轨迹。
Therefore, we correct the Social-LSTM results with the Kalman filter once a person’s predicted walking speed is inconsistent with the recorded speed.
一旦一个人的预测步行速度与记录的速度不一致，我们就用卡尔曼滤波修正Social-LSTM的结果。

注：
卡尔曼滤波：最终值=k*观察值+(1-k)*预测值，其中的k需要调
卡尔曼滤波就是平均的一种更优算法，卡尔曼滤波的权值是根据系统状态和噪音计算出的最优权值。

The PNPoly algorithm tests whether a point is inside a polygon (convex or concave) by counting how many times the ray from the test point crosses its edge. If the count is an odd number, the point is in the polygon area; otherwise, it is outside.
PNPoly算法通过计算自测试点发出的射线穿过多边形边缘的次数，来测试点是否在多边形(凸或凹)内。如果计数为奇数，则该点位于多边形区域内；否则，它就在外面。