Structural Attention-Based Recurrent Variational Autoencoder for Highway Vehicle Anomaly Detection

本文提出一种基于环境结构的无监督框架SABeR-VAE,利用车辆间和车道间的注意力网络,结合Koopman操作器在连续潜在空间中预测车辆轨迹,以检测高速公路的异常行为,特别关注环境因素对行为的影响和解释性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档


论文链接

https://arxiv.org/abs/2301.03634

代码链接

https://gitlab.engr.illinois.edu/hubris/highway-anomaly-detection

keywords

Anomaly Detection; Autonomous Vehicles; Unsupervised Learning; Human Behavior Modeling

Abstract

We propose a novel unsupervised framework for highway anomaly de-
tection named Structural Attention-Based Recurrent VAE (SABeR-
VAE), == which explicitly uses the structure of the environment to aid
anomaly identification.==
vehicle self-attention module to learn the relations among vehicles on a road
and a separate lane-vehicle attention module to model the importance of
permissible lanes to aid in trajectory prediction
Conditioned on the attention modules’ outputs, a recurrent encoder-decoder architec-
ture with a stochastic Koopman operator-propagated latent space predicts the next states of vehicles.
modeling environmental factors is essential to detecting a diverse set of anomalies in deployment.
such interaction-aware methods still ignore the effect of road structures on vehicle
behaviors, and thus can miss abnormal scenarios like wrong-way driving trajectories that appear normal when environmental context is overlooked.

Introduction

Specifically, a neural network, which generally follows an == encoder-decoder architecture== for trajectory reconstruction or prediction, learns an underlying distribution of normal vehicle trajectories in the latent space.
An anomaly is then detected whenever the trajectory is out ofdistribution and produces a large reconstruction or prediction error.
To ensure interpretability, we use variational autoencoder (VAE) to cluster useful features from similar behaviors together in a continuous and stochastic latent space
Specifically, we treat a highway scenario as a structured interaction graph where nodes represent vehicles and lane positions, and edges connect nearby vehicles, and permissible lanes.

Related Works

Exploiting Map Information

Variational Autoencoders for Sequences

Variational autoencoders (VAE) have been applied to sequential data combined with recurrent neural networks (RNN)
To bridge the gap between complex human behaviors and the
structured environment, and overcome the hurdles of the temporal propagation in simplistic RNNs, we propose the use of a lane-conditioned Koopman Operator to model the temporal relations in the latent space. We were specifically inspired to use the Koopman operator to propagate the latent space due to its capability to model the dynamics ofcomplex, non-linear data, including fluid dynamics, battery properties, and control tasks

Anomaly Detection

In this work, we explicitly model both vehicle-to-vehicle interactions and lane-to-vehicle interactions to boost performance, and use an interpretable variational
architecture to learn a continuous distribution over behaviors.

Methodology

Problem Formulation

Architecture

Vehicle-Vehicle Self-Attention Network

Our goal is to learn a representation of spatial interactions among vehicles.
we encode the positions of vehicles on the road at each time with scaled dot-product multi-head self-attention, which allows each head to learn different features of the data
在这里插入图片描述
加入了Mask掩码
将每辆车的位移 X t X_t Xt 经过一个MLP得到 Q t Q_t Qt --> 将第 i i i 辆车与相邻其他车的位移 R t R_t Rt 经过两个不同的MLP得到 K,V --> 经过 self-attention layer 得到 fianl encoding --> learned a representation of spatial interactions among vehicles.

Lane-Vehicle Attention Network

在这里插入图片描述
在这里插入图片描述
mask out impermissible lane nodes.

Recurrent Encoder

GRU参考文章:https://zhuanlan.zhihu.com/p/32481747
在这里插入图片描述

Latent Propagation withKoopmanOperator

KoopmanOperator :propagate the latent distributions in time to predict the future states ofvehicles.
The Koopman operator is responsible for temporal reasoning (modeling vehicle state dynamics), while the preceding attention modules take charge of spatial reasoning.
在这里插入图片描述
在这里插入图片描述

The Decoder Network

在这里插入图片描述

Training and Evaluation

End-to-End Training

training objective:minimize the current reconstruction loss and one-step future prediction loss of the model.

Anomaly Detection Evaluation

EXPERIMENTAL SETUP AND RESULTS

MAAD Dataset and Augmentation

Baseline Methods

Latent Space Interpretation

SABeR-VAE is a variational model with a continuous latent space such that observations with similar learned characteristics are clustered closer together in the latent space.

### Accumulated Polar Feature-Based Method in Computer Vision The accumulated polar feature (APF)-based method is a robust technique used primarily within the domain of computer vision and image processing to analyze, recognize patterns, and extract features from images. This approach leverages the transformation of Cartesian coordinates into polar coordinates around key points identified within an image. #### Key Concepts Transforming data into polar space allows for more effective representation of circular or radial structures present in visual content. The accumulation process involves aggregating information over specific angular sectors at varying radii from selected keypoints. By doing so, this method can capture both local texture details as well as broader structural characteristics simultaneously[^1]. #### Implementation Steps To implement APF-based methods effectively: - **Keypoint Detection**: Identify distinctive regions across different scales using algorithms like SIFT or FAST. - **Polar Transformation**: Convert neighborhoods surrounding these keypoints into polar coordinate systems where each point has distance \( r \) and angle \( θ \). - **Feature Extraction**: Aggregate intensity values along concentric circles centered on detected keypoints while dividing them into multiple bins based upon angles. Here's how one might code such functionality in Python with OpenCV library support: ```python import cv2 import numpy as np def compute_polar_features(image, kp): # Extract patches around keypoints patch_size = 32 winSize = (patch_size, patch_size) descriptors = [] for p in kp: x, y = int(p.pt[0]), int(p.pt[1]) # Crop region of interest roi = image[y-patch_size//2:y+patch_size//2, x-patch_size//2:x+patch_size//2] # Compute gradient magnitude and orientation mag, ang = cv2.cartToPolar(cv2.Sobel(roi,cv2.CV_64F,1,0,ksize=5), cv2.Sobel(roi,cv2.CV_64F,0,1,ksize=5)) # Create histogram per sector hist = np.zeros((8,)) bin_edges = np.linspace(-np.pi,np.pi,9) for i in range(len(mag)): idx = np.digitize(ang[i],bin_edges,right=True)-1 if(idx>=0): hist[idx%8]+=mag[i] descriptors.append(hist.flatten()) return np.array(descriptors).astype(np.float32) img = cv2.imread('example.jpg',cv2.IMREAD_GRAYSCALE) detector = cv2.FastFeatureDetector_create() keypoints = detector.detect(img,None) descriptors = compute_polar_features(img,keypoints[:10]) # Limit number of keypoints processed here print(f"Descriptors shape: {descriptors.shape}") ``` This script demonstrates extracting simple histograms representing distribution of gradients' orientations inside patches defined by Fast corners found earlier in grayscale input imagery.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值