Multi-Scale Detection of Anomalous Spatio-Temporal Trajectories in Evolving Trajectory Datasets
是的,你对模型的理解基本正确,我可以帮你进一步理清细节和逻辑关系。
1. 模型嵌入过程概述
- 空间嵌入 (Spatial Embedding):
- 使用图卷积 (GCN) 对轨迹的空间位置进行嵌入,生成不同尺度的空间嵌入 (三种尺度)。
- 时间嵌入 (Temporal Embedding):
- 使用类似
Doc2Vec的方法对时间信息进行嵌入,生成不同尺度的时间嵌入 (三种尺度)。
- 交叉注意力机制 (Cross-Attention):
- 将空间和时间嵌入结合在一起,学习空间和时间之间的交互关系,生成时空联合嵌入。
- 多尺度聚合 (Multi-scale Aggregation):
- 对不同尺度的嵌入进行聚合,形成最终轨迹嵌入表示,保留轨迹的多尺度特征。
2. 异常检测过程
- 轨迹嵌入分布建模 (GMM):
- 使用高斯混合模型 (GMM) 对轨迹嵌入的空间和时间分布进行建模,假设模型学习到 3 个高斯分布,每个分布代表一种正常轨迹模式。
- 轨迹嵌入在空间和时间维度都会有对应的高斯分布。
- 生成概率计算 (Likelihood Calculation):
- 针对输入轨迹,计算其在每个高斯分布下的生成概率 (空间和时间分别计算)。
- 最大生成概率选择 (Max Likelihood Selection):
- 在空间和时间维度中,分别选择生成概率最大的高斯分布,表示轨迹最符合哪一种正常模式。
- 异常分数计算:
- 将空间和时间的最大生成概率相乘,表示轨迹整体符合正常模式的程度。
- 异常分数公式如下:
\( \text{Score}(T) = 1 - \left( \max P_s \times \max P_t \right) \)- 其中,$ P_s $ 和 $ P_t $ 分别是空间和时间维度上的最大生成概率。
3. 举例说明
假设输入轨迹如下:
- 空间轨迹点:[1, 3, 5]
- 时间戳:[10, 20, 30]
模型通过多尺度嵌入得到以下表示:
| 尺度 | 空间嵌入 $ S $ | 时间嵌入 $ T $ |
|---|---|---|
| 1 | [0.1, 0.2] | [0.05, 0.1] |
| 2 | [0.3, 0.4] | [0.1, 0.15] |
| 3 | [0.5, 0.6] | [0.15, 0.2] |
轨迹在 3 个高斯分布中的生成概率:
| 簇编号 | 空间概率 $ P_s $ | 时间概率 $ P_t $ |
|---|---|---|
| 簇1 | 0.75 | 0.80 |
| 簇2 | 0.60 | 0.70 |
| 簇3 | 0.90 | 0.85 |
异常分数计算:
-
空间维度最大概率:
\( \max P_s = 0.90 \quad (\text{簇3}) \) -
时间维度最大概率:
\( \max P_t = 0.85 \quad (\text{簇3}) \) -
轨迹异常分数:
\( \text{Score}(T) = 1 - (0.90 \times 0.85) = 1 - 0.765 = 0.235 \)
最终异常分数 = 0.235,表示该轨迹较符合簇3的模式,不属于异常轨迹。
4. 进一步理解:
- 如果轨迹在所有簇的生成概率都很低,异常分数将接近 1,表示轨迹极度异常。
- 如果轨迹符合某个簇的模式,生成概率较高,异常分数接近 0,表示轨迹属于正常轨迹模式。
5. 为什么选取最大生成概率?
- 真实轨迹可能属于多个正常模式之一,选择最大生成概率的簇代表轨迹最有可能的正常模式。
- 如果仅选择均值或最小概率,可能会误判轨迹为异常,而实际上轨迹符合某种模式。
6. 你的理解正确的部分:
- 多尺度空间和时间嵌入:✅
- 交叉注意力机制结合空间和时间:✅
- 高斯分布建模轨迹模式:✅
- 最大概率相乘并计算异常分数:✅
如果你有更多细节问题,或者想深入探讨代码逻辑和实际实现,我们可以进一步分析每个函数的具体流程!
代码实现
From analyzing the code, here's how trajectories are processed and anomaly scores calculated:
- Trajectory Embedding Process in
mst_oatd.py:
# Initial spatial embedding through graph convolution
H = D.mm(A).mm(self.V).mm(D) # Normalize adjacency matrix
nodes = H.mm(self.embedding(self.nodes))
s_inputs = torch.index_select(nodes, 0, trajs.flatten())
# Temporal embedding
t_inputs = self.d2v(times)
# Combine via cross-attention
att_s, att_t = self.co_attention(s_inputs, t_inputs)
st_inputs = torch.concat((att_s, att_t), dim=2)
# Multi-scale processing via different RNNs at scales s1 and s2
encoder_inputs_s1 = pack_padded_sequence(self.attention_layer(st_inputs, lengths))
encoder_inputs_s2 = self.scale_process(st_inputs, self.s1_size, [int(i // self.s1_size) for i in lengths])
encoder_inputs_s3 = self.scale_process(st_inputs, self.s2_size, [int(i // self.s2_size) for i in lengths])
# Combine scales with learned weights
encoder_final_state = (self.W1 * encoder_final_state_s1 +
self.W2 * encoder_final_state_s2 +
self.W3 * encoder_final_state_s3)
- Anomaly Score Calculation in
mst_oatd_trainer.py:
def detection(self):
# Compute likelihood across all clusters
for batch in self.outliers_loader:
trajs, times, seq_lengths = batch
c_likelihood_s = []
c_likelihood_t = []
# Get likelihood for each cluster
for c in range(self.n_cluster):
output_s, _, _, _ = self.MST_OATD_S(trajs, times, seq_lengths, batch_size, "test", c)
likelihood_s = - self.detec(output_s.reshape(-1, output_s.shape[-1]),
trajs.to(self.device).reshape(-1))
likelihood_s = torch.exp(
torch.sum(mask * (likelihood_s.reshape(batch_size, -1)), dim=-1) / torch.sum(mask, 1))
output_t, _, _, _ = self.MST_OATD_T(trajs, times, seq_lengths, batch_size, "test", c)
likelihood_t = - self.detec(output_t.reshape(-1, output_t.shape[-1]),
times_token.to(self.device).reshape(-1))
likelihood_t = torch.exp(
torch.sum(mask * (likelihood_t.reshape(batch_size, -1)), dim=-1) / torch.sum(mask, 1))
c_likelihood_s.append(likelihood_s)
c_likelihood_t.append(likelihood_t)
# Final anomaly score combines spatial and temporal likelihoods
likelihood_s = torch.cat(c_likelihood_s).max(0)[0]
likelihood_t = torch.cat(c_likelihood_t).max(0)[0]
score = 1 - likelihood_s * likelihood_t
The code shows that anomaly scores are computed using:
- Maximum likelihood across all clusters
- Product of spatial and temporal likelihoods
- Normalized by trajectory length via the mask
- Negative cross-entropy loss as the base likelihood measure
公式
Let me explain how MST-OATD handles trajectory embedding at different scales and anomaly score calculation:
- Trajectory Embedding at Different Scales:
The model embeds trajectories at multiple scales through these steps:
a) Initial Embedding:
- For spatial points: Maps locations to grid cells and generates spatial embeddings using graph convolution
- For temporal points: Uses a neural network with periodic activation to capture temporal patterns
- These are combined using cross-attention to get spatio-temporal embeddings e = ⟨e1, e2, ..., en⟩
b) Multi-scale Fusion:
- Takes the combined embedding sequence e and generates embeddings at different scales k using:
ek = ⟨êk1, êk2, ..., êk⌈n/sk⌉⟩
where:
- sk is the segment size for scale k
- êki is the mean embedding of points in segment i at scale k
- The model uses 3 scales with sizes 1, 2, and 4
c) Attention Between Segments:
- Applies attention between segments at each scale to capture relationships
- Uses the formula:
ẽki = ∑j Softmax(αi,j)êkj
where αi,j captures the relationship between segments i and j
- Anomaly Score Calculation:
The anomaly score is NOT just based on reconstruction loss. Instead, it uses a detect-by-generation strategy:
The score for a trajectory T is calculated as:
Score(T) = 1 - arg max(c(s),c(t)) exp[
log pγ(T(s)|μ(s)c(s))pγ(T(t)|μ(t)c(t))
────────────────────────────────
n
]
Where:
- T(s) and T(t) are spatial and temporal aspects
- c(s) and c(t) are route types
- μ(s) and μ(t) are means of Gaussian distributions for each type
- n is trajectory length for normalization
- pγ represents the generation probability
The key points about the scoring:
- It considers both spatial and temporal aspects
- Uses Gaussian mixture models to represent different route types
- Calculates generation probability rather than just reconstruction loss
- Normalizes by trajectory length
- Takes the maximum likelihood across all route types
This is more sophisticated than simple reconstruction loss because it:
- Models multiple normal route patterns through Gaussian mixtures
- Considers both spatial and temporal anomalies
- Can handle partial trajectories for online detection
- Normalizes scores to be comparable across trajectories of different lengths
Would you like me to explain any particular aspect in more detail?

浙公网安备 33010602011771号