Typical Models of RNN and TFF
RNN
LSTM(2014)
Recurrent Neural Networks
Hidden State: \(h\)
-
\(h_t = tanh(U h_{t-1} + W x_t + b)\)
-
\(y_t = Vh_t\)
- h: history state
- tanh : active function , sometimes also use Logistic function
Long Short Term Memory networks
Cell State: \(C_t\)
Hidden State: \(h_t\)
4 states
1. Forget gate: \(f_t\)
2. Input gate: \(i_t\)
3. Candidate Values: \(\widetilde{C_t}\)
4. Output gate: \(o_t\)
GRU(2014)
Gated Recurrent Units
good at capturing short-term dependencies
FC-LSTM()
Conv-LSTM()
GNN
Manifold
Attention
e a self-attention layer does better at handling long-term dependencies
New ST Model
DSTF
Decoupled Spatial-Temporal F ramework (DSTF)
separates the diffusion and inherent traffic information in a data-driven manner,
encompasses a unique estimation gate and a residual decomposition mechanism.
Decoupled Dynamic Spatial-Temporal Graph Neural N etwork
\(D^2STGNN\)
- captures spatial-temporal correlations
- features a dynamic graph learning module
the complex spatial-temporal correlations
-
each signal (i.e., time series) naturally contains two different types of signals
- diffusion signals
- captures the vehicles diffused from other sensors
- non-diffusion signals (which is also called inherent signal for simplicity).
- captures the vehicles that are independent of other sensors
- diffusion signals
THE DECOUPLED FRAMEWORK
two hidden signals
$\cal X = \cal X^{𝑑𝑖𝑓} +X^{𝑖𝑛ℎ} $
the decouple block
-
a residual decomposition mechanism
-
an estimation gate
to decompose the spatial-temporal signals in a data-driven manner
Residual Decomposition Mechanism
Estimation Gate
5. DECOUPLED DYNAMIC ST-GNN
5.1 Diffusion Model: Spatial-Temporal Localized Convolutional Layer
- Forecast Branch
- auto-regression
- Backcast Branch
- non-linear fully connected networks
5.2 Inherent Model: Local and Global Dependency
We utilize GRU [7] and a multi-head self-attention layer [35] jointly to capture temporal patterns comprehensively.
- GRU: capturing short-term dependencies
- Multihead Self-Attention layer: handling long-term dependencies
5.3 Dynamic Graph Learning
TFF
INTRODUCNTION
Traffic forecasting is a crucial service in Intelligent Transportation Systems (ITS) to predict future traffic conditions (e.g., traffic flow) based on historical traffic conditions observed by sensors .
- Many early studies formulate the problem as a simple time series.
rely heavily on stationarity-related assumptions.
- Auto-Regressive Integrated Moving Average (ARIMA [38])
- Kalman filtering
- Recently, deep learning-based approaches capture the complex spatial-temporal correlations in traffic flow.
construct an adjacency matrix to model the complex spatial topology of a road network and formulates the traffic data as a spatial-temporal graph.
- STGNN + models the dynamics of the traffic flow as a diffusion process
- combines diffusion graph convolution
- sequential models
the spatial dependency
the temporal dependency
RELATED WORK
Temporal dependency
- Sequential models
- GRU
- LSTM
- TCN
- Attetion Mechanism
Spatial dependency
-
Convolution models
-
Diffusion models
-
Diffusion Convolution
- DCRNN
- Graph WaveNet
PRELIMINARIES
Traffic Network
Graph
\(G = (V,E)\)
- V: |V| = N nodes
- E: |E| = M edges
- A: \(A\in \R^{N\times N}\)adjacent matrix
Traffic Signal
\(X_t \in \R^{N\times C}\)
Traffic Forecasting
- historical traffic signals \(X = [X_{𝑡−𝑇_ℎ+1}, · · · , X_{𝑡−1}, X_𝑡 ] ∈ \R^{𝑇_ℎ×𝑁 ×𝐶}\)
- future traffic signals \(Y = [X_{𝑡+1}, X_{𝑡+2}, · · · , X_{𝑡+𝑇_𝑓} ]\)
EXPERIMENTS
Baselines
- HA: Historical Average model, which models traffic flows as a periodic process and uses weighted averages from previous periods as predictions for future periods.
- VAR: Vector Auto-Regression [22, 23] assumes that the passed time series is stationary and estimates the relationship between the time series and their lag value. [37]
- SVR: Support Vector Regression (SVR) uses linear support vector machine for classical time series regression task.
- FC-LSTM [32]: Long Short-Term Memory network with fully connected hidden units is a well-known network architecture that is powerful in capturing sequential dependency. (2014)
- DCRNN [21]: Diffusion Convolutional Recurrent Neural Network [21] models the traffic flow as a diffusion process. It replaces the fully connected layer in GRU [7] by diffusion convolutional layer to form a new Diffusion Convolutional Gated Recurrent Unit (DCGRU). (2018)
- Graph WaveNet [41]: Graph WaveNet stacks Gated TCN and GCN layer by layer to jointly capture the spatial and temporal dependencies.
- ASTGCN [11]: ASTGCN combines the spatial-temporal attention mechanism to capture the dynamic spatial-temporal characteristics of traffic data simultaneously. (2019)
- STSGCN [31]: STSGCN is proposed to effectively capture the localized spatial-temporal correlations and consider the heterogeneity in spatial-temporal data. (2020)
- GMAN [51]: GMAN is an attention-based model which stacks spatial, temporal and transform attentions. (2020)
- MTGNN [40]: MTGNN extends Graph WaveNet through the mix-hop propagation layer in the spatial module, the dilated inception layer in the temporal module, and a more delicate graph learning layer. (2020)
- DGCRN [20]: DGCRN models the dynamic graph and designs a novel Dynamic Graph Convolutional Recurrent Module (DGCRM) to capture the spatial-temporal pattern in a seq2seq architecture(2021)