Pytorch中的RNN、RNNCell、LSTM、LSTMCell、GRU、GRUCell的用法

首先,当然,官方文档都有

RNN: https://pytorch.org/docs/stable/generated/torch.nn.RNN.html

RNNCell: https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html

LSTM: https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html

LSTMCell: https://pytorch.org/docs/stable/generated/torch.nn.LSTMCell.html

GRU: https://pytorch.org/docs/stable/generated/torch.nn.GRU.html

GRUCell: https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html

这里,只是自己做下笔记

以LSTM和LSTMCell为例

LSTM的结构

 

 

 

 

LSTM the dim of definition input output weights 

LSTM parameters:

  • input_size: input x 的 features
  • hidden_size: hidden state h 的 features
  • num_layers: 层数,默认为1
  • batch_first: if True,是(batch, seq, feature),否则是(seq, batch, feature),默认是False
  • bidirectional: 默认为False

input:

  • input: 当batch_first=False, tensor为(L, N, H_i) ,否则为 (N, L, H_i)
  • h_0: tensor of shape (D*num_layers, N, H_out),默认为zeros,如果(h_0, c_0) not provided
  • c_0: tensor of shape (D*num_layers, n, H_cell),默认为zeros,如果(h_0, c_0) not provided

where:

N = batch size

L = sequence length

D = 2 if bidirectional=True otherwise 1

H_in = input_size

H_cell = hidden_size

H_out = proj_size if proj_size>0 otherwise hidden_size,通常就是hidden_size咯

Output:

  • output: (L, N, D*H_out) when batch_first=False,是一个长度为L的序列,[h_1[-1], h_2[-1], ..., h_L[-1]],就是最后一层的hidden states
  • h_n: tensor of shape (D*num_layers, N, H_out)
  • c_n: tensor of shape (D*num_layers, N, H_cell)

Variables:

好像新版的有改动

  • all_weights

Examples:

>>> rnn = nn.LSTM(10, 20, 2)   # (input_size, hidden_size, num_layers)
>>> input = torch.randn(5, 3, 10) # (time_steps, batch, input_size) 
>>> h0 = torch.randn(2, 3, 20)  # (num_layers, batch_size, hidden_size)
>>> c0 = torch.randn(2, 3, 20)
>>> output, (hn, cn) = rnn(input, (h0, c0))  # (time_steps, batch, hidden_size)
# output[-1] = h0[-1]

 

LSTM Cell

就是LSTM的一个单元,许多个LSTM Cell组成一个LSTM

结构

 

相比LSTM,少了参数t

Parameters:

  • 只有input_size 和 hidden_size,没有了 num_layers

Inputs:

  • input: (batch, input_size)
  • h_0: (batch, hidden_size)
  • c_0: (batch, hidden_size)

Outputs:

  • h_1: (batch, hidden_size)
  • c_1: (batch, hidden_size)

Variables:

  • weight_ih: input-hidden weights, of shape (4*hidden_size, input_size),因为是左乘W*input,且有4个W,所以是4*hidden_size
  • weight_hh: hidden-hidden weights, of shape (4*hidden_size, hidden_size)
  • bias_ih: input-hidden bias, of shape (4*hidden_size)
  • bias_hh: hidden-hidden bias, of shape (4*hidden_size)

Example:

>>> rnn = nn.LSTMCell(10, 20) # (input_size, hidden_size)
>>> input = torch.randn(2, 3, 10) # (time_steps, batch, input_size)
>>> hx = torch.randn(3, 20) # (batch, hidden_size)
>>> cx = torch.randn(3, 20)
>>> output = []
>>> for i in range(2):
        hx, cx = rnn(input[i], (hx, cx))
        output.append(hx)
>>> output = torch.stack(output, dim=0)

 

posted @ 2021-07-23 15:11  Rogn  阅读(1094)  评论(0编辑  收藏  举报