Attention Mechanisms-attention cues 课后题

1、What can be the volitional cue when decoding a sequence token by token in machine translation? What are the nonvolitional cues and the sensory inputs?

待补充

2、Randomly generate a 10×10 matrix and use the softmax operation to ensure each row is a valid probability distribution. Visualize the output attention weights.

data = torch.rand(10, 11)
data = torch.nn.functional.softmax(data, dim = 0)
show_heatmaps(data.reshape(1,1,data.shape[0],data.shape[1]), xlabel='Keys', ylabel='Queries')

posted @ 2021-05-27 17:26  哈哈哈喽喽喽  阅读(89)  评论(0)    收藏  举报