摘要:
When training on GPU, the error "Model diverged with loss = NaN" is often caused by a sotmax that's getting a symbol larger than vocab_size 阅读全文
摘要:
>>> from collections import Counter>>> Counter(['apple','red','apple','red','red','pear'])Counter({'red': 3, 'apple': 2, 'pear': 1}) 阅读全文