neural network - 随笔分类(第2页) - chease

模型蒸馏

摘要：待阅读全文

posted @ 2019-08-05 15:52 chease 阅读(521) 评论(0) 推荐(0)

摘要：1.Triplet https://www.jianshu.com/p/46c6f68264a1 http://lawlite.me/2018/10/16/Triplet-Loss%E5%8E%9F%E7%90%86%E5%8F%8A%E5%85%B6%E5%AE%9E%E7%8E%B0/ http 阅读全文

posted @ 2019-05-06 17:28 chease 阅读(648) 评论(0) 推荐(0)

生成对抗网络GAN

摘要：https://www.leiphone.com/news/201706/ty7H504cn7l6EVLd.html https://www.msra.cn/zh-cn/news/features/gan-20170511 1、判别模型目的：判别出来属于的一张图是来自真实样本集还是假样本集。假如阅读全文

posted @ 2019-04-08 17:46 chease 阅读(177) 评论(0) 推荐(0)

Graph和Session

该文被密码保护。

posted @ 2019-03-19 18:13 chease 阅读(2) 评论(0) 推荐(0)

NLP语义匹配

摘要：参考资料【搜狗语义匹配技术前沿】https://www.jiqizhixin.com/articles/2018-10-25-16?from=synced&keyword=%E6%90%9C%E7%8B%97%E8%AF%AD%E4%B9%89%E5%8C%B9%E9%85%8D 阅读全文

posted @ 2019-03-11 10:17 chease 阅读(1060) 评论(0) 推荐(0)

Transformer

摘要：参考资料：【ERT大火却不懂Transformer？读这一篇就够了】 https://zhuanlan.zhihu.com/p/54356280 (中文版) http://jalammar.github.io/illustrated-transformer/ (谷歌AI博客英文版) https: 阅读全文

posted @ 2019-03-11 10:11 chease 阅读(267) 评论(0) 推荐(0)

NLP句子表征，NLP 的巨人肩膀（下）：从 CoVe 到 BERT （转载）

摘要：深度长文：NLP的巨人肩膀（上）：https://www.jiqizhixin.com/articles/2018-12-10-17 NLP 的巨人肩膀（下）：从 CoVe 到 BERT： https://www.jiqizhixin.com/articles/2018-12-17-17?from= 阅读全文

posted @ 2019-03-11 10:09 chease 阅读(717) 评论(0) 推荐(0)

放弃幻想，全面拥抱Transformer：自然语言三大特征抽取器CNN/RNN/Transformer比较

摘要：参考： https://zhuanlan.zhihu.com/p/54743941 阅读全文

posted @ 2019-03-11 10:03 chease 阅读(482) 评论(0) 推荐(0)

残差网络待

摘要：参考：https://zhuanlan.zhihu.com/p/42706477 https://zhuanlan.zhihu.com/p/22447440 阅读全文

posted @ 2019-02-13 17:50 chease 阅读(107) 评论(0) 推荐(0)

Tensorflow集成接口TensorLayer、Keras

摘要：https://www.zhihu.com/question/50030898 https://zhuanlan.zhihu.com/p/25296966 https://www.jiqizhixin.com/articles/2017-08-02 阅读全文

posted @ 2018-12-24 19:50 chease 阅读(321) 评论(0) 推荐(0)

词向量

摘要：1、word2vec 2、FastText https://blog.csdn.net/sinat_26917383/article/details/54850933 3、glove embedding 4、elmo 阅读全文

posted @ 2018-10-17 19:29 chease 阅读(150) 评论(0) 推荐(0)

tensorflow分布式训练

摘要：https://blog.csdn.net/hjimce/article/details/61197190 tensorflow分布式训练 https://cloud.tencent.com/developer/article/1006345 分布式 TensorFlow，分布式原理、最佳实践 ht 阅读全文

posted @ 2018-08-24 15:35 chease 阅读(400) 评论(0) 推荐(0)

has invalid type <class 'numpy.ndarray'>, must be a string or Tensor

摘要：转自: https://blog.csdn.net/jacke121/article/details/78833922 has invalid type <class 'numpy.ndarray'>, must be a string or Tensor. (Can not convert a n 阅读全文

posted @ 2018-08-21 11:15 chease 阅读(4597) 评论(0) 推荐(0)

谈谈激活函数以零为中心的问题

摘要：转自: https://liam0205.me/2018/04/17/zero-centered-active-function/ https://zhuanlan.zhihu.com/p/307901500 今天在讨论神经网络中的激活函数时，陆同学提出 Sigmoid 函数的输出不是以零为中心的（阅读全文

posted @ 2018-08-17 13:33 chease 阅读(1002) 评论(0) 推荐(0)

循环神经网络RNN及LSTM

摘要：细节： 01) LSTM遗忘门偏置项初始化为什么比较大参考：https://zhuanlan.zhihu.com/p/113109644 一、循环神经网络RNN RNN综述 https://juejin.im/entry/5b97e36cf265da0aa81be239 RNN中为什么要采用tan 阅读全文

posted @ 2018-08-16 17:10 chease 阅读(663) 评论(0) 推荐(0)

梯度消失、爆炸原因及其解决方法(转)

摘要：转自: https://blog.csdn.net/qq_25737169/article/details/78847691 前言本文主要深入介绍深度学习中的梯度消失和梯度爆炸的问题以及解决方案。本文分为三部分，第一部分主要直观的介绍深度学习中为什么使用梯度更新，第二部分主要介绍深度学习中梯度消失阅读全文

posted @ 2018-08-16 17:02 chease 阅读(889) 评论(0) 推荐(0)

tensorflow中使用指定的GPU及GPU显存 CUDA_VISIBLE_DEVICES

摘要：参考: https://blog.csdn.net/jyli2_11/article/details/73331126 https://blog.csdn.net/cfarmerreally/article/details/80321276 http://www.cnblogs.com/darkkn 阅读全文

posted @ 2018-08-14 12:55 chease 阅读(32572) 评论(2) 推荐(3)

深度学习 weight initialization

摘要：转自: https://www.leiphone.com/news/201703/3qMp45aQtbxTdzmK.htmla https://blog.csdn.net/shuzfan/article/details/51338178 [原理推导] 背景深度学习模型训练的过程本质是对weight 阅读全文

posted @ 2018-08-13 17:51 chease 阅读(1129) 评论(0) 推荐(0)

深度学习最全优化方法总结比较（SGD，Adagrad，Adadelta，Adam，Adamax，Nadam）(转)

摘要：转自: https://zhuanlan.zhihu.com/p/22252270 ycszen 参考: 一个框架看懂优化算法之异同 SGD/AdaGrad/Adam ： https://zhuanlan.zhihu.com/p/32230623 https://blog.csdn.net/llx1 阅读全文

posted @ 2018-06-07 19:30 chease 阅读(5088) 评论(0) 推荐(0)

epoch、 iteration和batchsize区别

摘要：转自: https://blog.csdn.net/qq_27923041/article/details/74927398 深度学习中经常看到epoch、 iteration和batchsize，下面按自己的理解说说这三个的区别：（1）batchsize：批大小。在深度学习中，一般采用SGD训练阅读全文

posted @ 2018-05-23 12:41 chease 阅读(348) 评论(0) 推荐(0)

chease

随笔分类 - neural network

公告