随笔分类 -  Deep Reinforcement Learning

该文被密码保护。
posted @ 2017-04-29 11:41 AHU-WangXiao 阅读(8) 评论(0) 推荐(0)
摘要:强化学习策略梯度方法之: REINFORCE 算法 (从原理到代码实现) 2018-04-01 15:15:42 最近在看policy gradient algorithm, 其中一种比较经典的算法当属:REINFORCE 算法,已经广泛的应用于各种计算机视觉任务当中。 【REINFORCE 算法原 阅读全文
posted @ 2017-03-26 16:04 AHU-WangXiao 阅读(19313) 评论(2) 推荐(6)
摘要:Evolution Strategies as a Scalable Alternative to Reinforcement Learning this blog from: https://blog.openai.com/evolution-strategies/ MARCH 24, 2017 阅读全文
posted @ 2017-03-25 09:28 AHU-WangXiao 阅读(583) 评论(0) 推荐(0)
摘要:Deep Deterministic Policy Gradients in TensorFlow Deep Deterministic Policy Gradients in TensorFlow AUG 21, 2016 This blog from: http://pemami4911.git 阅读全文
posted @ 2017-02-26 09:42 AHU-WangXiao 阅读(684) 评论(0) 推荐(0)
摘要:本文转自:https://jaromiru.com/2017/02/16/lets-make-an-a3c-theory/ Let’s make an A3C: Theory February 16, 2017A3C This article is part of series Let’s make 阅读全文
posted @ 2017-02-17 09:05 AHU-WangXiao 阅读(596) 评论(0) 推荐(0)
摘要:深度增强学习前沿算法思想 CSDN 作者: Flood Sung 2017-02-16 09:34:29 举报 阅读数:3361 作者: Flood Sung,CSDN博主,人工智能方向研究生,专注于深度学习,增强学习与机器人的研究。 责编:何永灿,欢迎人工智能领域技术投稿、约稿、给文章纠错,请发送 阅读全文
posted @ 2017-02-16 12:19 AHU-WangXiao 阅读(2047) 评论(0) 推荐(0)
摘要:本文转自:http://mp.weixin.qq.com/s/Xe3g2OSkE3BpIC2wdt5J-A 谷歌大规模机器学习:模型训练、特征工程和算法选择 (32PPT下载) 2017-01-26 新智元 1新智元编译 来源:ThingsExpo、Medium 作者:Natalia Ponomar 阅读全文
posted @ 2017-02-12 09:04 AHU-WangXiao 阅读(641) 评论(0) 推荐(0)
摘要:本文转自:http://mp.weixin.qq.com/s/aAHbybdbs_GtY8OyU6h5WA 专题 | 深度强化学习综述:从AlphaGo背后的力量到学习资源分享(附论文) 原创 2017-01-28 Yuxi Li 机器之心 选自arXiv 作者:Yuxi Li 编译:Xavier 阅读全文
posted @ 2017-02-12 09:02 AHU-WangXiao 阅读(2913) 评论(0) 推荐(0)
该文被密码保护。
posted @ 2017-01-21 20:24 AHU-WangXiao 阅读(6) 评论(0) 推荐(0)
该文被密码保护。
posted @ 2016-12-17 16:30 AHU-WangXiao 阅读(3) 评论(0) 推荐(0)
摘要:Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. The papers are organized based on manually-defined b 阅读全文
posted @ 2016-12-04 13:04 AHU-WangXiao 阅读(725) 评论(0) 推荐(0)
摘要:Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/adeshpande3.github.io/Deep-Learning-Research-Review-We 阅读全文
posted @ 2016-11-17 12:42 AHU-WangXiao 阅读(608) 评论(0) 推荐(0)
摘要:Hierarchical Object Detection with Deep Reinforcement Learning NIPS 2016 WorkShop Paper : https://arxiv.org/pdf/1611.03718v1.pdf Project Page : https: 阅读全文
posted @ 2016-11-15 15:48 AHU-WangXiao 阅读(1905) 评论(2) 推荐(0)
摘要:The AlphaGo Replication Wiki 摘自:https://github.com/Rochester-NRT/RocAlphaGo/wiki/01.-Home Contents : Home 01. Home 02. Code 03. Data 04. Neural Networ 阅读全文
posted @ 2016-11-06 19:36 AHU-WangXiao 阅读(874) 评论(0) 推荐(0)
摘要:Progressive Neural Network Google DeepMind 摘要:学习去解决任务的复杂序列 结合 transfer (迁移),并且避免 catastrophic forgetting (灾难性遗忘) 对于达到 human-level intelligence 仍然是一个关键 阅读全文
posted @ 2016-10-26 22:40 AHU-WangXiao 阅读(5244) 评论(1) 推荐(0)
摘要:Let’s make a DQN 系列 Let’s make a DQN: Theory September 27, 2016DQN This article is part of series Let’s make a DQN. 1. Theory2. Implementation3. Debug 阅读全文
posted @ 2016-10-25 08:28 AHU-WangXiao 阅读(1950) 评论(0) 推荐(0)
摘要:一张图解AlphaGo原理及弱点 2016-03-23 郑宇,张钧波 CKDD 作者简介: 郑宇,博士, Editor-in-Chief of ACM Transactions on Intelligent Systems and Technology, ACM数据挖掘中国分会秘书长。 张钧波,博士 阅读全文
posted @ 2016-10-21 22:37 AHU-WangXiao 阅读(684) 评论(0) 推荐(0)
摘要:Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playing Out Run, session 201609171218_175epsNo time lim 阅读全文
posted @ 2016-10-08 00:22 AHU-WangXiao 阅读(780) 评论(0) 推荐(0)
摘要:Deep Recurrent Q-Learning for Partially Observable MDPs 摘要:DQN 的两个缺陷,分别是:limited memory 和 rely on being able to perceive the complete game screen at e 阅读全文
posted @ 2016-10-03 21:25 AHU-WangXiao 阅读(4486) 评论(0) 推荐(0)