05 2018 档案

摘要:https://en.wikipedia.org/wiki/Bayesian_inference 阅读全文
posted @ 2018-05-31 23:09 ecoflex 阅读(138) 评论(0) 推荐(0)
摘要: 阅读全文
posted @ 2018-05-31 14:29 ecoflex 阅读(124) 评论(0) 推荐(0)
摘要:jump over this lecture 阅读全文
posted @ 2018-05-29 17:21 ecoflex 阅读(131) 评论(0) 推荐(0)
摘要:after the break, we'll extend our IRL into continuous spaces 阅读全文
posted @ 2018-05-29 14:55 ecoflex 阅读(203) 评论(0) 推荐(0)
摘要:yellow region corresponds to β blue to α 阅读全文
posted @ 2018-05-28 20:46 ecoflex 阅读(141) 评论(0) 推荐(0)
摘要:make compromise between learnt policy and minimal cost! π hat is using states π theta is using observations 阅读全文
posted @ 2018-05-27 23:01 ecoflex 阅读(201) 评论(0) 推荐(0)
摘要:MPC means replan every step Every N step, rebuild the dynamic model 阅读全文
posted @ 2018-05-27 18:15 ecoflex 阅读(249) 评论(0) 推荐(0)
摘要:transition possibility is unknown and we even don't need to estimate the possibility 阅读全文
posted @ 2018-05-26 23:04 ecoflex 阅读(154) 评论(0) 推荐(0)
摘要:understand that correlated samples cause problem. and how paralled solve the problem another solution is replay buffers, fully ultilizing the advantag 阅读全文
posted @ 2018-05-26 19:57 ecoflex 阅读(223) 评论(0) 推荐(0)
摘要:in most AC algorithms, we actually just fit value function. less common to fit Q function as well. batch:off line, monte carlo。online: bootstrap,TD in 阅读全文
posted @ 2018-05-26 12:28 ecoflex 阅读(218) 评论(0) 推荐(0)
摘要:green bar is the reward function, blue curve is the possibility of differenct trajectories if green bars are equally increased to yellow bars, the res 阅读全文
posted @ 2018-05-24 23:13 ecoflex 阅读(145) 评论(0) 推荐(0)
摘要:first order markov chain on policy algorithm is easier to be paralleled off policy algorithm has to fit transition net, and policy net. much more comp 阅读全文
posted @ 2018-05-24 18:13 ecoflex 阅读(161) 评论(0) 推荐(0)
摘要:前面弄错了,应该看2017的秋季课,结果看了春季课了。 neural network control a virtual robot, by imitating human motion Domain shift cause the failure of supervised learning in 阅读全文
posted @ 2018-05-24 16:43 ecoflex 阅读(1090) 评论(0) 推荐(0)
摘要:initialization dramatically influences the trajectory. the current state depends on all the past decision. ones reflect the dimensions being counted. 阅读全文
posted @ 2018-05-24 13:59 ecoflex 阅读(319) 评论(0) 推荐(0)
摘要:There are some problems: mismatch of model and reality; gradient explosion so, the dynamics can be quite messy, and backpropogating can be quite probl 阅读全文
posted @ 2018-05-23 19:14 ecoflex 阅读(348) 评论(0) 推荐(0)
摘要:solved normally by sequential quadratic programming algorithms an example of linear system 阅读全文
posted @ 2018-05-21 20:33 ecoflex 阅读(221) 评论(0) 推荐(0)
摘要:You have to force experts to treat some uncommon and extreme situations. a mechanical way to learn However, we don't know rt if you use sequence GAN, 阅读全文
posted @ 2018-05-19 20:21 ecoflex 阅读(469) 评论(0) 推荐(0)
摘要:not only JS divergence could be applied to GAN, other divergences are all applicable! f start is convex several ACG icons become very similar, if trai 阅读全文
posted @ 2018-05-15 18:43 ecoflex 阅读(664) 评论(0) 推荐(0)
摘要:Too much limitation of Gaussian model. The images are too blurry. So any general model? But if PG(x;θ) is a neural network, it's impossible to calcula 阅读全文
posted @ 2018-05-15 14:50 ecoflex 阅读(459) 评论(0) 推荐(0)
摘要:HW2: input a sentence, output an ACG icon 3 target: trains from front view, side views. So that the output would be the average of the three pictures. 阅读全文
posted @ 2018-05-14 23:12 ecoflex 阅读(1661) 评论(0) 推荐(0)
摘要:比较有用的是conditioned generator,能够控制输入的vector来控制对应的文字音像 https://zhuanlan.zhihu.com/p/24767059 单纯生成人脸意义不大,因为随便拍一个路人就行了。 但是能从左右照片生成正面照片,就很神奇了 必须学会辨别转折 Varia 阅读全文
posted @ 2018-05-13 13:12 ecoflex 阅读(3583) 评论(0) 推荐(0)
摘要:https://www.bilibili.com/video/av15997678/ My own deep reinforcement learning code: https://github.com/ysgclight/Reinforcement-Learning-with-Pytorch D 阅读全文
posted @ 2018-05-06 14:54 ecoflex 阅读(183) 评论(0) 推荐(0)
摘要:data augumentation 阅读全文
posted @ 2018-05-05 19:46 ecoflex 阅读(187) 评论(0) 推荐(0)
摘要:10 free hours run on AWS click this one click on new machine pick a region choose linux ubuntu 16 250GB is preferred ctrl shift v to paste your passwo 阅读全文
posted @ 2018-05-05 18:49 ecoflex 阅读(572) 评论(0) 推荐(0)
摘要:https://www.bilibili.com/video/av22940029 left hand side: NN being constructed right hand side: NN being called turn the NN code into GPU compatible m 阅读全文
posted @ 2018-05-04 18:02 ecoflex 阅读(160) 评论(0) 推荐(0)
摘要:high bias if the robot has learnt something (no changes appear with iterations) however, in the real world tasks, the task could change a little bit, 阅读全文
posted @ 2018-05-04 17:14 ecoflex 阅读(265) 评论(0) 推荐(0)
摘要:model free: high variance. model based: high bias within 1h of human demonstration of each task, VR!!! 阅读全文
posted @ 2018-05-04 15:34 ecoflex 阅读(255) 评论(0) 推荐(0)
摘要:intrinsic ambiguity: move toward purple triangle? move away from red triangle? move along grey arrow? or the combine of them? the right part of the ri 阅读全文
posted @ 2018-05-04 13:58 ecoflex 阅读(365) 评论(0) 推荐(0)
摘要: 阅读全文
posted @ 2018-05-03 18:55 ecoflex 阅读(186) 评论(0) 推荐(0)
摘要:So, the process is similar to one-to-many RNN? learn much more efficiently than model-free method iteratively get better less than 300 trials ~ 25min 阅读全文
posted @ 2018-05-02 23:02 ecoflex 阅读(230) 评论(0) 推荐(0)
摘要:you wouldn't try to explore any problem structure in DFO low dimension policy 30 degrees of freedom 120 paramaters to tune keep the positive results i 阅读全文
posted @ 2018-05-02 13:08 ecoflex 阅读(200) 评论(0) 推荐(0)
摘要:^ is the square root of epsilon a simplified version of hard version a more smooth way to find correct solution the first term is the REINFORCE term, 阅读全文
posted @ 2018-05-01 22:38 ecoflex 阅读(285) 评论(0) 推荐(0)
摘要:fast feedback to robot with better shape reward func, and learning could be much faster open ai baseline rllab multiple tasks and multiple seeds to te 阅读全文
posted @ 2018-05-01 21:34 ecoflex 阅读(352) 评论(0) 推荐(0)
摘要:https://statweb.stanford.edu/~owen/mc/Ch-var-is.pdf https://zhuanlan.zhihu.com/p/29934206 blue curve is the lower bounded one conjugate gradient to so 阅读全文
posted @ 2018-05-01 17:38 ecoflex 阅读(368) 评论(0) 推荐(0)
摘要:https://drive.google.com/file/d/0BxXI_RttTZAhTUpqUFdEZ3BXNFE/view game of Pong is a MDP. 终于一睹AK真容了,很有想法,很幽默 http://karpathy.github.io/ 阅读全文
posted @ 2018-05-01 12:52 ecoflex 阅读(178) 评论(0) 推荐(0)
摘要:http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html https://zhuanlan.zhihu.com/p/22252270 阅读全文
posted @ 2018-05-01 12:45 ecoflex 阅读(105) 评论(0) 推荐(0)