Koha9

如何在代码中命名事物的建议

根据CodeAesthetic的视频“Naming Things in Code”进行的简要总结，关于如何更好地在代码中命名。

ReinforcementLearning Research DeepLearning AI

What Matters In On-Policy Reinforcement Learning? 的简单摘要

近期阅读了这篇文章 What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study ，本文探索了PPO policy loss, 网络结构, 初始化和转换策略等方面的具体内容。

DeepLearning ReinforcementLearning

Reinforcement Learning中的Reward Function设计

在Reinforcement Learning中，reward function的设计至关重要。最近大概总结了一些reward function的设计原则。

Unity ML-Agents

ML-Agents中(<number of agents>, <action size>)错误

ML-Agents中使用Python API时出现The behavior needs a discrete input of dimension (xx, xx) for (<number of agents>, <action size>) but received input of dimension (xx, xx)错误

Tensorflow Python DeepLearning ReinforcementLearning

用Tensorflow2.8实现莫凡老师的PPO算法

莫凡老师曾用Tensorflow1.x实现用PPO解决Pendulum-v0问题，将其用Tensorflow2.8实现了出来。