泛化Q学习与连续动的工作空间工作、空间、泛化

2023-09-11 03:51:14 作者：岁月染过的梦

我试图让代理去学习必要的鼠标运动的加强，以最好的执行某些任务，学习设置（即奖励信号是学习的唯一的反馈）。

I'm trying to get an agent to learn the mouse movements necessary to best perform some task in a reinforcement learning setting (i.e. the reward signal is the only feedback for learning).

我希望用Q学习技术，但同时我发现的办法这种方法扩展到连续状态空间时，我似乎无法弄清楚如何适应的问题有连续动作空间

I'm hoping to use the Q-learning technique, but while I've found a way to extend this method to continuous state spaces, I can't seem to figure out how to accommodate a problem with a continuous action space.

我可以强迫所有的鼠标动作得到一定幅度的，并只在一定数量不同的方向，但使得行动离散会产生巨大的作用空间中的任何合理的方式。由于标准Q学习要求代理评估的所有的可能行动，这种近似不会解决问题在任何实际意义。

I could just force all mouse movement to be of a certain magnitude and in only a certain number of different directions, but any reasonable way of making the actions discrete would yield a huge action space. Since standard Q-learning requires the agent to evaluate all possible actions, such an approximation doesn't solve the problem in any practical sense.

泛化Q学习与连续*动*的工作空间工作、空间、泛化

推荐答案

泛化Q学习与连续动的工作空间工作、空间、泛化