This is a machine learning agent I developed using Unity’s MLAgents Toolkit. The aim for the agent was to have it push a box toward a goal (green tile). The PPO algorithm was selected as the trainer due to its robustness and proven performance in continuous control tasks.

TitleImage

TensorBoard was used to visualise performance metrics such as cumulative reward, episode length and policy entropy over time.

TitleImage

You can view the source code here

Updated: