Machine Learning Puzzle Agent
This is a machine learning agent I developed using Unity’s MLAgents Toolkit. The aim for the agent was to have it push a box toward a goal (green tile). The PPO algorithm was selected as the trainer due to its robustness and proven performance in continuous control tasks.

TensorBoard was used to visualise performance metrics such as cumulative reward, episode length and policy entropy over time.

You can view the source code here