This week I worked on fixing and documenting bugs in RoCo to prepare it for an upcoming handover to a new team. I will be moving on to a new project later this quarter so in addition to working on RoCo, I performed a literature search for the current state of the art of reinforcement learning and learning in the context of robotics.
[1] explains how the authors were able to generalize sensor data from 2D range sensors to use as input for a generalized neural network for robot control. They used multiple nets and multiple machine learning models put together to create an overall controller that would allow a robot to navigate and avoid obstacles.
In [2], the authors took a pre-trained LSTM neural network that was trained with supervised learning for the purpose of music generation and improved it by imposing extra properties defined by factors such as music theory using reinforcement learning techniques. This created a refined LSTM NN that produced better melodies.
My initial plan was to create a simulation environment where neural network-based robot controllers would be refined with reinforcement learning algorithms based on user-defined reward functions. After discussing with Professor Mehta and my colleagues, I am looking into the idea of optimizing robot geometry parameters by simulating robot structures and maximizing user-defined reward functions. I hope to flesh out some of the details of this idea soon and write a project proposal for it.
[1] Dezfoulian, S. H., Wu, D., & Ahmad, I. S. (2013). A Generalized Neural Network Approach to Mobile Robot Navigation and Obstacle Avoidance.
[2] Jaques, N., Gu, S., Turner, R. E., & Douglas, E. (2017). Tuning Recurrent Neural Networks With Reinforcement Learning.