This past week me and Prathyush worked on formulating an algorithm to use to co-optimize structure and control. Due to the difference in how geometric and control parameters may affect the robot's reward, past research has separated the optimization of the two, choosing to iterate between optimizing the control parameters and optimizing the geometric parameters. However, this could in theory make the optimization more suspect to initial conditions by not allowing the optimization algorithm to change the structure and control concurrently. The main functions involved in the optimization can be seen below: We are planning on using a PPO like update for the control parameters and a PGPE like update for the geometry. By optimizing the controller on geometry samples, we hope to let the algorithm "explore" more without fixing the parameters:

Next Post Previous Post