Methods to Incorporate Machine Learning for Control System Applications
Current methods in reinforcement learning typically strive to develop control strategies with no prior controller structure or domain knowledge. These approaches are favorable as they can natively capture non-linearities and achieve solutions to problems that are non-intuitive. However, similar improvements and efficiencies can be discovered through the integration of machine learning with existing control system architectures, thereby providing a more robust controller solution. This work focuses on efficiently using machine learning tools combined with existing control system solutions to retain the benefits of the existing approaches while improving overall stability and performance. Three main methods were proposed to address this problem.
The first method was to use neural networks to map existing conditions and performance goals with a set of controller parameters. The goal was to use the neural network to perform gain scheduling to ensure that the desired response was maintained even in the presence of disturbances or component degradation. This approach was applied both to mitigate a wind disturbance on a quadrotor drone and to maintain performance when system components are not operating at expected conditions in a nuclear power plant simulation. This process resulted in a stable response with tighter control and improved disturbance rejection. This method is best suited to process control applications with defined transients or system disturbances.
The second method combined reinforcement learning with traditional control system architectures. The reinforcement learning agents were trained to perform gain scheduling or provide parallel control signals to achieve the desired performance. These agents were capable of training quickly and efficiently, achieving high rewards faster than a traditional reinforcement learning agent. This method was tested in several simulated environments and compared against a traditional reinforcement learning agent and existing controller. The resultant response was not only more stable and robust, but also outperformed a traditional reinforcement learning agent. This method is best suited for reference tracking applications that have an existing controller design.
In the final method, agent training was completed efficiently using actions generated by agents tasked with learning different skills. This method is not related to control system applications, but rather a means of accelerating agent training. In this method, a randomly selected agent provided the actions necessary to fulfill their task within each training episode. The other agents would store the states and actions in their own experience replay with the reward parsed through their own reward function, thereby providing a vehicle for each agent to pursue a different skill or task. As each agent learned and the associated policy matured, the actions chosen by an agent would typically differ from the other agents, while still maintaining a coherent strategy. This provided a more stable means of exploration as well as a richer experience base for training, resulting in a more well-rounded policy for the target agent. While the computational cost of maintaining additional agents was non-trivial, the training necessary was reduced to such a degree that the benefits outweigh the additional cost.
- Mechanical Engineering
- Doctor of Philosophy (PhD)