TY - JOUR
T1 - Deep reinforcement learning and randomized blending for control under novel disturbances
AU - Sohège, Yves
AU - Provan, Gregory
AU - Quiñones-Grueiro, Marcos
AU - Biswas, Gautam
N1 - Publisher Copyright:
© 2020 The Authors. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0)
PY - 2020
Y1 - 2020
N2 - Enabling autonomous vehicles to maneuver in novel scenarios is a key unsolved problem. A well-known approach, Weighted Multiple Model Adaptive Control (WMMAC), uses a set of pre-tuned controllers and combines their control actions using a weight vector. Although WMMAC offers an improvement to traditional switched control in terms of smooth control oscillations, it depends on accurate fault isolation and cannot deal with unknown disturbances. A recent approach avoids state estimation by randomly assigning the controller weighting vector; however, this approach uses a uniform distribution for control-weight sampling, which is sub-optimal compared to state-estimation methods. In this article, we propose a framework that uses deep reinforcement learning (DRL) to learn weighted control distributions that optimize the performance of the randomized approach for both known and unknown disturbances. We show that RL-based randomized blending dominates pure randomized blending, a switched FDI-based architecture and pre-tuned controllers on a quadcopter trajectory optimisation task in which we penalise deviations in both position and attitude.
AB - Enabling autonomous vehicles to maneuver in novel scenarios is a key unsolved problem. A well-known approach, Weighted Multiple Model Adaptive Control (WMMAC), uses a set of pre-tuned controllers and combines their control actions using a weight vector. Although WMMAC offers an improvement to traditional switched control in terms of smooth control oscillations, it depends on accurate fault isolation and cannot deal with unknown disturbances. A recent approach avoids state estimation by randomly assigning the controller weighting vector; however, this approach uses a uniform distribution for control-weight sampling, which is sub-optimal compared to state-estimation methods. In this article, we propose a framework that uses deep reinforcement learning (DRL) to learn weighted control distributions that optimize the performance of the randomized approach for both known and unknown disturbances. We show that RL-based randomized blending dominates pure randomized blending, a switched FDI-based architecture and pre-tuned controllers on a quadcopter trajectory optimisation task in which we penalise deviations in both position and attitude.
KW - Design of fault tolerant/reliable systems
KW - Fault accommodation and reconfiguration strategies
KW - Methods based on neural networks and/or fuzzy logic for FDI
UR - https://www.scopus.com/pages/publications/85107757387
U2 - 10.1016/j.ifacol.2020.12.2313
DO - 10.1016/j.ifacol.2020.12.2313
M3 - Article
AN - SCOPUS:85107757387
SN - 2405-8971
VL - 53
SP - 8175
EP - 8180
JO - IFAC-PapersOnLine
JF - IFAC-PapersOnLine
T2 - 21st IFAC World Congress 2020
Y2 - 12 July 2020 through 17 July 2020
ER -