Impact of Exploration-to-exploitation Ratio on Energy Saving Potential of Plug-in Hybrid Vehicles Controlled by Reinforcement Learning

Volume 16: Low Carbon Cities and Urban Energy Systems: Part V

Impact of Exploration-to-exploitation Ratio on Energy Saving Potential of Plug-in Hybrid Vehicles Controlled by Reinforcement Learning Bin Shuai, Quan Zhou, Huw Williams, Hongming Xu, Yanfei Li, Lun Hua

https://doi.org/10.46855/energy-proceedings-8428

Abstract

Electric vehicles, including hybrids, will domain the road transport in future cities. Reinforcement learning has shown its capacity in online optimization of energy management strategy for hybrid vehicles. Exploration of new control settings and exploitation of the existing control policy are two key procedures in reinforcement learning but there is a lack of study on how the exploration-to-exploitation (E2E) ratio affects the energy efficiency improvement for hybrid vehicles. This paper introduces two decay functions, ‘Reciprocal function-based decay’ (RBD) and ‘Step-based decay’ (SBD), to generate E2E ratio trajectories for reinforcement learning algorithm which is conventionally based on Exponential decay (EXD) function. By monitoring the improving rate of vehicle energy efficiency in the learning process, the vehicle controlled by Q-learning algorithm based on the SBD function has shown the best compared with the vehicles based on the RBD function and the EXD function. The improving rate can be more than 6.21%. In the HiL testing, the SBD can save 1.52% energy compared to the EXD in real-time control under a predefined driving cycle.

Keywords Vehicle energy efficiency, Reinforcement learning, Hybrid electric vehicle, Exploration-to-exploitation ratio

Download full text in PDF

Volume 16: Low Carbon Cities and Urban Energy Systems: Part V

Impact of Exploration-to-exploitation Ratio on Energy Saving Potential of Plug-in Hybrid Vehicles Controlled by Reinforcement Learning Bin Shuai, Quan Zhou, Huw Williams, Hongming Xu, Yanfei Li, Lun Hua