Abstract
Electric vehicles, including hybrids, will domain the road transport in future cities. Reinforcement learning has shown its capacity in online optimization of energy management strategy for hybrid vehicles. Exploration of new control settings and exploitation of the existing control policy are two key procedures in reinforcement learning but there is a lack of study on how the exploration-to-exploitation (E2E) ratio affects the energy efficiency improvement for hybrid vehicles. This paper introduces two decay functions, ‘Reciprocal function-based decay’ (RBD) and ‘Step-based decay’ (SBD), to generate E2E ratio trajectories for reinforcement learning algorithm which is conventionally based on Exponential decay (EXD) function. By monitoring the improving rate of vehicle energy efficiency in the learning process, the vehicle controlled by Q-learning algorithm based on the SBD function has shown the best compared with the vehicles based on the RBD function and the EXD function. The improving rate can be more than 6.21%. In the HiL testing, the SBD can save 1.52% energy compared to the EXD in real-time control under a predefined driving cycle.
Keywords Vehicle energy efficiency, Reinforcement learning, Hybrid electric vehicle, Exploration-to-exploitation ratio
Copyright ©
Energy Proceedings