Paper ID: 2403.06466
RL-MSA: a Reinforcement Learning-based Multi-line bus Scheduling Approach
Yingzhuo Liu
Multiple Line Bus Scheduling Problem (MLBSP) is vital to save operational cost of bus company and guarantee service quality for passengers. Existing approaches typically generate a bus scheduling scheme in an offline manner and then schedule buses according to the scheme. In practice, uncertain events such as traffic congestion occur frequently, which may make the pre-determined bus scheduling scheme infeasible. In this paper, MLBSP is modeled as a Markov Decision Process (MDP). A Reinforcement Learning-based Multi-line bus Scheduling Approach (RL-MSA) is proposed for bus scheduling at both the offline and online phases. At the offline phase, deadhead decision is integrated into bus selection decision for the first time to simplify the learning problem. At the online phase, deadhead decision is made through a time window mechanism based on the policy learned at the offline phase. We develop several new and useful state features including the features for control points, bus lines and buses. A bus priority screening mechanism is invented to construct bus-related features. Considering the interests of both the bus company and passengers, a reward function combining the final reward and the step-wise reward is devised. Experiments at the offline phase demonstrate that the number of buses used of RL-MSA is decreased compared with offline optimization approaches. At the online phase, RL-MSA can cover all departure times in a timetable (i.e., service quality) without increasing the number of buses used (i.e., operational cost).
Submitted: Mar 11, 2024