DRLMA: An Intelligent Move Acceptance for Combinatorial Optimization Problems based on Deep Reinforcement Learning
Master thesis
View/ Open
Date
2023-07-01Metadata
Show full item recordCollections
- Master theses [218]
Abstract
Numerous heuristic solution methods have been developed to tackle combinatorial opti- mization problems, often customized for specific problem domains and use-cases where they exhibit remarkable performance. However, their effectiveness diminishes significantly when applied to problem domains for which they were not originally designed, showcasing poor generalization capabilities. In contrast, metaheuristics are higher-level heuristics so- lution methods that aim to be applicable to a wide range of different problems. Perturba- tive metaheuristics operate by traversing the solution space through iterative application of modifications induced by low-level heuristics. This process continues until a specified stopping criteria is met, enabling the method to efficiently explore and refine solutions. A central aspect of these search-based methods is the move acceptance scheme, which determines whether or not the suggested modification is to be applied. The Simulated Annealing acceptance criteria, for instance, occasionally accepts uphill moves, or worse so- lutions, in order to explore the space of solutions and help the search escape local optima. In this thesis we propose Deep Reinforcement Learning Move Acceptance (DRLMA), a general move acceptance framework that leverages Deep Reinforcement Learning into the acceptance decision. A Deep RL agent is trained using problem-independent search information, enabling it to learn high-level acceptance strategies regardless of the specific combinatorial optimization problem at hand. We show that by replacing the Simulated Annealing acceptance criteria with DRLMA in two different heuristic selection frame- works, namely Adaptive Large Neighborhood Search (ALNS) and Deep Reinforcement Learning Hyperheuristic (DRLH), we are generally able to improve the performance of the respective search methods, the degree of improvement ranging from only slightly in the worst cases to considerably in the best cases.