Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning (LCRL)

Description

Logically-Constrained Reinforcement Learning (LCRL) is a model-free reinforcement learning framework to synthesise policies for unknown, continuous-state Markov Decision Processes (MDPs) under a given Linear Temporal Logic (LTL) property. LCRL automatically shapes a synchronous reward function on-the-fly. This enables any off-the-shelf RL algorithm to synthesise policies that yield traces which probabilistically satisfy the LTL property. This probability is calculated in parallel with the learning process when the MDP state space is finite. LCRL produces policies that are certified with respect to the LTL property.

Code Repository

LCRL code can be found here: github.com/grockious/lcrl

Publications

• Hasanbeig, M. , Kroening, D. and Abate, A., "Deep Reinforcement Learning with Temporal Logics", International Conference on Formal Modeling and Analysis of Timed Systems, 2020. [Bib ] [PDF ]

• Hasanbeig, M. , Abate, A. and Kroening, D., "Cautious Reinforcement Learning with Logical Constraints", International Conference on Autonomous Agents and Multi-agent Systems, 2020. [Bib ] [PDF ]

• Hasanbeig, M. , Jeppu, N. Y., Abate, A., Melham, T., Kroening, D., "DeepSynth: Program Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning", CoRR abs/1911.10244, 2019. [Bib ] [PDF ]

• Hasanbeig, M. , Kantaros, Y., Abate, A., Kroening, D., Pappas, G. J., and Lee, I., "Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees", IEEE Conference on Decision and Control, 2019. [Bib ] [PDF ]

• Lim Zun Yuan, Hasanbeig, M. , Abate, A. and Kroening, D., "Modular Deep Reinforcement Learning with Temporal Logic Specifications", CoRR abs/1909.11591, 2019. [Bib ] [PDF ]

• Hasanbeig, M. , Abate, A. and Kroening, D., "Certified Reinforcement Learning with Logic Guidance", CoRR abs/1902.00778, 2019. [Bib ] [PDF ]

• Hasanbeig, M. , Abate, A. and Kroening, D., "Logically-Constrained Neural Fitted Q-Iteration", International Conference on Autonomous Agents and Multi-agent Systems, 2019. [Bib ] [PDF ]

• Hasanbeig, M. , Abate, A. and Kroening, D., "Logically-Constrained Reinforcement Learning", CoRR abs/1801.08099, 2018. [Bib ] [PDF ]

Experiments

"Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees" [PDF ]

- Pacman with Probabilistic Labels

"Logically-Constrained Reinforcement Learning" [PDF ]

- Pacman with Deterministic Labels

LCQL converges

Vanilla RL fails to converge

- Slippery Grid-world