Hosein Hasanbeig
                
                
                    
                        
                        
                            
                                
                                    Short Bio:
                                    
                                    I am a member of the PROSE team at Microsoft. I completed my Ph.D. in 2020 in the Computer Science Department of the University of Oxford, under the supervision of Alessandro Abate and Daniel Kroening. I was also a lecturer at St Catherine's College, University of Oxford, teaching Computer-aided Formal Verification from 2018 to 2022. Prior to Oxford, I was a research assistant in the Systems Control Lab at the University of Toronto, where I received my M.Sc. in Electrical and Computer Engineering in 2016.
                                    
                                
                            
                         
                        
                            
                                
                                     
                                    
                                    My research focus is on the design and analysis of safe, responsible, interpretable, and explainable machine learning algorithms in decision-making problems. Most of my work is at the intersection of reinforcement learning, formal methods, language models and cognitive science.
                                    
                                
                            
                        
                         
                     
                 
                Next
            
			
            
                
                    
                    
                        
                            
                            
                                
                                
                                    
                                        - • 10 Dec 2023   ¦   Our article "Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis" has been accepted to AAAI 2024 Special Track on Safe, Robust and Responsible AI . Preprint will be posted to arXiv soon!
- • 21 Sep 2023   ¦   Our article "Evaluating Cognitive Maps and Planning in Large Language Models with CogEval" has been accepted to NeurIPS 2023 . Preprint will be available soon!
- • 22 May 2023   ¦   "Certified Reinforcement Learning with Logic Guidance"  has been accepted to AIJ Special Issue on Risk-aware Autonomous Systems: Theory and Practice . 
- • 01 Feb 2023   ¦   Our article "Symbolic Task Inference in Deep Reinforcement Learning" has been accepted to JAIR . Preprint will be available soon!
- • 24 Oct 2022   ¦   Two papers have been accepted to NeurIPS ML Safety Workshop .
- • 20 Jun 2022   ¦   "LCRL: Logically-Constrained Reinforcement Learning" has been accepted to QEST'22 as a tool paper! The codebase is available here  and the reprint will be online soon.
- • 15 Jul 2021   ¦   Our Work  on deep RL for continuous motion planning with temporal logic has been accepted to IROS'21 and IEEE Robotics and Automation.
- • 17 Dec 2020   ¦   "Shielding Atari Games with Bounded Prescience" has been accepted to AAMAS'21! Preprint will be online soon.
- • 01 Dec 2020   ¦   "DeepSynth : Automata Synthesis for Automatic Task Segmentation in Deep RL" has been accepted to AAAI'21.
- • 01 Sep 2020   ¦   Our Work  on modular deep RL with temporal logic has been accepted to FORMATS'20.
- • 03 Mar 2020   ¦   Our invited submission to OVERLAY  is now available here .
- • 15 Jan 2020   ¦   Our Work  on safe RL has been accepted to AAMAS'20, Auckland, New Zealand.
- • 05 Dec 2019   ¦   Alessandro  will present our work  at CDC'19, December 12, 2019, 18:10-18:30.
- • 19 Jul 2019   ¦   Our paper  has been accepted to CDC'19, Nice Acropolis, France.
- • 15 May 2019   ¦   I'll be presenting our work on Logical Neural FQ  at AAMAS'19, Montreal, Canada.
 
                             
                         
                        
                            
                                
                        
                     
                 
             
            
            
                
            
                    
            
            
            
					
						
							Publications in Reverse Chronological Order
						
						
                            
                                
                                 2025 
                                
				    - • Wang, J., Hasanbeig, H., Tan, K., Sun, Z., Kantaros, Y., "Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications", L4DC, 2025. [Bib ] [PDF ]
 2024 
                                
				    - • Hasanbeig, H., Jeppu, N. Y., Abate, A., Melham, T., and Kroening, D., "Symbolic Task Inference in Deep Reinforcement Learning", JAIR, 2024. [Bib ] [PDF ]
- • Mitta, R., Hasanbeig, H., Wang, J., Kroening, D., Kantaros, Y., Abate, A., "Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis", AAAI Special Track on Safe, Robust and Responsible AI, 2024. [Bib ] [PDF ]
- • Wang, J., Hasanbeig, H., Tan, K., Sun, Z., Kantaros, Y., "Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications", arXiv preprint, 2024. [Bib ] [PDF ]
 2023 
								
                                    
                                    - • Hasanbeig*, H., Momennejad*, I., Vieira Frujeri*, F., Sharma, H., Ness, R., Jojic, N., Palangi, H., Larson, J., "Evaluating Cognitive Maps and Planning in Large Language Models with CogEval", NeurIPS, 2023. [Bib ] [PDF ]
- • Yousefi, S., Betthauser, L., Hasanbeig, H., Saran, A., Momennejad, I., "In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations", arXiv preprint, 2023. [Bib ] [PDF ]
- • Hasanbeig, H., Kroening, D., Abate, A., "Certified Reinforcement Learning with Logic Guidance", AIJ Special Issue on Risk-aware Autonomous Systems: Theory and Practice, 2023. [Bib ] [PDF ]
- • Hasanbeig, H., Sharma, H. , Betthauser, L., Frujeri, F., Momennejad, I.,"ALLURE: Auditing and Improving LLM-based Evaluation of Text using Iterative In-Context-Learning", arXiv preprint, 2023. [Bib ] [PDF ]
 2022 
								
                                    - • Mitta, R., Hasanbeig, H., Kroening, D., Abate, A., "Risk-aware Bayesian Reinforcement Learning for Cautious Exploration", NeurIPS, MLSW, 2022. [Bib ] [PDF ]
- • Barez, F., Hasanbeig, H., Abate, A., "System III: Learning with Domain Knowledge for Safety Constraints", NeurIPS, MLSW, 2022. [Bib ] [PDF ]
- • Hasanbeig, H., Kroening, D., Abate, A., "LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement Learning", CONFEST, 2022. [Bib ] [PDF ] [Code ]
 2021 
								
									- • Hasanbeig, H., Jeppu, N. Y., Abate, A., Melham, T., and Kroening, D., "DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning", AAAI, 2021. [Bib ] [PDF ] [Code ]
- • Cai, M., Hasanbeig, H., Xiao, S., Abate, A., and Kan, Z., "Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal Logic", IROS, 2021. [Bib ] [PDF ]
- • Giacobbe, M., Hasanbeig, H., Kroening, D., and Wijk, H., "Shielding Atari Games with Bounded Prescience", AAMAS, 2021. [Bib ] [PDF ]
 2020 
								
									- • Hasanbeig, H., Kroening, D., and Abate, A., "Deep Reinforcement Learning with Temporal Logics", International Conference on Formal Modeling and Analysis of Timed Systems, 2020. [Bib ] [PDF ]
- • Hasanbeig, H., Abate, A., and Kroening, D., "Cautious Reinforcement Learning with Logical Constraints", International Conference on Autonomous Agents and Multi-agent Systems, 2020. [Bib ] [PDF ]
- • Hasanbeig, H., Kroening, D., and Abate, A., "Towards Verifiable and Safe Model-Free Reinforcement Learning", Workshop on Artificial Intelligence and Formal Verification, Logics, Automata and Synthesis (OVERLAY), 2020 [invited]. [Bib ] [PDF ]
- • Ringstrom, T.J., Hasanbeig, H., and Abate, A., "Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer", CoRR abs/2007.02527, 2020. [Bib ] [PDF ]
 2019 
								
									- • Hasanbeig, H., Kantaros, Y., Abate, A., Kroening, D., Pappas, G. J., and Lee, I., "Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees", IEEE Conference on Decision and Control, 2019. [Bib ] [PDF ]
- • Lim Zun Yuan, Hasanbeig, H., Abate, A., and Kroening, D., "Modular Deep Reinforcement Learning with Temporal Logic Specifications", CoRR abs/1909.11591, 2019. [Bib ] [PDF ]
- •  Hasanbeig, H., Abate, A., and Kroening, D., "Certified Reinforcement Learning with Logic Guidance", CoRR abs/1902.00778, 2019. [Bib ] [PDF ]
- •  Hasanbeig, H., Abate, A., and Kroening, D., "Logically-Constrained Neural Fitted Q-Iteration", International Conference on Autonomous Agents and Multi-agent Systems, 2019. [Bib ] [PDF ]
 2018 
								
									- • Hasanbeig, H., and Pavel, L., "From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning ", Journal of Autonomous Agents and Multi-Agent Systems, 2018. [under review] [Bib ] [PDF ]
- •  Hasanbeig, H., Abate, A., and Kroening, D., "Logically-Constrained Reinforcement Learning", CoRR abs/1801.08099, 2018. [Bib ] [PDF ]
 2017 
								
									- • Hasanbeig, H., and Pavel, L., "On Synchronous Binary Log-linear Learning and Second Order Q-learning", International Federation of Automatic Control, 2017. [Bib ] [PDF ]
- • Hasanbeig, H., and Pavel, L., "Distributed Coverage Control by Robot Networks in Unknown Environments Using a Modified EM Algorithm", International Journal of Computer, Electrical, Automation, Control and Information Engineering, 2017. [Bib ] [PDF ]
 2016 
								
									- • Hasanbeig, H., "Multi-agent Learning in Coverage Control Games", MSc Thesis, 2016. [Bib ] [PDF ]