Towards Causal Reinforcement Learning
Empowering Agents with Causality

Zhihong Deng, Jing Jiang, Chengqi Zhang


While autonomous agents driven by reinforcement learning techniques have made significant strides in decision-making under uncertainty, they still lack a nuanced understanding of the world. Recognizing the pivotal role of causal understanding in human cognition, a new wave of reinforcement learning research has emerged, integrating insights from causality research to improve agent’s decision-making capabilities. Our tutorial delves into this fascinating research area, offering participants a unique opportunity to explore the intersection of causality and reinforcement learning.

The tutorial is divided into two parts, crafted to provide participants with a comprehensive understanding of causal reinforcement learning. In the first part, participants will be guided through the fundamental concepts underpinning both causality and reinforcement learning, including basic definitions, mathematical representations, and their connections. Moving into the second part, participants will embark on a journey to explore the synergies between causal research and reinforcement learning, gaining insights into how causal research enriches traditional reinforcement learning methodologies, addressing critical challenges such as sample efficiency, generalization ability, and reliability. The tutorial will conclude with a discussion of the future opportunities and challenges in causal reinforcement learning.

Whether you are an experienced reinforcement learning researcher, a causal research enthusiast, or a machine learning practitioner eager to expand your toolkit, join us in exploring the exciting world of causal reinforcement learning and advancing this emerging field together.

Link to slides Link to recording
Link to paper Link to github


While familiarity with basic concepts in reinforcement learning and causality is beneficial, it is not mandatory for participation in the tutorial. Participants should have a general understanding of machine learning. Additionally, a basic understanding of probability theory and graphical models would be helpful, albeit not essential. The tutorial will cater to participants with varying levels of expertise, providing explanations and examples to accommodate diverse backgrounds and interests.


Zhihong Deng

Zhihong Deng

Bio: [Homepage] [Google Scholar]

Zhihong Deng is a PhD student at the University of Technology Sydney. His research currently focuses on improving the reliability of autonomous agents from a causal perspective. This not only allows agents to gain a deeper understanding of the world and make more informed decisions, but also fosters trust and transparency by elucidating the causal relationships behind their decisions. Ultimately, the goal is to advance the progress of next-generation AI agents that are not only intelligent but also reliable, with enhanced robustness, interpretability, fairness, and safety. He has published papers in multiple international conferences and journals, such as AAAI, ICLR, IJCAI, IEEE TPAMI, IEEE TNNLS and IEEE TCYB. He is also the recipient of the best paper award from the Australian Artificial Intelligence Institute in 2023.

Jing Jiang

Jing Jiang

Bio: [Google Scholar]

Jing Jiang is an Associate Professor in the School of Computer Science, a core member of Australian Artificial Intelligence Institute (AAII), at the University of Technology Sydney (UTS) in Australia. Her research interests focus on machine learning and its applications. She is the recipient of the DECRA (Discovery Early Career Researcher Award) fellowship funded by ARC (Australian Research Council). She has published over 70 papers in the related areas of AI in the top-tier conferences and journals, such as NeurIPS, ICML, ICLR, AAAI, IJCAI, KDD, TPAMI, TNNLS and TKDE.

Chengqi Zhang

Chengqi Zhang

Bio: [Google Scholar]

Prof. Chengqi Zhang is a Distinguished Professor and a Pro Vice-Chancellor at the University of Technology Sydney (UTS). He is also the Chairman of the Australian Computer Society National Committee for Artificial Intelligence and the General Chair of IJCAI-2024. He has a Doctor of Science degree in Artificial Intelligence from Deakin University and a PhD degree in Artificial Intelligence from The University of Queensland. He is a Fellow of the Australian Computer Society and a Senior Member of the IEEE Computer Society. He has published over 350 scholarly articles and has an H-index of 69. He has been a keynote speaker at 28 international conferences and has supervised over 30 doctoral graduates. He has received several awards, including the NSW Science and Engineering Award (2011) and the 2021 IEEE ICDM Outstanding Service Award. He was the founding Director of the UTS Priority Research Centre for Quantum Computation & Intelligent Systems (QCIS) from 2008 to 2016, which helped UTS Computer Science to be ranked among the top 100 universities by QS 2021. He has established five joint research centres with leading Chinese universities and institutions. He is a leading expert in the fields of Distributed Artificial Intelligence and Data Mining.


If you find this tutorial useful for your research, please consider citing our work. Thank you! 😎

    title=title={Causal Reinforcement Learning: A Survey},
    title=author={Deng, Zhihong and Jiang, Jing and Long, Guodong and Zhang, Chengqi},
    title=journal={Transactions on Machine Learning Research},


We do not cover all relevant fields / techniques / papers in this tutorial. For further information, please refer to our [Awesome-Causal-RL] github repository. Please feel free to email us with pointers and suggestions, and we will update the repo.

Buesing, L., Weber, T., Zwols, Y., Heess, N., Racaniere, S., Guez, A., & Lespiau, J. B. (2018, September). Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search. In International Conference on Learning Representations.

Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., ... & Fiedel, N. (2023). PaLM: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240), 1-113.

Deng, Z., Fu, Z., Wang, L., Yang, Z., Bai, C., Zhou, T., ... & Jiang, J. (2023). False Correlation Reduction for Offline Reinforcement Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.

Deng, Z., Jiang, J., Long, G., & Zhang, C. (2024, August). What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning. In IJCAI.

Deng, Z., Jiang, J., Long, G., & Zhang, C. (2023). Causal Reinforcement Learning: A Survey. Transactions on Machine Learning Research.

Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

Huang, B., Feng, F., Lu, C., Magliacane, S., & Zhang, K. (2021, October). AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning. In International Conference on Learning Representations.

Guo, J., Gong, M., & Tao, D. (2021, October). A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning. In International Conference on Learning Representations.

Kirk, R., Zhang, A., Grefenstette, E., & Rocktäschel, T. (2021). A survey of generalisation in deep reinforcement learning. arXiv preprint arXiv:2111.09794, 1, 16.

Pearl, J. (2009). Causality. Cambridge university press.

Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic books.

Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., & Bengio, Y. (2021). Toward causal representation learning. Proceedings of the IEEE, 109(5), 612-634.

Seitzer, M., Schölkopf, B., & Martius, G. (2021). Causal influence detection for improving efficiency in reinforcement learning. Advances in Neural Information Processing Systems, 34, 22905-22918.

Sontakke, S. A., Mehrjou, A., Itti, L., & Schölkopf, B. (2021, July). Causal curiosity: Rl agents discovering self-supervised experiments for causal representation learning. In International conference on machine learning (pp. 9848-9858). PMLR.

Zhang, Y., Du, Y., Huang, B., Wang, Z., Wang, J., Fang, M., & Pechenizkiy, M. (2024). Interpretable reward redistribution in reinforcement learning: A causal approach. Advances in Neural Information Processing Systems, 36.