🎯 I'm on the 2026 Job Market — actively seeking research positions in both industry and academia. 🤝 I'm also always excited to explore new research collaborations and exchange ideas with fellow researchers. 📩 Please feel free to reach out via ruhwang@iu.edu or LinkedIn — I'd love to chat!

Ruhan Wang / 王汝涵

PhD Candidate

Indiana University

Research Interests

Agent Reinforcement Learning

Reinforcement Learning for Post-training, Reasoning

Self-Improving LLMs

About

I am a Ph.D. candidate in Computer Engineering at Indiana University, advised by Prof. Dongruo Zhou. My research focuses on building self-improving large language models and autonomous AI agents that can continuously learn from interaction, feedback, memory, and experience. To support this goal, I study reinforcement learning for post-training, reasoning, and agentic behaviors in large-scale language models, with recent interests in Agent Reinforcement Learning, federated reasoning, and memory mechanisms for iterative self-improvement.

My recent work spans agentic LLM systems, uncertainty-aware reasoning, federated in-context learning, RL-enhanced retrieval-augmented generation, and offline reinforcement learning. I have also worked on multimodal agentic recommender systems and adaptive reasoning frameworks for large language models.

I am currently a Research Intern at Tencent AI Lab (Hunyuan Frontier Lab), where I work on Agentic Reinforcement Learning for large language model agents. Previously, I worked as a Ph.D. Research Intern at Mitsubishi Electric Research Laboratories, where I explored quantum machine learning and generative models.

Selected Publications

View All →

Instance-Dependent Continuous-Time Reinforcement Learning via Maximum Likelihood Estimation

Runze Zhao, Yue Yu, Ruhan Wang, Chunfeng Huang, Dongruo Zhou

Forty-Third International Conference on Machine Learning (ICML)

Instance-dependent analysis of continuous-time reinforcement learning using maximum likelihood estimation.

Federated In-Context Learning: Iterative Refinement for Improved Answer Quality

Ruhan Wang, Zhiyong Wang, Chengkai Huang, Rui Wang, Tong Yu, Lina Yao, John C.S. Lui, Dongruo Zhou

Forty-Second International Conference on Machine Learning (ICML)

Privacy-preserving framework (Fed-ICL) that combines federated learning with in-context learning to collaboratively train diverse LLM agents, with theoretical equivalence to established FL algorithms.

Safe Decision Transformer with Learning-based Constraints

Ruhan Wang, Dongruo Zhou

7th Annual Learning for Dynamics and Control Conference (L4DC)

Constrained Q-learning Decision Transformer (CQDT) for safe offline RL, addressing stitching limitations of CDT while strictly adhering to safety constraints.

Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

Ruhan Wang, Yu Yang, Zhishuai Liu, Dongruo Zhou, Pan Xu

Transactions on Machine Learning Research (TMLR)

Return Augmented Decision Transformer (RADT) for offline off-dynamics RL with rigorous suboptimality analysis and D4RL evaluation across off-dynamics shifts.

LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models

Ahmad Faiz, Sotaro Kaneda, Ruhan Wang, Rita Osi, Parteek Sharma, Fan Chen, Lei Jiang

The Twelfth International Conference on Learning Representations (ICLR)

End-to-end carbon footprint projection model for LLMs across training, inference, experimentation, and storage phases, integrating LLM, hardware, and data center parameters.

News

2026-05

🏅 Recognized as a Gold Reviewer for ICML 2026, placing among the top reviewers for this year's conference (with complimentary registration).

2026-05

🚀 Started my Ph.D. Research Internship at Tencent AI Lab (Hunyuan Frontier Lab) in Bellevue, WA, hosted by Dr. Kishan Panaganti. Working on Agentic Reinforcement Learning for large language model agents.

2026-05

🎉 Our paper Instance-Dependent Continuous-Time RL via Maximum Likelihood Estimation — joint work with Runze Zhao, Yue Yu, Chunfeng Huang, and Prof. Dongruo Zhou — has been accepted to ICML 2026 (Seoul)!

2026-04

📝 Submitted FERA: Uncertainty-Aware Federated Reasoning with Large Language Models to COLM 2026 — a parameter-free federated reasoning framework that uses uncertainty quantification for cross-client knowledge integration. With Chengkai Huang, Zhiyong Wang, Rui Wang, Tong Yu, Lina Yao, and Prof. Dongruo Zhou.

2026-04

🏅 Recognized as a Top 25% Reviewer for ICLR 2026.

2025-07

🛫 Attended ICML 2025 in Vancouver to present Federated In-Context Learning: Iterative Refinement for Improved Answer Quality — a privacy-preserving Fed-ICL framework with theoretical equivalence to established FL algorithms. With Zhiyong Wang, Chengkai Huang, Rui Wang, Tong Yu, Lina Yao, John C.S. Lui, and Prof. Dongruo Zhou.

2025-06

🛫 Attended L4DC 2025 in Ann Arbor, MI to present Safe Decision Transformer with Learning-based Constraints — our Constrained Q-learning Decision Transformer (CQDT) for safe offline RL. Joint work with Prof. Dongruo Zhou; previously presented at the NeurIPS 2024 Safe Generative AI Workshop.

2025-04

🎓 Advanced to Ph.D. Candidacy in Computer Engineering at Indiana University after passing the qualifying examination.

2025-03

📚 Our work Towards Agentic Recommender Systems in the Era of Multimodal LLMs has been accepted to ACM TIST — a formal LLM-ARS framework spanning user profiling, memory, planning, and action selection, identifying seven open challenges. Collaboration led by Chengkai Huang with the team at UNSW, Adobe Research, UCSD, and Indiana University.

2024-12

🎓 Completed my M.S. in Computer Engineering at Indiana University Bloomington.

2024-08

🔬 Wrapped up my Ph.D. Research Internship at Mitsubishi Electric Research Laboratories (MERL) in Cambridge, MA, hosted by Dr. Toshiaki Koike-Akino. Worked on quantum machine learning and generative models — resulted in Quantum Diffusion Models for Few-Shot Learning, accepted to ICAD 2025 and the AAAI 2024 Quantum Computing & AI Workshop.

2024-01

⭐ LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models selected for Oral Presentation at ICLR 2024 (Vienna) — projecting LLM carbon footprints across training, inference, experimentation, and storage. Joint work with Ahmad Faiz, Sotaro Kaneda, Rita Osi, Parteek Sharma, Fan Chen, and Lei Jiang.

2023-08

🤝 Joined the Machine Learning Lab at Indiana University, advised by Prof. Dongruo Zhou — shifting research focus toward reinforcement learning, foundation models, and agentic AI.

2022-08

🌟 Started my Ph.D. journey in Computer Engineering at Indiana University Bloomington, joining the Quantum Computing Lab under Prof. Fan Chen.