index.html

<html>

<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
	<meta name="description" content="Hi, this is Huayu Chen, a student from Tsinghua University. 嗨，我叫陈华玉，目前是清华大学计算机系的博士生。">
    <meta name="keywords" content="Huayu Chen, ChenHuayu, chenhuayu, 陈华玉, chenDRAG, reinforcement learning, generative models, Tsinghua, 清华, department of computer science, 计算机">
	<link rel="stylesheet" href="./jemdoc.css" type="text/css">
	<title>Huayu Chen's Homepage</title>
        <link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
        <link rel="icon" href="favicon.ico" type="image/x-icon">
    <meta name="baidu-site-verification" content="kZzZTRvzUQ" />
</head>

<body>

<div id="layout-content" style="margin-top:25px">

<table><tbody><tr>
    <td width="670">
        <div id="toptitle"><h1>Huayu Chen (陈华玉)&nbsp;</h1></div>
        <h3>PhD student</h3>  
        <p>
            Room 1-509, FIT Building <br>
            Dept. of Computer Science and Technology <br>
            Tsinghua University <br>
            Beijing, China, 100084. <br>
            Email:  chenhuay17[AT]gmail[DOT]com <br>
            <!-- <a href="https://github.com/ChenDRAG">[GitHub]</a> <br> -->
            <a href="https://scholar.google.com.au/citations?user=0FBCHc4AAAAJ&hl=en&oi=ao">[Google Scholar]</a> <br>
        </p>
    </td>

    <td><img src="./images/me.jpg" border="0" width="200"></td>
</tr></tbody></table>


<h2>Biography</h2>
<p> I am a fourth-year PhD student of <a href="http://ml.cs.tsinghua.edu.cn/index.html">TSAIL Group</a> in the <a href="https://www.cs.tsinghua.edu.cn/csen/">Department of Computer Science and Technology</a>, <a href="https://www.tsinghua.edu.cn">Tsinghua University</a>, advised by <a href="http://ml.cs.tsinghua.edu.cn/~jun/">Prof. Jun Zhu</a> and <a href="https://www.suhangss.me/">Prof. Hang Su</a>.
    Currently, I am also a research intern at Nvidia <a href="https://research.nvidia.com/labs/dir/"> Deep Imagination Research group </a> in the San Francisco Bay Area. 
    Previously, I received my B.S. degree from the Department of Automation of Tsinghua University in July 2021.
    I spent a wonderful time at the Digital Media Lab at Tsinghua University, advised by <a href="http://www.liuyebin.com/">Prof. Yebin Liu</a> in the field of AIGC from Oct 2018 to May 2019. 
    I have also been a research intern at <a href="https://fuxi.163.com/">Netease's Fuxi AI Lab </a> and <a href="https://ailab.bytedance.com/sdk">ByteDance's AI Lab</a> respectively in 2021 and 2020.
</p>
    <p>
        Currently, my research interests lie primarily in the area of <strong>deep reinforcement learning</strong> and <strong>deep generative models</strong>. 
        My lifelong goal is to build a scalable, impenetrable, and adaptable decision-making engine that could relieve human from tedious tasks and elevate their work efficiency.
        My current progress includes authoring <strong>Tianshou</strong>: A highly modularized deep reinforcement learning library <a href="https://github.com/thu-ml/tianshou"><img alt="GitHub stars" src="https://img.shields.io/github/stars/thu-ml/tianshou?style=social" class="github"></a>, designing large-scale Online RL system for mastering MOBA games (see <a href="#competitions"> Competitions</a>), and bridging the gap between RL theories and generative modeling methods such as LLM/diffusion.
    </p>

<h2>Selected Publications</h2>
<strong> RL Infra: </strong>
<ul>
<li>
    <a href='https://arxiv.org/abs/2107.14171'> Tianshou: A Highly Modularized Deep Reinforcement Learning Library </a> <br>
    <!-- <font color="#FF0000">Oral (Accept rate~1.7%)</font> <br> -->
    Jiayi Weng*, <strong>Huayu Chen*</strong>, Dong Yan, Kaichao You, Alexis Duburcq, Minghao Zhang, Yi Su, Hang Su, Jun Zhu <br>
    Journal of Machine Learning Research <strong>(JMLR)</strong> <br>
    <a href='https://github.com/thu-ml/tianshou'>[Code: <font color="#FF0000">8.2k Stars</font>]</a>
</li>
</ul>
<strong> RL for LLM: </strong>
<ul>
    <li>
        <a href='http://arxiv.org/abs/2402.05369'>Noise Contrastive Alignment of Language Models with Explicit Rewards </a> <br>
        <strong>Huayu Chen</strong>, Guande He, Lifan Yuan, Ganqu Cui, Hang Su, Jun Zhu <br>
        Annual Conference on Neural Information Processing Systems <strong>(NeurIPS 2024)</strong> <br>
        <a href='https://github.com/thu-ml/Noise-Contrastive-Alignment'>[code]</a>
    </li>
    <li>
        <a href='https://arxiv.org/abs/2503.15558'>Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning </a> <br>
        <strong>Nvidia Group</strong> (Contributing to VLM RL training) <br>
        <a href='https://research.nvidia.com/labs/dir/cosmos-reason1/'>[project page]</a>
        <a href='https://github.com/nvidia-cosmos/cosmos-reason1'>[code]</a>
    </li>
    <li>
        <a href='https://arxiv.org/abs/2502.01456'>Process Reinforcement through Implicit Rewards   </a> <br>
        Ganqu Cui, Lifan Yuan, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, <strong>Huayu Chen</strong>, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding <br>
        <a href='https://arxiv.org/abs/2502.01456'>[Preprint]</a>
        <a href='https://github.com/PRIME-RL/PRIME'>[Code: <font color="#FF0000">1.4k Stars</font>]</a> 
    </li>
    <li>
        <a href='https://arxiv.org/abs/2412.01981'>Free Process Rewards without Process Labels   </a> <br>
        Lifan Yuan, Wendi Li, <strong>Huayu Chen</strong>, Ganqu Cui, Ning Ding, Kaiyan Zhang, Bowen Zhou, Zhiyuan Liu, Hao Peng <br>
        <a href='https://arxiv.org/abs/2412.01981'>[Preprint]</a>
        <a href='https://github.com/PRIME-RL/ImplicitPRM'>[code]</a>
    </li>
</ul>
<strong> RL for Vision (Diffusion & AR): </strong>
<ul>
    <li>
        <a href='http://arxiv.org/abs/2501.15420'>Visual Generation Without Guidance  </a> <br>
        <strong>Huayu Chen*</strong>, Kai Jiang*, Kaiwen Zheng, Jianfei Chen, Hang Su, Jun Zhu <br>
        <a href='http://arxiv.org/abs/2501.15420'>[Preprint]</a>
        <a href='https://github.com/thu-ml/GFT'>[code]</a>
    </li>

    <li>
        <a href='https://arxiv.org/abs/2410.09347'>Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment   </a> <br>
        <strong>Huayu Chen</strong>, Hang Su, Peize Sun, Jun Zhu <br>
        International Conference on Learning Representations <strong>(ICLR 2025)</strong> <br>
		<font color="#FF0000">Oral (Accept rate~1.8%)</font> <br>
        <a href='https://github.com/thu-ml/CCA'>[code]</a>
    </li>

    <li>
        <a href='https://arxiv.org/abs/2304.12824'>Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning </a> <br>
        Cheng Lu*, <strong>Huayu Chen*</strong>, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu <br>
        International Conference on Machine Learning <strong>(ICML 2023)</strong> <br>
        <a href='https://github.com/ChenDRAG/CEP-energy-guided-diffusion'>[code]</a>
        <a href='./images/icml2023_poster_CEP_fix2_20230715172121.pdf'>[poster]</a>
    </li>
</ul>

<strong> RL for Embodied AI (Diffusion Policy): </strong>
<ul>
    <li>
        <a href='https://arxiv.org/abs/2410.07864'>RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation   </a> <br>
        Songming Liu, Lingxuan Wu, Bangguo Li, Hengkai Tan, <strong>Huayu Chen</strong>, Zhengyi Wang, Ke Xu, Hang Su, Jun Zhu <br>
        International Conference on Learning Representations <strong>(ICLR 2025)</strong> <br>
        <a href='https://rdt-robotics.github.io/rdt-robotics/'>[project page]</a>
        <a href='https://github.com/thu-ml/RoboticsDiffusionTransformer'>[code]</a>
    </li>

    <li>
        <a href='https://arxiv.org/abs/2407.09024'>Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control </a> <br>
        <strong>Huayu Chen</strong>, Kaiwen Zheng, Hang Su, Jun Zhu <br>
        Annual Conference on Neural Information Processing Systems <strong>(NeurIPS 2024)</strong> <br>
        <!-- <a href='https://github.com/thu-ml/Noise-Contrastive-Alignment'>[code]</a> -->
    </li>

    <li>
        <a href='https://arxiv.org/abs/2310.07297'>Score Regularized Policy Optimization through Diffusion Behavior </a> <br>
        <strong>Huayu Chen</strong>, Cheng Lu, Zhengyi Wang, Hang Su, Jun Zhu <br>
        International Conference on Learning Representations <strong>(ICLR 2024)</strong> <br>
        <a href='https://github.com/thu-ml/SRPO'>[code]</a>
        <a href='./images/iclr2024-poster-v1mine.pdf'>[poster]</a>
    </li>

    <li>
        <a href='https://arxiv.org/abs/2209.14548'>Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling</a> <br>
        <strong>Huayu Chen</strong>, Cheng Lu, Chengyang Ying, Hang Su, Jun Zhu <br>
        International Conference on Learning Representations <strong>(ICLR 2023)</strong> <br>
        <a href='https://github.com/ChenDRAG/SfBC'>[code]</a>
        <a href='./images/iclr2023-poster-v1mine.pdf'>[poster]</a>
    </li>
</ul>

* indicates co-first authors.
<h2 id="competitions">Competitions</h2>
<ul>
    <li>
        <strong>First place (two years in a row) in Tencent's multi-agent RL competition of Honor of Kings (王者荣耀), <font color="#FF0000">final win rate: 99.2%</font>  </strong>, 2021-2023 <br>
        <a href='https://www.tencent.com/en-us/articles/2201392.html'>[news]</a>
        <a href='https://aiarena.tencent.com/aiarena/zh/match/aiarena-competition-3rd'>[webpage]</a>
    </li>
    <li>
        Second place in DJI's Robomaster Sim2Real Challenge, ICRA 2022
    </li>
    <li>
        First place in the 30th International Design Contest (IDC Robocon 2019, MIT), 2019
    </li>
    <li>
        First place in the 20th Electronic Design Competition at Tsinghua University, 2018
    </li>
    <li>
        First place in the 1st Artificial Intelligence Challenge in Tsinghua University, 2017
    </li>
</ul>

<h2>Honors & Awards</h2>
<ul>
    <li>
        HUAWEI-Tsinghua Scholarship, 2023
    </li>
    <li>
        '84' Future Innovation Scholarship, 2023
    </li>
    <li>
        <strong>Outstanding Undergraduate in Beijing, 2021</strong>
    </li>
    <li>
        BaoGang Scholarship (Awarded to ~500 students in China every year), 2021
    </li>
    <li>
        Student Of The Year, in Dept. of Automation, Tsinghua University, 2020
    </li>
    <li>
        <strong>China National Scholarship, 2019</strong>
    </li>
    <li>
        Excellence Award for Technological Innovation, Tsinghua University, 2019
    </li>
    <li>
        <strong>'129' Scholarship (Highest honor for 2nd year students in the Dept. of Automation, Tsinghua University), 2018</strong>
    </li>
    <li>
        1st Prize in the 35th China Regional College Students Physics Competition, 2018
    </li>
    <li>
        1st Prize in the 30th National Physics Olympiad, 2016
    </li>
</ul>


<h2>Services</h2>
<li>Reviewer for ICLR, NeurIPS, ICML, AISTATS, AAAI, etc.</li>

<li>President of Student Association of Science and Technology, Dept. of Automation, Tsinghua University, 2020-2021</li>
 

<h2>Teaching</h2>
2023 Spring, TA in <strong>Statistical Learning Theory and Applications</strong>, instructed by <a href="http://ml.cs.tsinghua.edu.cn/~jun/">Prof. Jun Zhu</a><br>

</div>

<div id="footer">
	<div id="footer-text"></div>
</div>
&copy 2024 Huayu Chen

</body>

</html>