Experience
Between June 2022 and May 2023, I served as a Reinforcement Learning Algorithms Engineer at
InspirAI. I put forward and optimized a general artificial intelligence modeling paradigm suitable for card games,
which was successfully deployed in Hearthstone, Dou Dizhu (defeated professional players), and
Guan Dan. Notably, The Doudizhu AI has been launched on the Taptop platform.
In the summer of 2021, I had the opportunity to intern as a Research Engineer at Baidu AI Cloud in
Beijing. I developed an innovative
multi-agent cooperative adversarial algorithm, which we termed Expert Data-Assisted Multi-Agent
Proximal Policy Optimization (EDA-MAPPO). Our work finally released a video
showing the performance of our algorithm, which has been published the Source Code.
At the same time, we called "superfly" team completed a machine learning for combinatorial optimization competition (9/23).
|
|
Understanding World Models through Multi-Step Pruning Policy via Reinforcement Learning
Zhiqiang He,
Wen Qiu,
Wei Zhao,
Xun Shao,
Zhi Liu,
Information Sciences, 2024, Source Code, (IF=8.1)
Parallel Multi-Step Pruning Policies enhance diversity Sampling. (Analysis of convergence theory for MSPP and its PG Theorem.)
|
|
Erlang planning network: An iterative model-based reinforcement learning with
multi-perspective
Jiao Wang,
Lemin Zhang,
Zhiqiang He,
Can Zhu,
Zihui Zhao,
Pattern Recognition, 2022, Source Code, (IF=8.5)
Bi-level reinforcement learning in Model-Based Reinforcement Learning.
|
|
Control Strategy of Speed Servo Systems Based on Deep Reinforcement
Learning
Pengzhan Chen,
Zhiqiang He,
Chuanxi Chen,
Jiahong Xu,
Algorithms 11, no. 5: 65., 2018, Source Code,
(Cited 50 times)
First paper applied Reinforcement Learning in Jump Speed Servo System.
|
|