Zhiqiang He (何志强)

I have a master's degree from Northeastern University, and my research direction is Reinforcement Learning. My academic research journey began in Jiangxi Province Advanced Control and Key Optimization Laboratory. July 2017 to June 2019, I worked under the guidance of Professor Pengzhan Chen. Subsequently, from July 2019 to June 2022, I continued my research at the Deep Learning and Advanced Intelligent Decision-Making Research Institute , mentored by Professor Jiao Wang.

In my professional capacity, I interned as a Research Engineer at Baidu in Beijing, from June to September 2021. Subsequently, I served as a Reinforcement Learning Algorithms Engineer at InspirAI from June 2022 to May 2023.

Email  /  Scholar  /  Github  /  Zhihu  /  BiliBili

profile photo

Experience

Between June 2022 and May 2023, I served as a Reinforcement Learning Algorithms Engineer at InspirAI. I put forward and optimized a general artificial intelligence modeling paradigm suitable for card games, which was successfully deployed in Hearthstone, Dou Dizhu (defeated professional players), and Guan Dan. Notably, The Doudizhu AI has been launched on the Taptop platform.

In the summer of 2021, I had the opportunity to intern as a Research Engineer at Baidu AI Cloud in Beijing. I developed an innovative multi-agent cooperative adversarial algorithm, which we termed Expert Data-Assisted Multi-Agent Proximal Policy Optimization (EDA-MAPPO). Our work finally released a video showing the performance of our algorithm, which has been published in Bilibili and the accompanying Source Code. At the same time, we called "superfly" team completed a machine learning for combinatorial optimization competition (9/23).

Publication / Preprint

Control Strategy of Speed Servo Systems Based on Deep Reinforcement Learning
Pengzhan Chen, Zhiqiang He, Chuanxi Chen, Jiahong Xu,
Algorithms 11, no. 5: 65., 2018, Source Code, (Cited 47 times)
Erlang planning network: An iterative model-based reinforcement learning with multi-perspective
Jiao Wang, Lemin Zhang, Zhiqiang He, Can Zhu, Zihui Zhao,
Pattern Recognition, 128: 108668., 2022, Source Code, (IF=8.5)

Credits.