zh

Publications / 2024

Understanding World Models through Multi-Step Pruning Policy via Reinforcement Learning

Zhiqiang He Wen Qiu Wei Zhao Xun Shao Zhi Liu

Information Sciences · 2024 Q1 · IF 8.1 Published world-modelspolicy-gradient
Figure · Understanding World Models through Multi-Step Pruning Policy via Reinforcement Learning

Abstract

Parallel multi-step pruning policies enhance diversity sampling, with convergence analysis for MSPP and a corresponding policy gradient theorem.

Parallel multi-step pruning policies enhance diversity sampling for world-model based RL, backed by a convergence analysis and a policy gradient theorem.