Mesh-Pro: Asynchronous Advantage-guided Ranking Preferenc...

Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation

arXiv:2603.00526v1 Announce Type: new Abstract: Reinforcement learning (RL) has demonstrated remarkable success in text and image generation, yet its potential in 3D generation remains largely unexplored. Existing attempts typically rely on offline direct preference optimization (DPO) method, which suffers from low training efficiency and limited generalization. In this work, we aim to enhance both the training efficiency and generation quality of RL in 3D mesh generation. Specifically, (1) we design the first asynchronous online RL framework tailored for 3D mesh generation post-training efficiency improvement, which is 3.75$\times$ faster than synchronous RL. (2) We propose Advantage-guided Ranking Preference Optimization (ARPO), a novel RL algorithm that achieves a better trade-off between training efficiency and generalization than current RL algorithms designed for 3D mesh generation, such as DPO and group relative policy optimization (GRPO). (3) Based on asynchronous ARPO, we propose Mesh-Pro, which additionally introduces a novel diagonal-aware mixed triangular-quadrilateral tokenization for mesh representation and a ray-based reward for geometric integrity. Mesh-Pro achieves state-of-the-art performance on artistic and dense meshes.

相关推荐

沧州明珠：拟向控股股东广州轻工定增募资不超14亿元

Adaptive Confidence Regularization for Multimodal Failure Detection

Electric Vehicle User Charging Behavior Analysis Integrating Psychological and Environmental Factors: A Statistical-Driven LLM based Agent Approach

Reasoning or Rationalization? The Role of Justifications in Masked Diffusion Models for Fact Verification

XAI-enhanced Comparative Opinion Mining via Aspect-based Scoring and Semantic Reasoning

珠海冠宇：预计1—2月营业收入同比增长45%至54%