WildActor: Unconstrained Identity-Preserving Video Genera...

WildActor: Unconstrained Identity-Preserving Video Generation

arXiv:2603.00586v1 Announce Type: new Abstract: Production-ready human video generation requires digital actors to maintain strictly consistent full-body identities across dynamic shots, viewpoints and motions, a setting that remains challenging for existing methods. Prior methods often suffer from face-centric behavior that neglects body-level consistency, or produce copy-paste artifacts where subjects appear rigid due to pose locking. We present Actor-18M, a large-scale human video dataset designed to capture identity consistency under unconstrained viewpoints and environments. Actor-18M comprises 1.6M videos with 18M corresponding human images, covering both arbitrary views and canonical three-view representations. Leveraging Actor-18M, we propose WildActor, a framework for any-view conditioned human video generation. We introduce an Asymmetric Identity-Preserving Attention mechanism coupled with a Viewpoint-Adaptive Monte Carlo Sampling strategy that iteratively re-weights reference conditions by marginal utility for balanced manifold coverage. Evaluated on the proposed Actor-Bench, WildActor consistently preserves body identity under diverse shot compositions, large viewpoint transitions, and substantial motions, surpassing existing methods in these challenging settings.

相关推荐

A Study on Building Efficient Zero-Shot Relation Extraction Models

ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation

UNICBench: UNIfied Counting Benchmark for MLLM

Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation