Bilinear representation mitigates reversal curse and enab...

Bilinear representation mitigates reversal curse and enables consistent model editing

arXiv:2509.21993v3 Announce Type: replace Abstract: The reversal curse--a language model's inability to infer an unseen fact "B is A" from a learned fact "A is B"--is widely considered a fundamental limitation. We show that this is not an inherent failure but an artifact of how models encode knowledge. Our results demonstrate that training from scratch on synthetic relational knowledge graphs leads to the emergence of a bilinear relational structure within the models' hidden representations. This structure alleviates the reversal curse and facilitates inference of unseen reverse facts. Crucially, this bilinear geometry is foundational for consistent model editing: updates to a single fact propagate correctly to its reverse and logically dependent relations. In contrast, models lacking this representation suffer from the reversal curse and fail to generalize model edits, leading to logical inconsistencies. Our results establish that training on a relational knowledge dataset induces the emergence of bilinear internal representations, which in turn support language models in behaving in a logically consistent manner after editing. This suggests that the efficacy of language model editing depends not only on the choice of algorithm but on the underlying representational geometry of the knowledge itself.

相关推荐

Bilinear representation mitigates reversal curse and enables consistent model editing

SCOUT: Fast Spectral CT Imaging in Ultra LOw-data Regimes via PseUdo-label GeneraTion

From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

CoLC: Communication-Efficient Collaborative Perception with LiDAR Completion