Generative AI & Fictionality: How Novels Power Large Lang...

Generative AI & Fictionality: How Novels Power Large Language Models

arXiv:2603.01220v1 Announce Type: new Abstract: Generative models, like the one in ChatGPT, are powered by their training data. The models are simply next-word predictors, based on patterns learned from vast amounts of pre-existing text. Since the first generation of GPT, it is striking that the most popular datasets have included substantial collections of novels. For the engineers and research scientists who build these models, there is a common belief that the language in fiction is rich enough to cover all manner of social and communicative phenomena, yet the belief has gone mostly unexamined. How does fiction shape the outputs of generative AI? Specifically, what are novels' effects relative to other forms of text, such as newspapers, Reddit, and Wikipedia? Since the 1970s, literature scholars such as Catherine Gallagher and James Phelan have developed robust and insightful accounts of how fiction operates as a form of discourse and language. Through our study of an influential open-source model (BERT), we find that LLMs leverage familiar attributes and affordances of fiction, while also fomenting new qualities and forms of social response. We argue that if contemporary culture is increasingly shaped by generative AI and machine learning, any analysis of today's various modes of cultural production must account for a relatively novel dimension: computational training data.

相关推荐

SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

龙虾再进化！强化飞书表格技能，25.2万星登顶超越React/Linux

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

佑驾创新：与宁德时代（上海）智能科技有限公司达成战略合作

上期所调整燃料油期货相关合约交易限额