With Generative Language, Nobody Knows You're a Dog

May 15, 2024

UKAI Projects · Local Disturbances - Shorts #32 - With Generative Language, Nobody Knows You're a Dog

The more time we spend with generative language tools, the more certain we are that they will have profound implications for the world we live in and for how we understand that world.

The printing press is often held up as an exemplar of how technological innovations can transform societies. Johannes Gutenberg’s innovation in the 15th century had significant and far-reaching impacts on language in the decades following its popularization. Our Poetics of Synthetic Language residency is planning a visit to Milan’s printing and publishing industry in Fall 2023 to explore these connections.

At a high level, the printing press led to the elimination of regional dialects and decreases in variations in language generally through the establishment of standardized spellings, grammar, and punctuation in many languages. However, this pattern was paralleled by an emergence of vernacular languages. Latin had been the primary language for religious, academic, and legal texts. As more people had access to books, the demand for literature in local languages increased. This led to a surge in the production of books in vernacular languages and ultimately contributed to the growth and development of these languages.

Other demonstrated impacts of the printing press included improved literacy, a greater dissemination of ideas, phrases, and concepts, and a democratization of knowledge.

While the printing press had many positive impacts, it also faced several criticisms that echo concerns of the current moment. Centres of power railed against perceived threats to religious authority and concerns about the spread of controversial ideas and misinformation. Other anxieties involved fears that mass production of texts would lead to a loss of artistic quality and craftsmanship, to threats on intellectual property, and to a general decline in the oral tradition.

I am becoming increasingly curious about what other cultural transformations sit latent in the increased ubiquity of artificial language. The pace of transformation will far surpass the dissemination of the printing press and the nature of our connected world differs considerably. What follows is a (very) high-level look at some areas of interest and artistic rese

Linguistic Relativity: The Sapir-Whorf Hypothesis is a highly contested theory that holds that the structure of a language shapes its speakers' worldview and cognition. As such, speakers of different languages may perceive and categorize the world differently, leading to distinct social, economic, and political perspectives. My Japanese ability is sufficient to survive most day-to-day encounters. I have long been struck by the paucity of abstract nouns in Japanese. I also love how onomatopoeia is leveraged to animate complex internal emotional states. These qualities of language, according to this theory, alter how a Japanese language speaker experiences the world.

Most generative language is produced through predictive models trained on digitally available text. It therefore relies on a narrow ontology that it then proceeds to replicate. How will our perspectives and sense of meaning be altered when the language we encounter increasingly relies on a limited cross-section of potential structures and forms? While linguistic relativity generally has been applied to variations between languages, local dialects and vernacular patterns risk being smoothed out by models looking to optimize.

Bourdieu's Theory of Linguistic Capital: Pierre Bourdieu argued that language functions as a form of capital in society. Individuals who possess "linguistic capital" (i.e., the ability to speak and write in a prestigious language or dialect) are more likely to succeed in various social, economic, and political domains. Generative language makes possible the production of text that mimics the linguistic capital of other groups. The arts, for example, relies on particular ways of saying things and access to opportunities is conditional on one’s ability to be fluent in these ever-evolving idioms. I’ve recently experimented with using ChatGPT to write grant applications based on my notes and bullet-pointed ideas, and with adequate instruction, it produces some splendid art-speak.

Legal documents and contracts are often experienced as opaque and impenetrable, yet by cutting and pasting a contract into ChatGPT I can get a summary of the salient points in language I can understand. I believe that AI will serve to de-professionalize, as the effort required to maintain one’s corpus of prestigious language will no longer be seen as a worthwile investment. Of course, humans will find other ways to stratify, separate, and exclude. Just as a home full of handmade furniture signals wealth or handiness, perhaps those seeking a more elevated station will develop tricks to signal that their language is ‘natural’. One commentator remarked that business emails will increasingly be filled with profanity, as it is a clear signal that the text was not produced by a mass-market language model.

Social Identity Theory: This theory posits that people categorize themselves and others into social groups based on shared characteristics, such as language. Language can serve as a marker of social identity, helping to establish and maintain group boundaries, and potentially leading to in-group favoritism and out-group discrimination. This relates to the previous point, but with a more defensive positioning. The production and reinforcement of group boundaries through language has a long history. Slang and paralinguistic features of language serve to make groups less legible to others while affirming a shared identity. This pattern is unlikely to disappear, even if AI models decode these adaptations in real time. How will people wishing to restrict access to what they say in public go about camouflaging their intentions? My Please Don’t Understand This project was aimed at this very question. Of course, any exploration and publication of these adaptations limits the effectiveness of these tactics going forward.

Language Ideology: Language ideologies are the beliefs and attitudes people hold about language and its role in society. These ideologies can perpetuate and reinforce social, economic, and political hierarchies by privileging certain languages or dialects and stigmatizing others. This is one area where the liberatory potential of large language models seems closest at hand. I work extensively with folks that speak English as a second language. I see daily how small gaps in fluency are stigmatized. An organization I worked with had grant-writers born and raised in Cameroon and struggled to get funding for programs. When we requested the notes from the juries, we found comments that their use of adjectives such as “amazing” or “incredible” in the application signalled a lack of professionalism and therefore reduced the application’s score. Tools like GPT-4 will allow second-language speakers to produce written text that is infallibly bland, inoffensive, and consistent with the language ideologies of those doing the evaluating. This does depressingly little to resolve the ideologies that make these counter-strategies necessary, however.

Critical Discourse Analysis: This analytical approach examines how power relations and social inequalities are constructed, maintained, and challenged through language use. A CDA lens has helped to draw attention to how large language models reproduce biases and inequalities. It also offers a means for exploring control over language and discourse. As LLMs become more widely adopted and integrated, private companies and governments will exert significant control over language production and dissemination. This will lead, whether intentionally or tacitly, to manipulation or censorship of information, impacting the freedom of expression and access to diverse viewpoints.

There is also concern that the proliferation of LLMs will erode critical thinking. As LLMs become more capable of generating coherent and persuasive content, there is a risk that people may rely more on these models for information, potentially diminishing the importance of critical thinking and independent analysis.

This is a very high-level overview of some of the potential cultural implications of large language models. What are you noticing? What is missing? Do the opportunities surpass the threats? Let me know if the comments!

Back to blog

With Generative Language, Nobody Knows You're a Dog

Leave a comment

Country/region

Leave a comment

Subscribe to our emails