Is ChatGPT a bit unsporting?

Photo: Luis Bartolomé Marcos, CC BY-SA

Novelist and essayist Walter Kirn doesn't like ChatGPT. He especially doesn't like ChatGPT screenshots:

Of all the posts I ignore on social media... the ones I ignore most thoroughly these days, almost with a vengeance, are the results of prompts to AI chatbots. The tell-tale fonts and formats of these posts allow me to spot them instantly...

It's unsurprising that a professional writer doesn't like ChatGPT. Some of the complaints are also not surprising – its default tendency towards bland, homogenized, diplomatic writing, for example.

One criticism is interestingly different (emphasis added):

Part of what bugs me about these documents, whether they’re generated in the form of college essays, poems, newspaper articles, or screenplays, is the implication that they’re ingenious, and that the people who ordered them are ingenious by association. But I am underwhelmed by the performances. When you consider that the human race has moved the ball of language down the field for millennia upon millennia using nothing but its throats and tongues and sticks with ink and graphite on their tips, the idea that advanced computer networks are able to kick the ball into the net repeatedly and with little effort, in all kinds of showy ways, isn’t as impressive as it’s made out to be.

That is, ChatGPT is unsporting.

For all the current drawbacks of large language models – in logical reasoning, basic arithmetic, confident-but-wrong facts – they are actually pretty decent at stylistic showboating on demand. You can have them rephrase arbitrary paragraphs into iambic pentameter, pirate-speak, whatever else you want. Ian Bogost has a nice article discussing this in The Atlantic.

But it's done by a kind of machine brute force, ingesting huge amounts of data, distilled into slightly less huge generative models. These models presumably have a compressed representation of stylistic patterns (what kind of representation is being studied), and can synthesize new examples of the patterns on demand. To Kirn, that's all about as interesting as watching a robot pitcher master every pitch and tirelessly throw shutouts (to mix sports metaphors).

Since my research area is AI in games, I'm reminded of perennial discussions about computer game-playing. Nowadays, people aren't impressed that computers can crush humans at chess. Even your smartphone can play a grandmaster-level game. So what? Flawed human-level chess is making a resurgence anyway.

For a long time, chess was interesting to AI researchers. The AI wasn't good. As that situation persisted for decades, it became increasingly surprising, and at times a little embarrassing for the field. Chess is a game seemingly made for a computer – it's played on a discrete grid, with precise rules that are easy to formalize, well defined win/loss conditions, accurate fast simulators – and yet for a few decades chess bots were just not that great. That was intriguing and in need of explanation. The continued lack of success also helped sustain critics' doubts about whether computers could even in principle do this kind of reasoning task.

Then the bots did become good. Once that happened, the computer chess scene stayed interesting for a time. Audiences tuned in to Kasparov v. Deep Blue for the 1996 match and 1997 rematch. Computers had a significant impact on the chess "meta", the mix of strategies humans use. But as time passed, it became less clear exactly what to take away from the whole episode. Once computers got good enough, it felt a bit deflating. Turns out, computers really are better than humans at choosing optimal actions in a well defined problem space, just as we first thought. It took longer than the more optimistic 1950s AI researchers predicted, but did happen.

Players now use chess AI in a utilitarian way, to practice games and analyze positions. You can even do that in your browser. But few are now impressed by a computer's chess play. Similar things could be said about the history of computer Go, though the current situation there isn't quite as clear.

For me, language models are at the stage where they've started working, but it's still interesting – even a little surprising – that they do. I can see already being bored by the whole thing though. In retrospect, maybe it should've been obvious that computers could conjure up sequences of words on demand in the style demanded. The novelty of the ChatGPT screenshot itself has certainly declined; I don't see them now as much as six months ago, when Kirn published the essay quoted above.

And of course text generators mimicking styles aren't new. We've had the Postmodernist Generator (1996) and SCIgen (2005), which both use hand-crafted grammars to mock third-rate jargon-filled academic writing. There's a long history of Markov chain generators too (a recent-ish one is Automatic Donald Trump, 2016), which capture short-range surface patterns. But it was hard to say that these really worked in general – the brittleness and lack of controllability quickly becomes clear if you try to generalize those systems beyond the specific joke they were making.

Now language models can synthesize style on demand more generally, which is genuinely new. Is that impressive, or unsporting?