AI & The Science of Creativity

2023-04-10

In an effort to better understand how all this AI stuff works, I’ve been chipping away at Stephen Wolfram’s meticulous piece, “What Is ChatGPT Doing … and Why Does It Work?”.

As you likely know, ChatGPT works by guessing at the next word. Here’s Stephen:

when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again “given the text so far, what should the next word be?”—and each time adding a word

What strikes me in Stephen’s description is how it determines what word to guess next. Here’s Stephen again:

at each step it gets a list of words with probabilities. But which one should it actually pick…? One might think it should be the “highest-ranked” word (i.e. the one to which the highest “probability” was assigned). But this is where a bit of voodoo begins to creep in. Because for some reason—that maybe one day we’ll have a scientific-style understanding of—if we always pick the highest-ranked word, we’ll typically get a very “flat” essay, that never seems to “show any creativity” (and even sometimes repeats word for word). But if sometimes (at random) we pick lower-ranked words, we get a “more interesting” essay.

It might seem like “voodoo” from a purely scientific perspective that can’t understand what it can’t measure, but uncertainty in creative endeavors is an age-old problem one wrestles with.

Why does one thing, which seems logically destined for greatness, fall flat while another seemingly obscure, no-potential-at-all thing find resonating, exponential success?

Maybe we’ll never know. But it sounds like AI has encountered, and is now trying to tame and logarithmically encode, the mystery of the artist’s muse.

I’m very much a novice at how all this works, so it’s possible I’m way off the mark. But perhaps there’s a lesson here, one that deals with all those metrics we collect and scour trying to glean a modicum of inspiration for what to do in our products.

Maybe following metrics that rise to the top isn’t always the best idea. Maybe the highest ranked, most popular thing will only lead you to something “flat” while a lower ranked, less popular thing can lead you somewhere “more interesting”.

As Steph mentions, we’re still trying to understand this in a scientific way.

It’s worth emphasizing that there’s no “theory” being used here; it’s just a matter of what’s been found to work in practice.

You don’t have to prove everything. Sometimes you gotta follow what’s interesting, follow your gut, even if you can’t immediately measure it and extrapolate it to a theory. If it proves to work, that’s its own metric, however opaque the mechanics.