Cite Your Sources, AI

2023-05-04

In a recent note of mine, I quoted Jaron Lanier on AI chatbots:

There are two ways this could go. One is that we pretend the bot is a real thing, a real entity like a person, then in order to keep that fantasy going we’re careful to forget whatever source texts were used to have the bot function…The other way is you do keep track of where the sources came from. And in that case a very different world could unfold…

In my brain, this overlapped with Chris’ recent post where he laments Google’s Bard and how it deviates from the search engine’s traditional model of pointing people to individual websites.

Google should be encouraging and fighting for the open web. But now they’re like, actually we’re just going to suck up your website, put it in a blender with all other websites, and spit out word smoothies for people instead of sending them to your website.

It’s intriguing to me that middle school students are seemingly held to a higher standard of “cite your sources” than today’s interfaces to large language models. What happened to the idea of authorship?

The individual website is deemphasized in order to emphasize the faceless, nameless crowd. We get “word smoothies” powered by LLMs that obscure attribution resulting in, as Lanier puts it in his book You are not a Gadget, “a digital flatting of expression into a global mush”.

Which brings me to an intriguing tool I stumbled upon: Phind (shoutout to Sara for the the link). While I don’t know much about the company behind it, the tool is touted as “The AI search engine for developers” and its design is a fresh, alternative perspective on what an LLM interface could be.

In contrast to an AI chatbot like Bard or OpenAI, which takes input and spits back an almost conman-like answer laced with certainty to instill confidence but empty of any cited evidence for its claims, Phind preserves the idea of authorship by presenting you with a summarized answer and credit to its source material.

For example, ask a question like: “How do you repeat a character n times in python?” And its response often follows a template like, “Well, according to StackOverflow [link] you can do this [code]. Or, as an alternative, example.com [link] suggests this [code]. One more consideration from StackOverflow [link] is this [code].”

Screenshot of phind.com circa May 2023

It’s kind of like asking a question to an academic who will try provide a response filled with nuance and sources, e.g. “Well there are various schools of thought on this. You can do X espoused by camp A, or you can do Y espoused camp B. There’s also Z which is espoused by person C.”

In contrast to Chris’ assessment of Google’s direction with Bard – “that’s just fuckin’ rude” — this seems quite the opposite: that’s just fuckin’ polite.