AI and the News Business

(Image credit: Getty)

AI seems to be everywhere these days. Predictions abound and range widely, but very few news outlets stop and analyze what AI actually is, and how ChatGPT—the innovation that seems to have brought the topic to the center of our attention—differs from the things that preceded it. 

So before we dive into predictions, I'd like us to pause and define our terms.

A Brief History of AI
Artificial intelligence is a broad term that refers to all the methods used to give a computer the appearance of intelligent thought. In the old days, most AI systems were rule-based - they followed specific instructions on how the system needs to behave in every situation. 

Over the past 2 decades, the field went through a major upheaval as "machine learning" gradually supplanted rule-based systems. In this new paradigm, machines behave as miniature model brains (or simply "models"). A model can learn to perform certain tasks by ingesting data and finding patterns in it. 

This approach is similar to the way children learn—for example, given enough photos of dogs and cats, a child (and a computer-vision model) can learn how to tell the difference between a dog and a cat.

Most recently, we have seen great advances in a narrow field of machine learning called natural language processing. First, with the advent of transformer models, and now with the creation of the large language model (LLM).

Large language models (like GPT4) are trained on very large sets of data—essentially the entire internet. They are able to produce natural-looking text by following patterns that have occurred in the (human-written) text they’ve ingested during their learning phase. 

Incorporating the GPT4 large language model into a chatbot gave us ChatGPT. 

The Strengths and Weaknesses of LLMs
The way LLMs are trained makes them incredibly good at performing certain tasks. They can recite everything from Shakespeare to particle physics journals, and to paraphrase, summarize, explain and illustrate the entire corpus of human knowledge as only a world-class expert could.

At the same time, these models cannot distinguish between real knowledge and complete nonsense, except through a feedback process that teaches them what people like (and not what is true). The result is that these models hallucinate a lot - and usually err in the direction of telling people what they want to hear.

LLMs are also unable to tell the difference between original writing and plagiarism, and often write text that is itself plagiarized (without knowing it).

Finally, these models are incapable of originality. Everything they output is some version of text they have encountered in the past, fine-tuned to make people like the output. 

How LLMs Will Be Used in the News Business
The effect of LLMs on the news business is already palpable, and will continue to grow with each day. 

Most often, we see them used in one of three ways:

  • Replacing background research (to save time)
  • Filling in specific paragraphs / sections of an article that require no originality (for instance, covering the history of a past event)
  • Writing low-quality content that requires no originality or truthfulness—clickbait, cheap entertainment, sensationalist gossip, etc

It is this third use-case that I'd like to hone in on, because it is likely to change the course of the entire industry over the next five years. Since most content is monetized through ads, and most ads pay per click or per view—clickbait and other forms of junk content already make up the majority of the internet.

The introduction of LLMs into this process will reduce the cost of producing low-quality content by an order of magnitude. As a result, we will probably see even more junk content created, and at the same time - less jobs for people creating any kind of content. 

What Part of the News Business is Safe (for now)
Real journalism requires investigation, thought, interaction with other humans, and original writing. ChatGPT can do none of these things.

Nevertheless, if low-quality content multiplies, it will likely pull some clicks away from higher-quality content—resulting in budget cuts and layoffs throughout the industry. 

Finally, while these models are great at manipulating text, they do nothing to solve the uncanny valley problem—whereby robots that appear almost (but not quite) human are actually less appealing than either real humans or robot-like robots. 

In other words, news anchors—the people we actually look at when we watch the news—are probably safe for a few more decades.

The greatest threat posed by the latest advances in AI is their ability to generate low-quality content, which is likely to make the entire internet look like the spam folder in your email account. We are likely to see more and more junk chasing fewer and fewer clicks, which will result in significant budget cuts at all levels of content creation (including journalism). 

Low-quality content creators will suffer because their ability to generate much more content will mean that fewer of them are required; high-quality content creators will suffer because fewer people will be able to find their content in the vast ocean of junk that surrounds it.

It is my personal belief that the only way to ameliorate this risk is by developing AI-based filters that allow readers to filter-out junk, clickbait, and other forms of low-quality content. Such filters would serve the interests of the readers, and they would also serve the interest of journalists who want to create high-quality content that readers will actually find.

This is why I have personally devoted my life to developing such a system, and have founded a public benefit corporation that commercializes it. 

Alex Fink

Alex Fink is the Founder and CEO of the Otherweb, a Public Benefit Corporation that helps people read news and commentary, listen to podcasts and search the web without paywalls, clickbait, ads, autoplaying videos, affiliate links, or any other junk. The Otherweb is available as an app (ios and android), a website, a newsletter, or a standalone browser extension.