Whether it’s endless parliamentary debates, evening-long soccer games, or hours-long concert recordings, broadcasters record huge amounts of material every day. Until recently, finding a specific, meaningful statement within this material was tedious and time-consuming. Additionally, time is a crucial factor in radio. If something needs to be reported in the next news broadcast, it must happen quickly.

AI is a game-changer for broadcasters needing fast turnarounds, especially when it comes to automatically transcribing recordings. For example, in the audio editor, the generated text is displayed alongside the so-called envelope (the visualization of the audio level), where editors can simply search within the text. When marked, the audio will jump to the correct timestamp, and editors can then select a segment, cut it out, create new audio from it, insert it into the broadcast schedule, and play it—all in a very short time.

Speech-to-text on a new level

Speech-to-text date back several decades, but only recently, through machine learning (ML) and AI, has the technology evolved enough to offer significant value to broadcasters. Initially, only a few spoken words were transcribed correctly, but today a recognition rate of over 90% is realistic. The example of speech-to-text clearly shows that it’s never enough to simply introduce a new technology—it needs to deliver quality results before it can be adopted and used.

A particular challenge for speech-to-text arises from the variety of ever-new topics, terms, and proper names. The blending of words from different languages is also challenging for the technology, such as incorporating foreign technical terms. Speech recognition systems are therefore trained to independently learn new words and phrases, and they’re continuously getting better at it. Overall, the quality of transcription is remarkable today. AI even provides a percentage score indicating how likely it is that the result is accurate.

Archives become treasure troves

Additional functions can be built on transcribed text. It’s very helpful if AI identifies the most important keywords and automatically adds them to the metadata. This drastically increases the usability of the content. Broadcasting companies have extensive archives that may go back to the beginnings of government records. Over the years, such archival material has largely been digitized, but it only becomes usable if it can be found. With AI, complete archives can be searched in seconds. Moreover, it’s becoming increasingly reliable to automatically determine which person is speaking and when. By combining speech and speaker recognition, it becomes possible to find a specific statement from a known person with ease.

What the future could bring

AI has already shown its value in radio journalism. Therefore, the industry is very open to what the near future may bring:

  • There is great potential, for example, in the automated post-processing of audio material, which can then be broadcast faster.
  • Filler words and hesitation sounds can be automatically edited out.
  • AI can make suggestions on how best to cut audio material, and it can make edits automatically for podcasts, short formats, and teasers on various platforms, for example.
  • AI can also create show profiles, help assemble and plan music, images, and videos thematically or by specific markets.
  • Despite AI’s much-discussed text-to-speech capability, whether we will one day see AI-generated newscasters remains unclear.

From the example of radio production, it’s clear that AI can be an ideal tool to support the work of creative people. It takes over repetitive, simple tasks so that editors can focus on more important activities. It helps editors quickly find the right information and generate more output for various platforms. People are—and remain—indispensable in this process.

Read more about CGI’s responsible use of AI and ethical considerations of AI in newsroom workflows.