Screenshot shows a portion of the what should I watch experience — The new What Should I Watch (WSIW) experience, released in mid-September, combines Alexa AI and Fire TV recommendations to turn Alexa into an entertainment expert who provides relevant suggestions with a conversational customer experience.

Conversational AI

The science behind the new “Alexa, what should I watch?” Fire TV experience

The phrase launches a feature built to help customers navigate an increasingly complex and diverse world of content.

By Staff writer

October 06, 2022

"What should I watch?"

In an entertainment universe filled with a rapidly expanding catalog of shows across myriad channels and apps, this might be one of the most common questions to pop up in many households. And if you are among those who have trouble keeping up with all the latest shows and pinpointing which ones are worth your time, you are not alone.

In fact, more than half of respondents in a recent survey from the consulting firm Deloitte found it difficult to access content across multiple services, and 49% were frustrated if a service failed to provide them with good recommendations. Viewers find themselves surfing … and surfing. It takes the average smart TV owner 12 minutes to land on a show, according to a 2020 survey by Tivo — and for some viewers that can take up to half an hour.

"It's kind of shocking how much time customers have to spend on finding content instead of just sitting down on the couch and jumping into a TV show or a movie that they really enjoy," said Cosmin Laslau, a technical program manager who works on spoken language understanding as part of the Amazon Alexa Entertainment team. "We wanted to leverage new technology to help solve that problem for customers."

Image shows the new Fire TV Cube, left, the Fire TV Omni QLED Series, middle, and the Alexa Voice Remote Pro, right — The What Should I Watch experience works with many Fire TV devices, including the new Fire TV Cube, left, the Fire TV Omni QLED Series, middle, and the Alexa Voice Remote Pro announced at the 2022 Devices and Services event.

The team did that by launching What Should I Watch (WSIW). The new experience, released in mid-September, combines Alexa AI and Fire TV recommendations to turn Alexa into an entertainment expert who provides relevant suggestions with a conversational customer experience. The experience also works with the new Fire TV Cube, the Fire TV Omni QLED Series, and the Alexa Voice Remote Pro announced at the 2022 Devices and Services event.

“We built WSIW to rapidly experiment with new Alexa technologies and push the envelope on discovery experiences to address the core customer need to find something interesting to watch,” explained Parthasarathi Dutta Sharma, a product manager who helped bring WSIW to customers.

WSIW displays personalized recommendations when customers ask, “Alexa, what should I watch?” or a variant of that phrase. Customers can then customize the recommendations using voice prompts (for example, “just the ones that are free to me”) or by using their remote to select filters on the screen, watch trailers, view additional information (eg genre, ratings), and initiate playback.

Building AI for the entertainment space

The What Should I Watch experience builds upon existing Alexa natural language understanding and automatic speech recognition capabilities.

"But bringing natural conversation to the entertainment domain has its own set of unique challenges," Laslau explained. Maybe a show, like The Boys or The Expanse, is ambiguously named, or a movie starts to trend that wasn't in the catalog a week or two ago. Optimizing the feature required combining core advances in AI around natural, multi-turn conversations with a fast-changing catalog.

"We are making sure those natural conversations are intelligent enough to reflect the very latest of what's happening in entertainment," he said.

The team also worked to ensure a mix of personalization based on your preferences— those British detective series you always gravitate toward — and something new that you might not have seen otherwise.

They did this by customizing Fire TV's existing recommender technology, mixing personalization with popular titles and randomizing subsets of these lists so that viewers encounter fresh ideas each time they turn on the TV.

A flywheel effect on innovation

The deep-learning-based Alexa Conversations makes it far simpler to develop the thousands of potential dialogue turns that a “What Should I Watch?” utterance might generate.

Alexa Conversations comprises three models: entity recognition (identifying Tom Cruise as an actor, for example), action prediction (utilizing the “movie searching” API to find movies), and argument filling (indicating the movies to be those with Tom Cruise).

“Alexa Conversations is designed to reduce the burden on developers, generating variations of dialogue automatically. The team has added several new features recently,” said Jiun-Yu Kao, an applied scientist within the Alexa AI Natural Understanding organization.

The WSIW experience is the first to launch with enhanced understanding of screen context.

Jiun-Yu Kao

Those include conversational Q&A which allow customers to ask broad questions about the recommended titles, such as which movies won an Oscar; a context reset function that allows a user to "start over" with a blank slate; and visual context, which enhances Alexa’s ability to respond correctly when a viewer says something like, "play the one on the left,” referencing what’s on the screen instead of naming the movie title.

“The WSIW experience is the first to launch with enhanced understanding of screen context,” Kao said. “It is also the first to combine all above-listed features for improved customer experience.”

Alexa and Fire TV science, engineering, and product teams collaborated to build the different components of the new feature.

The Amazon Fire TV Stick 4K is shown lying on its side

Looking ahead

Teams continue to work on making What Should I Watch faster and smarter.

One possibility is for users to explicitly guide Alexa by saying something like, "I'm a big sci-fi fan," or "I don't like horror movies." This type of interaction represents an opportunity for Alexa to adapt to customer engagement preferences, with some preferring to guide the service directly, and others wanting to lean back and take in recommendations.

As collaboration on the experience continues, both Alexa and Fire TV are becoming more capable. That could have a broader effect, particularly for the Alexa skill development community.

“We’re really trying to raise the bar,” Mattoso said, “and the capabilities we develop may eventually benefit third-party skill developers. Those might include improved long-term memory, better context resetting, and better visual context understanding.”

About the Author

Staff writer

The science behind the new “Alexa, what should I watch?” Fire TV experience

The phrase launches a feature built to help customers navigate an increasingly complex and diverse world of content.

Building AI for the entertainment space

A flywheel effect on innovation

Looking ahead

Related content

Work with us