Navigating the Unlimited World of Net TV

A consequence—not necessarily unintended—of the Internet-connected TV era is the inevitable avalanche of content and the concomitant challenge of finding what you want to see.

Now add in evolving navigation options—voice and gesture controls, facial recognition—and you have a completely new experience when you sit down to watch some TV.

For many years, we've recognized the limitations of the classic TV program grid, with its up-down-left-right navigation. It was barely OK in the 50-channel era and maybe even the 200-channel universe; but the current onslaught of video-on-demand content and now the integrated inclusion of Web-delivered video are making the program selection process even more complicated.


Whether it's YouTube, Netflix or any online source or multiplexed broadcast and cable channels, there's too much to see or find simply.

Many ventures are rushing in to help viewers seek whatever they want to watch, wherever the programs may reside. Companies such as Blinkx, which calls itself "the world's largest and most advanced video search engine;" Fanhattan, a digital entertainment discovery service; and Google's TV search capability offer some solutions. So do video recommendation engines such as Jinni, plus the venerable TV Guide and Rovi systems.

The Samsung booth at CES in the Central Hall of the Las Vegas Convention Center Many of these approaches are integrating traditional linear program line-ups of broadcast and cable TV channels while they try to develop a structure that doesn't overly confuse viewers who might want to find a show that is available through an online library.

Since Net-connected TV sets can deliver shows from traditional TV sources as well as broadband video hosts, the viewing options on that flat-panel display screen are unlimited. And the opportunity for confusion and frustration are also endless.

At last month's Consumer Electronics Show, exhibits were awash with video guide options—and also with interface alternatives that will supposedly simplify and make finding shows more "natural." The frustrating 48-button uber-remote controls were definitely not front-and-center.

There were plenty of integrated multiscreen systems that enable viewers to browse for shows via a tablet or smartphone, then order them to be displayed on the big-screen monitor in the room, or to be stored on a DVR for later viewing.

But the most dramatic approaches eliminated buttons and touch-based (haptic) systems, relying instead on natural language voice controls or gestures to let viewers manage their selections. This was a dramatic leap in just the last few years, some of it spurred by the Apple iPhone Siri phenomenon.

Clearly, LG, Samsung, Sony, Toshiba and many other TV makers want to shore up their interface options before the arrival of Apple TV (presumably by next year), which is anticipated to have a voice-controlled option.

The recent, rapid advances in voice control triggered my own nostalgic memories. It was at an annual International Conference on Consumer Electronics (run by the Institute of Electrical and Electronics Engineers) in the early '80s where I first encountered a Texas Instruments voice-recognition prototype—the kind that had to be customized to an individual voice and accent. Appropriately, the ICCE conference was then, as now, piggybacked to CES.


Then it was an intriguing concept. Now it's part of the TV experience. The growing focus on immersive interfaces came through at a CES panel on alternative interfaces.

Nuance introduced its voice recognition "Dragon TV" software at CES, which it plans to license to set-top box and TV set makers. Richard H. Mack Jr., vice president of Nuance Communications, which makes some of the technology behind Siri, explained how his company has developed software to understand what speech means—in other words, the context in which a viewer is searching for a show.

Nuance introduced its Dragon TV software at CES; it plans to license the software to set-top box and TV set makers.

Using Dragon TV technology, viewers could say to their TV sets: "Who's on Ellen today?" Or "Find comedies with Vince Vaughn;" or "What's on Bravo at 9 tonight?" Mack said that with increased processing power, "It's only going to get better."

Also on the panel, Adi Berenson, vice president of PrimeSense, the company that created the Kinect gesture control device for Microsoft's Xbox 360, demonstrated his company's next-generation interface, which can recognize movement in three dimensions. Berenson showed a prototype program guide that let him reach toward an on-screen movie list and "grab" a title. His motion triggered a preview clip to run.

Meanwhile on the CES show floor, LG showed an enhanced version of its "Magic Remote," the year-old gesture-controlled remote control. Now it includes voice recognition capability, allowing viewers to speak their control commands. Samsung's approach let you say "Hi TV" to turn on the set, then recognized terms like "Web browser" to take you to the online options of the Smart TV.

Computers, which are fundamentally digital TV sets with more processing power, are also stepping up their voice-response options. Lenovo unveiled its K91 smart TV, based on the Android 4.0 operating system. In addition to voice recognition, the K91 has a webcam that can be used for facial recognition to control viewing and enable advanced parental control. Expansion of such features will create greater opportunity for customization of personal TV viewing experiences.

Rovi Corp. approached the cross-platform content search option in several ways. TotalGuide G2, its second-generation content discovery system, is aimed at 'net-connect TV sets. The Rovi Multi-Source Entertainment Guides enable access to content from a variety of sources, including over-the-top video content, electronic sell-through and VOD programs, as well as linear broadcast TV, cable, satellite or IPTV content.

Hybrid integrated navigation and control systems are also popping up. Arris Group Inc., the cable technology supplier, showed off voice controls for its Moxi interactive program guide, which would let viewers speak commands into their mobile devices to change channels and search for content on a TV.

ActiveVideo Networks also privately demonstrated an ersatz system using Apple's Siri interface. An iPhone user can log into his cable system account, then speak to Siri to order shows or recordings simply by using the voice-activated interface.

Alongside such multiplatform program search and navigation systems are the evolving Automatic Content Recognition (ACR) systems such as Civilution and Audible Magic, which bring social TV features to the program selection process.

Next comes the really hard part. Customers have to figure out which search and navigation systems fit their needs, and then they'll have to determine which interfaces are available via their equipment and how the systems fit their viewing process.

It may take a while.

Clarification: An updated version of my Jan. 18 column, "Cost is Key in Zuicker's Digital Agenda" about Anthony Zuiker and his made-for-Web video series is available at

Gary Arlen is president of Arlen Communications LLC, a media/telcom research firm. He can be reached at

Gary Arlen

Gary Arlen, a contributor to Broadcasting & Cable, NextTV and TV Tech, is known for his visionary insights into the convergence of media + telecom + content + technology. His perspectives on public/tech policy, marketing and audience measurement have added to the value of his research and analyses of emerging interactive and broadband services. Gary was founder/editor/publisher of Interactivity Report, TeleServices Report and other influential newsletters; he was the long-time “curmudgeon” columnist for Multichannel News as well as a regular contributor to AdMap, Washington Technology and Telecommunications Reports; Gary writes regularly about trends and media/marketing for the Consumer Technology Association's i3 magazine plus several blogs.