IBM Technology to search Internet for audio and video

IBM is developing search technology that could allow Internet users to archive important news footage, or retrieve scenes from a favorite television show, CNET News reported.

Researchers at IBM are attempting to create a search engine, code-named “Marvel,” that will retrieve video and/or audio clips that can't easily be retrieved on the Internet.

A person using IBM technology will be able to click on a sample shot of, say, a presidential debate, or describe a scene (“two guys, podiums”), and get back relevant clips from the thousands of hours of audio and video that gets generated by broadcasters, film studios and, conceivably, individuals every year.

Though current search engines like Google and Yahoo can serve up video clips or images, they really aren't searching on the images contained in the files. Instead, they rely on metadata attached to the files, and thus only find the small number of files that have been properly tagged.

Marvel is designed to automatically categorize and subsequently retrieve clips using modifiers like “outdoor,” “indoor,” “cityscape” or “engine noise” that describe the action in the clip. The Marvel research team showed off the first prototype this summer. The prototype system can scan through a database of more than 200 hours of broadcast news video and use 100 different descriptive terms to classify and identify scenes. A query takes about two to three seconds.

IBM has formed a committee with CNN, the BBC and organizations like the Getty library to gather a quantity of video files for testing. A full-fledged, functional Marvel-based search engine may not be available for another three to five years.

For more information, visit www.ibm.com.

Back to the top