The Pilot Innovation Challenge, an initiative of the National Association of Broadcasters, “recognizes creative ideas that leverage technological advances in the production, distribution and display of engaging content.” More than 150 ideas were submitted to address the challenge question, “What is an unconventional way broadcasters and other local media could serve communities?
TV Technology recently spoke with Karly Choi, marketing director for G’Audio, about the End-to-End audio solution for VR live streaming, which was named a finalist in the Challenge. Winners will be announced Nov. 13.
TV TECHNOLOGY: Please describe the End-to-End Audio Solution for VR Live Streaming and the technology behind it.
Karly Choi: G’Audio Lab’s End-to-End Audio Solution for VR Live Streaming is a spatial audio solution that brings a realistic sonic experience to VR livestreaming. The current VR audio solutions in the market serve the VR content playback for streaming and sideloading, but not livestreaming. G’Audio Lab tackles this issue through our innovative encoding and decoding system for VR livestreaming.
There are two different ways in which audio is recorded and delivered live, so we had to take both of these scenarios into account for the encoding system. The first way is the all-in-one style where the 360 camera already has built-in Ambsionics microphone. The second s is when a 360 camera and Ambisonics microphone are used separately. G’Auido Lab has a prototype ready for the first scenario and is currently finishing the prototype for the second. We came up with another scenario for which we are still developing the prototype. This third is to record audio using the G’Mic solution, and if completed, would let the user create directional audio without the costly Ambisonics microphone. This method would also be able to synchronize audio and video. Using multiple regular omnidirectional microphones, our solution can create one Ambisonic-like audio signal that has x, y, and z directional information.
Our VR Live Streaming solution will contain the most efficient and highest-quality binaural renderer at the decoding system. Binaural rendering is needed to deliver three dimensional audio with just regular headphones. It will allow audio to be adjusted to VR user’s view while watching livestreaming video. Using a good renderer is important because the rendering process decides the quality of the final audio. Our team are experts in binaural rendering and have developed the algorithm that was adopted into the international standard, MPEG-H 3D Audio. With our algorithm and renderer, 360 video will be played with the highest quality audio while requiring the minimal computation power, which would eliminate heavily burdening a mobile VR device.
TVT: What was the inspiration behind the solution?
KC: G’Audio Lab is dedicated to bringing the most immersive and interactive 3D sound to VR experiences. Our team has been serving both content creators and distributors through our different suite of products for VR streaming and sideloading. We realized there was a growing demand for VR livestreaming, yet no existing VR livestreaming platform that offered this spatial audio. Existing distributors offered sound that was just static stereo, rather than delivering sounds from three dimensional space to fully utilize VR’s characteristics.
At NAB 2017 Show, we realized we are the team capable of building the encoding and decoding systems for VR livestreaming and can definitely commercialize the solution. We displayed the concept at our booth, and received positive feedback from broadcasting professionals interested in or already delivering 360 video. They were fascinated with how our spatial audio technology can elevate an experience, bringing it to the next level.
We were already integrating the binaural renderer at the decoding system efficiently. However, dealing with the encoding system could have been much easier. But, our ultimate goal is to replace the costly Ambisonics microphone, offering more realistic audio that is accessible to the public. We have invented an algorithm that will process the multiple regular microphone signals as one combined Ambisonics signal. Furthermore, we believe our solution will lower the current barriers facing both creators and publishers, thus enlarging the VR industry as a whole.
TVT: How do you expect this technology will affect the broadcast industry?
KC: With this technology, the broadcast industry will be able to deliver what people hear in the real world, which is how it should have been all along.
Currently, one can either listen to a concert in real life or through traditional media. What one hears at a live concert is dependent on how close he or she is to the musicians, where a person towards the back will hear more noise from the audience than one who is sitting in the front row. Through traditional media, a listener will hear what was recorded at the time in respect to where the speakers are positioned. In VR, we will be able to detect sounds where they are physically located and coming from, creating an audio experience even clearer than what is heard in a real life situation.
TVT: The description says it’s a solution “that can be adopted into any 360-degree video rendering platform,” but could it be adapted for radio broadcasts or other nontraditional VR environments?
KC: This technology opens doors and creates an entirely new way to experience sound even in a traditional flat screen environment. Instead of using an HMD, one can use either the remote control or keyboard/mouse to change the direction of one’s view. The audio will be interactive and immersive, where sound will be adjusted to what one looks at. The best part is this will all be heard through regular headphones, not with an extra device or hardware. As long as 360 video is supported in a non-VR environment, our audio solutions can be integrated in non-VR environments.
This is the same for radio broadcasts. The true 3D audio includes the height that will be heard over regular headphones, such as above and behind. Instead of traditional stereo signal, one can now deliver 3D music or multi-directional podcast stories.
TVT: What should readers know about the solution and its practical implications for media consumption?
KC: Since this new medium is all about complete immersion, ‘being there’ is the key to the experience. Even if intricate graphics dictate one is in a giant church, if one hears dry and non-echoic sound,he or she immediately knows this scene is fake. This proves that audio is half the experience if not more; it drives your emotions, as well as completes and transports users into the scene. G’Audio Lab’s solution proves how far sound can go in VR, as well as possible future avenues. Spatial audio will be maximized for sports game, music concerts, virtual tour experiences and so on, all in livestreaming.