New Group Aims to Develop Standards Using AI to Improve Data Compression

(Image credit: MPAI)

GENEVA, Switzerland—A new group tasked with using artificial intelligence to improve the efficiency of data compression was announced here on Sept 30. Several forces were behind the establishment of the international non-profit “Moving Picture, Audio and Data Coding by Artificial Intelligence” (MPAI) organization.

One driving force is the need to have an organization responsive to industry needs with the mission to develop data coding standards for a range of applications with AI as its core enabling technology. In the past, the sheer reduction of the amount of data—i.e. compression—has been the success factor for a variety of businesses that range from broadcasting to telecommunications, broadcasting, IT and related industries.

In response to the demand for more compression, MPAI plans to develop AI-enabled standards that further improve the coding efficiency of data types that have already benefited from compression and bring the benefits of coding to new data types. An example of AI-enabled coding is to “bring out” aspects of the data semantics relevant to an application.

The second driving force is the need to overcome the limitations of the Fair, Reasonable and Non-Discriminatory (FRAND) licensing declarations, a burning issue for many standard developing organizations and their industries. MPAI plans to address this problem by developing, for each MPAI standard, a “framework license,” (i.e. the business model), without values, dates and percentages, that standard essential patent (SEP) holders intend to use to monetize their patents, eventually adopted in the standard.

USE CASES

Immediately after the announcement of MPAI’s creation last summer, a group of industry colleagues collaborated to create a set of use cases. One project that is quickly taking shape is Context-based Audio Enhancement (MPAI-CAE), designed to improve the user experience for a variety of uses such as entertainment, communication, teleconferencing, gaming, post-production, restoration etc., in a variety of contexts such as in the home, in the car, on-the-go, in the studio etc.

Two more projects are in the pipeline. The first is Integrative AI-based Analysis of Genomic/Sensor Experiments (MPAI-GSA) that defines a framework where AI-based or traditional processing components available for free or at a cost can be combined in application-specific “processing apps,” thus creating a horizontal market. The second project is AI-Enhanced Video Coding (MPAI-EVC), a video compression standard that substantially enhances the performance of a traditional video codec by improving or replacing traditional coding tools with AI-based tools.

Additional projects in development include:

Server-based Predictive Multiplayer Gaming (MPAI-SPG) is designed to minimize the visual discontinuities experienced by multiplayer online gaming players by collecting data from the clients involved in a particular game and feeding them to an AI-based system that predicts each individual participant’s moves.
Multi-Modal Conversation (MPAI-MMC), a standard that defines a framework of AI-based processing components, such as fusion of multimodal input, natural language understanding and generation, speech recognition and synthesis, emotion recognition, gesture recognition, intention understanding and knowledge fusion. Appropriate orchestration of these components enables forms of communication with machines that emulates human-to-human communication in completeness and intensity.
Compression and Understanding of Industrial data (MPAI-CUF), a standard to filter and extract key information from the flow of data produced companies from data received from outside the company, e.g. invoices, sales, raw materials etc., or generated from the inside, e.g. payrolls, product models or issued because of regulation compliance.

Work in MPAI is moving fast. Several projects mentioned above (e.g. MPAI-CAE, MPAI-GSA, MPAI-SPG and MPAI-MMC) share a common component, which combines different AI-based processing modules to respond to the need of applications, that has been called AI Framework MPAI-AIF.

ROBUST AI RESEARCH ALREADY UNDERWAY

MPAI's reliance on AI-based data coding is not only justified by the promises of higher performance, but also by the expected ability to tap into the results of thriving research in the AI community in such areas as:

Representation learning, the discovery of data coding effective in solving AI tasks;
Transfer learning, the adaptation of an AI model to work with different data;
“Edge AI,” the deployment of AI models to the edge;
Model integration, the creation of larger AI models by combining simpler models; and
The ability to reproduce performance, giving an AI model the same level of performance in different contexts.

AI is not vaporware, but rather a stream of technologies fed by global research. MPAI believes that, to ensure the success of its standards in the fast-evolving AI field, it must leverage its connection with academia and industrial research. Indeed, approximately 40% of the current MPAI members are academic and research institutions.

Any entity, such as a corporation and individual firm, partnership, university, governmental body or international organization supporting the mission of MPAI may apply for membership, provided that it is able to contribute to the development of standards for the efficient use of data. Individuals representing technical departments of academic institutions may apply for Associate Membership, stating their qualification in their application.

For further information, visit https://mpai.community/ and/or contact secretariat@mpai.community.

Leonardo Chiariglione has been at the forefront of several initiatives that have helped shape media technology and business as we know them today. Among these the Moving Picture Experts Group (MPEG), which he founded and chaired for 32 years. He is the CEO of CEDEO.net, a company providing advanced media technologies and solutions and advising major multinationals on matters related to digital media.