Technical Blog

Encrypted Media Extensions Provide a Common Ground

Next Generation Video: Beyond 4K Daryl Malas

CableLabs
CableLabs

May 19, 2015

Background
Carriage agreements between content owners and Multichannel Video Programming Distributors (MVPDs) are likely to contain clauses that require the MVPD to provide ample protection against content theft. In the traditional, QAM-based delivery model of cable television networks, the desired level of protection was relatively easy to implement and manage due to the fact that the operator controlled all parts of the ecosystem. Tight integration of the headend, network, and client set-top-boxes with an off-the-shelf or homegrown Conditional Access System (CAS) provided the protection that was needed.

In the age of internet video, the client playback device is typically owned by the consumer and utilizes a variety of operating systems and hardware configurations. In the beginning, companies like Adobe and Microsoft developed native applications to support decryption and rendering of premium video content. These native apps were later converted to web-browser extensions, which enabled HTML-embedded encrypted content. With the advent of adaptive bitrate (ABR) streaming paradigms such as Apple’s HTTP Live Streaming, Microsoft’s Smooth Streaming, and MPEG-DASH, these “black-box” media players left many content distributors wanting even more control over the playback experience. In response, the World Wide Web Consortium (W3C) developed the Media Source Extensions API to allow JavaScript applications to provide individual audio, video, and data media samples to the browser. With this powerful new tool, web applications had the power to decide when and how to switch between various bitrates. While this solved the problem of putting adaptive bitrate control in the hands of the application, it did not provide a secure method of playing encrypted content.

In 2012, W3C began work on standardizing the Encrypted Media Extensions (EME). These new JavaScript APIs allow a web application to facilitate the exchange of decryption keys between a digital rights management (DRM) system embedded in the web browser (the Content Decryption Module or CDM) and a key source or license server located somewhere on the network. CableLabs has played an active role in the W3C working group to ensure the needs of the cable industry are met in the development of the EME specification. The EME APIs have undergone several significant transformations over its three-year history, but we are now seeing some stability in the architecture and browser vendors are beginning to produce some complete and robust implementations.

EME Workflow

The process by which a JavaScript web application utilizes the EME APIs goes something like this:

  1. (OPTIONAL) The browser media engine notifies the app that it has encountered encrypted media samples for which it has no appropriate decryption key.
  2. App requests access to a DRM system available in the browser that supports specific operational and technical requirements associated with the content.
  3. App assigns the selected DRM system to an HTMLMediaElement.
  4. App creates one or more key sessions associated with the selected DRM system, each of which will manage one or more decryption keys (licenses)
    1. The app instructs the key session to generate a license request message by providing it with initialization data. The browser may provide this data by means of the event in Step 1, or it may be acquired by the app through other means (i.e. in an ABR manifest file).
    2. The CDM for the selected DRM system will generate a data blob (license request) and deliver it to the app.
    3. The app sends the license request to a license server.
    4. Upon receiving a response to its license request, the app passes the response message back to the CDM. The CDM adds to the key session any decryption keys contained within the response.
  5. The CDM and/or browser media engine will use keys stored in the key session to decrypt media samples as they are encountered.

EME

Initialization Data

Protected content intended for playback in an EME-enabled web browser must be accompanied with data that instructs a particular DRM implementation how to fetch the licenses required to decrypt it. This may include information such as key IDs, license server URLs, and digital rights assigned to the content. The contents of the initialization data packet are, in most cases, not to be parsed by the application. However, it is necessary to specify the method of carrying initialization data in a variety of media containers so as to allow browser media engines to extract it from a stream for delivery to the application. The W3C maintains a registry of currently defined stream and initialization data formats.

Key System Attributes

The first step in the EME process is to find a key system that satisfies the requirements of the content and the application. The next sections describe the criteria available to the app to allow it to select from a set of multiple DRMs implemented in a browser via the Navigator.requestMediaKeySystemAccess() API.

DRM System

EME was designed with the understanding that a single browser may support one or more DRM systems. Additionally, with ISO CommonEncryption, a single piece of content could be protected with multiple DRM systems. In EME, Each DRM is associated with an identifying key system string (e.g. “com.microsoft.playready”, “org.w3.clearkey”) and a Universally Unique Identifier (UUID). While the key system string will be unique within a particular browser implementation, the UUID should be unique across all browser implementations. The DASH Industry Forum has created a registry of UUIDs to maintain this uniqueness across DRM vendors. The application must select a DRM system that is supported by both the content and the browser.

Content Types

Assessing content type support crosses the line between the CDM and the media engine in the browser. The content’s container type must certainly be supported by the browser since it will need to parse the container to learn about the content (i.e. is it encrypted? How may tracks? etc.). The audio and video codec information is also important and will require support by the browser and/or CDM. In certain DRM robustness models, decrypted media samples may not be allowed outside of the protected memory of the CDM or graphics drivers. In that case, it would be up to the CDM to coordinate decode and display of the media.

Key Session Persistence

When creating a key session, applications are able to indicate that licenses associated with that session are to be persisted across multiple loads of the application. In order to ensure that these types of sessions can be created, the application can request access only to CDMs that can support persistence.

Distinctive Identifiers

One of the big arguments against the inclusion of “black-box” CDMs in the world of open-source software is the possibility that the CDM would use unique or near-unique attributes of the user or device to “track” an individual or small groups of individuals. In attempt to address this privacy-related concern, a portion of the EME specification is dedicated to defining these distinctive identifiers and indicating when and where they might be used by a CDM. When requesting access to a particular key system, the application may choose to select only from CDMs in the browser that do not use distinctive identifiers. CDMs that have an explicit dependence on the use of distinctive identifiers may not be available for selection by an application (and thus, may prevent playback of certain content) if the app indicates them as off-limits.

MediaKeySession

Key sessions provide the means for initiating the license retrieval process and for storing the keys upon receipt. The application begins the process by providing initialization data to the CDM (MediaKeySession.generateRequest()). The CDM parses the data and generates a license request in its own secure, proprietary format and notifies the application (MediaKeyMessageEvent). Upon receipt of the license request, the application forwards it on to a license server that it knows can handle the request. In a production environment, it is possible that the license request will be packaged with other business-specific data such as requests for user authentication and/or authorization. When successful, the DRM server will respond with a license message which the application will forward on to the CDM (MediaKeySession.update()).

During the normal course of media playback, it is possible that the CDM will need to make an unsolicited request to the DRM server (e.g. to verify that a given license is still valid). The application simply continues to function as a proxy, sending the message to the license server and updating the CDM with the response.

Key Session Persistence

As described earlier, key sessions can be established as “persistent”. In this case, the CDM stores all keys (and other data) associated with the session to a private store on the device. The stored sessions are uniquely associated with the web application that created them. Each key session is assigned a unique identifier that the application can use to recall the session data at a later time. MediaKeySession provides several APIs to allow the application to manage the persistence of key sessions.

  • MediaKeySession.close() – Closes the key session and makes its keys unavailable for decrypting media, but leaves any persistent store of the session unaffected.
  • MediaKeySession.load()– Takes a sessionID and loads the data associated with that ID into an empty MediaKeySession object. The keys that were persisted with that session are once again available to decrypt content.
  • MediaKeySession.remove() – Closes the key session AND removes any persistent storage of that key session from the CDM. The associated session ID is now no longer valid.

MediaKeys

Once the application has found a key system that meets both its needs and the needs of the content, it can create MediaKeys. MediaKeys is a container for one or more key sessions. MediaKeys facilitates the association of decryption keys with the HTMLMediaElement that will be used to view the encrypted content. Even if keys have been fetched from a license server and stored in the CDM, the media will not be decrypted until those keys have been associated with the media element.

ClearKey

Also included in the EME specification are the details for a test DRM system known as ClearKey. ClearKey is exactly as its name implies: a system in which decryption keys are “in the clear” at some point during their journey to the CDM. Browser support for ClearKey is mandated by the EME spec. Its intended use is primarily as a means to evaluate an EME implementation in a browser when either content or a CDM for a “real” DRM system is not available. The formats for ClearKey license request and response messages are detailed in the spec. The mechanism by which an application attains ClearKey keys for a given piece of content is left up to the developer.

Greg Rutz is a Lead Architect at CableLabs working on several projects related to digital video encoding/transcoding and digital rights management for online video.

This post is part of a technical blog series, "Standards-Based, Premium Content for the Modern Web".