W3C Recommendation: Plugin-Free Playback of Premium Content
The recent World Wide Consortium (W3C) Encrypted Media Extensions (EME) Recommendation describes the importance of browser-based content protection for a better user experience when viewing encrypted video on the web. The W3C Media and Entertainment Interest Group has been a key venue for worldwide definition of premium video content delivery requirements. According to W3C:
“EME is an Application Programming Interface (API) that allows plugin-free playback of protected (encrypted) content in Web browsers, which works seamlessly on all major platforms. W3C’s Media Source Extensions (MSE) provides the API for streaming video while its companion Encrypted Media Extensions (EME) provides the API for handling encrypted content.”
As hinted in the announcement, EME is but one piece in achieving a goal of premium content delivery without the use of plugins. Moving web interactions away from plugins into browsers enhances security, privacy and accessibility for consumers and simplifies the development process for web developers.
Consider the past, not so long ago, when no common solution existed for premium content and each company implemented its own browser-specific or plug-in solution. Each solution was different - no content portability across platforms, uneven support for critical viewer features, no common encryption, no common standard or even disclosure for security considerations. Every piece of premium content was wrapped in technology specific to a browser, system vendor and user device.
Premium video content is much more than video
The use of streaming services with encrypted video content has grown exponentially and viewers expectations are high - multiple language audio tracks, subtitles and closed captions, flawless delivery over the best-effort internet, state-of-the-art user interfaces and content that can be viewed across any browser on any device. This is a challenge for content providers. How do they meet expectations without an explosion of complexity and cost in dealing with multiple devices, browsers and network technology?
A common, browser-based content encryption solution is necessary, but by no means sufficient in addressing this challenge. Along with EME, CableLabs and multiple system operators (MSO’s) have initiated and participated in W3C groups to piece together this puzzle.
It is useful to understand the set of browser features defined by W3C and others, why every feature in this set is required for premium content and why it’s all these features or nothing. These requirements are reflected in many W3C documents, in addition to the EME Recommendation:
- Browser support for multiple audio and video, subtitle and closed caption tracks are defined in the HTML5 Recommendation as a result of these requirements.
- Sourcing In-band Media Tracks defines how content in any format used on the web delivers these tracks in a common, interoperable manner.
- The Media Source Extension Recommendation, MPEG DASH and DASH Industry Forum collectively define the efficient, smooth delivery of time-critical media across the best-effort internet to any browser.
Every one of these features is essential to the industry goal of making the web a first-class platform for media and entertainment.
Encrypted media extensions, along with HTML5 features, media tracks, media source extensions and DASH define common implementations for essential components of premium content. All of these pieces rely on each other; take one away and it's back to the not-so-good-old-days. The W3C has provided an essential service to consumers and providers, and technology partners by enabling create once, use anywhere premium content.
Subscribe to our blog and find out more about CableLabs and how we contribute to enhancing the broadband experience.
Encrypted Media Extensions Provide a Common Ground
Carriage agreements between content owners and Multichannel Video Programming Distributors (MVPDs) are likely to contain clauses that require the MVPD to provide ample protection against content theft. In the traditional, QAM-based delivery model of cable television networks, the desired level of protection was relatively easy to implement and manage due to the fact that the operator controlled all parts of the ecosystem. Tight integration of the headend, network, and client set-top-boxes with an off-the-shelf or homegrown Conditional Access System (CAS) provided the protection that was needed.
- (OPTIONAL) The browser media engine notifies the app that it has encountered encrypted media samples for which it has no appropriate decryption key.
- App requests access to a DRM system available in the browser that supports specific operational and technical requirements associated with the content.
- App assigns the selected DRM system to an
- App creates one or more key sessions associated with the selected DRM system, each of which will manage one or more decryption keys (licenses)
- The app instructs the key session to generate a license request message by providing it with initialization data. The browser may provide this data by means of the event in Step 1, or it may be acquired by the app through other means (i.e. in an ABR manifest file).
- The CDM for the selected DRM system will generate a data blob (license request) and deliver it to the app.
- The app sends the license request to a license server.
- Upon receiving a response to its license request, the app passes the response message back to the CDM. The CDM adds to the key session any decryption keys contained within the response.
- The CDM and/or browser media engine will use keys stored in the key session to decrypt media samples as they are encountered.
Protected content intended for playback in an EME-enabled web browser must be accompanied with data that instructs a particular DRM implementation how to fetch the licenses required to decrypt it. This may include information such as key IDs, license server URLs, and digital rights assigned to the content. The contents of the initialization data packet are, in most cases, not to be parsed by the application. However, it is necessary to specify the method of carrying initialization data in a variety of media containers so as to allow browser media engines to extract it from a stream for delivery to the application. The W3C maintains a registry of currently defined stream and initialization data formats.
Key System Attributes
The first step in the EME process is to find a key system that satisfies the requirements of the content and the application. The next sections describe the criteria available to the app to allow it to select from a set of multiple DRMs implemented in a browser via the Navigator.requestMediaKeySystemAccess() API.
EME was designed with the understanding that a single browser may support one or more DRM systems. Additionally, with ISO CommonEncryption, a single piece of content could be protected with multiple DRM systems. In EME, Each DRM is associated with an identifying key system string (e.g. “com.microsoft.playready”, “org.w3.clearkey”) and a Universally Unique Identifier (UUID). While the key system string will be unique within a particular browser implementation, the UUID should be unique across all browser implementations. The DASH Industry Forum has created a registry of UUIDs to maintain this uniqueness across DRM vendors. The application must select a DRM system that is supported by both the content and the browser.
Assessing content type support crosses the line between the CDM and the media engine in the browser. The content’s container type must certainly be supported by the browser since it will need to parse the container to learn about the content (i.e. is it encrypted? How may tracks? etc.). The audio and video codec information is also important and will require support by the browser and/or CDM. In certain DRM robustness models, decrypted media samples may not be allowed outside of the protected memory of the CDM or graphics drivers. In that case, it would be up to the CDM to coordinate decode and display of the media.
Key Session Persistence
When creating a key session, applications are able to indicate that licenses associated with that session are to be persisted across multiple loads of the application. In order to ensure that these types of sessions can be created, the application can request access only to CDMs that can support persistence.
One of the big arguments against the inclusion of “black-box” CDMs in the world of open-source software is the possibility that the CDM would use unique or near-unique attributes of the user or device to “track” an individual or small groups of individuals. In attempt to address this privacy-related concern, a portion of the EME specification is dedicated to defining these distinctive identifiers and indicating when and where they might be used by a CDM. When requesting access to a particular key system, the application may choose to select only from CDMs in the browser that do not use distinctive identifiers. CDMs that have an explicit dependence on the use of distinctive identifiers may not be available for selection by an application (and thus, may prevent playback of certain content) if the app indicates them as off-limits.
Key sessions provide the means for initiating the license retrieval process and for storing the keys upon receipt. The application begins the process by providing initialization data to the CDM (
MediaKeySession.generateRequest()). The CDM parses the data and generates a license request in its own secure, proprietary format and notifies the application (
MediaKeyMessageEvent). Upon receipt of the license request, the application forwards it on to a license server that it knows can handle the request. In a production environment, it is possible that the license request will be packaged with other business-specific data such as requests for user authentication and/or authorization. When successful, the DRM server will respond with a license message which the application will forward on to the CDM (
During the normal course of media playback, it is possible that the CDM will need to make an unsolicited request to the DRM server (e.g. to verify that a given license is still valid). The application simply continues to function as a proxy, sending the message to the license server and updating the CDM with the response.
Key Session Persistence
As described earlier, key sessions can be established as “persistent”. In this case, the CDM stores all keys (and other data) associated with the session to a private store on the device. The stored sessions are uniquely associated with the web application that created them. Each key session is assigned a unique identifier that the application can use to recall the session data at a later time.
MediaKeySession provides several APIs to allow the application to manage the persistence of key sessions.
MediaKeySession.close()– Closes the key session and makes its keys unavailable for decrypting media, but leaves any persistent store of the session unaffected.
MediaKeySession.load()– Takes a sessionID and loads the data associated with that ID into an empty MediaKeySession object. The keys that were persisted with that session are once again available to decrypt content.
MediaKeySession.remove()– Closes the key session AND removes any persistent storage of that key session from the CDM. The associated session ID is now no longer valid.
Once the application has found a key system that meets both its needs and the needs of the content, it can create
MediaKeys is a container for one or more key sessions.
MediaKeys facilitates the association of decryption keys with the
HTMLMediaElement that will be used to view the encrypted content. Even if keys have been fetched from a license server and stored in the CDM, the media will not be decrypted until those keys have been associated with the media element.
Also included in the EME specification are the details for a test DRM system known as ClearKey. ClearKey is exactly as its name implies: a system in which decryption keys are “in the clear” at some point during their journey to the CDM. Browser support for ClearKey is mandated by the EME spec. Its intended use is primarily as a means to evaluate an EME implementation in a browser when either content or a CDM for a “real” DRM system is not available. The formats for ClearKey license request and response messages are detailed in the spec. The mechanism by which an application attains ClearKey keys for a given piece of content is left up to the developer.
Greg Rutz is a Lead Architect at CableLabs working on several projects related to digital video encoding/transcoding and digital rights management for online video.
This post is part of a technical blog series, "Standards-Based, Premium Content for the Modern Web".