An Introduction to Encrypted Media Extensions (EME)
The cryptographically trusted agent of DRM systems inside browsers is the Content Decryption Module (CDM). The CDM works by receiving the content keys from the remote DRM License Server. The Content Decryption Module then uses the content keys to decode encrypted videos for playback. Streaming media applications use the EME API to enable communication between the Content Decryption Module (CDM) and the DRM License Server.
Encrypted Media Extensions is an open-source API. In this blog we go into the details of how EME works, how DRM ensures revenue security for content creators, and the contentious debate about DRM and what it signifies for user security and privacy.
- Encrypted Media Extensions API and Digital Rights Management Systems
- EME – Jointly built by Browsers and Streaming Services
- Enables browsers to stream premium video
- The Role of EME API in DRM
- Use Case for DRMs
- Controversies around EME and CDMs
- Encrypted Media Extensions and the State of Streaming Media
Encrypted Media Extensions API and DRM Systems
Friction in a friction-less world
The online world operates differently from the offline. Purchasing a film online is different from purchasing a DVD. With online content you can create multiple copies and share them with billions of people, with little effort. The nature of the internet is to facilitate communication with minimal friction.
This poses a challenge for premium video. The business models and revenue security of video creators and streaming services depends on their content remaining scarce, so that users keep paying to watch the content. DRM therefore is a business requirement for content creators.
All DRMs encrypt video files using the AES-128 algorithm. This encryption happens once, at the time the video is made ready for online streaming. For DRM systems the challenge lies in the decryption of the video content. This is because the browser, in which the content is played, cannot be trusted as a cryptographic entity.
Most browsers are open-source, and are controlled by the end-user. On the internet every single point of origin, every single customer is equally capable of pirating content. Any DRM system has to consider the end-user as untrusted.
How DRM works in browsers – an Alice and Bob story
Cryptographic examples often use Alice and Bob, along with a whole host of accompanying characters, to explain cryptographic systems and scenarios. Alice wants to send a message securely to Bob, and but suspects that eavesdropper Eve is listening in on the messages.
Alice is the content provider, and wants to make her content accessible to Bob. Alice has used the AES-128 encryption algorithm to encrypt the content. AES-128 is a symmetric algorithm, meaning that Bob must have the same content keys to be able to access the content.
While the encrypted content itself can be sent over an insecure network, care must be taken to ensure that the keys are transferred in a secure manner. If the keys are transferred securely then even if Eve were to intercept all messages she would still not be able to decrypt the content.
The DRM remote license server plays the part of Alice, whereas the Content Decryption Module plays Bob in our example. The request for license that Bob sends to Alice, and the license itself that Alice sends to Bob are both scrambled with proprietary algorithms, so that only the two end-receivers can unscramble the messages. These algorithms are essentially obfuscation tricks that still make it incredibly hard for hackers to reverse-engineer the algorithms used.
The components of Streaming Video DRM
I have already mentioned the DRM license server and the Content Decryption Module many times. Its time for a more detailed look at the essential components of streaming video DRM:
- Encrypted content hosted on Content Delivery Network (CDN)
- Content decryption keys hosted on License Server
- Content Decryption Module sandbox
- Streaming application which uses EME API to enable communication between license server and content decryption module
The streaming application receives content from the Content Delivery Network. The media header contains encryption information, which the streaming application then uses to initiate communication between the remote DRM license server the device Content Decryption Module
CDM code is closed-source, and is available only to the browser platform. For example Widevine code is proprietary to Google. PlayReady is proprietary DRM implementation for Microsoft Edge and IE, while FairPlay is the DRM for Safari and iOS. Mozilla Firefox does not have its own CDM, and instead incorporates Google Widevine CDM inside a secure sandbox.
EME was jointly implemented by Browser vendors and streaming services
The major stakeholders
As I explained previously, DRM is a business requirement for content providers – whether it is independent video creators, Hollywood studios, or streaming applications such as Netflix and VdoCipher. If somehow a play about how EME was implemented and adopted were to be produced, the dramatis personae would be:
- Content Providers, represented by Hollywood
- Streaming Services, represented by Netflix
- Browser Vendors, represented by Google (Chrome), Microsoft (Edge) and Mozilla (Firefox)
- World Wide Web Consortium (W3C)– Responsible for standardizing HTML specs
- Security and Privacy researchers – represented by Electronic Frontier Foundation
Given that much of the contention around DRM is linked to the Digital Millenium Copyright Act (DMCA), there is a possibility of a courtroom scene in our hypothetical play.
Content Providers – Hollywood Studios
The business requirement for a DRM system for Hollywood studios is clear. For example, take the Digital Entertainment Group’s Year-End 2017 Home Entertainment Report. The report revealed the following revenue figures:
- Revenue from Subscription Streaming Services – $9.55bn
- Revenue from Video on Demand – $1.96bn
- Revenue from Electronic Sell-Through – $2.15bn
The total revenue from online media comes out to $13.66bn
- Revenue from DVD – $4.72bn
The revenue from subscription TV revenues is split between streaming services (Netflix, Prime Video) and content providers (Hollywood Studios), Hollywood studios receive a greater percentage of revenue from selling DVDs and Blu-Rays. Although significantly lower than online streaming revenue, and showing consistent Year-on-year decline, DVD remains a highly profitable revenue stream for studios.
Given the money involved, safeguarding from piracy is a major priority for Hollywood studios. For this reason they mandate a DRM when licensing content to a third-party streaming service. Any streaming service that wants to license content from any global movie studio therefore has to have a DRM feature. VdoCipher serves this requirement by integrating all major DRMs as part of our service.
Streaming services have the option to implement a custom proprietary desktop app. However browser remains the most popular way in which users access the internet on PC. Installing custom applications however can be a major friction point in user experience, for which reason DRM in-built in the browser is in the best interest of streaming services.
The most important metric that browser vendors care about is the number of users using their software. Browser vendors want that users view all online content on the desktops via their browsers.
Browser vendors also want to be competitive in the browser market. If even one browser supports DRM, that browser possesses a major advantage over other browsers. This is the problem faced by Mozilla in 2014, when both Chrome and Edge had implemented EME and DRM on their browsers. Faced with the problem of losing users who want to watch Netflix to competing browsers, Firefox chose to implement the EME API and bundled and Adobe’s CDM.
World Wide Web Consortium – W3C
W3C is the standards recommending body for the Internet. Essentially W3C is tasked with creating and driving consensus for web standards. The standardization efforts of W3C is one major reason why websites and web applications are interoperable across different browsers. W3C is represented by all major stakeholders – major corporations as well as user advocacy groups.
The EME spec is recommended by the W3C as an extension to the HTML5 Media Spec supported by modern browsers which is an extension of HTML. In short, EME is now a standardized API that all browsers can implement.
Security and Privacy Researchers
The code for the Content Decryption Module required by DRM systems is proprietary and closed-source. The CDM is installed on the user’s device against the user’s interest – it serves the purpose of the content provider. For this reason, security and privacy researchers consider CDM an adversarial software.
Furthermore the Digital Copyright Millenium Act makes it illegal for bona-fide security researchers to publish security flaws in CDMs. Invariably every piece of code has bugs that persistent hackers can work around to compromise end-users. The DMCA makes it illegal for security researchers to report on these bugs.
The closed-source nature of Content Decryption Module, and the fact that DMCA precludes any research into security flaws, are major points of contention around DRM in browsers.The strongest voice in W3C for the interests of end-users has been the Electronic Frontier Foundation (EFF), led by Cory Doctorow.
An implementation history of Encrypted Media Extensions API
In February 2012, David Dorwin from Google, Inc., Adrian Bateman from Microsoft Corporation and Mark Watson from Netflix, Inc. proposed the Encrypted Media Extensions (EME) API as an extension to the HTML5MediaElement spec. The First Public Draft was released in May 2013, and in September 2013 W3C director Tim Berners-Lee declared work on EME spec as being “in scope” of the HTML Working Group.
Work on Encrypted Media Extensions was strongly opposed by open-source advocates and security and privacy researchers. In their view EME API essentially sanctioned the use of closed-source Content Decryption Module in the browser.
To prevent work on the controversial Encrypted Media Extensions from slowing down the processes of the HTML5 working group, an HTML Media Extensions working group was chartered. This group routinely released Working Drafts between 2013 to 2016. With such widespread usage, the progression from Working Draft to Candidate Recommendation to Proposed Recommenation to the final EME Recommendation was completed in a little over an year, in September 2017.
By 2014, Google Chrome, Mozilla Firefox and Microsoft Internet Explorer had implemented the EME spec. With the native HTML5 Media Source Extensions and Encrypted Media Extensions API streaming services could stream to compatible browsers without requiring users to install Silverlight and Flash plugins.
Encrypted Media Extensions improves User Experience
EME enables HTML5 video for Premium Content
HTML5 enables video streaming applications to have huge amounts of control over the quality of the video experience. Features of video streaming via HTML5 include:
- Adaptive streaming via MPEG-DASH
- Improved user-control features – precise seeking and playback of content
MPEG-DASH has emerged as the most popular video streaming protocol in the streaming industry. Adaptive streaming enabled by MPEG-DASH protocol ensures that the user is delivered the highest quality video streaming while continuing uninterrupted playback. The video streaming application uses DASH protocols to switch across different bitrate streams according to the user network conditions.
HTML5 Video enabled by EME provides a better user experience over Flash and Silverlight plugins
HTML5 now provides a better experience using native APIs for every functionality for which Flash was previously required. By the time that EME was introduced in browsers in 2014, DRM was the last use case where Flash provided a functionality not available in HTML5.
Flash also presents major security challenges. The risk arises from the fact that the Flash runtime environment has greater access to the user’s system. The standardization and implementation of EME was the final nail in the coffin of Flash and its security vulnerabilities. All major browsers have discontinued automatic support for Flash, and require user authorization to play Flash scripts.
Having said that, for streaming services the DRM in Flash still offers a reliable fallback in case your videos are not playing in your browser. VdoCipher uses Flash player as fallback in case videos are not streaming using Widevine or Fairplay on your target device.
Interoperable APIs across browsers
Interoperable APIs across different browsers means that users can use streaming applications across their browser of choice. It also means that website developers have reliable APIs they can use to build fantastic video streaming applications without having to worry about browser compatibility
Streaming via Browser is more user-friendly than using custom applications
Streaming services can create separate desktop applications like they have for Android and iOS devices. All major browsers run inside a sandbox which separates browser processes from the host machine. This is a major safeguard against malicious code from accessing and infecting your device.
All browser vendors invest heavily in ensuring maximum security. Google launched the Chrome Rewards Program in 2010, offering rewards upto $100,000 for reporting security vulnerabilities that could compromise a user’s host system.
Proprietary apps on Desktops have much greater access to the user device, including access to the file system and user-private data. As proprietary applications communicates directly with the website, they may collect more extensive data about the end-user, becoming a major privacy threat.
The role of Encrypted Media Extensions API in DRM systems
EME API mediates information flow between Trusted Systems
In the DRM system, only trusted agents should handle the key in any usable form. The CDM is native to the browser, and does not by itself make any network request to the remote license server which holds the content keys. It is the responsibility of the streaming application to enable communication between these trusted systems.
The key has to pass through the streaming application and the browser, and should only accessible in an unrecognizable form. The messages encoded with the CDM-specific Key System are deliberately obfuscated and anti-debugging tricks are applied. Any intercepting application can only read the information as byte buffers, completely opaque to use. The cryptographic principles of confidentiality and integrity require:
- Confidentiality – The messages should be accessible to only the sender and the recipient (CDM and License Server)
With the EME API, the web page and the browser have control over communication between the CDM and the license server. There is no outside communication between the closed source module and the license server.
EME API Details
- The streaming application has to figure out what key system to use when it finds that the content is encrypted(
navigator.requestMediaKeySystemAccess()to find the available Key System). Key System refers to the specific Content Decryption Module schema
- The keys are then initialized (
createMediaKeysto initiate a
- A session is created during which the keys are valid (
createMediaKeySessionobject on the
- A request is sent to the CDM to generate the license request (call the
generateRequest()method on the
MediaKeySessionobject, with the initialization data sent in the metadata of the encrypted media file.
- The application then receives the license from the license server.
- The streaming application forwards the license to the CDM with the
updatemethod on the
From a cryptographic standpoint the two steps in which confidential keys are passing through the JS application are:
- When the license server returns the license containing the content keys
EME allows browsers to sandbox Content Decryption Module
The EME spec recommends that browsers create a sandbox inside which the closed-source CDM code run. A sandbox works by creating a virtual environment in which an application operates. The key idea of a sandbox is to isolate system-specific information from another application.
A sandbox implementation can curb privacy-invading user-identifiable information that CDMs may leak to the server. Mozilla, when implementing EME and Adobe CDM in May 2014, explained their sandbox implementation which should mitigate some of the risks associated with unverified closed-source code. For example the sandbox that Firefox has implemented gives the device a unique ID for different streaming applications, which means that different content provider services cannot get data about the user behaviour across sites other than their own.
Use Case for DRMs
DRM is a prerequisite for streaming Hollywood content
DRM is a requirement for Hollywood studios for online streaming. Any movie streaming service looking to broadcast Hollywood films via OTT has to ensure DRM system for protection of premium content. For example, Netflix has been streaming DRM-protected videos in HTML5 since 2014, when the EME spec was first proposed
UltraViolet and Movies Anywhere are two initiatives by movie studios for users to be able to purchase films across different platforms and yet be able to watch them on the platform of their choice. One major issue with DRM is that it creates silos of platforms for users, that users have to navigate. Besides such silos also enhance the market power of the streaming platform themselves – if you have bought one film from Amazon Instant Video, you are much more likely to make your second purchase from Amazon Instant video. With digital lockers you can purchase it on iTune and still be able to watch on Amazon Video or Google Video. Digital lockers are movie studios attempt to create a centralized personal library for users.
DRM enables SVOD and TVOD models
Subscription Video on Demand (SVOD) business model involves giving users access to a large bundle of streaming videos for a monthly fee. The feasibility of the SVOD business model is based on the validity of bundle economics.
In Transactional Video on Demand business model on the other hand users directly purchase the content. Because there is no physical goods actually changing hands (and I am not going into the bytes and bits here), users actually purchase a license to the content. It is the DRM system that enforces the terms of the license.
The viability of the TVOD and the SVOD model requires that it is hard to create copies of premium content online.
DRM Content and licenses can be implemented on-premise
Content can be hosted on private cloud servers just as well as on shared cloud servers. The DRM system can be easily be suited to the requirements of businesses that require their content to be hosted in-house only.
VdoCipher offers on-premise content and license hosting for corporations. Many businesses use our service specifically for the encryption and license server management functionalities.
Controversies around EME and CDM
Closed nature of Content Decryption Modules
Because DRM systems are designed to limit user freedom, they are cause for considerable controversy. The EME API, which itself does not define DRM but is nevertheless exclusively used by DRM systems, has been shrouded in controversy since the first time it was proposed in W3C and implemented in browsers.
The glass-half-full argument that W3C Chairman Tim Berners-Lee put forth in explaining W3C’s decision to recommend the EME API is that EME enables DRM streaming through browsers. Browsers have better security records than the alternative of proprietary desktop applications, and are therefore much safer to use for the end-user.
Restrictions on security research on CDMs
Closely related to the closed-source nature of CDM is the fact that the Digital Millenium Copyright Act (DMCA) prevents security researchers from publishing flaws in DRM systems. Because a DRM system is used to protect content creator’s Intellectual Property rights, publishing details about flaws in DRM systems is equivalent to causing copyright infringement.
Cory Doctorow of Electronic Frontier Foundation (EFF) proposed a covenant in the W3C recommendation process, according to which security researchers will not be liable to prosecution under DMCA on researching the Content Decryption Module. The covenant however was rejected, meaning that security researchers risk imprisonment for carrying out research on the CDM.
Security experts argue that flaws in security systems are inevitable. An important case cited frequently is the Sony BMG rootkit scandal. In 2005, the copy protection code that Sony installed on 22 million CDs exposed PCs to unrelated trojans and malware.
The essential contention of security experts is that independent securit research is necessary to make sure that the black-box content decryption modules does not threaten the security of user using the device.
Encrypted Media Extensions API and the State of Streaming Media
While EME standardizes playback in the browser, the state of streaming media in 2018 is still quite fragmented. When premium content is prepared for streaming, it goes through 3 distinct processes:
- Encoding reduces file size of the video file- Codecs used may be H.264 (AVC), H.265 (HEVC), VP9
- Encryption – Common Encryption standard is designed for different DRM systems to be used with a single set of content keys
- Packaging – Two major streaming protocols are used:
- HLS, used on Apple devices, which uses MPEG-2 TS file format
- MPEG-DASH which uses ISOBMFF file formats such fragmented MP4
Widevine DRM supports the ISO Base Multimedia File Format as video streaming container . Most DASH packagers use a variant of ISO BMFF standard called fragmented MP4. Apple’s HLS historically has used the MPEG-2 transport stream as container, in which the video files are broken down into chunks of 10s, in the .ts file format. Because Apple packages files for HLS differently video publishers have to keep multiple copies in different file formats in their Content Delivery Network edge locations.
DASH fragmented MP4s on the other hand are in chunks of 2-4seconds. This allows for faster switching for the primary adaptive streaming use case for DASH over HLS. The following diagram, from Bitmovin, captures the many different layers of content encoding, encryption, packaging and streaming that video streaming providers such as VdoCipher need to serve.
At its WWDC 2016 event Apple announced that it would be supporting the fragmented MP4 – fMP4 – container format for HLS streaming on iOS devices and Safari browser. Wih this video services can now package files in a single container format to serve users across all platforms. Fragmented MP4 in HLS is a great step towards a common video streaming worklflow across different device platforms. However still yet there are not enough developer tools for streaming services to package fMP4 in HLS.
However for studio-approved DRM content the Grand Unification Standard does not quite exist. Also Apple’s DRM currently uses Sample-AES encryption – wherein content is encrypted by AES-128 in Cipher in Counter Mode, whereas Widevine and Playready support AES-128 in Cipher Block Chaining Mode. For this reason streaming services have to still encode and package content twice for different devices and platforms, and have to keep multiple copies on edge CDN locations.
As a result of this streaming services ultimately have to package the videos twice, with the different encryption modes. Until all 3 major stages of content preparation – encoding, encryption and packaging are mutually compatible, streaming services and ultimately customers will have to keep paying the higher costs for storage space across the servers and edge locations.