Web video is becoming an attractive proposition for publishers to attract new users, increase engagement, and expand monetization opportunities while, for some, it also enables “TV on the Go” products.
With web applications embracing open standards rather than proprietary plugins, the reported security vulnerabilities of Adobe’s (News – Alert) Flash and browser efforts (e.g., Chrome, Firefox) to curtail Flash usage, it behooves web application publishers to start exploring the use of HTML5 to support web video.
However, moving from Flash to HTML5 video is not a simple web application code change. The path associated with a move to HTML5 involves a broad collection of evolving video-related specifications and implementation decisions that span multiple layers of the technology stack. Also, depending on your requirements, support across different browsers (and some of their older versions) could also be a factor to consider.
This discussion is presented from the perspective of a web application developer who has to navigate the technology landscape to develop a solution. For those who have to deliver a viable solution today, the preference is to use web standards, but, if this is not feasible, practical alternatives may need to be pursued. The current maturity of the specifications and browser support will be presented within the context of typical web video use cases for digital publishers.
Given the number of different components and technologies related to this discussion, a conceptual schematic is provided below to support this discussion. Note that it should not be interpreted as a strict architectural diagram. For example, in the diagram, VAST, which presents an XML schema for serving ads to video players, simply illustrates the conceptual association with ad servers; it does not denote a transport protocol. The details articulated in this article will shed further insight into this schematic.

Figure 1. Conceptual schematic of web video technology landscape. Also, depending on your architecture and implementation, the technical functions discussed could be assumed by different application components (video player, web application, server application, etc.). Note that the descriptions of the various functions are intentionally devoid of specific commercial product references and are not aligned with any vendor recommendations.
<video> element
The core web page element for playing web video (without requiring a proprietary plugin) is the HTML5 <video> element. The <video> element allows for different sources for video and the use of different video container formats (MP4, WebM) and video codecs (H.264, VP8).
If a browser does not support a specific container/codec, you can specify alternate container/codecs specifying a <source> element within a <video> element.
If closed caption support is required, the <track> element provides a simple, standardized way to add subtitles and captions to video.
Videos display closed captions by referring to an accompanying caption file, which contains text of speech and sounds with corresponding timestamps. WebVTT (.vtt) is the format of choice for HTML5 video.
Similar to the <source> element, a <track> element is specified within the <video>element, and has a src attribute to reference a related WebVTT file.
Closed captioning support is mandatory for some publishers as the Federal Communications Commission (FCC) requires that captioned programs on TV also be captioned when played on the web. Closed captioning support is fairly stable and broadly supported across browsers and versions. <source> and <track> elements provides a standard way for playing back web video in web applications (responsive web design, progressive web apps).
Playing protected content
Publishers need to play back on-demand and live Digital Rights Management (DRM)-protected audio and video in browsers without the use of proprietary plug-ins. While Encrypted Media Extensions (EME) is not a DRM solution, it is a specification that enables a web application to use JavaScript for playing protected media. Note that license key exchange to support a DRM solution is controlled by the web application.
The HTML5 HTMLMediaElement adds properties and methods to HTMLElement to support basic media-related functions. EME extends the HTMLMediaElement (e.g., defining properties and methods related to keys used for media decryption) and provides a standard way to interact with Content Decryption Modules (CDM). A CDM is a clientbased component integrated with a browser that assists with decryption of media. As shown below, current browsers each leverage different CDMs.

Table 1. Browsers and supported Content Decryption Modules.
Note that EME support for older versions of browsers is not available.
Another related specification for playing protected content via adaptive streaming of live and VOD content is the Media Source Extension (MSE). HTTP streaming generally encodes a source video into discrete file segments and clients play the stream by requesting fragments from the server. Adaptive streaming enables video to be streamed from a server to a client browser by adapting the video fragments delivered based on the client’s capacity and network and conditions. The Media Source API is an extension to the HTMLMediaElement enabling applications to use JavaScript in an HTML5 application to control the source of media and build playback streams from video chunks.
Multiple adaptive streaming protocols are in use today. Currently, the desktop browsers and some set top boxes leverage a Flash (10.1 and later) plugin for supporting adaptive streaming typically via Adobe HTTP Dynamic Streaming (HDS). iOS devices and Safari (Mac) use Apple’s (News
– Alert) HTTP Live Streaming (HLS). Dynamic Adaptive Streaming over HTTP (DASH) is an emerging industry standard for streaming video and is codec-agnostic (e.g., can work with HEVC or H.264) but adoption is fragmented across browsers at this time (only supported by Chrome and Android).
Advertisement Support
Typically, publisher web sites often play a preroll ad before playing a video clip. The Video Ad Serving Template (VAST) and Video Player-Ad Interface Definition (VPAID) specifications define a standard way to render video ads. HTML5 support for preroll video ads will require advertisers to support VPAID 2.0.
While VAST provides a common ad response format for video players across video players, VPAID defines a standard interface for players and ad units to enable a dynamic interactive ad experience. VPAID 2.0 adds support for HTML5 and adds JavaScript events and properties related to skipped ads, resized ads, and ad interactions. Currently, many advertisers do not yet support VPAID 2.0; VPAID 1.0 ads will not function in an HTML5 video implementation. For preroll ads to be used with HTML5 video clip implementations, advertisers need to be engaged to support VPAID 2.0. Also, there are tools available to assist them in creating HTML5 interactives and to assist in migrating Flash ads to HTML5 versions.
If in-stream ad insertion is an important use case, note that it is supported by both HLS and DASH. While SCTE-35 was originally designed for ad insertion in a linear broadcast stream, it has also gained interest for web delivery. For example, there are HLS extensions to accept SCTE-35 markers in an MPEG stream.
Given the significant use of ad blockers by web clients, efforts to dynamically insert ads into the stream on the server side are being pursued. For example, for DASH, one such approach leverages DASH's Media Presentation Description (MPD), which is an XML document describing video segments, their relationships and other metadata used by DASH clients. With this server-side ad insertion, encrypted MPD URLs serve as a segment's server location thereby allowing for real-time decision-making on whether a URL points to content or an ad. Thus, to a DASH client, there is one video stream spliced with both ads and content with no inherent delays from switching between separate invocation contexts for ads and content.
More importantly, it shields publishers from web client ad blocker actions.
Discussion
As highlighted above, publishers who deliver web applications with premium video content must evaluate their core use cases, gain an appreciation of the technology landscape, and carefully plan for an implementation rollout. It would also be valuable to engage with browser vendors and industry forums to share the challenges faced by solution developers.
While there is no current consensus on streaming protocol adoption among all major browsers, there may be paths for those who want to pursue HLS until DASH becomes more viable. For example, you could evaluate and select a licensed third party player without DASH and
MSE support but that can play encrypted HLS streams across all recent major browser versions. Another option is to augment your client application components with emerging open source JavaScript libraries that support HLS across recent versions of browsers (Chrome Android, Chrome desktop, IE11, Firefox and Safari 8). If required, you may need to add fallback support to Flash for older versions of browsers.
For those who want to pursue the standards-based route (DASH, MSE and EME), there are examples of industry success stories for those who have invested in their players, worked with industry partners and helped influence the standards. For example, in early 2015, YouTube announced that it had stopped using Adobe Flash. YouTube now uses its HTML5 video player by default in Google’s Chrome, Microsoft’s (News
– Alert) IE11, Apple’s Safari 8, and in beta versions of Mozilla’s Firefox browser. Netflix is another example of successful use of HTML5.
Netflix uses EME, MSE and WebCrypto (for encryption and decryption of user data) to deliver protected video content via their HTML5 player to the latest versions of the following browsers: Firefox, IE, Edge, Safari, and Chrome delivered using multiple bitrates (ranging from 100 kbps to 16 Mbps) without the use of plugins such as Silverlight or Flash. Both Netflix and YouTube (News – Alert) also leverage DASH for streaming on supported platforms.
In closing, it behooves publishers to begin to gain an awareness of the technology landscape related to HTML5 video, participate in industry forums and partner with technology vendors to both influence as well as to learn how best to attain success in delivering premium video content via HTML5.
Author Bio
Ian Moraes (News – Alert), Ph.D., is a Senior Director, Technology at a leading media company. Ian has led the development of large-scale web applications including web video solutions and has presented on a number of web topics.
Edited by
Kyle Piscioniere