Media Source Extensions inside Webkit


Webkit [1] is a famous web layout/rendering engine, which is opensource, active development by Apple (Safari browser is based on it). In history, Google Chrome is also based on Webkit, but after Google created their own (webkit) branch Blink, Chrome is based on Blink now.

Html5 video [2] (and audio) element is a technology that provide a standard HTML API for video(audio) playing without a plugin(like flash). In Web video content providers, Youtube and Netflix support html5 video. In Web browsers, Chrome / Safari / Firefox support html5 video.

Media Source Extensions [3] (MSE) is a W3C specification (Work in progress) that allows Javascript to dynamically construct media streams for html5 video and audio element.

In this blog post I want to analysis and show the architecture of MSE inside Webkit Source code. First an introduction to html5 video implementation itself( without MSE), then the big picture of MSE, finally key interface classes. Please note it is better to have Webkit source code to follow this blog, so if you don’t have it now , please check Webkit [1] site to get one. For impatient audiences I myself suggest to use

git clone git://

to take advantages of git (comparing to svn).

As Webkit and MSE is changing, for this blog I am writing, I refer to webkit version r159335.

Brief introduction to Html5 video inside Webkit

Html5 video implementation is mainly inside webcore. There are two layers: HTML Media Element Layer (upper) and MediaPlayer layer(lower).

digraph html5vw {
"HTML Media Element Layer" -> "Media Player Layer"

“HTML5 video inside webkit”

For HTML Media Element Layer, refer the code inside HTMLMediaElement and HTMLVideoElement (webidl definition and c++ implementation);

<WebkitSrc>/Source/WebCore/html/HTMLMediaElement.idl <WebkitSrc>/Source/WebCore/html/HTMLMediaElement.h <WebkitSrc>/Source/WebCore/html/HTMLMediaElement.cpp <WebkitSrc>/Source/WebCore/html/HTMLVideoElement.idl <WebkitSrc>/Source/WebCore/html/HTMLVideoElement.h <WebkitSrc>/Source/WebCore/html/HTMLVideoElement.cpp

For MediaPlayer Layer, there are two parts, the general part (platform independent part) and implentation part (platform dependent part). here is the graph:

digraph mpl {
"Platform independent Part" -> "Platform dependent Part"

Or we change to a better name :

digraph mpln {
"MediaPlayer Interface" -> "MediaPlayer Implementation"

As we refer to webkit, we can also simplify:

digraph mplw {
"MediaPlayer" -> "MediaPlayerPrivate"

For MedaiPlayer refer the code inside MediaPlayer Class:

<WebkitSrc>/Source/WebCore/platform/graphics/MediaPlayer.h <WebkitSrc>/Source/WebCore/platform/graphics/MediaPlayer.cpp

For MediaPlayerPrivate, the first thing you should know is a interface definition for it:


And then for different platforms, there are real implementation for it, for example, for Mac port, there are three files:

<WebkitSrc>/Source/WebCore/platform/graphics/mac/MediaPlayerPrivateQTKit.h <WebkitSrc>/Source/WebCore/platform/graphics/mac/ <WebkitSrc>/Source/WebCore/platform/graphics/mac/MediaPlayerProxy.h

For gstreamer port, there are 4 files:

<WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamerBase.cpp <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamerBase.h <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.cpp <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaPlayerPrivateGStreamer.h

OK here is the very brief statements for Html5 video architecture inside webkit: the HTML element layer is for JS APIs and the what the user and interact; the MediaPlayer Layer is for playing the media; inside this layer, the platform independent part (MediaPlayer) acts as a bridge between HTML element layer and the platform dependent layer(MediaPlayerPrivate); MediaPlayerPrivate implementation is where the media can actually play, it varies from platform to platform.

MSE inside Webkit, the big piture

MSE provide the media element another source media stream; Without MSE, the video element (MediaPlayerPrivate) can handle http:// https:// streams by itself , but with MSE, it should receive media stream from Javascript. Here I refer to an image copied from MSE spec page:


again, from MSE point of view, there are two layes, one is MediaSource API layer, one is HTMLMediaElement Layer which implements MSE.

digraph mse {
"MediaSource API Layer" -> "Media Source Implementation Layer inside HTML5Video"

The MediaSource API layer is located at the following path:


The other layer is more related to MediaPlayerPrivate, and it splits to two parts again:

digraph msei {
"MediaSource Private Interface Definition" -> "MediaSource Private Implementation"

Inside webkit, the first part is composed of three files:

<WebkitSrc>/Source/WebCore/platform/graphics/MediaSourcePrivate.h <WebkitSrc>/Source/WebCore/platform/graphics/SourceBufferPrivate.h <WebKitSrc>/Source/WebCore/platform/graphics/SourceBufferPrivateClient.h

The first two is for MediaPlayerPrivate to implement, the last one is for MediaPlayerPrivate to call, which is implemented in MediaSource API layer. In next section Interface Classes I will detail more info for the three files.

The MediaSource Private Implementation part, is inside in MediaPlayerPrivate implementation, thus again, varies from platform to platform. For gstreamer, the related files are:

<WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaSourceGStreamer.h <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/MediaSourceGStreamer.cpp <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/SourceBufferPrivateGStreamer.h <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/SourceBufferPrivateGStreamer.cpp <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/WebKitMediaSourceGStreamer.h <WebkitSrc>/Source/WebCore/platform/graphics/gstreamer/WebKitMediaSourceGStreamer.cpp

Interface Classes

These classes are the keys to understanding the architecture.



A interface class that a related MediaPlayerPrivate implementation should also implement it if it supports MSE.

Function and Duty:

Control some state transitions for MediaPlayerPrivate when related MediaSource Events occurs.

Related File(s):


Known subclass:




A interface class that a related MediaPlayerPrivate implementation should also implement if it supports MSE.

Function and Duty:

send Buffers to MediaPlayerPrivate, and some other related state transitions.



Known subclass:




A interface class that MediaSource layer should implement to interact with SourceBufferPrivate.

Function and Duty:

Control some state transitions for MeidaSource Layer when events from MediaPlayerPrivate ( SourceBufferPrivate ) occurs.



Known subclass:


The end

Hope it helps.