The Secret of Getting Video on Demand Right First Time
by Maria Ingold (who was CTO of Disney/Sony joint venture FIlmFlex Movies at the time of publication.)
IBC 2010 Technical Paper (Peer-reviewed by the Technical Papers Committee). As it is no longer in publication, it is now available here.
ABSTRACT
Delivering content to cable and now our white-label service for broadband, the UK’s leading Video on Demand (VOD) film provider, FilmFlex Movies, had to figure out how to create top-quality VOD services, on a tight budget and with fifteen staff. We had to ingest content once and get it right at the start so that we could distribute to multiple platforms in the future.
Given the complexity of such a project, the paper breaks down the steps to achieve a flexible VOD workflow. Transitioning from an existing model, receiving content from multiple suppliers, selecting an SD and HD high-resolution master format, asset storage and content management, reviewing encode and transcode quality, and delivery to multiple channels. The document also provides tips of what to consider at each stage in order to avoid the “gotchas”.
We deliver content to both cable and broadband platforms following this process.
INTRODUCTION
Aggregating films from over thirty-five of the leading movie studios and content suppliers is hard enough. Creating custom promos and reams of bespoke metadata for hundreds of films a month is no mean feat either. Tie in encoding thousands of films. And transcoding tens of thousands of standard definition (SD) and high definition (HD) variants. Package all that all up and simultaneously deliver it to several of the UK’s largest VOD platforms on both television and broadband. It was never going to be easy. Try doing it with fifteen staff. Imagine making it happen and becoming the UK’s leading VOD film provider.
From this experience, this paper discusses all the steps necessary to construct a full VOD workflow. While it can’t detail every case, it raises the pitfalls to be aware of in planning and provides the foundation of information required. The following diagram represents the areas that will be discussed.
RECEIVING CONTENT FROM MULTIPLE SUPPLIERS
How the content needs to be received is driven in part by the way it will be used. However, suppliers deliver content in many different formats. The quality, video format, audio format, delivery method and packaging may all vary.
Content Formats
Content is typically provided as Standard Definition (SD) or High Definition (HD), but could be provided in a 3D-capable format or in the future as Super High-Vision (SHV) as seen at IBC 2008. SD quality is similar to a DVD, while HD is like a Blu-ray Disc (BD). Most VOD content is SD, although some HD content is now also available.
Tape-based Delivery
As there are fewer delivery options with tape it can be a simpler format to consider. Digital Betacam (Digibeta) is the standard for SD content, while HDCAM or HDCAM SR is the standard for HD. Even so, there are variations.
Long pieces of content can arrive on multiple tapes. The encoding system must be able to seamlessly join multiple tapes into one piece of content with correctly aligned time-codes.
The way audio is stored on tape, depends on the type of audio and its intended use. Stereo audio takes up two channels. Music and effects (M & E) or a second stereo language version can also be stored as two channels. Dolby 5.1 surround sound takes up two channels as Dolby-E 5.1 or eight channels as uncompressed 16-bit audio. Dolby-E is a professional audio format. While it can’t be listened to directly, it can be the most efficient way of transferring and encoding 5.1 audio. Both Digibeta and HDCAM only store four audio channels, while HDCAM SR can store up to twelve. As an added complication, 5.1 audio is sometimes provided separately on a DA88 tape and must be muxed in and synchronised to the video.
File-based Delivery
File-based delivery is entirely different. The benefit of file-based delivery is that it is volcano and transportation strike-proof. However, there is no industry standard for either SD or HD. There are certainly a number of standards available to use, but most suppliers have interpreted the standards differently.
So even if someone uses the Material eXchange Format (MXF), it can be used with different operational patterns (OP-Atom, OP-1a, OP-1b, etc) and different wrappers (e.g. QuickTime referenced, QuickTime self-contained, etc).
High-resolution master files, equivalent in quality to Digibeta or HDCAM, can be provided. Alternatively, it may be more appropriate to receive content that has already been transcoded to the required derivative format.
Typically each supplier is individually “onboarded”, a method that allows testing receipt of one or more delivery formats until the process and format is agreed.
Content Security
Tape-based content is logged and stored into a library. Suppliers may want tapes returned or destroyed. Depending on the process, multiple groups may need to access the tapes – testing, encoding, editing – so tapes should be checked in and out of the library, much in the same way as books.
Similarly access to master file-based content needs to be logged.
Access to movie content usually needs to be FACT accredited and proven to be secure, especially at points where the content is stored unencrypted.
Planning
With any service, no matter how many content suppliers, the best way to determine the content format and delivery model required is to understand how and where the content will be used throughout the workflow and in what timescales.
Then put together a standard content source specification to be provided to the suppliers with a plan for how to deal with variations. There will always be variations.
GETTING CONTENT INTO A USABLE FORMAT
The key to a cost-effective, simple process without variations is to get the content as quickly as possible into your mezzanine, or working format. Once content is in a standard mezzanine format, the workflow becomes the same.
Encoding
Encoding is the process of getting content from a tape into a file-based format.
Once the tape has passed testing, it is encoded into the file-based format. Typically this is the mezzanine format, but it could be an intermediate format or even a final format.
Ingesting
Once content is in its file-based mezzanine or final file format, it is normally ingested into a Media Asset Management (MAM) system. Ingesting into a MAM makes it easier to find the file, record details that relate to the file and to use that file in the future.
The file is usually ingested along with a related set of technical metadata. This can include information such as format, resolution, aspect ratio, video compressor-decompressor (codec) and audio codec.
Once the file is in the MAM it can be connected to automated or manual workflows to allow transcoding and onward delivery.
Transcoding
Transcoding is the process of taking one file-based format and transforming it into another. Typically this involves taking a high-resolution master mezzanine file and transcoding it down to the specific derivative format or codec required for a delivery platform. A master file can be transcoded simultaneously to different derivative formats, or it can be transcoded at different times depending on new requirements. How this works in practice depends on how the transcoding system is designed.
Selecting Encoding and Transcoding Formats
If the content is only ever delivered to one delivery channel, then it may be encoded to its final derivative format. However, if the content is to be delivered to multiple delivery channels then it is encoded to a high-resolution master file.
The type of high-resolution master file depends on its use. If the file is simply used for transcoding, then the choice of master file can be more flexible. If, however, it will be used for file-based editing of trailers or promotional material, in addition to transcoding derivatives, then it must be an Intra-coded frame (I-frame) based master file format.
An I-frame is essentially a still image. A picture. In this case, the entire master file is composed of still images. It makes the file size larger, but allows edit tools such as Final Cut Pro (FCP) and Avid to edit at a frame-accurate level. As a note, Final Cut and Avid each load different types of high-resolution master formats more efficiently, so which edit tool and version you select will influence the master file format you choose.
The quality of the high-resolution master file will affect the quality of the derivatives. There may be a trade-off between quality and storage size of the file. The higher the bit rate and the more I-frames, the bigger the file, but generally you will lose less in the encode and it will be better quality.
There are a number of hardware and software encoders available. Test a range of short and full-length pieces of content in the environments you intend to use them. Content should include all possible formats (e.g. SD 4:3, SD 16:9, HD 720p, HD 1080i) and a range of types of content (e.g. high-budget action, faces and skin-tones, arthouse, fast scene changes, pans and scans.)
Also test how the format created by the encoder works with the proposed transcoders. Different combinations produce different levels of quality. And different codecs respond better to different types of movement and colours. Some codecs give darker blacks and brighter colours while others respond better to fast scene changes than pans and scans.
Testing
Typically the content is given a Quality Check (QC) prior to encoding. A Quality Check is part of a Quality Assurance (QA) process. QA should include a manual check in addition to any automated process.
A manual or automated check may be used to ensure there are no audio or video glitches. But a manual check of the source content is required to ascertain that it is indeed the right piece of content and the correct version. If content comes from the USA, for example, it may not be the classification version for release elsewhere. What is accepted in terms of depicted sex or violence in one country is not always the same in another.
Encoded and transcoded variants may also go through a QC. At this stage the check is typically automated, but may include a three-point manual spot check. The person doing the three-point spot check looks at the beginning, middle and end and ensures the audio and video are present, correct, and in-sync. The key automated checks are that the audio and video levels are compliant, no audio or video glitches have been made during encoding or transcoding, and that any header or transport stream information is correct for the platform it is being delivered to.
Note, at any point content can fail to pass QA. Depending on when it fails the content will need to be re-ordered from the supplier, fixed, re-encoded or re-transcoded.
MANAGING STORED CONTENT
A MAM generally abstracts from its user how the file is stored, although may provide the file’s current location. The method for accessing the file underneath will vary.
Depending on the solution, the cost of ingesting the content into the MAM may include some of the storage cost. Additionally, content suppliers will mandate secure storage and the ability to log and report digital and physical access to content over a period of time.
The type of storage required depends on how and when the content will be used. The volume of historical content, projected growth, and amount of content that needs to be immediately accessible will determine storage policies.
Immediately Accessible Storage
Content that is currently being used or is required regularly in the short-term is stored online, on spinning disk. Access to it is immediate. The amount of disk space required is a factor of how much content is required quickly over what period of time. This is the most expensive, but most accessible form of storage.
Medium-term Storage
After content is moved from online spinning disk, but before it is archived, it is often nearchived. A nearchive can be a robot-driven tape store. The purpose is to allow reasonably quick file access but at a reduced cost.
Long-term Storage
Online storage should automatically be archived on ingest for disaster recovery purposes. Similarly, storage polices will determine when nearchive storage eventually moves to archive. Archive often takes the form of tapes that are stored offsite in a secure, environment-controlled vault as a permanent backup.
CONTENT MANAGING RELATED CONTENT
A Content Management System (CMS) allows disparate types of content to be stored and related to each other. A CMS stores text-based metadata about a product, e.g. a film or TV episode, in relation with the images, promotional material, high-resolution files and transcoded derivatives for its various platforms that may be stored in the MAM or in the CMS.
More than just entering metadata into a database, the CMS generally also provides a workflow for reviewing the metadata and content, and publishing and updating related subsets to one or more platforms.
Certain users may only have access to certain features and workflows. The CMS may also need to log movement of data.
DELIVERING CONTENT TO MULTIPLE CHANNELS
To relate data, the concept of a package is required. A package is everything that is bundled together to be published to a specific platform. It usually includes metadata in an XML format plus referenced file-based content. All of which needs to be in the format expected by the delivery location.
There are many different products available to guarantee file-based delivery or one can be built in-house using Secure File Transfer Protocol (SFTP). Exact requirements will depend on the security specification of the suppliers, combined with where it coming from and going to, how fast it needs to arrive and how the arrival is confirmed. Unless this package is a physical bundle, the order in which things will be published may also influence a package’s delivery.
A handshake is essential. This should include a successful package arrival notification or a set of useful error messages should delivery fail. In a failure, all or part of the package may need to be republished.
In delivering to multiple channels the format, package, and delivery conduit may all vary, but the process is the same. The CMS publication process will need to export the correct metadata XML, select the appropriate transcoded derivatives and package them onto the relevant delivery route.
LAUNCH CONSIDERATIONS
Testing
Testing includes Functional Verification Testing (FVT), load and stress testing, User Acceptance Testing (UAT), regression testing for bug fixes and integration testing. Ongoing post-launch regression testing usually takes place in a staging environment to reduce the risks of introducing errors onto a live system.
UAT is key to ensure the design has met the users’ requirements and using it actually delivers what is expected. To minimise the surprises in UAT, talk through fully developed designs and flows with the end-users of the system prior to development. Rather than a big-bang approach, as pieces of the system are developed demonstrate them to the end-users and get feedback as the project evolves.
Training
No system is complete without training the people who will use it. Training precedes UAT.
Beta
Launch in a public beta without heavy advertising. That will help identify further problems in a wider test environment prior to a large-scale commercial launch.
ONGOING CONSIDERATIONS
Reporting
Reporting is only possible on items that are both trackable and logged. Ongoing reports track trends and indicate how the service should adapt over time.
Monitoring
Monitoring checks to see that the overall service is running and that the content is available in all the locations in which it should be available. It can also check that all the details, e.g. price, surrounding that content are accurate. Monitoring may be a mix of automated and manual checks.
Fault Resolution
Things will go wrong. Service Level Agreements (SLAs) will need to be in place with all parties. A fault resolution process typically includes priority levels, a reporting procedure, time to an initial response, time to workaround and time to fix.
TYING IT TOGETHER
Plan. Plan. Plan. Think through all the workflows from receiving content to delivering it. Consider what can go wrong and what flow will be required to right it. To determine if there is enough time at all junctures, work timelines forward from receiving content, and backwards from delivering it. Start at an overview level and then work into the detail.
TRANSITIONING FROM AN EXISTING MODEL
Speak to everyone about what does and doesn’t work about current tasks. The people who perform the roles are best placed to know and it will allow them to buy into the new changes if they contribute to them. Document in detail how the process works now and how it will work. Demonstrate at a high-level so everyone understands how the workflow is changing yet remaining the same.
CONCLUSIONS
Following the preceding descriptions provides the foundation on which to build a VOD system.
The key elements in putting together a project of this scope are understanding the requirements, identifying how the process works now and will work in the future, documenting and designing the process, and then iterating testing and user feedback.
ACKNOWLEDGEMENTS
The author would like to thank her colleagues at FilmFlex Movies and partners Ascent Media and Ioko for their contributions to these projects.