What until recently was a fraught quest to achieve low-latency, TV-caliber delivery of premium video content over the Internet has morphed into the less frustrating but still daunting challenge of choosing from a basket of compelling options that have emerged to address the issue.
The need to improve performance amid surging real-time consumption of sports and other broadcast TV content over the Internet has prompted development of multiple open-source solutions competing alongside a bevy of proprietary systems. Everyone with a vested interest in which paths are taken has a compelling case to make for their preferred approach.
For the folks running streaming operations, tracking the constantly changing nuances and relative merits of an alphabet soup’s worth of open-source innovations like SRT, QUIC, WebRTC and CMAF is an especially irritating distraction from day-to-day execution of tasks at hand. But getting it right is fast becoming an existential challenge.
At a time when consumers have more service options than ever (over 200 OTT services in the U.S. alone, according to Parks Associates), they can easily go elsewhere if they don’t like what they see, and often they don’t. In a survey of over 10,000 U.S. households in Q3 2017, Parks found that the cumulative churn rate for OTT video services, not counting the relatively low-churn Netflix and Amazon services, topped 50 percent.
Start-up delays and buffering interruptions are no longer accepted by consumers as a given of online experience when download speeds are averaging better than 55 mbps, as tabulated by the FCC in its latest annual broadband report. A study involving 23 million video viewing sessions sponsored by CDN operator Akamai found that viewers start abandoning videos in significant numbers with any delay in startup time that goes beyond two seconds. For every second of delay beyond two seconds, the study found that six percent of the audience will stop watching, which means a video with a five-second delay will lose a quarter of its audience.
Excessive buffering produces similar results. Researchers appearing at the IEEE’s 2017 International Conference on Network Protocols said their review of behavior involving 625,626 users in nine U.S. cities found that average daily viewing time decreased from more than 210 minutes for users who experience little or no buffering to less than 30 minutes for users experiencing 0.5 buffering events per minute. Separately, statistics compiled by Conviva from close to two billion viewing sessions in North America showed that buffering caused at least one stall during playback on eight percent of viewing sessions.
Intolerance for poor performance is exacerbated by expectations that come with viewing OTT video on big TV screens, especially when live sports or other linear content is involved. eMarketer says 55 percent of the U.S. population currently views Internet video on connected TVs at least once a month. According to the latest figures produced by Conviva, TV viewing actually accounted for 51 percent of all the time people spent watching streamed video in Q2.
The share of OTT consumption going to live viewing is rivaling and will soon surpass traditional TV viewing. Over 50 percent of 500 executives worldwide who responded to a 2017 survey conducted by Unisphere Research and Level 3 Communications said they expected the tipping point would occur by 2020, and 70 percent saw it happening by 2022.
As these trend lines suggest, there’s a big opportunity for distributors who get it right. When it comes to live event streaming, online users want to view what’s happening in sync with the action displayed on TV screens over legacy outlets, especially when engaging in group viewing through social media or when the viewer is on a second screen in proximity to a TV set showing the same event.
This is doable. But getting to where end-to-end latency from camera capture to playback is in the few-second range of a broadcast feed across every viewing session requires the ability to work in an environment where ecosystem partners, whether they be content producers, cloud service platforms or CDNs, are applying a disparate range of low-latency solutions to transport content over first, middle and last miles.
The UDP Juggernaut
The latest open-source approach to latency reduction is called Secure Reliable Transport (SRT), a UDP (User Datagram Protocol)-based platform that has rapidly gained market traction over the past year since its inventor, Haivision, made it available license free through a supporting organization called the SRT Alliance. With Haivision and Wowza as co-founders, the alliance has drawn 127 companies to its ranks, including a handful of prominent players like Bitmovin, Brightcove, Canal Cable, Comcast Technology Services, Cinergy, Deluxe, Ericsson, Harmonic and Limelight as well as dozens of smaller vendors.
At a plug fest in May that utilized test facilities in Chicago, Montreal, Denver and Germany, 15 alliance members successfully completed over 50 tests validating SRT streams between cameras, encoders, decoders, gateways, multi-viewers and players, according to Sylvio Jelovcich, vice president of global alliances at Haivision. “It’s exciting to see the breadth of new SRT-ready solutions that are continuously being launched in the streaming and broadcast community,” Jeloveich says.
The SRT initiative encroaches on turf already occupied by Quick UDP Internet Connections (QUIC), the Google-invented protocol now in final draft stages as an IETF standard targeted for completion by year’s end. SRT and QUIC are designed to overcome the packet-loss and sequencing issues of UDP while eliminating the buffering delays common to TCP (Transmission Control Protocol). And both protocols provide for secure transport with utilization of TLS 1.3, the latest version of the Transport Layer Security protocol.
QUIC modifies through algorithmic adjustments at the distribution and receive ends the implementation of UDP as a superior transport alternative that encapsulates HTTP 1.1 or HTTP/2 formatted streams in the flow. This enables seamless conversion of QUIC-delivered streams to HTTP on receiving devices, which means QUIC can transport the multiple stream used with ABR. QUIC, in fact, reverts to HTTP over TCP as a fallback to mitigate rare instances where streams destined to multiple users might be behind blocked UDP traffic.
QUIC employs a number of techniques to minimize blockage, such as the pacing of packet generation based on persistent bandwidth estimations for the path taken by each stream and proactive retransmission of the most important packets supporting such things as error correction or initiation of encryption.
QUIC also lowers latency by reducing the number of roundtrips required to set up a connection and by avoiding the need to set up connections with secondary sources on a Web page once the primary connection has been made. Multiple steps associated with handshake, encryption setup and initial data requests are consolidated in the initial set-up while compression and multiplexing processes like those adopted with HTTP/2 are used to avoid separate set-up for accessing the sub-sources on a page.
SRT employs variations on many of these techniques, including quick session setup, bandwidth estimation and handling of packet loss recovery through low-latency retransmission techniques, which it tempers by dropping packets when congestion is high. But rather than relying on HTTP and ABR to alter bitrates to accommodate variations in bandwidth availability, SRT analyzes network conditions in real time and filters out the impact of jitter, noise and congestion.
As is typically the case with battling protocols that accomplish more or less the same thing, one can get lost in the nuances informing the debate over which is best and which has the most support. But one thing seems certain: enhanced UDP is destined to supplant TCP as the mechanism for streaming latency-sensitive video content.
Current thinking about UDP brings the evolution of streaming transport full circle. The first widely deployed media streaming platform, Real Time Streaming Protocol, invented by RealNetworks, was UDP-based, followed by the first HTTP-based streaming modes, which were also initially based on UDP. TCP supplanted UDP in both environments and was the dominant streaming transport foundation when HTTP-based ABR took hold.
Gains for SRT & QUIC
Because SRT represents a radical departure from the norms of streaming over HTTP Live Streaming (HLS), MPEG-DASH and other ABR modes, it faces an uphill battle in mid- and last-mile applications. But as an open-source option SRT is rapidly building on the successes of its proprietary precursor as a first-mile transport solution.
Haivision CMO Peter Maag made this point last year in an interview with StreamingMedia Magazine when he commented, “SRT is currently ideal for contribution and distribution of performance streams.” The goal, he added, is “to extend SRT to address broad-scale OTT delivery challenges.”
Meanwhile, QUIC is gaining ever wider traction in end-to-end commercial deployments worldwide. Google got the ball rolling by putting the technology in play on the server side across all its Web properties and as a default mode in the Chrome browser. Anyone accessing a YouTube video on the Chrome browser will receive the stream over QUIC.
Use of the technology is spreading rapidly beyond the Google website domain with one percent of all websites worldwide now supporting it, according to Internet stat compiler W3Techs. This is up from a one-half percent at the start of 2018.
Last year Akamai became the first CDN operator to announce support for QUIC, which is now part of the Media Acceleration and Efficiency platform. Akamai offers QUIC on access routes from the CDN edge to a load-dependent subset of end user devices running compatible client software, such as Chrome browsers.
“This can be applied to any video stream utilizing Akamai’s Media Acceleration and Efficiency platform, whether or not QUIC is used for ingress,” notes Will Law, chief architect for Akamai’s media division. ABR-streamed video on the Akamai CDN can be delivered from the edge locations over TCP or UDP using QUIC, depending on client support and edge server load, Law says.
Verizon Digital Media Services, the turnkey support platform now positioned in the carrier’s Oath portfolio, recently announced it, too, is supporting QUIC on its CDN, which operates 125 points of presence on six continents. VDMS customers can enable QUIC through a simple rules engine change that takes affect within minutes, at no additional cost, says VDMS CTO Frank Orozco. “Whether it’s streaming a widely watched sports event or accelerating a shopping cart transaction, every millisecond counts,” Orozco says.
VDMS customer Bluekiri, which operates Logitravel, a leading online travel agency in Europe, along with other websites, has seen outstanding results with QUIC, says Bluekiri CEO Iñaki Fuentes. “Since implementing QUIC, we have experienced a significant improvement in Web performance, with visitors now able to access information faster than ever,” Fuentes says.
One development pointing to a potential major expansion in the use of QUIC involves 3GPP, developer of the 5G mobile standard. As noted in a recent blog from Zaheduzzaman Sarker, a senior researcher at Ericsson Research, 3GPP, which created a service base packet core architecture (SBA) with HTTP/2 over TCP as the transmission mode, is now looking at QUIC over UDP.
With features like “no Head of Line (HoL) blocking, multiplexing, flow control, security, better congestion control,” QUIC is a potentially better and faster protocol than TCP for SBA,” Sarker says. “[3GPP] are now studying the potential of replacing TCP with QUIC. The decision has yet to be made but this illustrates a potential path for QUIC to become the transport of choice not only for the users’ traffic but for the control plane traffic as well.”
Upending the Contribution Status Quo
In the contribution (first-mile) domain that’s SRT’s sweet spot the question is whether the technology can measure up to what proprietary systems from the likes of IBM’s Aspera, Signiant, Adobe Send & Track and many others have been providing to companies with pockets deep enough to rely on them as an alternative to satellite or dedicated links for transmitting video at broadcast quality over the Internet. If so, the impact could be profound, enabling access to that level of UDP-based performance on a license-free platform to professional video producers of every description.
The recent experience of Fox Sports with use of Aspera’s FASPStream platform in live production for the FIFA World Cup from Moscow offers a dramatic look at what can be done, in this case in a long-distance production scenario. As described by Aspera vice president Richard Heitmann, Fox used the UDP-based Aspera transport platform in conjunction with Telestream’s Lightspeed Live capture and Vantage transcoding solutions to enable live broadcast production in Los Angeles studios 6,000 air miles from the event.
“Traditionally,” Heitmann notes, “producing a major event such as this one required a massive deployment of production staff and equipment across the globe, including either expensive live satellite feeds or dedicated terrestrial networks with heavy quality of service. Rather than following this conventional model, FOX Sports chose to take a more innovative approach that would allow it to fully make use of its LA production facility and staff.”
Delivering raw feeds in under ten seconds from every camera at 12 venues in Russia, Fox was able to edit the content at its L.A. studio for live broadcast coverage in the U.S. “The organization plans to use the joint Telestream-Aspera solution for the FIFA Women’s World Cup next year and for other major sporting events including NFL games,” he says.
It may well be that SRT has answered this challenge, judging from how the NFL is using the technology. As described at this year’s NAB Show by John Cave, vice president of information technology for the NFL, the application has to do with the league’s need to generate instant replays for referee reviews of plays in games outside the U.S.
The NFL used SRT-transported feeds over the Internet from London to review headquarters in New York with a 280 ms latency between their reception and the replay viewed live in London. “They just took an SDI feed from a broadcast truck and put it in the SRT enabled Makito X encoder from Haivision, and off they went,” says Mark John Hiemstra, a writer for Haivision, in a recent blog post.
Combining QUIC & CMAF
As distributors sort through the options they might use to reduce latency, one new development that lends strength to QUIC as the open-source choice for last-mile distribution is Common Media Application Format (CMAF), the ISO MPEG standard designed as a common container for encrypting and streaming via the two leading streaming protocols, Apple’s HLS (HTTP Live Streaming) and MPEG-DASH. As previously reported, the standard supports an optional approach to lowering latency that involves breaking up ABR fragments into smaller chunks that can be delivered sequentially to clients for playback without waiting for the whole fragment to load in the buffer.
Chunked transfer encoding is a streaming data transfer mechanism available in the HTTP 1.1 and higher. While there’s a need for support from the CDN for chaining chunks and storing CMAF fragments, any client equipped to support ABR at high processing speeds common to recent vintage devices can work with the CMAF chunking process without additional software support. By playing back chunks as they arrive, the player avoids the delay that would result from waiting for the full fragment to arrive.
Fragments are bounded by key frames while chunks of equal length within the fragment include what is known in ISO-BMFF parlance as a Movie Fragment file (moof) and a Media Data Box (mdat). The player doesn’t request individual chunks. Instead, chunks are units of intermediate transfer of requested fragments which are sent sequentially across all points in the delivery chain, relying on well-timed players to render them in proper sequence.
Users of QUIC as an HTTP-compatible protocol will be able to work in the emerging CMAF environment. While SRT, which as noted earlier is not HTTP compatible, supports segmenting, the mechanisms enabling the extremely low-latency capabilities of the approach taken with CMAF chunking require a level of CDN support for chaining chunks and storing fragments that isn’t part of the SRT roadmap.
CMAF, development of which was spearheaded in a rare collaboration between Apple and Microsoft, is in the early stages of adoption but appears destined for wide use. Notably, it is the anchor framework for the Consumer Technology Association’s Web Application Video Ecosystem (WAVE) Project, which seeks to make it easier for content owners and distributors to launch interoperable services across disparate streaming systems and multiple devices using MPEG’s Common Encryption (CENC), the World Wide Web Consortium’s Media Source and Encrypted Media Extensions (MSE/EME) and HTML5 in conjunction with CMAF.
“One goal of WAVE is to promote convergence around CMAF,” Law says. “When CMAF becomes a common container, it reduces the pool of content that a content distributor must prepare in order to achieve broad reach. WAVE is trying to make real-world use of CMAF as interoperable as possible.”
The use of chunk-encoded CMAF has produced end-to-end transmission metrics in the four-second range over commercially operating networks, including Akamai’s. While adaptation to combined utilization of QUIC and CMAF will be piecemeal, momentum in this direction appears likely to limit how far SRT goes beyond the contribution stage.