Sunteți pe pagina 1din 9

ADVANCED CODING: TECHNOLOGY, SYSTEMS AND APPLICATIONS

Charles Cartwright, Jeremy Bennett, Giles Wilson TANDBERG Television, UK ABSTRACT The success of MPEG-2 has been remarkable since the first commercial launches of direct to home satellite systems in the mid-1990s. Globally, 800 satellite TV channels in 1994 has grown to over 9,000 today. A significant driver in this growth has been improved compression efficiency and that need to reduce bit rate continues today. However MPEG-2 can no longer maintain its year-on-year improvements and a new range of audio/video compression systems are coming to the fore to pursue bit rate reductions even further. This paper describes how these new Advanced Video Coding (AVC) techniques will improve efficiencies as well as pointing out some of the practical pitfalls. This paper briefly considers the system aspects of AVC solutions including a model of a typical head-end. A new range of applications are considered - enabled by video and audio bit rates far lower than current MPEG-2.

INTRODUCTION MPEG-2 brought a revolution to broadcasting nearly ten years ago and it is almost time for another revolution. In the past ten years MPEG-2 encoding performance has improved significantly. Market leading MPEG-2 encoding companies have improved the efficiency of their encoders by between 50 and 75% all within MPEG-2 limitations. Typical bit rates in a first generation MPEG-2 encoder was six to eight Mbit/s. Now two to three Mbit/s for each video component is typical in direct to home satellite systems (see figure 1). These improvements are beginning to reduce asymptotically as MPEG-2 reaches it limitations. The opportunity for significant improvements in MPEG-2 performance are now very limited. MPEG-2 solutions are likely to become more dense, lower power and cost, but they are not going to produce much lower bit rates. It is fortunate that during the same period Moores Law, which states that semiconductor/computer processing power increases by a factor of two every eighteen months, has also progressed relentlessly over the same period. More processing power at a lower price has enabled a whole new generation of compression technology. Both set top boxes and head-end equipment are now capable of much more complex calculations at commercially acceptable prices than were possible at the time of MPEG-2 standardisation. The two most significant algorithms/systems proposed currently are Windows Media 9 Series from Microsoft and MPEG-4 part 10 an open standards algorithm from the ITU/MPEG Joint Video Team. Both these systems draw heavily on the experience of MPEG-2 but add extra tools to improve the coding efficiency particularly at low bit rates. It is not the intention of this paper to differentiate between the AVC solutions; other papers do present some of the differentiating factors1. The next section looks at some of the issues surrounding AVC, why and how it improves on MPEG-2, both audio and video. The following section considers some of the key system design issues, specifically the relevance of Digital Rights Management (DRM) in AVC systems. The final section reviews the applications that have driven the need for AVC and will eventually become the markets for AVC solutions.

First Broadcast MPEG-2 Encoder

Bit Rate (Mbits per second)

MPEG-2 Bitrate for Broadcast Quality Television

ReflexTM Statistical Multiplexer Enhancements

Enhanced Motion Estimation TTV Noise Reduction Advanced Preprocessing

03 20

02 20

01 20

00 20

99 19

96 19

97 19

98 19

95 19

ADVANCED CODING TECHNIQUES Improvements on MPEG-2 MPEG-2 video standard2 was limited by two factors in its original definition: The targeted bit rate operation of compressed video was around 2-15 Mbit/s (for main profile at main level). The specification does not contain any lower limit of bit rate as this was not required in the definition of a compliant encoder. However MPEG-2 does not scale down to lower bit rates very efficiently. Silicon for implementing MPEG-2 was limited to the technology of the day. This meant, in 1994, an ASIC (used in decoder design) with densities of 120,000 gates per chip with gate size of 0.5 to 1 m. This compares to todays leading technology of 25,000,000 gates on an ASIC with a gate size less than 0.1 m. MPEG-2 specified techniques had to be limited to realistic implementations on the current technology.

94 19

Year

Figure 1: Bit rate over time for broadcast quality pictures

The major success of MPEG-2 was that it was a standard the spawned and entire industry. Most standards do not achieve that accolade. This was possible because the standards authors ensured that the standard could be implemented in the products and silicon of the day. The authors were aware of many techniques but saw that it would not be possible to implement them at that time. Likewise every technique in the standard had to be evaluated, agreed and documented to a degree of detail that could ensure design engineers would be capable of producing viable products. It follows that the second success of the authors of MPEG-2 was in ensuring the standard did not

endlessly evolve: instead they drew a line and produced a working standard. Now, almost ten years later, some of the tools unused for reasons of practicality in MPEG-2 are now possible in AVC. It is these new tools that are producing this next level of compression efficiency: AC/DC co-efficient prediction where the macro-block co-efficients are predicted from those in adjacent macro-blocks and from some data within the macro-block itself. For smooth transitions such as green grass in the background where adjacent macro-blocks have only a small difference, this has a dramatic effect on the number of bits required to encode the macro-blocks. However, the macro-block decoding has changed from the previously simple bit stream operation to include some computation for each co-efficient. Different variable length coding (VLC) tables for different material and bit rates so the VLC module has become more efficient at the expense of the decoder having to hold more tables. This is commonly referred to as Context-based Adaptive VLC (CAVLC). A major new tool is Context-based Adaptive Binary Arithmetic Coding (CABAC), which replaces VLC (a form of first-order entropy coding) with arithmetic coding (second-order entropy coding). This gives an improvement in the entropy coding as well as being optimal for any tools used because of its contextual adaptation. While MPEG-2 Part 2 has motion compensation, it was restricted to -pel bi-linear interpolation. MPEG-4 Part 10 allows motion vectors to the accuracy of -pel and then uses bi-cubic interpolation. Bi-cubic interpolation produces a better match for the macro-block, which in turn reduces the energy stored in the error image so lowering the number of bits required to encode it. However bi-cubic interpolation requires many more operations and accumulators leading a step increase in the complexity of the motion compensation.

Another factor is that much was learnt by the weaknesses of MPEG-2 and these have been improved in AVC systems. The best examples are: The blockiness is the most obvious visual effect of bit starvation of an MPEG-2 video encoder. An in-loop de-blocking filter, which is present in both the encoder and decoder, reduces the discontinuities at the block boundaries caused by different quality factors (Qp) used for adjacent blocks. This reduces the blocking artefact that is present in low bit rate MPEG-2 encoding and because it is in-loop for the motion compensation, the encoder and decoder remain in synchronisation. This tool does have a significant effect on the encoder and decoder complexity because of the number of block boundaries and the fact that the filter is in-loop so cannot be implemented as a separate module. The block size has been changed from 16x16 down to 4x4. It has been shown empirical that the reduced block size provides coding gain without a significant increase in complexity. Relatively large (sequence, picture, slice) headers in MPEG-2 represent a fixed overhead in the data stream. This overhead was not significant at 6 Mbit/s but at a few hundred kbit/s they are a noticeable overhead. AVC implements headers more efficiently.

Theoretical and Practical Results It is widely suggested that improvements of 66%, some have suggested 75%, are possible using AVC techniques compared with MPEG-2. The interpretation of this fact is that it requires only , or even a , of the bandwidth in AVC to carry the same picture quality as MPEG-2. TANDBERG Televisions own research on Windows Media 9 Video and MPEG-4 part 10 show that such

dramatic improvements are possible, see figure 2. However it is necessary when considering picture quality improvements against MPEG-2 to remember that there is not a fixed relationship but the efficiency gains depend upon certain factors: Not all MPEG-2 encoders perform the same. It is quite possible to choose circumstances where 50% efficiency improvements can be achieved between two MPEG-2 encoders. MPEG-2 performance is particularly poor at very low bit rates. The biggest differentials in performance are at those very low bit rates. It might be that AVC out performs MPEG-2 at these very low data rates but, from the broadcasters view, neither is acceptable for transmission. The improvements in efficiency are video content sensitive. The two AVC techniques shown in figure 2 improve on MPEG-2 but the improvement varies between one type of content and another.

Picture Quality by Content of Various Coding Algorithms


12

10
MPEG-2

MPEG-4 part 2 Windows Media 9 MPEG4 part 10

PQR-YC

Auto (3Mbit/s)

Kiel (3Mbit/s)

Mobile (3Mbit/s)

Soccer (3Mbit/s)

Susie (0.8Mbit/s)

Sailboat (0.8Mbit/s)

Content Source
Figure 2: Picture Quality by Material Source

There are two other important issues in the practical implementation of AVC techniques and their improvement on MPEG-2. The first is the gap between the theoretical results from simulation models (such as shown for MPEG-4 part 10 in figure 2) and real implementations in the market place. The simulation tools, often used as the basis of quality comparisons, have no limit on how long they take to compute the compressed stream and therefore implement all the new tools exhaustively. A real encoding product is almost certain to implement a sub-set of the full AVC tools, or implement them in an optimised manner. The result is a area, or window, of performance

points for AVC encoding products, see figure 3. This is similar to the window that defines performance of current MPEG-2 encoder performance not all MPEG-2 encoders perform equally.

Subjective Picture Quality for Bitrate


Excellent

ITU Picture Quality Grading

Good

Fair possibly much better

GOOD MPEG-2 WINDOW OF IMPLEMENTATION


but could be worse

Poor

Bad

The second factor is the likely improvements in AVC encoding technology over time. As both design experience and silicon complexity increase during the life of these AVCs it is reasonable to assume that coding efficiency will improve just as it did for MPEG-2. So, performance of AVC compression will be implementation dependent both who produced it and when it is produced. Audio Coding Changes The current DVB standard for broadcast audio and that used in DAB is based around the MPEG-1 Layer I/II algorithm. This algorithm can be considered as a first iteration of a perceptual audio encoder where not only redundant information is removed from the audio signal but also energy that is not perceived by the listener. Owing to its intrinsic deficiencies, the Layer II algorithm cannot be guaranteed to produce a decoded signal that cannot be distinguished from the original signal, the ultimate target of audio compression. Modern audio algorithms such as the AAC (Advanced Audio Coding) and WMA (Windows Media Audio), have used the structure of the Layer II algorithm and added tools which perform the removal of excess energy more efficiently to achieve improved performance. AAC will be used for the discussion of new audio tools that follows: Where MPEG-2 Layer II uses no redundancy (entropy) coding, AAC allows the choice of multiple Huffman coding tables including some which encode multiple co-efficients into a single codeword.

1M

Figure 3: Window of Implementation in AVC

2M

Bitrate

3M

4M

5M

6M

7M

The forward and inverse transforms within Layer II are not matched so that the decoded signal will always be different from the original. AAC uses a formulation that guarantees perfect reconstruction. With perceptual encoding, the compression ratio varies according to the signal complexity so quality improvements can be made by allowing frames to be encoded with more constant quality than constant number of bits. AAC defines a small buffer in the decoder. Research has shown that the optimal frequency bin structure gives 1024 samples at 48 kHz sampling frequency (as used for most broadcast applications). This was chosen for AAC but Layer II uses 384 samples. However, the longer frame means that the time resolution has become sub-optimal leading to pre-echo artefacts. Thus AAC has the option to switch to a 128 sample frames at a point where time resolution is important. Also AAC includes a prewhitening filter called Temporal Noise Shaping that reduces the number of points where time resolution is important. MPEG-2 Layer II allows higher frequency bands to be encoded as a mono signal otherwise it has no special stereo encoding tools. However the human brain performs some combined computation on the two ears to generate a stereo image. AAC includes two stereo tools which can be applied exclusively to any band: intensity stereo where the signals are encoded as mono but a relative intensity between the left and right channel is also transmitted; and mid-side stereo where the channels are encoded as average (mid) and difference (side) rather than as left and right.

Thus the modern algorithms can produce excellent results around 96 kbit/s for a stereo pair which Layer II could not replicate at 256 kbit/s. These improvements have been demonstrated already in laboratory trials of AAC and other audio compression techniques3,4. SYSTEM ISSUES Many of the system design issues for AVC are the same as these already confronted and overcome in MPEG-2. One issue however is of overwhelming interest to broadcasters: protecting the content. A significant change since MPEG-2 has been consumer devices with both storage and digital media interfaces. Digital media interfaces allow consumer devices to be far more flexible and attractive to the consumer. The PC has been the foremost product with digital interfaces. The problem is that digital interfaces will allow sharing of content, including content that the consumer device owner has no right to share. Traditional on-the-fly conditional access (CA), used in most MPEG-2 systems today, decodes the content as it is received. With storage and a digital interface this means that unprotected content, stored on the consumers devices, could be shared without the consent of the copyright owner. Digital Rights Management (DRM), a well know technology from IT systems, ensures that the content remains protected up to the point of consumption. At the time of consumption a licence is requested and the user can accept the terms of the license and consume the content. The licenses managed via DRM can be far more complex than CA with a variety of different usage rights being conferred or withheld. Clearly there are some additional practical issues in implementing DRM. The system must be completely secure from the content owners point of view but also flexible and simple from the consumers point of view. An example of an AVC system head-end is shown in figure 4 based on TANDBERG Television EN5920 Windows Media 9 Series encoders.

BROADCAST COMPONENT

WINDOWS MEDIA 9 COMPRESSION


EN5920 EN5920 EN5920 EN5920 DRM DRM DRM DRM

MANAGED IP SWITCH

SERVICE ROUTING

MPEG-2 TURNAROUND SUB-SYSTEM

M ST CA TI UL ES IC RV SE

SYSTEM MANGEMENT

TANDBERG DEVICE CONTROLLER

DIGITAL RIGHTS MANAGEMENT SERVER

IP NETWORK

SERVICE PORTAL

CONTENT PREPARATION
NDE M
TAPE/ SERVER SOURCE TAPE/ SERVER SOURCE TAPE/ SERVER SOURCE

SE RV I AN
EN5920 EN5920 EN5920 MEDIA SERVER

CE S

LEGEND
SERVICE PORTAL MEDIA SERVER TANDBERG EQUIPMENT

3rd PARTY EQUIPMENT

CONTENT FLOW CONTROL DATA FLOW

Figure 4: Example AVC Head-end for IP delivery In figure 4 the Broadcast Services component provides the acquisition, encoding, encryption and delivery of traditional broadcast type services. Content acquisition may either be directly from baseband video sources or from an existing MPEG-2 broadcast medium such as digital satellite or terrestrial broadcasts. In the case of MPEG-2 derived services, the MPEG-2 turnaround sub-system is deployed, which may be comprised of a range of MPEG-2 interfacing equipment such as multiple-channel receiver/decoders and multiple-channel descramblers. On-demand Services are delivered to each individual, under control of their client application and using unicast streaming. On-demand content is pre-encoded and stored on the server disk arrays for retrieval upon a client request. The solution detailed in figure 4, is primarily designed for broadcast over IP networks such as xDSL or FTTH systems. The system can be varied to allow the delivery of all services over MPEG2 transmission paths, including digital satellite, terrestrial and cable networks. The system is adapted for transmission over MPEG-2 networks through the addition of an MPEG-2 IP encapsulator (IPE) at the output of the system This unit is configured to capture all of the IP traffic related to the Windows Media-9 services, and encapsulate this traffic into MPEG-2 transport stream format. The output of the IPE can them be directly modulated on to the MPEG-2 infrastructure, or forwarded to an MPEG-2 multiplexer for transmission within the same output stream as traditional MPEG-2 services. Additionally, work is in progress now to develop and standardise direct mapping of AVC coded streams in to MPEG-2 transport streams, thus removing any IP middle layer.

APPLICATIONS Direct to Home HDTV The reduction in bit rate for high quality video and audio is one clear advantage of AVC over MPEG-2. Given the drive for improved efficiency MPEG-2 over past decade one might expect the improved performance of AVC will be attractive in core MPEG-2 markets. However, in heavily established MPEG-2 markets such as direct to home satellite the legacy of MPEG-2 decoders will prove a strong inertia to AVC deployment. Set top boxes represent a significant part of service rollout costs for an operator and there would need to be powerful and believable ROI calculations to justify such a large investment as replacing a set top box population. It is unclear at this stage that the bit rate savings alone will justify replacing MPEG-2. However, in traditional consumer MPEG2 markets, new services enabled by AVC may well stimulate adoption of the technology. For example, Windows Media 9 has been shown capable of delivering high definition (HD) video as low as 5 Mbit/s. With consumers adopting increasing complex home cinema solutions then cost effective HD delivery could justify set top box investment. TV over xDSL networks Even more interesting are the new media applications for AVC such as streaming traditional TV content over broadband connections to home. With leading broadband countries, such as South Korea reaching 56% xDSL penetration5, the prospect of delivering high quality video services to those broadband customers is attractive to broadband operators. Multi-media content targeted at both PCs and set top boxes delivered via AVC gives broadband operators a compelling service beyond commodity increases in bandwidth that is offered by the basic xDSL package. For operators of xDSL services a one to one-and-a-half Mbit/s broadcast quality of services is required. This is to either simultaneously offer three different channels to support with complex modern family viewing habits alongside telephony and internet/data service all inside their five Mbit/s pipe or to be able to offer much better penetration and coverage with a universal one Mbit/s service but with acceptable picture quality. This broadband service would also include offering Video on Demand services (VoD) for which AVC systems like Windows Media 9 are well designed. Finally this technology will give the major telephony operators a triple-play (video, telephony, data) package to take the battle for customers to the cable companies with their IP modems. Mobile Networks At the lowest possible bit rates AVC also enables mobile media applications. Various wireless LAN technologies, Digital Audio Broadcasting (DAB) and Universal Mobile Telecommunications System (UMTS) are networks that could carry multi-media content for mobile devices such as phones, PDA or tablet PCs. The very low bandwidths available on those networks perfectly match the aptitudes of AVC. It also offers the opportunity of content owners to develop new revenue streams based on the same content or increase consumer pull toward their core offering. Examples include streamed sport highlights or promotional trailers for premium movie packages. Wireless access to multi-media will be a major growth market over next few years as content owners and networks try to create unique services through this technology.

CONCLUSIONS MPEG-2 has proved an excellent standard and the digital broadcasting industry was born and matured because of it. However, as with all technology, it was a standard of its time and what was state of the art ten years ago is increasing a commodity today. New video and audio coding techniques have shown clear improvements by design and in simulation. During 2003 the market will be able to see these improvements in live products and systems. Real solutions will need far more than a more efficient coding algorithm or even a more efficient coding product. As with MPEG-2, system level solutions will be required and, particularly with increasing complex consumer equipment where content sharing is possible, content protection will be of paramount importance. Finally we have seen that advanced audio and video coding will develop in some clear market areas but markets when there is a big installed base of MPEG-2 decoders AVC is less likely to develop in the short term without a significant new revenue opportunity. REFERENCES 1. Cartwright, C., 2003. Choices in Video Compression for IP Delivery, VidTrans February 23-25 2003. 2. ISO/IEC, 13818-2 Generic Coding of Moving Pictures and Associated Audio: Video, sixth amendment 1999 3. Meltzer, S., 2002. Coding Technologies, aacPLUS and mp3Pro. EBU Specialised Meeting on Audio/Video Coding, September 5 & 6 2002. 4. Stoll, G., 2002. EBU B/AIM Studies: Results of Internet Audio Coding Studies. EBU Specialised Meeting on Audio/Video Coding, September 5 & 6 2002. 5. John Bosnell et al., 2002. DSL Worldwide Retail Directory, Edition 6 October 2002. pp235 ACKNOWLEDGEMENTS The authors would like to thank their colleagues for their contributions to this work. They would also like to thank TANDBERG Television for its support in preparing and presenting this paper.

S-ar putea să vă placă și