HDR standards in depth
This article will be useful for QA engineers, application developers, OEM-manufacturers and SOC-designers who want to implement or identify HDR content. In this article, we will examine the main HDR standards, their identification and validation for H.264/AVC, H.265/HEVC, VP9 and AV1.
It should be noted that the term HDR is an umbrella term, since there are several HDR implementation standards from different vendors on the market. The most widely used are the four HDR standards: HDR10, HLG, HDR10+ and Dolby Vision. Figures 1.1 and 1.2 show the brands of HDR-supporting TV manufacturers, and Figure 2 shows the current HDR-supporting streaming services.
To play HDR content, you need properly prepared content that conforms to the standard, as well as an HDR-supporting decoder and display.
HDR10
The standard was adopted in 2014. HDR10 has gained wide acceptance due to its ease of use and absence of license fees. The standard describes video content that complies with the recommendations in UHDTV Rec. ITU-R BT.2020.
HDR10 is based on the PQ EOTF transmission function, which is why such video content is not compatible with SDR displays. Also, HDR10 has a single layer of video content.
The standard employs static metadata that is applied to the entire video sequence. On the one hand, static implementation simplifies the implementation. At the same time, it does not take into account the need for different tones for static and dynamic, bright and dark scenes, so that the application of global compensation methods is required. Thus, HDR10 is not able to fully convey the author’s ideas and vision.
HDR10 metadata includes mastering display colour volume and content light level information.
Mastering display colour volume is the display parameters that were used to create video content and are considered to be reference parameters. When playing video content, the display will be readjusted in relation to the reference.
Mastering display colour volume describes:
- Display_primaries, X and Y coordinates of the three primary chrominance components;
- White_point, X and Y coordinates of the white point;
- Max_display_mastering_luminance, the nominally maximum luminance of the mastering display in units of 0.0001 cd/m2;
- Min_display_mastering_luminance, the nominally minimum luminance of the mastering display in units of 0.0001 cd/m2.
Content light level information — the value of the upper limit of the nominal target luminance level of the images. It includes:
- Content light level information (MaxCLL), indicates the upper limit of the maximum pixel luminance level in cd/m2;
- Max_pic_average_light_level (MaxFALL), specifies the upper limit of the maximum average luminance level of the whole frame in cd/m2.
In H.264/AVC and H.265/HEVC video formats, HDR10 metadata can be specified at two levels.
- At the elementary video stream level in the corresponding SEI-headers of the IDR access block. Figure 3 shows an example of SEI Mastering display colour volume and Content light level information for an HEVC video sequence: maximum nominal luminance 1,000 cd/m2, minimum nominal luminance 0.05 cd/m2, MaxCLL 1,000 cd/m2, MaxCLL 400 cd/m2, and also the coordinates of the chromaticity components and the white point.
- At the MP4 media container level: mdcv (Mastering display colour volume) and clli (Content light level) boxes;
- At the MKV/WebM media container level: SmDm and CoLL boxes.
VP9 carries data at the media container level:
- MKV/WebM: SmDm (Mastering Display Metadata) and CoLL (Content Light Level) (Figure 4);
- MP4: mdcv and clli boxes.
AV1 carries metadata:
- at the elementary video stream level and signals it using the OBU syntax (metadata_hdr_mdcv and metadata_hdr_cll);
- at the MP4 media container level: mdcv and clli boxes;
- at the MKV/WebM media container level: SmDm and CoLL boxes.
HLG
The HLG standard appeared in 2015 and has also been widely adopted. The standard describes video content that conforms to BT.2020.
HLG, like HDR10, carries one layer of video content. Unlike HDR10, HLG does not have metadata, since it uses the HLG EOTF hybrid logarithmic function, partly repeating the SDR function curve, partly HDR (Figure 5). Such an implementation theoretically allows HLG to be played on both PQ EOTF (HDR10, HDR10+, Dolby Vision) and SDR displays with colorimetric parameters conforming to BT.2020. In terms of the degree of realism, HLG, like HDR10, is not able to fully convey the ideas and vision of the author. And due to the peculiarities of the HLG EOTF function, changes in hue may be noticeable on an SDR display if the images contain bright areas of saturated color. As a rule, distortion is observed in scenes with specular flares.
An HLG video stream can be identified by the Transfer_characteristics parameter, which will have the value 14 or 18.
For H.264/AVC and H.265/HEVC, the parameter can be specified:
- at the MP4 media container level: in avcc, hvcc or colr boxes (Figure 6);
- at the MKV/WebM media container level in the corresponding TrackEntry video and colour box;
- at the elementary stream level in SPS headers → VUI → video_signal_type_present_flag →colour_description_present_flag → Transfer_characteristics (Figure 6);
- SEI message Alternative transfer characteristics, located in the IDR access block at the elementary stream level. The message contains the parameter referred_transfer_characteristics = 18 (Figure 7). If there is a discrepancy between the values in SEI, VUI or media container, the SEI value is given priority.
For VP9, the parameter can be specified at the media container level:
- MP4: in vpcc and сolr boxes;
- MKV/WebM: colour box.
For AV1, the parameter can be specified:
- at the elementary stream level in the OBU Sequence Header → color_config → if (color_description_present_flag) → Transfer_characteristics;
- at the MP4 media container level in av1c and colr boxes;
- at the MKV/WebM media container level in the corresponding TrackEntry video and colour box.
HDR10+
The standard also describes video content that complies with UHDTV BT.2020.
HDR10+ uses PQ EOTF and is therefore incompatible with SDR displays.
Unlike HDR10, HDR10+ uses dynamic metadata, which allows more efficient editing of each scene during mastering, thereby entirely conveying the author’s ideas. During content playback, the display is rearranged from scene to scene in the same way as the author created it.
HDR10+ offers backward compatibility with HDR10. In case the display does not support HDR10+ dynamic metadata, but supports HDR10 static metadata, and such data is present in the stream or media container, the display can playback the HDR10 video sequence.
For H.264/AVC and H.265/HEVC, dynamic metadata is located at the elementary stream level in SEI user_data_registered_itu_t_t35 (Figure 8). In VP9 metadata are specified in BlockAddID (ITU-T T.35 metadata) of the WebM container. In AV1, metadata is specified in the metadata_itut_t35 () OBU syntax.
Dolby Vision
The most complex proprietary HDR standard developed and licensed by Dolby. An HDR standard regulating the possibility of using two layers simultaneously in one video file: basic (Base layer, BL) and enhanced (Enhancement layer, EL). In fact, the presence of two video layers is rare due to the large size of the video files and the difficulty in preparing and playing back such content.
Dolby Vision has 5 predefined profiles: 4, 5, 7, 8 (8.1 and 8.4) and 9.
1 Profile 4 is not supported for new applications and service providers.
2 Profile 8.4 is at the standardization stage. Maximum luminance level is 1000 cd/m2.
BL for profiles 5,8,9 and EL for profiles 4 and 7 use PQ EOTF, so they are not compatible with SDR displays. These profiles use dynamic metadata similar to HDR10+ metadata. This allows efficient editing of each scene during mastering and accurately convey the author’s ideas. When content is played back, the display is readjusted from scene to scene based on the dynamic metadata.
In H.264/AVC and H.265/HEVC video formats, Dolby Vision dynamic metadata is located at the elementary video level:
- In SEI user_data_registered_itu_t_t35 ST2094–10_data ();
- In NAL unit 42/62 at the elementary stream level in the corresponding NAL and SEI.
Dolby has standardized Dolby Vision identification for the MPEG-2 transport stream and MP4 media container. In MPEG-2 TS, information is provided using DOVI Video Stream Descriptor in a PMT table, from the content of which the profile, level, presence of layers and compatibility are determined.
For this purpose, MP4 container uses configuration boxes: dvcc (for profiles lower than or equal to 7), dvvc (for profiles higher than 7 but lower than 10), dvwc (for profiles equal to or higher than 10 — reserved for future use).
One of the following boxes is also used:
- dvav, dva1, dvhe, dvh1, avc1, avc3, avc2, avc4, avc1, avc3, hev1, hvc1 for decoder initialization;
- avcc, hvcc to provide information about the encoder configuration;
- avce, hvce to describe EL if the 2nd video layer is present (profiles 4 and 7).
General overview of HDR standards
3 Dynamic: profiles 4 (EL), 5, 7, 8.1, 9. Absent in 8.4.
HDR content playback is performed as follows (Fig. 10):
- The application extracts elementary video and HDR metadata (if present) from MP4, MKV/WebM, TS media containers, then transfers the data to the decoder;
- The decoder decodes the video sequence and extracts static or dynamic HDR metadata, or obtains static HDR metadata about the framework from the media container;
- Decoder transmits the decoded frames and HDR metadata to the display;
- Display outputs the image.
HLG has no metadata. If there are 2 video layers (BL/EL — profiles 4 or 7 in Dolby Vision), the extractor extracts them, but the application can decide which layer and the corresponding decoder to select, depending on the platform’s capabilities.
Checking the video sequence for compliance with colorimetric parameters
- BT.2020/2100 and EOTF video signal conversion functions. For all 4 video codecs (H.264/AVC, H.265/HEVC, VP9, AV1) this is standardized identical set of parameters:
- colour_primaries, indicates the chromaticity coordinates of the source primaries as specified in terms of the CIE 1931
- transfer_characteristics, indicates the reference opto-electronic transfer characteristic function of the source picture
- matrix_coeffs describes the matrix coefficients used in deriving luma and chroma signals from the green, blue, and red, or Y, Z, and X primaries
These parameters are located:
For H.264/AVC and H.265/HEVC:
- at the elementary stream level in the VUI headers: Sequence Parameter Set → VUI → video_signal_type_present_flag → colour_description_present_flag (Figure 6);
- at the MP4 media container level in avcc, hvcc or colr boxes (Figure 6);
- at the MKV/WebM media container level in the corresponding TrackEntry video and colour box.
For VP9:
- MP4: in vpcc and сolr boxes;
- MKV/WebM: colour box.
For AV1:
- at the elementary stream level in the OBU Sequence Header → color_config → if (color_description_present_flag);
- at the MP4 media container level in av1c and colr boxes;
- at the MKV/WebM media container level in the corresponding TrackEntry video and colour box.
2. Checking compliance of resolution, aspect ratio, frame rate, color depth, video codec.
3. HDR metadata check.
So, in this article, we have gathered the most relevant information on each HDR format in one place. The described markers allow you to quickly dive into the HDR subject area, identify, integrate HDR content and solve possible problems.
With StreamEye professional analyzing tool, you can check parameters at the elementary stream level, while Stream Analyzer helps to verify parameters both at the elementary stream and the media container levels.
The full list of references and standards used you may find in the full version of the article.
Author: Alexander Kruglov is a leading engineer at Elecard. He has been working in video analysis since 2018. Alexander is in charge for support of the Elecard’s largest clients, such as Netflix, Cisco, Walt Disney Studios, etc.