Context-adaptive binary arithmetic coding. Part 3

How the binary nature of the coded information will change the encoding and decoding procedures.

The name CABAC (Context Adaptive Binary Arithmetic Coding) itself implies that HEVC uses a binary version of arithmetic coding where the input message alphabet consists of only two characters, 0 and 1. In order to distinguish between the words used to denote the output stream bits that represent the coding result and the binary characters that represent the message being coded, the term “bins” is used to refer to these characters. Let us see what will change if we take into account the binary nature of the message being coded in the flow diagrams shown in Figures 2 to 4 of Chapter 7.

The first thing to note is that the EOF character that used to denote the end of the message is no longer part of the alphabet. The point where the message ends needs to be known at the time of decoding, otherwise the decoding procedure will continue even after the entire message has been decoded correctly. We will explore later how the end-of-message information is transmitted in HEVC and will just note for now that a procedure for transmitting this information needs to be implemented.

What else will change if there are only two different characters in the message to be coded? The array P where cumulative character probabilities are stored will only contain three values: P={0,PMPS,1}. Here, PMPS is the probability of the more probable character occurring in the message (if we exclude the EOF character from our 20-character message {b, a, b, b, b, b, b, b, b, b, a, b, b, b, b, b, b, b, b, EOF}, the message will become 19 characters long, and the probability of the more probable character “b” will be equal to 1819). This means that the current interval [L,H) will now be split into two parts at each iteration during coding. The length of the larger part is determined by the probability PMPS and the length of the smaller part by the probability PLPS=1−PMPS. The array P has thus basically degenerated into a single number. This number could be PMPS or it could just as well be PLPS.

Until now, the position on the number axis of the interval to be split during coding was characterized by the position of the interval’s endpoints L and H. Obviously, we can also describe the current state of the arithmetic coding procedure (the position of the interval on the number axis) using the numbers L and R, where R is the interval length. This is the description used in HEVC.

Now let us denote the number that characterizes the ratio between the interval parts when split iteratively as PLPS. Using the HEVC standard notation, let valMPS be the value (0 or 1) of the most probable bin in the sequence to be coded, binVal the value of the bin currently being coded, L the position of the left interval endpoint (as before), and R the current interval length. If the value of the currently coded bin is equal to valMPS, the new values of L and R can be calculated from their current values and PLPS as (Figure 1):

R=R(1−PLPS)=R−R⋅PLPS, and

L=L.

Figure 1. Calculating L and R when binVal≡valMPS

If the value of the currently coded bin is not equal to valMPS, the new values of L and R will be determined by the following expressions (Figure 2):

R=R⋅PLPS,

L=L+R(1−PLPS).

Figure 2. Calculating L and R when binVal!=valMPS

The result is a somewhat updated flow diagram (Figure 3), now describing binary arithmetic coding.

Figure 3. Flow diagram for the bin coding procedure

Although the renormalization procedure could be left unmodified for binary coding, some small modifications to it can simplify the algorithm somewhat and save computational effort. Using the new notation of L and R, the renormalization procedure is executed in a loop while R<12. If at the same time L+R<12, the value of the coded bit is defined and equal to 0. If L>12, the coded bit has a value of 1 (we just state it here for simplicity; the procedure is described in detail in Figure 2 of and the associated explanatory text). Clearly, the values of the coded bits will also be fully defined when R<14. If at the same time L<14, the current interval is fully contained inside the range [0,12], and thus the value of the coded bit is 0. If R<14 but L⩾12, the current interval is fully contained inside the range [12,1], and the coded bit has a value of 1. The flow diagram for the modified renormalization procedure for the coding is shown in Figure 4.

Figure 4. Flow diagram for the renormalization algorithm for the coding

Let us now see how the binary nature of the coded information will change the decoding procedure. The procedure of arithmetic decoding in this case consists of splitting the current interval iteratively into two parts with lengths proportional to 1−PLPS and PLPS. At each such iteration, a check is performed to see which of the two resulting intervals includes the binary number representing the encoded message. This becomes the current interval.

Let ivlOffset be a variable that contains several bits of the encoded message (we use the HEVC notation for this variable here). As before, let us characterize the current interval by its left endpoint L and the length R. The decoding procedure is illustrated in Figure 5 where two successive iterations are shown. At the first iteration, the number ivlOffset (with its position inside the current interval shown by a circle) is located inside the interval
[L,L+R(1−PLPS)]. As a result, the decoded bin is equal to valMPS. At the second iteration, the number ivlOffset is located inside the smaller interval, and the decoded bin is equal to !valMPS.

Figure 5. Interval splitting during decoding

It can be seen from the figure above that the choice of the interval at each iteration is determined by comparing the value of ivlOffset with the right-hand endpoint of the interval L+R⋅(1−PLPS). When ivlOffset falls inside the larger interval that corresponds to the probability PMPS, R gets a new value of R⋅(1−PLPS), and L is left unchanged. When ivl0ffset falls inside the smaller interval that corresponds to the probability PLPS, the left interval endpoint is redefined as L+R⋅(1−PLPS), and R gets a new value of R⋅PLPS. Note that in the second case, instead of modifying L we can modify ivlOffset by assigning it a value of ivlOffset−R⋅(1−PLPS). What the value of ivlOffset now characterizes is not the binary number itself that represents the encoded message but instead the position of this number relative to the left endpoint of the current interval. In that case, L as such is no longer used in the decoding procedure. To decide which of the two intervals shall become the current one at each iteration, we just need to compare ivlOffset with R⋅(1−PLPS). The flow diagram for the decoding algorithm is shown in Figure 6.

Figure 6. Flow diagram for the decoding algorithm

Redefining the meaning of ivlOffset for binary coding also leads to a significant change (simplification) in the renormalization procedure. The renormalization procedure is invoked when the current interval length becomes less than 14. Just as before, renormalization doubles the value of R. It is, however, no longer necessary to compute L. Since ivlOffset is now the difference between the number representing the encoded message and the left endpoint of the current interval, the renormalization procedure can simply double the value of ivlOffset unconditionally and then add a bit from the encoded message to the vacant bit position, thereby increasing the precision of the value ivlOffset. The flow diagram for the modified renormalization algorithm during binary decoding is shown in Figure 7.

Figure 7. Flow diagram for the renormalization algorithm during decoding

This flow diagram calls a function called read_bit(), the purpose of which is evident from its name: the function reads the next bit from the bit stream representing the encoded message. It also uses the symbol “|” for bitwise OR operation from the C language.

Author

Oleg Ponomarev, 16 years in video encoding and signal digital processing, expert in Statistical Radiophysics, Radio waves propagation. Assistant Professor, PhD at Tomsk State University, Radiophysics department. Head of Research Lab.

Leading provider of components and software products for analysis, monitoring, encoding, decoding and streaming digital video and audio data.