Optimal Integer Linear Prediction Coefficients
Most lossless audio codecs make use of linear prediction as a means to decorrelate signals. Since encoder and decoder need to work with the same set of prediction coefficients they either have to be chosen by the encoder and included in the compressed data stream before they are used (forward adaptive) or there must be a deterministic algorithm to compute these coefficients based on previous data (backward adaptive). FLAC and TAK are two examples of codecs that use forward adaptive linear prediction. This is the case I’m dealing with in this article.
Computing good real-valued prediction coefficients isn’t that difficult, actually. But these coefficients also need to be coded somehow which involves quantization. One option is to simply quantize the optimal real-valued coefficients to rational numbers that share a certain denominator. FLAC’s format specification for example allows transmission of rational coefficients that have a power-of-two as denominator. However, simple per-coefficient rounding to the closest rational number doesn’t necessarily lead to the best set of quantized coefficients. What does “best” mean anyway? Suppose we fix the prediction order and the prediction coefficients’ accuracy (denominator). We’d like to choose a set of quantized prediction coefficients that reduce the number of bits spent on the residual samples. Since this is a really complicated minimization problem we need to simplify our objective function. The natural choice is to minimize the sum of squared (not-yet-quantized) prediction errors instead. This objective function turns the problem into an “integer least squares” problem.
Suppose you have a block of samples (), you want to compute prediction coefficients () for the remaining m-n predictions (the first n samples are the “warm-up” samples) and the coefficients’ denominator is ‘p’ for “precision”. This leads to the following integer least squares problem ILS(A,b):
find that minimizes
ILS (integer least squares) is known to be NP-hard which practically means that computing the optimal solution for a large problem (matrix with many columns) takes quite some time. However, in case of typical linear prediction problems the complexity of computing the optimal set of oefficients is rather moderate.
How do we solve an ILS problem? There are a couple of approaches. Many include a popular tool known as LLL lattice reduction. With the help of orthogonal transforms and the LLL lattice reduction algorithm the above problem can be reduced to another instance ILS(R,y) where R is square upper triangular matrix of order ‘n’ with a near-orthogonal columns of low norm. A by-product of the reduction process is a an integer matrix M which connects the two problem instances: Assuming w is an optimal solution for ILS(R,y), x=M*w will be an optimal solution for ILS(A,b). Since M contains integers only, x will also be an integer vector. This reduction process is a kind of preconditioning which means the problem ILS(R,y) is “easier” to solve than ILS(A,b).
Still, we need to solve ILS(R,y). A simple approximation of the optimal solution is called Nulling and Cancelling:
The approximation’s coefficients are computed one at a time. Except for the rounding part ( denotes the closest integer) this is exactly what you do to solve a triangular system. Note that previously quantized coefficients are reused to solve for the next coefficient. If we’re lucky we already found the optimal solution (w’=w). But there’s no easy way to tell.
The optimal solution can be computed with an informed tree search where each branch corresponds to a certain quantization decision. At the first level is quantized. Next in line is until we reach a leaf node where all coefficients have been determined. Coupling a depth-first search with branch and bound that picks the most promising path first is what I did for my C implementation.
You can download my C implementation here. It’s released under a two-clause BSD license. Have fun!