This update is short and sweet.
Two weeks ago saw completion of the Thusnelda encoder's long-term code refactoring to allow motion vector analysis, mode selection, transformation, quantization and tokenization to be done block-by-block in a single pass. The point of this refactoring was to allow rate-distortion optimization of each individual token as a block is coded (each token relies on preceeding tokens) without requiring multiple passes though cache-busting amounts of auxiliary data.
As of this past week, token optimization is operational.
The idea behind token optimization is simple. As with all other aspects of rate-distortion optimization, we seek to minimize the expression:
distortion + lambda * bit_cost
With respect to token optimization, this is evaluated for each individual token to be written to the bitstream. The challenge is to efficiently evaluate the expression on the fly in the context of subsequent tokens. Because any change to the current token affects the coding of subsequent token, each optimization must evaluate the provisional bit costs of several possible choices and outcomes.
Theora's token structure as constructed can gain cost benefits only from 'demoting' tokens, that is, reducing the coefficient values being coded either to a smaller value (potentially allowing use of a shorter or combined token) or to zero (possibly extending zero or EOB runs). Token demotion can be evaluated recursively, however, analysis of recursive optimization yields negligable results for the cost gain. Therefore, despite the complicated token and coding structure, a fast constrained greedy optimization process yields nearly optimal results. The computational overhead of optimization is under 10%.
We turn once again to our test clip from the Matrix. The stills are closeups from frames 49, 2119 and 2209. The link leads to the actual encoded clip.
Encoded with internal qi=50, skip lambda=20, token lambda=16
(Demonstrating the potential usefulness of adaptive SKIP weighting)
Encoded with internal qi=50, skip lambda=5, token lambda=128
(Due to nonrecursive optimization, the demotion process quickly saturates)