Exploring Optimized Hadamard Methods to Design Energy-Efficient SATD Architectures

Authors

  • Luiz Henrique Cancellier
  • André Beims Brascher
  • Ismael Seidel
  • André Beims Brascher
  • José Luis Güntzel
  • Luciano Agostini

DOI:

https://doi.org/10.29292/jics.v10i2.412

Keywords:

Video coding, VLSI design, Sum of Absolute Transformed Differences, Hadamard, Energy efficiency

Abstract

State-of-the-art video coding tools are submitted to severe performance and energy consumption requirements resulting from high complexity of video standards and from limited energy budgets of portable mobile devices. While providing most of the compression gains, inter frame and intra frame prediction techniques are the most demanding steps, since they compare a huge number of blocks. In such a process, the similarity metric employed affects both the quality of compression and the calculation effort. In this paper we propose the use of Hadamardbased Sum of Absolute Transformed Differences (SATD), in replacement of the traditionally used Sum of Absolute Differences (SAD), as a means of improving the efficiency of video coding. To allow that we explore two Hadamard Transform methods to design efficient SATD architectures, one using the Fast Hadamard Transform (FHT) butterfly and another one using the so-called Transform-Exempted (TE) SATD algorithm. Those methods were combined with architectural decisions (full parallelism, full parallelism with pipelining or multi-cycling) to build a total of six Hadamard-based SATD architectures that were synthesized for a commercial 45nm standard cell library for two operating frequencies. The architectures were simulated with pixel block data to obtain realistic dynamic power and energy estimates. The TE-SATD architectures achieved the lowest energy results: down to 13.13 pJ/SATD in the case of parallel architecture with pipeline. However, considering also the area results when evaluating energy, the best results are given by both methods using multi-cycling (transpose buffer): nearly 20.75 pJ/SATD with up to 63.54% smaller area compared with fully parallel architectures.

Additional Files

Published

2020-12-28