Tradeoff of FPGA Design of a Floating-point Library for Arithmetic Operators


  • Daniel M. Mu`ñoz
  • Diego F. Sanchez
  • Carlos H. Llanos
  • Mauricio Ayala-Rincón



Floating-point arithmetic, FPGAs


Many scientific and engineering applications require to perform a large number of arithmetic operations that must be computed in an efficient manner using a high precision and a large dynamic range. Commonly, these applications are implemented on personal computers taking advantage of the floating-point arithmetic to perform the computations and high operational frequencies. However, most common software architectures execute the instructions in a sequential way due to the von Neumann model and, consequently, several delays are introduced in the data transfer between the program memory and the Arithmetic Logic Unit (ALU). There are several mobile applications which require to operate with a high performance in terms of accuracy of the computations and execution time as well as with low power consumption. Modern Field Programmable Gate Arrays (FPGAs) are a suitable solution for high performance embedded applications given the flexibility of their architectures and their parallel capabilities, which allows the implementation of complex algorithms and performance improvements. This paper describes a parameterizable floating-point library for arithmetic operators based on FPGAs. A general architecture was implemented for addition/subtraction and multiplication and two different architectures based on the Goldschmidt’s and the Newton-Raphson algorithms were implemented for division and square root. Additionally, a tradeoff analysis of the hardware implementation was performed, which enables the designer to choose, for general purpose applications, the suitable bit-width representation and error associated, as well as the area cost, elapsed time and power consumption for each arithmetic operator. Synthesis results have demonstrated the effectiveness of the implemented cores on commercial FPGAs and showed that the most critical parameter is the dedicated Digital Signal Processing (DSP) slices consumption. Simulation results were addressed to compute the mean square error (MSE) and maximum absolute error demonstrating the correctness of the implemented floating-point library and achieving and experimental error analysis. The Newton-Raphson algorithm achieves similar MSE results as the Goldschmidt’s algorithm, operating with similar frequencies; however, the first one saves more logic area and dedicated DSP blocks.