Design and Study of Enhanced Parallel FIR Filter Using Various Adders for 16 Bit Length

D.Ashok Kumar, P.Samundiswary

Abstract— Now a day’s parallel Finite Impulse Response (FIR) filter plays very important role in the Digital Signal Processing (DSP) based applications. FIR filters are one of the most widely used fundamental filters in the DSP systems. The parallel FIR filters are derived from FIR digital filter. In this paper, design and study of enhanced parallel FIR filter with various adders using the structure of Fast FIR Algorithm (FFA) based FIR filter and symmetric convolution based FIR filter structures considering 2-parallel and 3-parallel filters is done. These entire filter structures are also designed using Ripple Carry Adder (RCA), Carry save Adder (CSA) and Carry Increment Adder (CIA) by replacing the existing adders with the input bit length and coefficient length of 16-bits. Then the performance metrics of the above two structures is done by designing using Verilog HDL. Further, they are simulated and synthesized using Xilinx ISE 13.2 for Vertex family device of speed -12.

Index Terms— Parallel FIR filter, FFA, symmetric convolution, Ripple Carry Adder (RCA), Carry Save Adder (CSA), Carry Increment Adder (CIA).

I. INTRODUCTION

FIR filters are one of two primary types of digital filter. One is the FIR and another one is IIR filter. Due to the explosive growth of multimedia applications, the demand for high performance and low-power DSP system is increasing day by day. Finite Impulse Response (FIR) digital filters are one of the most widely used fundamental block in DSP applications. Some applications such as error correction and detection, video processing and data compression require the FIR filter to be operated at high frequencies, where as some other applications namely Multiple-input Multiple-output (MIMO) system used in wireless communication require a low-power circuit. The MIMO system requires high throughput FIR filter. Parallel filter structure is a well-known technique for FIR digital filter in terms of area performance analysis. Recently, a lot of research has been done on the study and analysis of parallel FIR filter based FFA and symmetric convolution structures to achieve better filters. Many researchers are focussing on the design of different structures of parallel FIR filter to meet the needs of current Very Large Scale Integration (VLSI) industry. In parallel FIR filter, Single Input to Single Output (SISO) is converted into Multiple Input to Multiple Output (MIMO). In parallel FIR filter, there are two types of structures.

First type is the FFA structure and the second type is the symmetric convolution based FIR filter structure. In FFA, the techniques used are dividing and conquer. In the symmetric convolution structures, reusability & poly phase decomposition are used that reduces the number of multiplication in sub filter section by exploiting the inheritance nature of symmetric coefficients compared to that of FFA based parallel FIR filter.

S.Balasubramaniam et.al [1, 14] discussed about the design of 2-parallel FIR filter structures. Yu-Chi Tsao et.al [2, 3, 4] discussed about parallel linear FIR filter based on odd and even length. In this paper, design and performance analysis of FFA and symmetric convolution based FIR filter structures considering 2-parallel and 3-parallel filters for bit length and coefficients of 16 bits is done by incorporating various adder topologies.

This paper is organized as follows: Section I deals with introduction & related work. In section II, parallel FIR filter structures using FFA and symmetric convolution techniques are explained. In section III, design of parallel FIR filter with various adders is discussed. And section IV deals with simulation and synthesis results of modified parallel FIR filter structures with adders such as RCA,CSA, CIA for 16 bit length . Conclusion and future work are drawn in section V.

II. PARALLEL FIR FILTER

Parallel FIR filters [5] are designed using the FFA & symmetric convolution structures for 2*2 parallel and 3*3 parallel filters. The 2*2 parallel FIR filter contains two inputs (X0, X1), coefficients (H0, H1) and outputs (Y0, Y1). The 3*3 parallel FIR filter contains three filter inputs (X0,X1,X2), three filter coefficients (H0, H1, H2), and three filter outputs (Y0,Y1,Y2).

A. Fast FIR Algorithm (FFA)

In general, n-tap FIR filter in time domain equation is given below (1)

\[ Y(n) = \sum_{k=0}^{n-1} h(k)x(n-k) \]  

Here h(n) and x(n) are finite duration elements.

In two parallel FIR filter, the elements present in the structure are already discussed. Parallel FIR filter is derived from polyphase decomposition [6]

\[ \sum_{p=0}^{2^n-1} Y_p (z^p)z^{-p} = \sum_{r=0}^{2^n-1} z^{-r} \sum_{r=0}^{2^n-1} H_r (z^r)z^{-r} \]  

where

\[ X = \sum_{k=0}^{2^n-1} X_k z^{-k} \]  

\[ H_r = \sum_{k=0}^{2^n-1} z^{-r} \sum_{r=0}^{2^n-1} X_k (L_k L_r) \]  

\[ Y = \sum_{k=0}^{2^n-1} \sum_{r=0}^{2^n-1} z^{-r} \sum_{r=0}^{2^n-1} X_k (L_k L_r) \]

B. 2*2 FFA Based Parallel FIR filter structure (L=2)
The two parallel FIR filter structure is shown in Fig.1. The output equation for this filter structure [7] is

\[ Y_0 = H_0 X_0 - Z^2 H_1 X_1 \]

(3)

\[ Y_1 = (H_0+H_1) (X_0+X_1)-H_0 X_0-H_1 X_1 \]

(4)

This structure contains the two outputs and the outputs are obtained using three length N/2 FIR filter and four pre/post processing adders. Further the outputs \( Y_0 \) and \( Y_1 \) are determined through the terms of \( X_0 X_0 \) and \( X_1 H_1 \).

### C.3*3 FFA based parallel FIR filter structure (\( L=3 \))

The three parallel FFA filter structure is shown in Fig.2. The output equation for this filter structure is

\[ Y_0 = H_0 X_0 - Z^3 H_2 X_2 + Z^3 + [(H_2+H_2) (X_2+X_2) - H_2 X_2] \]

(5)

\[ Y_1 = [(H_0+H_1) (X_0+X_1) - H_1 X_1] - (H_0 X_0) Z^3 H_2 X_2 \]

(6)

\[ Y_2 = [(H_0+H_2) (X_0+X_1+X_2)] - [(H_0+H_2) (X_0+X_1) - H_2 X_1] \\
\quad [(H_2+H_2) (X_2+X) - H_2 X_2]. \]

(7)

These structures contain the three outputs and 3N multiply units and (3N-1) adders similarly to the simple two parallel FIR filters having the pre/post adders. In the three parallel FIR filter structure, the outputs \( Y_0, Y_1 \) and \( Y_2 \) are obtained through the input and coefficients in terms of \( X_0, X_0, X_1, H_1 \), and \( X_2, H_2 \). These three terms need to be computed only once which means they required six number of filtering operations.

### D.2*2 Symmetric convolution based parallel FIR filter structure (\( L=2 \))

The three parallel symmetric convolution based FIR filter structure is shown in Fig.3. The output equation for this filter structure [8,9] is given below,

\[ Y_0 = \{1/2[(H_0+H_1) (X_0+X_1)+(H_0-H_1)(X_0-X_1)]-H_1 X_1\} + Z^2 H_1 X_1 \]

(8)

\[ Y_1 = 1/2[(H_0+H_1) (X_0+X_1)-(H_0-H_1) (X_0-X_1)] \]

(9)

### E.3*3 Symmetric convolution based parallel FIR filter structure (\( L=3 \))

The three parallel symmetric convolution based FIR filter structure is shown in Fig.4. The output equation for this FIR filter structure is given in eq. 10, 11 & 12.

\[ Y_0 = 1/2[(H_0+H_1) (X_0+X_1)+(H_0-H_1)(X_0-X_1)]+Z^2 H_1 X_1 \]

(10)

\[ Y_1 = 1/2[(H_0+H_1) (X_0+X_1)-(H_0-H_1) (X_0-X_1)]+Z^2 [1/2(H_0+H_1)(X_0+X_2)+(H_0-H_1)(X_0-X_2)]-1/2[(H_0+H_1)(X_0+X_1)-(H_0-H_1)(X_0-X_1)]+H_1 X_1 \]

(11)

\[ Y_2 = 1/2[(H_0+H_2) (X_0+X_1)-(H_0-H_2)(X_0-X_1)]+H_1 X_1 \]

(12)

### III. Parallel FIR Filter With Various Adders

Adders are categorized depending upon the carry occurred in the addition of two \( n \) bit numbers.

#### A. Ripple Carry Adder

A \( n \)-bit ripple carry adder consists of \( n \) full adders with the carry signal propagating from one full adder stage to next stage from LSB bit to MSB bit. The critical path of the ripple carry adder consists of the carry chain from the first Full adder stage to the last. A 8-bit ripple carry adder structures is shown in Fig.5.
B. Carry Save Adder

A carry-save adder illustrated in fig. 6 is a type of digital adder. Compute the sum of three or more n-bit numbers of binary bits. It differs from other digital adders in that it outputs two numbers of the same dimensions as the inputs, one which is a sequence of partial sum bits and another one is a sequence of carry bits.

C. Carry Increment Adder

The design of Carry Increment Adder (CIA) consists of RCA’s and incremental circuitry. The incremental circuit can be designed using half adder’s in ripple carry chain with a sequential order. The addition operation is done by dividing total number of bits in to group of 4bits and addition operation is done using several 4bit RCA’s. The architecture of CIA is shown in Fig.7.

IV. SIMULATION RESULTS

Parallel FIR filters using FFA and symmetric convolution structures are designed by using Verilog HDL. Then, simulation and synthesis [10] of the above mentioned structures are done by using Xilinx ISE 13.2 for virtex4 family device with a speed grade of -12. In simulation results, Technology View [11] describes top block which shows the set of inputs and outputs. Register Transfer Logic (RTL) view designates the internal architectural blocks along with the connections between input and output pins. Timing waveform [12, 13] is generated by writing test bench program which contains the set of input test vectors applied to design.

A. Simulation results of two-parallel 16-bits FFA based FIR Filter using RCA

RTL view and technology view is shown in the Fig 8 and Fig.9. The timing waveform of two parallel FFA based FIR filter with RCA shown in Fig.10 represents the output obtained from various input vector provided in the test bench program during simulation process. Output also depends on the clock and reset values.
B. Simulation results of two-parallel 16-bits FFA based FIR Filter using CSA

RTL view and technology view is shown in Fig.11 and Fig.12. The timing waveform of two parallel FFA with CSA shown in Fig.13 represents the output obtained from various input vectors provided in the test bench program during simulation process. Output also depends on the clock and reset values.

Fig.11. RTL view of two parallel 16 bits FFA based FIR filter using CSA

Fig.12. Technology view of two parallel 16 bits FFA based FIR Filter using CSA

Fig.13. Timing wave form of two parallel 16 bits FFA based FIR filter using CSA

C. Simulation results of two-parallel 16-bits FFA based FIR Filter using CIA

Fig.14. RTL view of two parallel 16 bits FFA based FIR filter using CIA

Fig.15. Technology view of two parallel 16 bits FFA based FIR filter using CIA

Fig.16. Timing wave form of two parallel 16 bits FFA based FIR filter using CIA

RTL view and technology view is shown in Fig.14 and Fig.15. The timing waveform of two parallel FFA with CIA shown in Fig.16 represents the output obtained from various input vectors provided in the test bench program during
simulation process. Output also depends on the clock and reset values.

D. Simulation results of three-parallel 16-bits symmetric convolution FIR Filter using CSA

Technology view is shown in the Fig.17. And RTL view is shown in Fig.18. The timing waveform of three parallel symmetric convolution based FIR filter with CSA shown in Fig.19 represents the output obtained from various inputs vector provided in the test bench program during simulation process. Output also depends on the clock and reset values.

E. Simulation results of three-parallel 16-bits symmetric convolution FIR Filter using CIA

RTL view is shown in the Fig.20. And technological view is shown in Fig.21. The timing waveform of three parallel symmetric convolution based FIR filter with CIA shown in Fig.22 represents the output obtained from various inputs vector provided in the test bench program during simulation process. Output also depends on the clock and reset values.
F. Simulation results of three-parallel 16-bits symmetric convolution based FIR Filter using RCA

![Three parallel 16 bits symmetric convolution based FIR filter using RCA](image)

Fig.21. Technology view of three parallel 16 bits symmetric convolution based FIR filter using RCA

![Three parallel 16 bits symmetric convolution based FIR filter using RCA](image)

Fig.24. Technology view of three parallel 16 bits symmetric convolution based FIR filter using RCA

![Three parallel 16 bits symmetric convolution based FIR filter using RCA](image)

Fig.25. Timing wave form of three parallel 16 bits symmetric convolution based FIR filter using RCA

G. Performance analysis of FFA and Symmetric convolution structured based FIR filter for 16-bit length with various adders

The performance comparison of FFA and Symmetric convolution structured based FIR filter for 16-bit length with various adders such as RCA, CSA and CIA are discussed in the form of Tabular column.

<table>
<thead>
<tr>
<th>Structure Type of adder</th>
<th>delay(ns) L=2</th>
<th>Area(slices) L=3</th>
</tr>
</thead>
<tbody>
<tr>
<td>Symmetric convolution</td>
<td></td>
<td></td>
</tr>
<tr>
<td>RCA</td>
<td>42.123</td>
<td>132</td>
</tr>
<tr>
<td>CSA</td>
<td>37.030</td>
<td>142</td>
</tr>
<tr>
<td>CIA</td>
<td>27.144</td>
<td>583</td>
</tr>
</tbody>
</table>

From the above table I, it is observed that FFA structure based FIR filter for 16-bit input data and coefficients using RCA has better area compared to that of CSA & CIA. And, CIA has better in terms of delay compared to that of RCA & CSA.

<table>
<thead>
<tr>
<th>Structure Type of adder</th>
<th>delay(ns) L=2</th>
<th>Area(slices) L=3</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fast FIR Algorithm</td>
<td></td>
<td></td>
</tr>
<tr>
<td>RCA</td>
<td>20.629</td>
<td>148</td>
</tr>
<tr>
<td>CSA</td>
<td>14.642</td>
<td>165</td>
</tr>
<tr>
<td>CIA</td>
<td>12.448</td>
<td>516</td>
</tr>
</tbody>
</table>

From the above table II, it is observed that symmetric convolution based FIR filter for 16-bit input data and coefficients using RCA has better area compared to that of CSA & CIA. And, CIA has better in terms of delay compared to that of RCA & CSA.

V. CONCLUSION AND FUTURE WORK

In this paper, FFA based FIR filter structure and symmetric convolution structure based FIR using RCA, CSA and CIA with the input bit length of 16-bits is designed using Verilog HDL. These structures are simulated in Xilinx 13.2 using the family virtex4 speed -12 device. It is concluded through the simulation results, that symmetric convolutions and FFA based FIR filter structure using CIA is better compared to that of RCA & CSA in terms of delay analysis. In area analysis, RCA shows better results compared to that of CSA & CIA, in the above mentioned structures. In future, the work can be extended from 3 parallel structures to 6 parallel structures by using cascading algorithm.

REFERENCES


Xilinx, “7 Series FPGAs Configurable Logic Block”, UG 474 (v 1.5), August 6, 2013.


Ashok Kumar obtained his B.Tech Degree (2011) in Electronics and Communication Engineering from Intell College of Engineering and Technology, Ananthapur affiliated to JNTU Ananthapur. Currently, he is doing his M.Tech degree (2014) in the Department of Electronics Engineering, School of Engineering and Technology, Pondicherry University, Pondicherry, India. His research interests include Digital Design using Verilog HDL, FPGA and Low Power Design.

Dr. P.Samundiswary received the B.Tech degree (1997), M.Tech Degree (2003) and Ph.D. (2011) in the department of Electronics and Communication Engineering from Pondicherry Engineering College affiliated to Pondicherry University, India. She is having 15 years of teaching experience. She is currently working as Assistant Professor in the Dept. of Electronics Engineering, School of Engineering and Technology, Pondicherry University, Pondicherry, India. Her research interests include Wireless Communication, Wireless Networks and Digital Design using Verilog HDL.