Print ISSN: 1813-0526

Online ISSN: 2220-1270

Keywords : FPGA

Pipelined Parallel Implementation of CryptosystemsBased on Advanced Encryption Standard

Israa G. Mohammed; Shifaa A. Dawood

Al-Rafidain Engineering Journal (AREJ), 2015, Volume 23, Issue 2, Pages 44-55
DOI: 10.33899/rengj.2015.101074

A hardware architecture implementation of Advanced Encryption Standard (AES) is globally adopted to encrypt data for variant communications systems, taking into account that AES is reliable, secured and immunized against attacks. A single crypto system is suggested to encrypt and/or decrypt different types of data .These types of data are assumed to be as a text data .The image is considered as a case study for the type of data that is to be encrypted in real time. Then the proposed architectures are used to encrypt the video within the time ≤ 33 m sec . Two architectures are proposed . The first one is a hybrid of both stream and block ciphering. This architecture is used to increase the encryption security by reducing the correlation among image pixels. The resulting encryption time for an image of (32x64)pixels is equal to 16.76 µ sec. The second architecture is proposed for CTR mode of AES algorithm. The same time achieved in the first architecture is also achieved in this implementation. However ,the half of the hardware resources in comparison with the first architecture is achieved in implementing the second, but if it is used for either encryption or decryption , not for both simultaneity. The real time implementation is achieved due to using parallel computation that is based on pipelining technique. The architecture are synthesized on Spartan-6 LX(XC6SLX16) using ISE 14.2 .

Efficient Hardware Implementation of the Pipelined DES Encryption Algorithm Using FPGA

Noor Najeeb Qaqos

Al-Rafidain Engineering Journal (AREJ), 2014, Volume 22, Issue 5, Pages 212-223
DOI: 10.33899/rengj.2014.101018

This paper presents a high throughput reconfigurable hardware implementation of DES Encryption algorithm.This achieved by using a new proposed implementation of the DES algorithm using superpipelinedconcept.DES are simulated using Xilinx 9.2i software with the use of VHDL as the hardware description languageand implemented using Spartan-3E FPGA kit.The DES Encryption algorithm achieved a high throughput of18.327Gbps and 3235 number of Configurable Logic Blocks (CLBs), obtaining the fastest hardware implementation with better area utilization.Comparison is made between the proposed implementation and other recent implementations. The comparison results indicate that a high throughput with optimized resource utilization scan be achieved using a super pipelined concept on the proposed design in a single FPGA chip.

Systolic Video Stream Object Detector Using FPGA-E

Dr.Shefa A. Dawwd; Ula T. Salim

Al-Rafidain Engineering Journal (AREJ), 2014, Volume 22, Issue 4, Pages 33-43
DOI: 10.33899/rengj.2014.89977

Object detection is important operation used in multiple applications such as computer vision, image and video processing, security, artificial intelligent, and several other areas.However, in these applications, it is not easy to realize real-time frame rates and fast invariant detecting function under changing object states such as position and size using software implementations. So that to solves these problems and speed up the highly intensive calculation required, In this paper simple and efficient template matching algorithm architecture of a video streaming application for object detection is proposed,it is based on using Sum of Absolute Differences (SAD)withPyramid Sum of Absolute Differences (PSAD) as similarity measures and a systolic array design using sliding window operation, where each video frame is divided into slides and feeds through the window by using a suitable first in first out(FIFO) buffers instead of the sliding window across the video frame. The implementation operation is done by using combination of software and hardware co-design that is based by using pipelining technique, data recirculation , and single instruction multiple data (SIMD) operations. The results for both SAD and PSAD algorithms showed the best match can be found at the template (window) size is 19×19 bits/pixel and with accuracy detectionrate of100%.

Keywords: FIFO, FPGA, Object detection, Pipeline, PSAD, SAD,Sliding window, Systolic array, Template matching, Video stream.

Rapid Design and Test of Embedded Control Systems Using LABVIEW-FPGA Tool

Maher Algreer; Mohammad Tarik Mohammad

Al-Rafidain Engineering Journal (AREJ), 2014, Volume 22, Issue 4, Pages 65-73
DOI: 10.33899/rengj.2014.100885

Hardware-description Language (HDL) is typically used to synthesise the digital hardware of the control systems. Importantly, this requires a deep knowledge in digital hardware design; howeverthis is not essential for the design of the real time control systems. From this prospective, there is a great interest to employ a modern environment tool to simulate, design, validate and to rapidly implement the hardware to the target of the application. For this reason, this paper aims to presentthe methodology and effectiveness of using the LABVIEW-FPGA toolin embedded system design of digital control algorithms. As the model of the control system has been already simulated using the LABVIEW environment, therefore this will shorten the time of hardware implementation, where the designed control algorithm will directly translate into hardware resources by using LABVIEW-FPGA module. The methodology of hardware digital controller design is clearly explained using LABVIEW-FPGA modulebased SPARTAN-3EFPGA from Xilinx. The prototyped temperature control system using (CI-53003) is accommodatedas one of the examplesto demonstrate the embedded hardware design of digital control system. Experimental results clearly show the successful hardware implementation of the designed algorithm.

Enhanced Hardware Implementation of Hybrid Stochastic Neural Network using FPGA

Rafid Ahmed Khalil; Mustafa Salim

Al-Rafidain Engineering Journal (AREJ), 2014, Volume 22, Issue 2, Pages 142-154
DOI: 10.33899/rengj.2014.87338

Enhanced Hardware Implementation of Hybrid Stochastic
Neural Network using FPGA
Rafid Ahmed Khalil Mustafa Salim
Most of the traditional digital implemented systems uses fixed point or floating
point for representing and processing data. An alternative approach is to represent data
as random bits that are distributed along the sequence. To be precise, stochastic logic
can be considered as a solution for hardware size for application that consume physical
area like neural networks as it uses logic gates to implement complex operations and its
inherits resistance to bit flips noise. To avoid some of the problems that this type of
processing suffers from, a combination of stochastic logic and classical logic (fixed point)
is used to implement a neural networks (Fully connected feed-forwards) that is
characterized by FPGA large size consuming. The stochastic logic is utilized have to
implement part of the multiplication operations in the hidden layers of network and
LFSR is used as a random generator for conversion of weights and activation functions
outputs. The hardware utilization of Spartan 3E-500K FPGA results are compared with
another network of the same size. A discussion of some of the issues that related to this
methodology faces is also presented.
Key words: Artificial neural networks, LFSR, Probabilistic computation, Stochastic
arithmetic, FPGA, Stochastic logic.
تنفيذ شكبة عصبية عشوائية هيجنة ومحسنة
FPGA بأستخدام
رأفد احمد خليل مصطفى سالم
أغلب الانظمة الرقمية تطبق العلميات الرياضية عن طريق استخدام التمثيل الرقمي ذو النقطة الثابتة أو الفاصلة
العائمة أن البديل هو ترميز الارقام المراد معالجتها بسلسلة من الصفر والواحد وبصورة عشوائية أي المنطق العشوائي,
حيث أعتبر كبديل للمنطق الاعتيادي في التطبيقات التي تستهللك مساحه سيلكونية كبيرة كونه يحتاج الى داوئر بسيطة
ت ج ن ب ا لب ع ض ا لم ش اك ل ا لت ي ي ع ان ي م ن ه ا ه ذ ا ا لن و ع م ن .)bit flips( لتنفيذ العمليات الرياضية المعقدة ومقاومته للضوضاء
(Fully connected feed- المعالجة تم دمج هذا المنطق مع المنطق الاعتيادي في تنفيذ شبكة عصبية نوع
حيث تم أستخدام المنطق العشوائي في أجراء جزء من FPGA التي تتميز باستهلاكها الكبير لشريحة forward)
في أجراء (LFSR) عمليات الضرب في الطبقات الخفية من الشبكة العصبية مع استخدام مولد أرقام عشوائية من نوع
عملية تحويل الأوزان وأخراجات دوال التفعيل. تم مقارنة النتائج مع شبكة أخرى من نفس الحجم ومناقشة بعض النتائج.

Architectural Design of Random Number Generators and Their Hardware Implementations

Dr. Basil Shukr Mahmood; Sarmad Fakhrulddin Ismael

Al-Rafidain Engineering Journal (AREJ), 2014, Volume 22, Issue 2, Pages 50-59
DOI: 10.33899/rengj.2014.87322

Architectural Design of Random Number Generators
and Their Hardware Implementations
Sarmad Fakhrulddin Ismael Dr. Basil Shukr Mahmood
University of Mosul/computer University of Ninevah
Engineering Department
The architectural design of the random number generators for uniform
distribution, normal distribution, exponential distribution and Rayleigh distribution
using Box-Muller and inverse transformation method has been hardware implemented
on FPGA. Any of the random number generators can generate one sample every clock
cycle. The generators have been implemented on Xilinx Spartan 3E XC3S500E FPGA.
The designed generators work properly up to maximum frequency of 418.41MHz .The
outcome results of the generators have been tested by the chi-square test at a 5% level of
significance which provided the correct required distributions.
Keyword: Box-mulle, Chi-square, Inverse transformation, FPGA.
تصميم معمارية لتوليد الارقام العشوائية و ت ن ف ي ذ ه ا م اد ي ا
سرمد فخر الدين إسماعيل د. باسل شكر محمود
قسم هندسة الحاسوب/ جامعة الموصل كلية هندسة ألألكترونيات
المعمارية المصممة لتوليد الارقام العشوائية بتوزيع منتظم وتوزيع طبيعي وتوزيع اسي وتوزيع
.FPGA وطريقة التحويل العكسي تم بناءها ماديا باستخدام ال )Box-muller( )رايلي ( باستخدام طريقة ال
Xilinx اي واحد من مولدات الارقام العشوائية ممكن ان تولد رقم واحد في كل دورة. المولدات تم بناءها على
418.41MHz المولدات المصممة مناسبة للعمل بتردد مقداره .Spartan 3E XC3S500E FPGA
النتائج التي تم الحصول عليها من المولدات تم اختبارها بواسطة فحص مربع كاي بمستوى اهمية مقدارها
%5 والتي حققت التوزيع المطلوب.

Design and Implementation of a Network on Chip Using FPGA(English)

Dr. A. I. A. Jabbar; Noor .Th. AL Malah

Al-Rafidain Engineering Journal (AREJ), 2013, Volume 21, Issue 1, Pages 91-100
DOI: 10.33899/rengj.2013.67356

The fundamental unit of building a Network on Chip is the router , it directs the packets according to a routing algorithm to the desired host. In this paper ,a router is designed using VHDL language and implemented on Spartan 3E FPGA with the help of Integrated software environment ( ISE10.1) . The utilization of the Spartan 3E resources is excellent ( for example the number of slices required doesn’t exceed 3%) .After that a (4×4) mesh topology network is designed and implemented using FPGA ( the number of slices is 43% of the available slices ) . An example is applied on the designed Network on Chip (NoC) which validates the design successfully .
Keywords: Router , SoC, NoC, VHDL, FPGA,VGA,MESH

Transformation Matrix for 3D computer Graphics Based on FPGA(English)

Dr.Fakhrulddin H. Ali

Al-Rafidain Engineering Journal (AREJ), 2012, Volume 20, Issue 5, Pages 1-15
DOI: 10.33899/rengj.2012.61024


The real time of the computer graphics system performance is one of the fast many computing applications. The 3D (three-dimensional) geometric transformations are one of the most important principles of interactive computer graphics, which are essential for modeling, viewing and animation. This paper tends to construct a general form of a single matrix representation for multiple geometric transformations for three-dimensional objects. This way, a speed up factor of 1 to 5 can be gained. Architectureis designed, and implemented as a hardware unit, and then testedfor the single matrix transformation, using Field Programmable Gate Array (FPGA).

Keywords: 3D graphics, lookup table, FPGA, concatenation.

FPGA Implementation of Reversible Medical Image Watermarking

Dr. Ahlam Fadhil Mahmood

Al-Rafidain Engineering Journal (AREJ), 2012, Volume 20, Issue 4, Pages 21-31
DOI: 10.33899/rengj.2012.54152

Medical image protection and authentication are becoming increasingly important in an e-Health environment where images are readily distributed over electronic networks.
This paper presents an FPGA implementation of reversible watermarking encoder and decoder system. The system is based on the Discrete Cosine Transform (DCT/IDCT) to embed and extract the copyright protection mark and least significant bit (LSB) technique to hide the patients’ information and then extract back the information by the owner using.
The proposed structure employs a single multiplierless 1D-DCT/IDCT block, instead of three in many existing DCT watermarking systems, which is reduce the hardware DCT part to 16.6% compared with previous proposals. The parallel hardware implementation of DCT and LSB is done in Xilinx XSC3S500 FPGA. The proposed scheme allows multi-insertions by many doctors in order to give an exact diagnosis to the patient.

portable system of the watermarking with small size and low power dissipation
Keywords: Information hiding; watermarking ; FPGA; DCT/IDCT; Least Significant Bit method.

FPGA Implementation of a Fuzzy Control Surface

Fakhrulddin H. Ali; Mohammed M. Hussein

Al-Rafidain Engineering Journal (AREJ), 2012, Volume 20, Issue 3, Pages 103-116
DOI: 10.33899/rengj.2012.50483

This paper presents a design methodology of a dual-input single-output fuzzy logic controller where synthesizing the classical three stages, fuzzification, inference engine, and defuzzification, are replaced by the outcome control surface obtained from these stages which is treated as a tow dimensional table called fuzzy control table FCT. With this proposed approach, (8x8), (16x16), (32x32), and (64x64) FCTs were investigated each having 64, 256, 1024, and 2048 values respectively. To make this system adaptable to different operating states a supervisor fuzzy controller is designed to continuously adjust, on line, the output factor of the basic fuzzy controller based on the error and change in error signals. The proposed architecture is implemented in XC3S200 FPGA, Spartan-3 starter kit to control the position of a D.C. servo motor with unknown parameters in real time.
Keywords: control table, FPGA, fuzzy control surface, supervisor controller.

Design and FPGA Implementations of Four Orthogonal DWT Filter Banks Using Lattice Structures

Zainab R. S. Al-Omary; Dr. Jassim M. Abdul-Jabbar

Al-Rafidain Engineering Journal (AREJ), 2011, Volume 19, Issue 6, Pages 124-137
DOI: 10.33899/rengj.2011.26616

In this paper, lattice structures for DWT are introduced through the design and FPGA implementations of the orthogonal Daubechies filter banks of orders 2, 4, 6 and 8. Multipliers and shift-add methods are both used to perform multiplication operations for these types of filter banks. Two implementation techniques are introduced, namely; the pipelining technique that is efficient from the throughput point of view, and the area efficient bit-serial implementation technique. The obtained results show the ability to achieve high throughput using pipelining (with 2 output samples / clock) on behalf of the area allocation. While bit-serial technique minimizes the allocated area on behalf of the throughput which may decrease with increasing filter order. As compared with other recent implementations, the results of implementing the designed filter banks using the SPARTAN-3E FPGA kit are efficient in minimizing implementation complexity to 0.584 - 0.712 of its corresponding values for different structures in recent hardware implementations. It is also obtained that the resulting structures can operate at high frequencies (up to 47.09 MHz).

FPGA Implementation of Multiplierless DCT/IDCT Chip

Dr. Ahlam Fadhil Mahmood; Abdulkreem Mohameed Salih

Al-Rafidain Engineering Journal (AREJ), 2011, Volume 19, Issue 4, Pages 55-67
DOI: 10.33899/rengj.2011.26797

The advance of mobile electronics technology has produced handheld appliances allowing both wireless voice and data communications. One of the most important operations in the realm of digital signal and image processing is the 2-D Discrete Cosine Transform. This paper presents a multiplierless two dimensional Discrete Cosine Transform/Inverse Discrete Cosine Transform (DCT/IDCT) based on the transpose method. In this method the 2-D DCT is obtained by taking two 1-D DCTs in series. The input data is first divided into NxN blocks and the row-wise 1-D DCT of each block is taken, the intermediate transposition is then determined and a column-wise 1-D DCT is ascertained which gives the 2-D DCT of the data. The hardware implementation is parallel, pipelined and decomposed the coefficients matrix into four power of two term(i.e:16 ) to perform shift and add operations instead of multipliers(i.e 16); it costs only 1,443 slice , and runs at maximum frequency of 82.8 MHz with a very high process throughput of 991.2 Megabits/sec when synthesized onto Spartan3-E XC3S500 FPGA device. The proposed 2-D DCT/IDCT design achieving the most demanding real-time requirements of CODEC standardized frame resolutions and rates.

Implementation of Multiplier less Architectures for Color Space Conversions on FPGA

Dr. Ahlam Fadhil Mahmood; Abdulkreem Mohammad Salih

Al-Rafidain Engineering Journal (AREJ), 2011, Volume 19, Issue 3, Pages 89-103
DOI: 10.33899/rengj.2011.27029

The divergence of computers, internet, and wide variety of interactive video devices, in most of the multimedia applications, all using different color representations, is forcing the digital designer today to convert between them. The objective is to have a converter, which will be useful for number of applications with a basic function of converting from one color space to another and the inverse on same architecture. This paper presents an efficient parallel multiplierless implementation for two color space converters (RGB to YCbCr and YCbCr to RGB). The proposed architecture is based on distributed arithmetic (DA) principles which has been implemented on the Xilinx Spartan-3E XC3S500 FPGA using fewer resources. The implementation approach exhibits better performances when compared with existing implementations, Modifications have been carried out in DA to reduce the hardware complexity with better performance in area, latency and throughput.
Keywords: Color Space Conversion; Distributed Arithmetic ; FPGA; Video, Processing; Image Processing .

FPGA Implementation Of Elementary Function Evaluation Unit Using CORDIC and Lookup tables

Basil Sh. Mahmood; Ehsan A. Ali

Al-Rafidain Engineering Journal (AREJ), 2011, Volume 19, Issue 2, Pages 50-70
DOI: 10.33899/rengj.2011.27044

In this paper, a hardware computing unit has been designed and implemented. This unit computes many elementary functions (such as sine, cosine, tan-1, sinh, cosh, and square root) that their computing by using software systems requires thousands of clock cycles as an execution time. The architecture of the function computation has been designed by using VHDL and placed on XC3S500E FPGA chip in Spartan 3E as a target technique. In this paper, two algorithms have been used in computing the mathematical functions, because they can be implemented using FPGA chip. The first is the Coordinate Rotation Digital Computer algorithm (CORDIC) which was introduced in 1959. It is a single unified algorithm for calculating many elementary functions including trigonometric, hyperbolic, logarithmic and exponential functions, multiplication, division and square root. The second one uses the lookup table. According to the self-similarity in the trigonometric functions, and using the techniques of parallel pipelining for the CORDIC algorithm, speedup of (24.7 - 30.3)×100% is obtained as compared with the other parallel architectures. The throughput became operation/clock pulse except the first operation whose latency was 32 clock pulse.

Keywords: CORDIC, lookup table, Elementary Function, FPGA

Design and FPGA Implementation of Takagi- Sugeno Fuzzy Controller Based on LUTs

Rasha Ilham Majeed

Al-Rafidain Engineering Journal (AREJ), 2010, Volume 18, Issue 6, Pages 81-94
DOI: 10.33899/rengj.2010.34908

In this paper, an approach for designing Fuzzy Controller based on Takagi-Sugeno inference engine with high computational speeds in architecture is proposed. The work focuses on advantages and disadvantages of Takagi-Sugeno control as compared with Mamdani's', in addition to focus on how computational complexity of the inference engine can be reduced and the speed of computation can be increased. Fuzzy Controller is implemented on FPGA using Look-Up Table (LUT). Whereas, each LUT is represented by Block RAMs in FPGA besides using number of arithmetic units in the design. To interface the design with users, a GUI program is designed using Visual Basic. Using JTAG port, the GUI's data can be stored in Block RAMs. Later, a designed Air Conditioning application is implemented and the practical results (in FPGA), theoretical results (computed by hand) and Matlab results are compared.
Keywords: LUT, FPGA, JTAG port, Takagi-Sugeno Fuzzy Controller, GUI.

Fpga Based Implementation Of Concatenation Matrix

Fakhraldeen H. Ali; Amar I. Dawod

Al-Rafidain Engineering Journal (AREJ), 2010, Volume 18, Issue 2, Pages 15-31
DOI: 10.33899/rengj.2010.28255


The computer graphics system performance is increasing faster than any other computing application. The Geometric transformations and animation are one of the most important principle of the interactive computer graphics which are essential for modeling and viewing. This paper tends to construct a general form of matrix representation of the geometric transformations and implement it using Field Programmable Gate Array (FPGA). In addition to that the sine and cosine function evaluation is done using two techniques, the lookup table method and CORDIC algorithm.

Keywords: lookup table, FPGA, geometric transformations, CORDIC.

An FPGA-based Fault Tolerance Hypercube Multiprocessor DSP System

Al-Rafidain Engineering Journal (AREJ), 2010, Volume 18, Issue 1, Pages 69-82
DOI: 10.33899/rengj.2010.27993

This paper describes a new proposed architecture for tolerating faults in hypercube multiprocessor DSP system. The architecture considered employs the TMS320C40 DSP processors as processing node. The system has a single spare DSP processor assigned to each cluster ( a group of four nodes ). Each pair of clusters share one FPGA unit connected to every node in the two clusters plus the two spare processors. The FPGA units in the system are devoted for data routing, data distributing (in real time processing), diagnosis, system reconfiguration and expanding. Every 3D hypercube has additional spare processors connected to FPGA device of that cube. The spare nodes are used in two stages to tolerate more than one faulty node in each cluster with a low overhead and minimum performance degradation. The system makes use 50% hardware redundancy in the form of spare nodes to achieve fault tolerance. The effectiveness of interprocessor communications and the mechanism of fault detection( for one and two fault ) has been successively simulated using (Xilinx Foundation F2.1i) simulator.

Keywords: Fault Tolerance, Hypercube multiprocessor, TMS320C40, FPGA, DSP processor

FPGA Implementation of Adaptive Noise Canceller

Aws Hazim saber; Rafid Ahmed Khalil

Al-Rafidain Engineering Journal (AREJ), 2009, Volume 17, Issue 4, Pages 63-72
DOI: 10.33899/rengj.2009.43287

This paper presents hardware implementation of least mean square (LMS) adaptive filter based Adaptive Noise Canceller (ANC) structure on FPGA using VHDL hardware description language. First, the adaptive parameters are obtained by simulating ANC on MATLAB. Second, the data, processed by FPGA, such as step size, input and output signals, desired signal, and coefficients of ANC, are exactly expressed into fixed-point data representation. Finally, the functions of FPGA-based system structure for such LMS algorithm in time sequence are synthesized, simulated, and implemented on Xilinx XC3S500E FPGA using Xilinx ISE 9.2i developing tool. The research results show that it is feasible to implement, on chip train, and use adaptive LMS filter based ANC in a single FPGA chip.
Keywords: Adaptive noise canceller, least mean square, FPGA, Adaptive FIR filter

Developing the design of the Etherchannel switch for the enhancement of the Quality of Service (QoS) performance

Basil Sh. Mahmood

Al-Rafidain Engineering Journal (AREJ), 2009, Volume 17, Issue 3, Pages 60-71
DOI: 10.33899/rengj.2009.42912

Quality of Service (QoS) mechanisms provide the necessary level of services (bandwidth and delay) to any application in order to maintain an expected quality level. This paper studies the effect of adopting QoS on the performance of (real time) system like video conferencing. A simulation model of the real time network is built using OPNET package. The various parameters affecting the system performance are determined and different solutions to enhance the system performance are suggested .A modified switch architecture is proposed to enhance the real time performance of the system and to modify its quality of service capability .The modification includes adding Etherchannel unit which can classify data into real time or non-real time data and direct each data packet to the appropriate channel .The architecture of the Etherchannel unit is described by VHDL programming and built on FPGA chip .Accordingly , the modified switch is found to need only extra seven clock pulses to classify each data packet .

Keywords: Quality of Service (QoS), Switched Ethernet, UDP, TCP, FPGA,VHDL

Anti-Aliased DDA

Fakhraldeen Hamid Ali

Al-Rafidain Engineering Journal (AREJ), 2009, Volume 17, Issue 2, Pages 25-34
DOI: 10.33899/rengj.2009.38769


Bit-mapped images are prone to the jaggies (stair-step effect along edges) because the computer uses small dots to build images. This effect is called aliasing and the technique used to reduce it is called antialiasing. This paper investigates aliasing along straight line segments or edges, its origin, and how it is affected by the orientation or slope of the segment. A method for antialiasing or smoothing the straight line segments by modifying the intensity of the pixels is presented. Hardware implementation of this method is finally formulated and tested using Field Programmable Gate Arrays (FPGA).

Keywords: pixel, jaggies, antialiasing, raster, FPGA.

Digital Hardware Implementation of Artificial Neurons Models Using FPGA

ad Ahmed Al-Kazzaz; Rafid Ahmed Khalil

Al-Rafidain Engineering Journal (AREJ), 2009, Volume 17, Issue 2, Pages 12-24
DOI: 10.33899/rengj.2009.38764


This paper present the digital implementation of multiply-accumulate (MAC) circuit of artificial neuron using FPGA (Field Programmable Gate Array) including three types of nonlinear activation functions: hardlims, satlins and tansig. A VHDL hardware description Language codes are used to implement the neuron using XC3S500E-FG320 Xilinx FPGA device. The simulation results obtained with Xilinx Foundation 8.2i software are presented. The results are analyzed in terms of usage percentage of chip resources and maximum working frequency.

Keyword:- Artificial Nouron , FPGA , Neural Network

FPGA Implementation of a Multilayer Perceptron (MLP) Network

Nour talal gadawe; Rafid Ahmed Khalil

Al-Rafidain Engineering Journal (AREJ), 2009, Volume 17, Issue 1, Pages 1-13
DOI: 10.33899/rengj.2009.38557

In this paper, we suggest a method for designing and implementing of multilayer Perceptron (MLP) neural network based on backpropagation (PB) learning algorithm. The method is described using very high speed integrated circuit hardware description language (VHDL), that used in developing the designs of a very large scale integration (VLSI). Firstly artificial neuron with sigmoid activation function has been designed and implemented which is considered as a basic unit of MLP. The MLP network is trained by BP algorithms, in the Matlab environment in order to obtain the ideal parameters of the network. Then hardware implementation of MLP on FPGAs, of types Spartan 3E and Virtex4 is achieved by using integer format and floating point format respectively . A comparison is done between the two arithmetic formats of MLP implementations on FPGAs.

Keyword: MLP neural networks , floating-point (FLP) arithmetic, FPGA, VHDL.

Adaptive Filter Application in Echo Cancellation System and Implementation using FPGA

Rafid Ahmed Khalil

Al-Rafidain Engineering Journal (AREJ), 2008, Volume 16, Issue 5, Pages 20-32
DOI: 10.33899/rengj.2008.44853


In telephony system, the received signal by the loudspeaker, is reverberated through the environment and picked up by the microphone. It is called an echo signal. Which is in the form of time delayed and attenuated image of original speech signal, and causes a reduction in the quality of the communication. Adaptive filters are a class of filters that iteratively alter their parameters in order to minimize a difference between a desired output and their output. In the case of acoustic echo, the optimal output is an echoed signal that accurately emulates the unwanted echo signal. This is then used to negate the echo in the return signal. The better the adaptive filter simulates this echo, the more successful the cancellation will be. This paper examines LMS algorithm of adaptive filtering and the application in acoustic echo cancellation system. Employing a discrete signal processing in Matlab for simulation with real acoustic signals. Also a hardware implementation of an adaptive filter have been developed using XC3S500E Xilinx FPGA chip, and VHDL language on RTL abstraction level.

Keywords: Acoustic echo cancellation, Adaptive Filter, FPGA, VHDL.

Fpga Design And Implementation Of A Scan Conversion Graphical Sub-System

Amar I. Dawod; Fakhraldeen H. Ali

Al-Rafidain Engineering Journal (AREJ), 2008, Volume 16, Issue 4, Pages 80-92
DOI: 10.33899/rengj.2008.44735

One Major modeling primitive in the field of Computer Graphics is a planar polygon. This polygon can have an arbitrary number of vertices and different shapes. In this paper a graphic sub-system is designed and implemented using Field Programmable Gate Array
( FPGA ). One of the main tasks of the hardware designed is scan-converting convex planar polygons required to update an image in the image memory or video RAM which is used as a Frame Buffer. A facility to read the pixels (Picture Elements), from the frame buffer, for display on the monitor of the computer is also included in the design.
Keywords: frame buffer, scan-conversion, polygons, pixels, FPGA

Hardware Implementation of Backpropagation Neural Networks on Field programmable Gate Array (FPGA)

Rafid Ahmed Khalil

Al-Rafidain Engineering Journal (AREJ), 2008, Volume 16, Issue 3, Pages 62-70
DOI: 10.33899/rengj.2008.44649

In this paper, a design method of neural networks based on VHDL hardware description language, and FPGA implementation is proposed. A design of a general neuron for topologies using backpropagation algorithm is described. The sigmoid nonlinear activation function is also implemented. The neuron is then used in the design and implementation of a neural network using Xilinx Spartan-3e FPGA. The simulation results obtained with Xilinx ISE 8.2i software. The results are analyzed in terms of operating frequency and chip utilization.
Key words : Artificial, Neural , Network, Backprobagation, FPGA,VHDL.

Reconfigurable Hardware Based Programmable Digital Circuit Design for a Rotational Stepper Motor

Rabee Mouffag Hajim; Yahya Taher Qassim

Al-Rafidain Engineering Journal (AREJ), 2008, Volume 16, Issue 1, Pages 15-24
DOI: 10.33899/rengj.2008.43968

In this research, a hardware digital circuit was designed for a programmable rotational stepper motor using VHDL as a design tool and the FPGA as a target technology. The design is implemented on a Spartan 3 starter kit (supported with XC3S200 field programmable gate array). The 50 MHz provided by the starter kit is divided to obtain the necessary delay time between the motor phases that ranges between 2-10 m seconds. Through output selections, the direction of rotation of the stepper motor besides the magnitude of the angle of movement and the rotation speed can be controlled. The advantage of using reconfigurable hardware (FPGA) instead of a discrete digital component is that the designer can make modifications to the design easily and quickly, and the total design represents an embedded system (which works without computer). The total programmable hardware design that make control on the stepper motor movement, occupy an area that did not exceed 12% of the chip resources.
Keywords: Stepper Motor, Motion control, FPGA.

Single Chip DWT-IDWT Processor Design with VHDL

Ahmed khorsheed Al-Sulaifanie; Yahya Taher Al-Dabbagh

Al-Rafidain Engineering Journal (AREJ), 2006, Volume 14, Issue 1, Pages 58-72
DOI: 10.33899/rengj.2006.47424

Abstract: frames under C.L
The applications of Discrete Wavelet Transform necessitate fast computation. Full-custom
VLSI devices (ASIC) have been used for fast though expensive implementations of DWT.
Field-Programmable Gate Array (FPGA) architectures offer economical but area-constrained
implementation of DWT. The present paper proposes an important issues on the design and
simulation of ASIC and FPGA architectures for 1-D DWT as well as inverse DWT on a
single chip using VHDL simulation tools. The design of the programmable chip that can be
used as 1-D DWT or IDWT is introduced based on two quadrature mirror filters (QMF), one
used with DWT (decomposition) and other used with IDWT. The design is modular; the chip
can easily be worked as DWT or IDWT with ability of selecting one of the four
corresponding types of QMF wavelet filters (Daubechies 1, 2, 3 and 4).
The first chip is implemented and simulated using FPGA for two word lengths 8-bit and
12-bit respectively. The results show a clock speed of 66.2 MHz for 8-bit, and 55 MHz for
12-bit. While the design of ASIC chip validate a clock speeds 85.5 MHz and 59.2 MHz for
8-bit and 12-bit respectively. Simulation results have established that the higher word length
increase accuracy but at the expense of higher designed size and longest combinational logic
between two storage elements. This means increasing the length of critical path as result of
complexity which decrease the maximum speed clock.
Keywords: VHDL, Wavelet, FPGA, Architecture.