Khalil: Digital Hardware Implementation of Artificial Neurons Models Using FPGA

This paper present the digital implementation of multiply-accumulate (MAC) circuit of artificial neuron using FPGA (Field Programmable Gate Array) including three types of nonlinear activation functions: hardlims, satlins and tansig. A VHDL hardware description Language codes are used to implement the neuron using XC3S500E-FG320 Xilinx FPGA device. The simulation results obtained with Xilinx Foundation 8.2i software are presented. The results are analyzed in terms of usage percentage of chip resources and maximum working frequency.


Introduction
Artificial neural networks (ANNs) have been used successfully in solving pattern classification and recognition problems, function approximation and predictions. Their processing capabilities are based on their highly, parallel and interconnected architecture. Such characteristics make their implementation enormous challenging, and also very costly, due to the large amount of hardware required [ 1].
Digital implementation of ANNs may be performed using different tools such as custom design, digital signal processor (DSP), programmable logic …etc. Among them, programmable logic offers low cost, powerful software development tools and true parallel implementation [ 2].
Field Programmable Gate Array (FPGA) are a family of programmable device based on an array of configurable logic blocks (CLBs), which gives a great flexibility in prototyping, designing and development of complex hardware real time systems . The structure of a FPGA can be described as an "array of blocks" connected togerther via programmable interconnections. The main advantage of FPGA is the flexibility that they afford [ 3]. Xilinx Inc. introduced the world's first FPGA, the XC2064 in 1985. The XC2064 contained approximately 1000 logic gate. Since then, the gate density of Xilinx FPGAs has increased thousands times [4]. Recently there is a lot of interest in the FPGA realization of neural networks which is reported by many researchers [ 1,[5][6][7][8].
In the present work, we introduced the design of an artificial neuron models based on a XC3S500E Xilinx FPGA device. The XC3S500E Xilinx FPGA device has high gate density i.e. 500,000 logic gate and many features, as illustrated below [ 9], which are necessary for neural implementation: -Fast logic enable the design of compact and fast arithmetic functions (i.e., multiplication and addition ). -Look up tables can be used as RAMs and ROMs. -Combinational functions have up to ten inputs within configurable logic blocks (CLBs), and delays are very small and almost independent on the number of variable.
-Very high routing capabilities allows successful implementation of critical path delays, even for complex neural network [1]. The Very high speed integrated circuit Hardware Description Language (VHDL) is heavily used by large corporations, majority of companies as well as universities for FPGA programming.VHDL was first adopted as language standard in 1987, with a major revision occurring in 1993, 2001 [10 ]. VHDL is very powerful (HDL) but very complex syntax language. VHDL simplifies the development of complex system such as ANNS, because it is possible to model and simulate a digital system from a high level of abstraction with important facilities for modular design [2 ].
The purpose of this work is to design an artificial neuron model. The model consist of two stages .The first is the MAC stage for multiplication and accumulation of parallel inputs and weights values . And the second is the nonlinear activation function for the output signal .Three types of activation functions were considered : symmetrical hard limiter, symmetric saturating linear, and hyperbolic tangent sigmoid (referred to as hardlims, satlins, and tansig in Matlab software package receptivity ).

Mathematical model of an artificial neuron
The common Mathematical model of an artificial neuron is shown in Fig.(1) [ 11] . The neuron output can be written as : where j p is the input value and j w is the corresponding weight value , a is the output of the neuron, and f is a nonlinear activation function. Typically the activation function is chosen by the designer for specific training algorithm , and then the weights will be adjusted by some learning rule so that the neuron input / output relationship meet some specific goal.

VHDL design of the neuron
It is important to design the neuron without activation function as common part in designing a complete neuron with any activation function based on FPGA The design affect the utilization ratio of the chips area and the processing speed directly. The structure of the neuron can be realized in many ways , mainly considering the degree of the parallel computation needed .
The proposed VHDL structural diagram for hardware implementation of neuron is shown in Fig (2 ). The structure contains two shift registers , one shifters hold the weights , while the other holds the inputs ( shift register with data load capability ) .This approach is appropriate for general purpose neuron ( i.e., with programmable weights ) .It employs only one i nput to l oad al l wei ghts ( thus savi ng on chi p pi ns ) . The wei ghts are shi f ted i n sequentially until the register is loaded with its weight . The weights are then multiplied by the input and accumulated to produce the desired output . The VHDL code used for the implementation of a neuron without activation function is presented in table (1). Simulation results is shown in Fig.( 3a ) .The neuron has three 4-bit input each .since a SIGNED data representation was employed , the range of the input values and weights runs from (-8 to 7 ) and the range of the 8-bit output runs from ( -128 to 127 ) . The first input vector applied to the neuron has the values 1 23 3 , 4 ,5 p p a n dp , since there are three wights , three clock cycles are needed to shift them in ,as shown in Fig ( 3 a) .The values chosen for the weights were 3 21 7 , 8 ,9 www . Note that 9 is in indeed -7 , and 8 is -8 because data type used here is SIGNED . Consequently, the weight have been all loaded , the system immediately gives its output, i.e., 11 22 33 a pw pw pw (3) ( 7) (4)( 8) (5)(7) 18 represent as 256 18 238. The neuron output for the second input vector [6 8 2]is 36. Fig.(3b) shows the RTL (register transfer level) hardware schematic circuit for implementing linear neuron.

VHDL design of activation functions
The activation function in Fig.( 1 ) may be linear or nonlinear function of n. Aparticular activation function of neuron is chosen to satisfy specification of the training algorithm that the neural network is attempted to run. In this work, three of the most commonly used activation functions are hardware implemented on FPGA using VHDL language.

The symmetrical hard limit activation function
The symmetric hard limit transfer function referred to as " hardlims " in matlab . It is used to classify input into two distinct categories , and can be defined as follows [11]: 10 10 n a n …………………….

.(2)
This function is shown in Fig (4 ) . it is used with MLP, and Hopfield neural networks :-

Fig.( 4 ) Symmetrical hard limit activation function
The VHDL code used to implement this function is shown in table ( 2 ), as VHDL package. Table ( 2) VHDL code for implementing hardlims function as a package .
For simulating purposes the neuron inputs i p and weight i w were represented as signed 4-bit, the output of MAC n as signed 8-bit, and neuron output a represented as stdlogic 1-bit . The time diagram for artificial neuron with hardlims activation function is shown in Fig.(5a). Fig.(5b) shows the RTL hardware schematic circuit for implementing hardlims neuron.

The saturating linear activation function
The output of saturating linear activation function " satlins ", can be defined as follows [11] :  Table 3 shows the VHDL codes ( as a package) used for implementing the " saltins " activation function. Table ( 3 ) VHDL code for implementing satlins activation function as a package .
The resulthng simulation timing diagram of the neuron with satlins activation function is shown in Fig.( 7a) . where 4-bit are assigned to the input j p , 4-bit to the weight j w , 8bit to the input of the activation function, and 8-bit to the output a . Note that all data are of singed type. It can be seen that the output of the neuron a equal the output of MAC n as long as it is lower to the saturation level .Once the sum equals or becomes greater than the saturation level the output remains constant and equal to the saturation level . Fig.(7b) shows the RTL hardware schematic circuit for implementing satlins neuron.

Hyperbolic Tangent Sigmoid activation function
The Hyperbolic tangent sigmoid ( tansing ) activation function is shown in Fig ( 8 ) . This function takes the input ( which may have any value between plus and minus infinity ) and the output value into the range -1 to 1 , according to the expression [11]: The tansig activation function is commonly used in multilayer neural networks that are trained by the backpropagation algorithm , since this function is differentiable [11 ] . The tansig function is not easily implemented in digital hardware because it is consists of an infinite exponential series [2 ] .Many researchers use a lookup table to implement the tansig function. The draw back of using lookup table is the great amount of hardware resources needed [1,5]. A simple second order nonlinear function presented by kwan [12 ] , can be used as an approximation to a sigmoid function . This nonlinear function can be implemented where B and g represent the slope . and the gain of the nonlinear function () fn between the saturation regions -Ll and L. The block diagram of the sigmoid activation function implementation using this process is shown in Fig.(9). The VHDL code for an approximated tansig function as a package is given in Table (4). The input parameters have been set for an integer range from 0 to 255.

Table(4) VHDL code for implementation tansig activation function as a package.
The resulting simulation timing diagram of the neuron with tansig function is shown in Fig ( 10 a) . The inputs and weights of the neuron is defined as 4-bit and the output is defined as 8-bit and all of singed data type .The approximated parameter of tansig function are of integer data type. Fig.(10b) shows the RTL hardware schematic circuit for implementing tansig neuron. 23 In order to make easier visualization of the result of hardware Implementation of tansig function , a comparison between tansing defined by kwan ( equation ( 4 ) ) and the Implemented result is presented in Fig.( 11 ) . This clearly shows that the hardware realization of the tansig function provide a reasonable approximation of this function .

Synthesis and Implementation Results
As a result of Synthesis and Implementation of artificial neuron with three different activation function ( hardlims , satlins , tansig ) on a Xilinx XC3S5OOE FPGA device . Table ( 5 ) give performance and resource use summary for all implemented 8-bit neurons . As it can be seen , the hardlims function require a very few hardware resource in comparison with the tansig function. The operation speed in all cases gives a good results and shows the advantages of using FPGAs in neural realization.

Conclusions
The results of this work successfully demonstrate the hardware implementation of artificial neuron with three different activation functions using Xilinx FPGAS . This allows comparisons to be made between the hardware realization of these neurons , which are regarded as basic building block of artificial neural networks .
The tansig function approximation problem has been employed and demonstrate the validity of hardware implementation. The timing diagrams show the accuracy of the results and to enable comparisons with actual result . The operation frequency in all cases is very good and it gives a clear idea of the advantages of using FPGAs, since multiple modules can be working in parallel with a minimum reduction in performance due to the increased number of interconnections.
Finally , it can be say that FPGAs technology and their low cost , and reprogrammability make this approach a very powerful option for implementing ANNS as an alternative to the development of custom hardware.