Detection and Diagnosis of Inter-Turn Short Circuit Faults of PMSM for Electric Vehicles Based on Deep Reinforcement Learning

The fault diagnosis of electric vehicle motors is one of the exciting topics, and machine learning-based artificial intelligence proved its worth in this field. The primitive methods of machine learning, such as the support vector machine (SVM) and Artificial nural network (ANN) suffered from feature extraction problems, the efficiency of the system depended on the quality of these extracted features until deep learning and deep neural networks came to solve This problem, Although the efficient performance of the deep neural network, it needs excellent experience in selecting parameters and building the structure of the neural network. The emergence of deep reinforcement learning, capable of handling raw data directly and constructing end-to-end systems to link raw fault data with its corresponding mode, represents a significant advancement in the field of machine learning. Furthermore, deep reinforcement learning exhibits greater intelligence compared to previous methods. In this Research, Deep reinforcement learning will be applied in diagnosing inter-turn short circuit faults and finding the level of fault in the built-in wheel permanent magnet synchronous motor of electric vehicles. The proposed method proved highly effective for detecting faults, with an efficiency of 99.9 % and has a promising future in building a general system capable of predicting faults in the early stages.


INTRODUCTION
Diagnosing faults of electric motors used in electric cars is one of the interesting topics these days due to the increasing demand for these cars [1], [2].With the general orientation of the world's governments to solve the problem of global warming, most of the manufacturers of these cars increased production, for example, sales increased between 2016 and 2017 by 54%, and a significant change is expected in the percentage of electric cars use in the next fifteen years [3], so there was a need to think about increasing The safety of electric cars and the protection of people inside them, through early detection of malfunctions that can occur in the motors of electric vehicles.The permanent magnet synchronous motor (PMSM) is one of the most common types of electric motors used for electric cars because of their characteristics that distinguish it from other motors, as it has high efficiency, small size, high torque-to-current ratio, durability, and high speed [4] [5], Despite the advantages these motors possess, they are prone to several faults.The higher (power/torque) profile, the higher (mechanical/ thermal) stress, and this makes it vulnerable to faults [6], The most famous of these malfunctions and the most frequent is the occurrence of inter-turn short circuits in the stator windings, and in general, the percentage of stator faults is about 38% of the total faults that affect the motor [7].Several reasons lead to inter-turn short circuits, including the breakdown of the insulators of the stator windings, overheating, running for long periods, and aging [8].The Inter-turn short circuit (ITSC) faults start in the form of a simple fault and develops quickly into a severe one, and this makes the motor draw a higher current, which leads to an increase in the motor temperature significantly leads to break down the machine,So as described above, even a minor inter-turn short circuit may develop into a severe one, so it is Vol.28, No. 2, September 2023, pp.75-85 necessary to detect it in the early stages [9].The topic of detecting faults in electrical machines has been and remains a subject of extensive discussion, arousing the interest of researchers, the research papers in the field of machine learning in detecting and diagnosing faults are demonstrated.Yaw D.Nyanteh used an Artificial neural network(ANN) to detect stator winding faults in the permanent magnet synchronous motor by taking the current signal for different cases of faults and finding the zero component of the current signal then training the neural network based on these data to detect faults and their severity level [10].Young-Woo Young used a support vector machine (SVM) and fast Fourier transform (FFT) to detect the faults in the internal windings of the stator, as well as the demagnetization fault in the permanent magnet synchronous motor [11].Siyuan Liang used the Sparse representation method to extract features from sensor data and then used the Support vector machine (SVM) to detect the inter-turn short circuit fault [12].The traditional methods using simple machine learning such as SVM), (ANN), and so on suffered from the problem of extracting features that need expert experience, as well as they need long training time to get better results also these methods have a low level of intelligence so the efficiency of the model depends on the quality of the extracted features.The development in the field of artificial intelligence and the breakthrough achieved by deep learning made it a focus of researchers' interest, especially in the field of fault detection.Deep learning solved the problem of extracting features as the multiple layers that make up the neural network can learn better and deal with complex data.Wang Chio Sheng Used one of the deep learning approaches to detect faults in a permanent magnet synchronous motor using onedimensional convolutional neural networks.(1D CNN) [13].I-His Kao used wavelet packet transformation with a convolutional neural network (RNN) to detect faults that occur in bearings as well as demagnetizing faults [14].Although deep learning algorithms solved the problem of feature extraction and achieved good results, it also needs experience in building the structure of the neural network used, as well as hyperparameters that achieve the desired result.Additionally, diagnosing faults using deep learning is often perceived as less sophisticated since it primarily relies on supervised learning, which simply fits input data to corresponding outputs.Therefore, there's a necessity to explore more intelligent systems capable of detecting faults solely based on raw data, without requiring specialized expertise in extracting features or selecting optimal hyperparameters.The emergence of deep reinforcement learning, resulting from the integration of deep learning and reinforcement learning, has marked a significant breakthrough in the field of artificial intelligence.This approach combines the perceptual capabilities of deep learning with the decision-making abilities of reinforcement learning, bringing it closer to human behavior.[15] .It showed a high classification efficiency as it can directly deal with the raw complex data and link the faulted data with the associated fault mode.[16],[17].Deep reinforcement learning was able to achieve many successes, the most important of which was in the field of electronic games, where a smart robot using DRL was able to overcome the world champion in the game Go [18], This opened the door to the application of DRL algorithms in many fields.Deep reinforcement learning achieved great success In controlling robots, as well as in computer games and recommendation systems [19],[20] and had successes in the field of electrical engineering, it was used in solving the problems of residential smart grids and power system management, as well as in building an optimal control system for the HVAC system [15].Recently, Ding Y, Ma [21] used DRL to detect mechanical faults in induction machines and by using the vibration signal.The researcher used sparse auto-encoder (SAE) as one of the deep learning techniques and was able to achieve good performance and high efficiency.Based on our literature survey none of the researchers used of DRL detecting electrical faults Such as inter-turn short circuits in permanent magnet synchronous motors f used in electric vehicles.In this paper, a novel method is used, using deep reinforcement learning and using a deep convolutional neural network (DCNN) as agent policy, whereby DCNN can extract features from fault data better.The proposed method has been compared to the previous machine learning-based fault detection methods, and the advantages of this method can be summarized as follows.1-The system that depends on deep reinforcement learning can achieve excellent performance in detecting short circuit faults under different working conditions such as speed change, load, and the presence of noise in the signal.2 -It can detect and classify fault severity levels with high accuracy, as the results were analyzed using the classification report and the Confusion Matrix.3-It can sense malfunctions in the early stages with high efficiency, thanks to the presence of the feature of awareness provided by deep learning, it can extract features from the fault data as well as the decision-making feature provided by reinforcement learning, which depends on the principle of reward and punishment, and thus the system becomes closer to human behavior.
The rest of the research paper will be organized as follows.The second part will deal with a brief explanation of the convolutional neural network (CNN) as well as the theoretical background of deep reinforcement learning (DRL), in the third part it will explain the data acquisition system as well as the approved dataset, in the fourth part the method used will be explained as well as the analysis of results, while the fifth part will be about the most important conclusions.

METHODOLOGY 2.1 Convolutional Neural Network (CNN)
CNN is one of the most famous types of neural networks, as this type of network contains several hidden two-dimensional layers and it mainly consists of five parts: the input layer, the convoluted layer, the assembly layer(pooling layer), the fully connected layer, and the output layer.The convolutional neural network is mainly used with images and it has many advantages compared to the ordinary neural network as it can extract features by itself through the convolutional layer.The following is an explanation of the working of the layers of a convolutional neural network [22], [23] [24].
A CNN usually contains two convolutional layers and two pooling layers.The number of these layers and the filter size are chosen in proportion to the problem to achieve the highest efficiency.Calculating the training speed and computing performed by the computer, as well as an overfitting condition may occur, which negatively affects the performance of the algorithm in practice [25], [26].The CNN architecture used for bulding the DRL agent is as shown in Fig. 1.
Fig. 1 CNN architecture used for building the DRL agent

Reinforcement learning (RL) and Deep reinforcement learning (DRL)
Reinforcement learning algorithms are based on the principle of reward and punishment, as they differ in the way they work from simple machine learning algorithms and deep learning algorithms, as they are closer to human behavior.Reinforcement learning (RL) is based on 4 basic concepts: action, environment, reward, and state.As shown in the Fig. 2, the agent takes a certain action based on the coming observations from the environment, and based on this action, he will receive a reward or punishment.Through trial and error, the agent will learn the best policy for taking action in each given state inorder to maximize the accumulated reward.Although reinforcement learning algorithms have achieved success, they remain confined to a specific framework, as they are unable to deal with complex problems that need to extract features from raw data.The emergence of deep learning in recent years and its ability to deal with raw data directly is able to extract features from the data, whereby using deep learning it is possible to make an end-to-end model and generalize this model so that the programmer does not need to have high experience in the problem, so when combining deep learning with reinforcement learning, we will get what is known as deep reinforcement learning (DRL), which depends in its learning on the principle of reward and punishment provided by reinforcement learning and on the ability to extract features from the raw data provided by deep learning, and thus we will get an intelligent agent capable of dealing with complex problems (high-dimensional data ) and learning better.The success of (DRL) in beating the world champion in the game of Go and achieving great success in 49 other games [18], [27], sparked the enthusiasm of researchers to explore his ability in other fields such as robotics [28], self-driving cars [29], smart grids [30] and finance [31].The DQN algorithm is one of the most powerful DRL algorithms that can be used in fault detection, as the agent takes discrete actions for each environment state and can learn a successful policy based on the DNN that can deal with the raw data directly.Fig. 2 Reinforcement learning architecture

Q-learning
The Q-learning algorithm is one of the reinforcement learning algorithms, which is a model-free algorithm.This algorithm creates a table to calculate the maximum expected future rewards, where the columns in this table represent the actions and the rows represent the states, this table is called a Q-table, and the value of each cell in this table represents the greatest expected reward that the agent can get based on A specific action and state Q (a, s).The agent chooses the action that has the largest equivalent value from the Q-table.When this action is executed, the agent will get a reward or punishment accordingly, then the algorithm will update the table by using the Bellman equation as follows [21].
where Qnew (at ,st )is the new value after updating for a given procedure and condition, Q(at, st ) is the current value, α is the learning ratio, and Rt+1 is the reward for a given action at and state st, γ the discount factor ,   ( +1 , ) is the greatest expected future reward.The Bellman equation updates the table after every action taken by the agent.This action taken is based on the largest expected value of the future reward, where the agent calculates the expected reward for each action in the table, depending on the Q-learning algorithm, using the Q function equation as follows.
Where   (  ,   ) is the Q value for given state st and action at, Rt is the reward for each time step,  is the discount factor.

Deep Q-network (DQN)
The DQN algorithm combines reinforcement learning with deep learning, and it can be considered comprehensive reinforcement learning that is able to deal with higher-dimensional data, where deep learning extracts features from complex data and works to find an approximate function that links states with the actions.One of the deep learning algorithms can be used in building the agent like CNN, SAE as well as RNN and then the reinforcement learning makes the decision based on the Q-Table , DQN was shown for the first time using CNN to find out the features in the image frames taken from the games [20].The DQN algorithm has solved the problem of the reinforcement learning and the Q-Learning algorithm, as the table used in the Q-Learning will be very large when solving complex problems in order to find the value of Q for each state in the environment, and this means a very long training time in addition to the need for capacity high computability.Using the DQN, the neural network works to find an approximation function for the Q values.The action and state as well as the reward for each time step are stored in memory and this information is used to train the neural network and adjust its weights continuously until the optimal values are reached.Fig. 3 represents the basic structure of deep reinforcement learning used for fault detection.

Data Acquisition
A build-in wheel permanent magnet synchronous motor for electric vehicles was used.This motor has a capacity of 1500 watts, 60 volts and 48 poles, the motor was installed on a base and the brakes were added in order to simulate the change of load on the motor as shown in Fig. 4 , current transformers were used with a conversion ratio of 100 ampere to 50 milliamps In order to take the current signal in sine wave form from the motor feed lines to the microcontroller, an Arduino microcontroller of the DUE type was used, as it has high accuracy and speed in data collection, depending on the Nyquist theorem, the sampling frequency Must be greator than or equal to the basic frequency of the motor, and since the motor It depends on changing the frequency and voltage in order to control the speed, as the frequency at the full load speed reaches 200 Hz, so the sampling frequency must not be less than 400 Hz in order to get a good signal, and the higher the sampling frequency, the higher the signal quality, this serves the system in discrimination Between the faults signals and the normal signals.The sampling frequency was chosen within the limits of 4000 Hz in order to obtain high signal quality and in accordance with the ability of the Arduino device used.The Coolterm program was used to convert the data taken from the Arduino into Excel format, so that the system designed using DRL can deal with it and train on it later.
The data has been divided into four types, normal data with no faults(H), data with fault of the level 1(F1) , fault level 2(F2) and fault level 3(F3), as they were taken in different cases of changing speeds and loads on the motor, as shown in the table1.The signal was divided into segments, as each segment contains 400 samples that are entered in the form of an image into the system, after adjusting its dimensions, it is considered as one sample.each of F1, F2, and F3 describes the case of three levels of inter-turn short-circuit fault between the stator windings of the motor, where the severity of the faults varies according to the value of the fault resistance (Rf) used and as shown in the table 1 and Fig4.the thee phase current signals of all these cases shown in Fig. 6, Fig. 7, Fig. 8 and Fig.9  Fig. 5 The experimental setup of the entire fault diagnosis system.

THE PROPOSED DRL
The process of detecting and diagnosing faults is one of the classification problems, which needs a classifier capable of linking the data corresponding to each fault with its corresponding mode.Classification problems can be represented in the form of a guessing game in deep reinforcement learning because this type is designed to deal with games, as it is suitable for problems that need Sequential decision-making.To build the system according to the concept of reinforcement learning, we need an environment, agent, reward and punishment, and defining possible actions.The environment will consist of 4 types of data (H, F1, F2, F3) in the form of images, as each image is 400 samples taken from the current sensor, the agent is a smart body that depends on deep reinforcement learning.This agent takes a specific action every time and learns based on the reward he gets from this action.If the action taken is correct, this means the agent was able to guess the type of fault, then the agent will get a reward of +1, but if the guess is wrong, the agent will get a reward of 0. There are four types of actions that can be taken by the agent, which are the number of types of data available, which are (H, F1, F2, F3).in each time step, (State, actions, reward, next state) will be stored in the memory and then the stored data will be used to train the deep q network and then update it.The agent will try to learn an optimal policy that will enable him to correctly guess the type of fault.Fig. 10 The flowchart of the proposed DRL method.

RESULTS AND DISCUSSIONS
The process of detecting faults goes through two stages, where the proposed system is first trained on the available data from the sensors, and then comes the stage of testing the system to see its effectiveness.The data has been divided into training data and test data, where 95% of them are training data and 5 % for the purpose of testing the system.Since fault detection is one of the classification problems, it was relied on Confusion Matrix as well as on Classification Report in order to monitor the performance of the system.The classification report includes four indicators that are used to determine the performance of the system,namely Recall, Precision, accuracy, and F1 score.The equations governing these indicators are as shown below.Where TP is a true positive, which represents the positive samples that were classified as positive samples, FP represents a false positive, which is a negative sample that has been misclassified as positive, FN stands for false negative, it means positive samples that were misclassified as negative, TN stands for true negative, which are samples that have been correctly classified as negative.Support is the number of samples for each class in the dataset.positive samples are samples that have been diagnosed correctly, while negative samples are those that have been diagnosed incorrectly [32].theclassification report that shows these four indicators is as shown in Fig. 7.As noted from the classification report, the overall efficiency of the proposed system that uses the deep reinforcement learning algorithm is close to 100%.In order to know the performance of the system more accurately and to know the efficiency of detection for each level of fault, the Confusing Matrix was used as shown in the Fig. 8, where the vertical axis refers to the real types of faults, while the horizontal axis refers to the types of faults that were predicted by the system.The diagonal represents the percentage of correctly predicted, while the rest of the cells represent the percentage of predicting incorrectly.Fig. 12 The Confusion matrix of the proposed DRL algorithim.
The proposed system, which uses the deep reinforcement learning algorithm, has also been compared with traditional machine learning systems, as well as with deep learning systems, as shown in the table 2. The confusion matrix in the Fig. 13 shows that the proposed system using deep reinforcement learning has high accuracy in detecting the faults and determining the severity of the fault also when testing the system on the data of the researcher wondo , as the total accuracy of the system reached 96%., when looking at the confusion matrix, we see that the system was able to distinguish between the data in the normal state and the inter-turn short-circuit fault levels by 100%, but the accuracy of the system in distinguishing between the first and second fault levels(fault level1 and fault level 2) was less compared to the rest of the fault levels, and this can be improved If the system is trained more times or the number of training samples is increased.
Fig. 13 The confusion matrix of the proposed algorithim using deep reinforcement learning when it was tested on the data of the researcher wondo.

CONCLUSION
Deep reinforcement learning has proven its high ability to detect faults in electrical machines in the early stages of their occurrence and determine the severity of the fault.The environment the agent deals with was represented in the form of a guessing game.After playing the game multiple times, the agent gains the capability to accurately infer the fault due to its new structure, enabling automatic feature extraction and decision-making abilities.This enhanced intelligence sets the agent apart from traditional fault detection methods in machine learning, which demand extensive experience in fault analysis and the construction of tailored systems.The system designed through deep reinforcement learning can be generalized and it can deal directly with the raw data.The accuracy of the proposed system using deep reinforcement learning is 100% in detecting inter-turn short-circuit faults for the data obtained practically , as the system proved its ability to distinguish between three levels The system was also tested on the data of another researcher, where the efficiency reached 96% in total when distinguishing between seven different levels of inter-turn short circuit fault.The performance of the system was also compared with traditional machine learning and deep learning algorithms, as the results were proven The effectiveness of the proposed DRL.

Fig. 3
Fig. 3 DRL structure used for fault detection

Fig. 6
Fig.6 Three-phase current signals of the motor during normal operation.

Fig. 8
Fig.8 Three-phase current signals of the motor during the presence of inter-turn short circuit fault of the second level (F2).

Fig. 9
Fig.9 Three-phase current signals of the motor during the presence of inter-turn short circuit fault of the second level (F3).

Fig. 11
Fig.11 Classification report showing the results of the proposed DRL algorithim.

Table 1 :
fault data description.

Table 2 :
comparision with other machine learning algorithims.

Table 3 :
fault data description for the researcher Wondo.