Modeling Multiple Sclerosis at Different Levels Using Reinforcement Learning

Background: Multiple sclerosis (MS) represents one of the most common disorders of the central nervous system, which leads to the dysfunction of different body systems and generates a myriad of problems for the affected individuals. Given the progressive nature of this disease, it can divide into several levels. The progression rate of the disease at each stage is essential for specialists, as it can help them to adopt appropriate therapeutic measures. Methods: One of the methods used in many MS neurological treatments is Expanded Disability Status Scale (EDSS), which allows physicians to give an estimate of the severity of the disease to patients, learn about the stage of the patient’s disease and prescribe appropriate medicines accordingly. Given the importance and impact of this disease on the quality of life of patients, researchers look for inexpensive and simple models with minimum side effects for examining different levels of MS and providing treatment solutions. Results: In this study, patients were asked to stand on a force plate. Then, the time series of the center of pressure and body oscillations of patients at various levels were recorded using a motion analyzer device, and a closed loop control system was proposed using the reverse pendulum (representing human body) and reinforcement learning. Conclusion: Based on the feedback received from the environment, the necessary rules for maintaining the balance of pendulum obtained, and, by observing the ankle torque at the output, a model presented that could examine different levels of MS.


Introduction
In the multiple sclerosis (MS) disease, the central nervous system attacked by the immune system, which leads to the demolition of myelin.The disease usually breaks out at the young age, and it often sees in females.It has shown that in every 100 000 people, this disease affects 2 to 120.This disease, which affects the central nervous system, was first described by Jean-Martin Charcot in 1868.The central nervous system consists of the brain and spinal cord.This disease affects the white matters of the central nervous system.The white matters are axons of the neuronal cells, surrounded by a lipid layer called myelin.This lipid layer wraps around the optical fiber, 1 accelerating the dissemination of action potential in them.When the myelin layer destroyed, axons lose their ability for proper transfer of action potentials. 1 MS often associated with plaques or lesions in the white matter of the brain.Although sufficient data on the functional mechanism of this disease is available, its cause is thoroughly unknown.After the patient's referral to the clinic, MRI images of the central nervous system, cerebral spinal fluid (CSF) sampling, and patient history can help diagnose the disease.Since it is a progressive disease and divided into various levels, the extent of disease progression at each stage is essential for specialists as it enables them to adopt appropriate therapeutic procedure.Schizophrenia and MS [2][3][4][5][6] both impose substantial costs on society.A method that employed in many of the specialized MS neurology clinics is the Expanded Disability Status Scale (EDSS), which used as the standard in many articles and reports. 7,8n this scale, by examining the functional system and a person's ability, a score on a scale of 0 to 10 is assigned, which indicates the level of disability in individuals.This numerical scale has several shortcomings such as dependence on a specialist's opinion, and in the case of observing obscure symptoms not detected so far, the patient will receive the highest disability score.Krishnan et al 9 revealed that by examining posture control disorders associated with voluntary movements in MS patients could help physicians to diagnose and improve balance in journals.sbmu.ac.ir/Neuroscience http MS patients.Bonne et al 10 proposed a dynamic model for standing position using a closed loop control system for a range of complex human behaviors and balance control of humanoid robots.Patton 11 suggested a pendulum model predict balance, using the stipulation that the center of pressure placed on the sole in the standing position.In the present paper, using a reverse pendulum (representing human body) and a reinforcement learning algorithm, a model in the form of a closed loop control system is presented that can examine different levels of the disease and contribute to treatment process through analysis of MS patients.

Data Recording Measurement Instrument
In this study, the center of pressure and oscillations of the body were recorded by the force plate, using a motion analyzer device called AMTI Accu Gait.Then, we accessed data using the Vicon-Nexus software.The data on all participants recorded by this device and six cameras in a dark place.Subjects were standing in the vertical position.Experiments had undertaken at the laboratory of Azad University, Science and Research branch.

Experimental Protocol
During the measurement session, subjects were examined by the Romberg's test 12 in 2 separate tests under the following conditions.
Open eyes (OE): The person is standing with legs side by side, eyes opened and hands next to the body Close eyes (CE): The person stands upright, with legs side by side and eyes closed

Participants
Eight MS patients at three levels of the disease (good, moderate, weak) with an average age of 23-45, weight of 50-73 kg, and height of 1.55-1.76m included in the study.All patients read the experiment procedure and signed the written consent form.They were asked to stand on a force plate, twice with open eyes and then with closed eyes, and using a motion analyzer device, the time series of the center of pressure and body oscillations of patients recorded at different levels with 10-min intervals.
Table 1 reports specification data of patients in the open eye and close eye.

Closed Loop Model of Patients and Force Plate
In the process of identifying levels, patients were asked to stand on a force plate to record the time series of the center of pressure and body oscillations using the motion analyzer device.In normal standing, the body has slight oscillations in the anterior-posterior direction, which allow the body to maintain its balance against disturbances; hence, the human body acts as a reverse pendulum.For this reason, the reverse pendulum control system, 13 as shown in Figure 1, was adopted to model the body in the vertical standing position.In this system, the patient modeled as a balance controller, who seeks to maintain balance by applying an appropriate control signal.Therefore, given the progression of the disease and classification of MS into different levels, patients at each level display varying degrees of ability in maintaining balance.The reverse pendulum model produces proper ankle torque in the output to maintain balance following the body oscillations that serve as the input.In the model, 2 points of toe and heel assumed as support points.The continuous dynamical function of the reverse pendulum movement on the force plate is as follows in equation 1.
Where "s" is the Laplace variable, M is the person's weight, g is gravitational acceleration, "h" is the person's height and J is the moment of inertia in the ankle joint movement.

Stability Criterion
The body is balanced when it is either in the stationary mode (static stability) or the state of continuous and  15 In a static state, it can say that if the center of body pressure placed in the area between the supports of both feet, the person remains stable.According to the criterion for the vertical force of supports, if the vertical force drops to zero in a point of the toe or heel, the person loses balance.That is, to maintain stability against disturbance, a person must change the position of body members in a way that the ratio of toe and heel forces remains unchanged.The objective function for the human model according to the criterion for the vertical force of supports is defined as follows in equation 3.
the center of body pressure placed in the area between the supports of both feet, the person remains stable.According to the criterion for the vertical force of supports, if the vertical force drops to zero in a point of the toe or heel, the person loses balance.That is, to maintain stability against disturbance, a person must change the position of body members in a way that the ratio of toe and heel forces remains unchanged.The objective function for the human model according to the criterion for the vertical force of supports is defined as follows in equation 3. (3)

Proposed model with the Reverse Pendulum and Reinforcement Learning
Using a reverse pendulum model and reinforcement learning, 16 a model presented, as shown in Figure 2, which can examine different levels of MS disease.The time series of the center of pressure and body oscillations of patients were recorded at various levels by the force plate device.Given that MS acts as a disturbance factor, the body's ability to maintain stability at each level of the disease vary; hence, the body movements and body oscillations at each level are different.Therefore, by determining each level of the disease, the reverse pendulum generates different moments for each level in its output.The stability of the model ensured when the center of pressure placed between the toe and heel points.After the reverse pendulum output measured with the target function, it delivered to the reinforcement learning.Using the SARSA algorithm, the reinforcement learning, based on different patterns, ensures that the value of the target function is zero at each level and the pendulum fixed.In other words, it maintains the model stability.

SARSA Algorithm
In the reinforcement learning, the agent, which is a model of a patient in this study, is allowed to interact with the environment and gain experience by expanding   actions that lead to the desired outcome and limiting actions that induce undesirable consequences in order to obtain optimal strategy and policy.This is designed to find a mapping for the space of states in the possible actions space in each state, with the mapping providing the best course of action in each state.Learning takes place when the agent behaves differently in light of the new experiences achieved, and this distinct performance usually translated into different outcomes and improved performance.This kind of learning is highly dependent on feedback from the environment or other factors.In this study, SARSA algorithm, which developed in 1994, has been used.As the name of the algorithm implies (State, Action, Reward, State, Action), an agent selects action a_t at the state s_tand by receiving the reward r, it proceeds to the new state s_(t+1) and opts for action a_(t+1) as shown in Figure 3.In this project, the hypothesized model in the simulation, based on the environmental feedback or reverse pendulum movement, obtains the rules required for keeping the pendulum balanced through experience.

Results
The following figures show the curves of generated torque versus time by reverse pendulum at three levels of patients before and after the reinforcement learning.Prior to the application of reinforcement learning (Figure 4), the results of good level indicated that these individuals performed well in balancing because the system was able to maintain its stability after several minor oscillations.At the moderate level (Figure 5), the motor system was able to maintain balance for a while, after which it became unstable.That is, the curve had oscillations initially, but then it moved towards infinity.At the weak level, the motor system (Figure 6) moved toward infinity from the start and remained unstable, indicating that the patient had trouble maintaining his balance due to the progression of the disease.After applying the reinforcement learning, the model managed to control the pendulum movement to maintain its stability.As shown in the figures, the behavior of the stabilized system was in the range of 0 to 1. journals.sbmu.ac.ir/Neuroscience http Discussion Since one of the common problems of MS patients is maintaining balance during voluntary movements and routine chores, we can obtain valuable information about changes of posture control by examining the center of pressure and oscillations of the body, which can assist the treatment and balance in patients.
Chagdes et a1 17 presented a dynamic model for assessing the posture of patients with neuromuscular disease, which is crucial for improving balance and early diagnosis of musculoskeletal disorders.Boes et al, 18 using the inverse pendulum as the body muscles representative, and the PID controller, provided a mathematical model of the posture control system for assessing oscillations in central Given the large body of literature on MS extracted tests, there are still many studies in processing.The proposed suggestions can divide into two categories.In the first category, these tests can use for early diagnosis of patients suspected of MS, and the starting of the treatment procedure to prevent further damages to the central nervous The second category of suggestions is concerned with tests and improvement of outcomes.Accordingly, a few suggestions have made.First, increasing the number of participants to have larger training samples at different levels.Second, using a more complex model for the reverse pendulum and taking into account other moments, such as knee joint moments, simulating reallife conditions.Third, if a pressure-sensitive carpet with a camera system is available, a walking test to examine dynamic balance can be designed.

Figure 1 .
Figure 1.Model of Patient Standing on a Force Plate.14

Figure 3 .
Figure 3.The Procedure of SARSA Algorithm According to the State, Action, Reward, Mode, and Action. 10

Table 1 .
Data Specification of Patients in Open Eye and Close Eye Modes Abbreviations: OE, open eyes; CE, close eyes; EDSS, expanded disability status scale.journals.sbmu.ac.ir/Neuroscience http uniform motion (dynamic stability).

.2.2 Proposed model with the Reverse Pendulum and Reinforcement Learning
Generated Torques by Reverse Pendulum in MS Patients in Good Level. A. Prior to the application of reinforcement learning .B.After applying the reinforcement learning Figure 5. Generated Torques by Reverse Pendulum in MS Patients in Moderate Level. A. Prior to the application of reinforcement learning.B. After applying the reinforcement learning.Generated Torques by Reverse Pendulum in MS Patients in Poor Level. A. Prior to the Application of Reinforcement Learning .B. Their results were helpful for controlling posture and understanding the role of spasm in MS patients, as well as in rehabilitation to regain control and normal posture.Corradini et al, 14 using a reverse pendulum and Arma controller, provided a controlling model for diagnosis of posture disorders, which could be utilized to maintain balance and differentiate MS from other similar diseases.Using Pitchfork and Hopf algorithms, the reverse pendulum model (representing body muscles), and a combination of an amplifier in the feedback loop of the human control model, Chagdes et al 19 developed a mathematical model for assessing stability and instability areas in the human posture.They suggested that learning about the limited cycle of posture oscillations could help the diagnosis and treatment of musculoskeletal disorders.