Case Study: The Power of Combining Simulation and Reinforcement Learning

What is a Simulation Model

Simulation modeling is a technique to find solutions for real-world problems. In many cases, we cannot experiment in the real world, and hence, would like to create a model of this real system and experiment with the model. This computer model represents a “risk-free world”. The model will always be less complex than the real world as we will leave out the details that we think are not necessary for the effect we want to study. However, being able to test solutions in a risk-free world is not the only reason that we create simulation models. Real-world problems are complex and oftentimes there is no other way to solve the problem. Simulation can model complex systems and then by playing with this system we can see the trajectory of system states over time. In this way, we can develop a better intuition of the cause-and-effect relationships and reach the optimal solution.

Reinforcement Learning

Reinforcement learning is a machine learning technique in which the learning agent learns what actions to take to maximize a numerical reward signal.

It is based on the idea of framing problems as a Markov decision process where an AI agent (a specialized algorithm) learns a control policy to always pick the best possible action for a given state of the system. Ideally, this system is somewhat random and dynamic, making a reward-based learning approach superior compared to other traditional control theories. Successful application of neural networks in conjunction with reinforcement learning (hence the name “Deep Reinforcement Learning”) opened new horizons to deal with more complex scenarios that were previously deemed impossible. Deep reinforcement learning (DRL) follows this method, using a deep neural network to represent the policy.

Simulation as a training environment for Reinforcement Learning

Reinforcement Learning (RL) requires a large amount of “trial and error” episodes or interactions with environments to learn a good policy. Therefore simulators are required to achieve results in a timely as well as cost-effective way. The simulation model essentially becomes the environment in which the RL agent will learn.

Since the RL model learns by a continuous process of receiving rewards for every action taken, it can learn to respond to unforeseen environments. Industrial systems, including supply chain management and industrial processes, are good examples of large and complex problems which are perfectly suited to be solved with reinforcement learning.

In the case of Supply Chain Management, it would be extremely difficult, if not impossible, to write a program that could effectively manage every possible combination of circumstances occurring in everyday scenarios. The trucks could break down, the food could spoil, bad weather could force road closures — the list of potential situations is virtually infinite. As a result, the system must be highly adaptive. Again, this is where reinforcement learning techniques are especially useful since they don’t require lots of pre-existing knowledge or data to provide useful recommendations and solutions.

Some situations in which a Simulation and RL-based approach is recommended are:

Extreme Robust Requirements: An existing Control system or optimization engine require many different settings to work well under all conditions.
Multiple or Changing Optimization Goals: An existing control system struggles to optimize toward multiple goals well or the optimization goals change based on environmental conditions.
Operator Overload: Operators cannot process many variables in real time while controlling the systems or the engineers need help making decisions across scenarios.
Tasks where a person can outperform any traditional optimization methods: Such tasks could be made autonomous using an RL-based approach.

If you think you and your organization could benefit from Reinforcement learning powered by simulation, contact us to know where to start.