Skip to main content Skip to secondary navigation

Machine Learning in a Full-Physics Analysis

Main content start

Investigator: Abdullah Alakeely

1. Background

When it comes to managing and forecasting the performance of subsurface fields, numerical reservoir simulation models are considered the standard method. The power of simulation models resides in their dynamic representation of spatial-temporal reservoir properties and behavior that allows for predictability of wells performance, at any location within the space, at any desired time during the life cycle of a field.

The downside of these full physics simulation models is the inherent cost that renders these models as computationally and time expensive tasks, without always including details of the uncertainty associated with the geological models they are based on. The models typically involve gridding the modeling space, and discretizing the physical laws governing the fluid movements within that space. Additionally, evolution of fluid and rock properties are calculated
using various models repeatedly for every time step.

The ability to represent the reservoir behavior accurately and/or part of it using alternative quick and cost-effective methods is of interest, especially within domains where arriving at a decision is time-sensitive and resource-constrained, thus limiting the utilization of numerical simulation models that are either complex physically of uncertain geologically.

Machine learning has been used to improve the efficiency of numerical simulation models. Gaganis et al. (2012) applied machine learning to speed up compositional reservoir simulation models.

By representing phase stability problem as a data classification task, and using neural networks to model phase split behavior, the speed of computations was improved.

Shale plays have been also a target of active machine learning applications. For example, Dahaghi et al. (2009) coupled data mining and reservoir engineering tools to model and represent wells behavior in the New Albany Shale. With the recent fast-paced developments in machine learning techniques, and their widespread adaptation in solving wide range of problems, reservoir and well modeling could potentially benefit by utilizing these types of methods.

2.Research Objective

The preliminary objective of this research is to investigate the possibility to represent and/or replace the numerical reservoir simulation model using a proxy based on machine learning. The approach is envisioned to take advantage of the recent progress in machine learning and data mining approaches to help complement or replace parts of the functionality that numerical reservoir simulation models provide. Additionally, the modeling path will treat the reservoir as a system where production and pressure data, for example, provide us with signals carrying information about the reservoir in question, and are used as an indicator for the physical properties and behavior of the reservoir.

The research is aimed at investigating a method that may allow the ability to use the alternative models that are based on machine learning, to predict existing well behavior and performance, and guide us in choosing best locations for future wells.

The first step of this research has been investigating the possibility to treat production and pressure data as signals representing the reservoir at the location of the signal (Kalantri-Dahaghi et al. 2010), and using a machine learning model to replicate the behavior under different operational scenarios that the model has not seen before. The point of this exercise has been to build understanding of what may be the most effective machine learning methods that can replicate the reservoir dynamics and produce accurate representation.

Once the a machine learning model is determined to be accurate for replicating the reservoir behavior locally, the effort will focus on how this representation can be reproduced at different locations where no signal, hence no data, has been collected or made available.

The investigation of the ability of machine learning tools to represent the reservoir spatially through time while honoring the reservoir physics is the main objective of this work.

3. Preliminary Results

3.1 Capturing Reservoir Dynamics Locally

The starting point of this work has been to represent reservoir behavior locally at a specific location using some form of machine learning method. As shown in Figure 1, the observed data obtained from the reservoir will be used to develop a predictive model. After a good match between the actual observed and modeled predicted responses has been achieved, the model is used to make future prediction.

To achieve this, a synthetic two-dimensional black-oil numerical reservoir model has been developed to be used as the source of “observed” reservoir performance information as shown in Figure 2.

The model is used to generate oil production rate and pressure information from two producing wells. In addition, an injector well was placed to maintain the pressure but not used in the analysis at this stage.

Figure 3 shows an example of production and pressure histories generated for the two producers in the model. Note that the rate is switched from rate to pressure control at around 1500 days and back to rate control at around 2000 days.

Because the data collected represent a sequence of information through time, the first candidate for machine learning modeling has been to use Recurrent Neural Network (RNN).

Figure 4 shows a representation of the features and the RNN used for this modeling task. RNNs are suitable because of their ability to store dependencies of input to output through the whole sequence by the virtue of their hidden state.

Initial results show that the RNN is able to capture and reproduce the relationship between the rate and pressure locally in the reservoir as demonstrated by qualitative comparison of pressure prediction as shown in Figure 5. Note that the last 300 days of history was not seen by the neural network during training and was reproduced with good accuracy.

 Figure 1: Model development strategy.

Figure 2: Synthetic Black Oil Model Representation showing location of wells.

 Figure 3: Production and Pressure data used for modeling.

 Figure 4: Modeling Strategy and RNN Representation (picture from

Figure 5: Prediction Results for Pressure Data.

3.2 Sensitivity to New Operational Parameters

The second task was to focus on how well the trained RNN model can generalize and capture the dynamics between production rate and pressure data. To achieve this, multiple synthetic scenarios were produced using the numerical reservoir simulator to represent different production rates and pressure responses. In addition, the timing for the onset of pressure control period was changed for different cases. This sensitivity study revealed mixed results. In some cases, the RNN was able to reproduce the target, pressure data, with acceptable accuracy as shown in Figure 6, while in others, the results followed the trend, but not as accurately. An example is shown in Figure 7.

The exercise demonstrates the capability of RNN to capture the relationship, but there is some room for improvement. Currently, the study is investigating better data preprocessing, more features, and more advanced algorithms. For example, both better data division between training and validation, and regularization of RNN weights has shown improved generalization capability for the RNN model (Cao et al. 2016), resulting in closer prediction of unseen scenarios.

Figure 6: Shows a Good Pressure Prediction on Unseen Production Profile here Data in the black box is used for training, and Rate Data in the Red Box is used as input for Prediction. The Actual Pressure (observed) and Predicted Pressure are Shown Separately for Producer1 and Producer2 on the right.

Figure 7: Shows a Decent Pressure Prediction on Unseen Production Profile where Data in the black box is used for training, and Rate Data in the Red Box is used as input for Prediction. The Actual Pressure (observed) and Predicted Pressure (from the RNN) are Shown Separately for Producer1 and Producer2 on the right.

4. The Use of Long Short Term Memory Architecture in Recurrent Neural Networks

One of the limitation of standard RNN is the vanishing gradient problem where the hidden state sensitive to early information in the sequence is reduced (Graves 2012). A better approach theoretically is to deploy a Long Short-Term Memory (LSTM) cell in the RNN. The LSTM cell can be used to remember dependencies between the input and output sequence for long time as shown in Figure8 without the issues encountered in standard RNN making them an attractive candidate for experimentation and possible improvement in prediction. This approach is currently under investigation.

5. Future Work

The results so far indicate that standard RNN is promising in forecasting well performance, but could be improved. The plan for future work can be summarized as:

  1. Investigate the robustness and capability of LSTM RNN in modeling.
  2. Build more complexand realistic reservoir models with a richer set of features (water cut, fractures, multiphase flow properties, etc.).
  3. Investigate fuzzy logic usefulness in spatial representation.
  4. Investigate the possibility of using other machinelearning models (SVM, Kernel, etc.)
  5. Incorporate more wells into the model.
  6. Investigate the ability to infer performance in undrilled locations.
  7. Understand the relationship of these machine learning tools and physics of the reservoir.


Cao, Q., Banerjee, R., and Gupta, J.:“ Data Driven Production Forecasting Using Machine Learning”. SPE Argentina Exploration and Production of Unconventional Resources Symposium, Buenos Aires, Argentina. SPE 180984, (2016).

Dahaghi, A., and Mohaghegh, S.:“ Top-Down Intelligent Reservoir Modeling of New Albany
Shale”. SPE Eastern Regional Meeting, Charleston, West Virginia. SPE 125859, (2009).

Dahaghi, A., Mohaghegh, S., and Khazaeni, Y.:“ New Insight into Integrated Reservoir Management using Top-Down, Intelligent Reservoir Modeling Techniques; Application to a Giant and Complex Oil Field in the Middle East”. SPE Western Regional Meeting, Anaheim, California. SPE 132621, (2010).

Gaganis, V.,and Varotsis, N.:“ Machine Learning Methods to Speed up Compositional
Reservoir Simulation”. EAGE Annual Conference & Exhibition, Copenhagen, Denmark. SPE 154505, (2012).

Graves, A.:Supervised Sequence Labelling with Recurrent Neural Networks. Textbook, Studies in Computational Intelligence, Springer, 2012.

Mohaghegh, S., Reeves, S., and Hill, D.: “Development of Intelligent System Approach for Restimulation Candidate”. SPE/CERI Gas Technology Symposium, Calgary, Alberta Canada. SPE 59767, (2000).