Skip to main content Skip to secondary navigation

Machine Learning Applied to Multiwell Data

Main content start

Investigator: Tita Ristanto

1.1 Overview

In reservoir engineering, it is important to understand the behavior of thereservoir, often by seeing the dynamics of bottom hole pressure (pwf) and flowrate (q) in each well. Using machinelearning method, we can build a reservoir model by learning the historicalpattern of bottom hole pressure and flow rate (or training set) without   of the flow being programmed explicitly. Using this model, we canpredict flow rate given bottom hole pressure (or vice versa). We can also utilize this model as a diagnostic tool. For example, if liquid loading orcondensate banking or wax/asphaltene deposition starts to happen, the actualpressure and flow rate response will be different than that suggested to themodel, hence flagging that the reservoir performance has changed in character.

This research focuses on solving multiwell problems using machine learningapproaches with Python. So far, the study has identified important features ina single-phase multiwell problem. This study will explore several different machine learning algorithms and compare them in terms of accuracy and performance. Python is very helpful for this purpose as it has a lot of machine learning libraries to choose from and is handier in solving machine learning related problems. The study will also examine the impact of multiphase flow, reservoir heterogeneity, and data noise. As more complexities are introduced, we may need to involve temperature and/or other parameters.

1.2 Objectives

This research focuses on achieving three major objectives:

  • To understand reservoir behavior using well bottom hole pressure and rate data on complex problems: multiple wells, multiphase flow, and heterogenous reservoir.
  • To find effective and efficient machine learning algorithms to solve such complex problems.
  • To understand specific conditions under which an algorithm works well or not.

1.3  Approach

The first step is building the bottom hole pressure and flow rate dataset using analytical equations. This dataset will be fed into the machine learning model, part of it as training data and the rest as test data. The example dataset shown here was generated using the following assumptions:

  • Three producing wells.
  • Single-phase flow (oil).
  • Homogenous reservoir.
  • There are 6000 data points (on each variable) over 60 hours.

Figure 1.1: Illustration of the example problem

The first half of the data set is used as training set, while the other half is test set, as shown in Figure 1.2. The machine learning algorithm will learn the relationship between flowrates and bottom hole pressure of well A in training set. And given flowrates, the model can predict the bottom hole pressure of well A.

Figure 1.2: Visualization of the dataset, which includes flow rate of well A, B, and C (input) and bottom hole pressure of well A (output)

In this problem, the algorithm does not take raw flow rate input data directly. Instead, we need to transform the raw input data into features and then pass them to the algorithm. Choosing the right features is very important. Solving multiwell problem requires an understanding of Ei-function, that is an approximation to the full solution to the diffusivity equation in the case of interference, which has both logarithmic and exponential ‘flavors’. Liu and Horne (2013) have captured the logarithmic characteristics in the third feature but need to take into account the exponential ‘flavor’ as well. Three new features with exponential ‘flavors’ are added. Those features in Equation (1.1) are the input data  and x(i) and y(i) are  the output data.

1.4 Preliminary Results

Results are presented for two cases. The first case used four features previously used in single well problem and the second case used seven features in Equation (1.1).

Figure 1.3: Bottom hole pressure prediction using 4 features in Equation (1.3). Red: training data. Green: prediction. Black: actual data.

In the training set, the model generally matches the actual data, except immediately after flowrate B or C changes. The test set r-squared error is poor at around 0.69.

Figure 1.4: Logarithmic and exponential region.

In the second case with 7 features, the result shows an improvement in both training and test sets (see Figure (1.5)). The R-squared error for training and test set are 0.9972 and 0.9689, respectively. These facts demonstrate that  is a strong feature in multiwell problems as it contains  component that captures exponential region characteristics in Figure (1.4).

Figure 1.5: Bottom hole pressure prediction using 7 features in Equation (1.1). Red: training data. Green: prediction. Black: actual data.

The research is still ongoing and much remains to be done, but so far, we have come up with several key messages:

  • Methodologies used in previous research are transferred to Python.
  • Multiwell pressure and flow rate can be formulated as a linear problem.
  • Adding exponential term into the features improves the learning performance in multiwell problems.


  • Liu, Y., 2013, and Horne,
    R.N.: Interpreting Pressure and Flow Rate Data from Permanent Downhole Gauges
    Using Data Mining Approaches , PhD thesis, Stanford University.
  • Tian, C., 2014, and
    Horne, R.N.: Applying Machine Learning and Data Mining Techniques to Interpret
    Flow Rate, Pressure and Temperature Data From Permanent Downhole Gauges , MS
    Thesis, Stanford University.