# Large Volume Data Processing for Permanent Downhole Gauges

**Investigator: **Yang Liu

## Problem Description

Permanent Downhole Gauge (PDG for short) is a newly developed tool for well testing in the petroleum industry. In traditional well testing, pressure and flow rate transient data are collected for a short period which leads to a large uncertainty. Due to long time continuous data acquisition, a PDG may provide measurements for several years or longer. However, at the same time, this brings a new problem--a large volume of noise together with large volume of measurement.

In Figure 2, at Point B, the pressure and flow rate increase or decrease together in the same direction, which can be recognized as noise due to the violation of the Darcy's law. At Point A, the pressure and flow rate come in opposite directions so the changes can be recognised as a real event, consistent woth Darcy's Law. When the PDG provide a large dataset for a whole year at a sampling rate of 1 measurement per second, it will not be possible to manually process the data. A meaningful scientific topic thus is invoked: denoise, using the relationships between the two signals to filter the data.

Use of long time data from permanent downhole gauges will necessarily be ?? on deconverlution. In SUPRI-D, both the deconvolution and data filtering are under investigation. For me, the denoising direction will be current research topic. To form a systematic approach, four questions are being addressed:

- How do we define a "low noise" data set? (Definition)
- How to validate that a data set is a good data set with low noise? (Validation)
- How to achieve such a low-noise data set? (Manipulation)
- What is the physical process behind the noise? Thus how to honor the reservoir and tool physics in the denoising process? (Physical Essential)

In Questions 1, 2 & 3, I am actually defining a good data set instead of a bad set. The reason I avoid a definition of noise is I do not know what is real noise. What I am trying to define is what a good data set looks like. Making a clear answer of first three questions is sufficient for industry practice. Nevertheless, Question 4 will help to solve the problem at the foundamental level.

## What is a low-noise data set?

The so-called denoising process assumes a mathematical model is correct but some of the physical data are wrong. Thus a data set with less noise should honor the known well testing models including both Diffusion Equation and Darcy's Law. In such a well-posed data set, the flow rate curve should change a different direction from the pressure curve. This means when the flow rate increases, the pressure should decrease, and vise versa. Expressed mathematically as¦¤p∙¦¤q‹0. Figure 3 shows a good example.

Still, we cannot conclude that the data that do not honor¦¤p∙¦¤q‹0 are necessarily noise. That is because even when we generated an ideal pressure data set based on a given real flow rate data set, violations of the rule still exist sometimes. Since it is not possible to generate both an ideal pressure data set and an ideal flow rate data set, it is meaningless to require a data set absolutely obey the rule without any exceptions. Therefore, what we emphasize is a reduced noise data set will obey this rule better than a noisy data set. To validate this "better practice", we need to get some help from mathematical transformation which is discussed in next section.

How to validate that a data set is a good data set with low noise?

The Haar Wavelet Transformation will defines the differences and averages of a given data set. Repeating the same processing on the average subset, two new difference and average subsets will be generated. Thus, if we perform this transformation time after time until there is only one difference and one average, we will obtain a full-level decomposition of the original data set.

When we apply this full-level Haar Wavelet Transformation to both pressure and flow rate data sets, we will obtain the differences of pressure and flow rate at different levels--¦¤p and ¦¤q. According to the definition of a low-noise data set, if we plot all level dp vs. all level dq, we will obtain a plot of points lying along Quadrant #2 and Quadrant #4, as in Figure 4:

A noisy data set with more bad points will have more values in Quadrants #1 and #3 rather than Quadrant #2 and #4, as in Figure 5.

When we take a closed look at the plot at different levels, we may have plots shown in Figure 6:

## How to achieve such a low-noise data set?

Given a set of data, we can reduce the noise by applying Fast Fourier Transformation (FFT).

When we first use Fast Fourier Transformation to decompose both the pressure and flow rate signals, we will obtain an amplitude distribution graph in Figure 8:

The figure shows the signal has larger amplitude at both ends, which represent the lower frequencies. So set all the amplitudes of the frequencies in the middle higher than the ends to be zero, and then reconstructing the signal, we will obtain a new signal shown in Figure 9. The red curve is the curve reconstructed after FFT, and the blue curve is the original data. FFT is very efficient to smooth the curve while keeping the trend, although smoothing the curve is not our target.

Next, we can use the Haar Wavelet transformation to validate the denoised data set. When we apply the full-level Haar Wavelet tranformation, and replot the dp vs. dq plot, we will obtain Figure 10, which is much improved over the original data set.

## Current and future tasks

As described here, the FFT method is very efficient. However, this method treats the denoising process only from the aspect of signal. Without honoring the physical meaning behind the signal, we lose a lot of useful information. Thus the current and next step of the research is to involve the essential physics into the denoising process, which is an attempt to answer Question #4 above.