Manipulating Data

This section covers arithmetic operations, setting limits, averages, as well as some other analyses.

Note that when specific time periods or locations are not selected, these operations are applied to the time and spatial grids in their entirety by default. It also important to note that when multiple data variables are to be compared, as is common in the examples in this section, the time and spatial grids of those variables must be identical. You will see this issue addressed frequently.

Basic Arithmetic Operations

Adding a number to a field

Example: Add 2.5°C to the minimum temperature data from a gridbox (31.5°E, 10°S) in Zambia.

Start at the ZMD ENACTS ALL daily temperature* dataset main page.

Select a gridbox (31.5°E, 10°S) in Zambia. CHECK
Select the minimum temperature data variable. CHECK EXPERT
When adding a number to the field, the units of that data variable are automatically used. Note the units of the minimum temperature data variable under the Other Info heading.

START

While in expert mode, enter the following line below the text already there.

2.5 add
Click "OK". CHECK

To see the results of this operation:
Select Tables link > Agree button> columnar table link. CHECK Compare with ORIGINAL DATA

Adding fields

Example: Recreate the observed monthly precipitation data by adding the climatological precipitation data and the precipitation anomaly data.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.

Because we are combining an observational data stream with a climatological data stream, we do not need to worry about the time grids matching. We must only make sure that the two streams have the same time scale (temporal resolution). In this example, the time grids of each of the data variables look like this:

Climatological precipitation
Time grid: /T (months since 01-Jan) periodic Jan to Dec by 1. N= 12 pts

Anomalies
Time grid: /T (months since 1960-01-01) ordered (Jan 1983) to (Dec 2019) by 1.0 N= 444 pts

Note that both of the data variables have a the same time scale and that the time grid for the climatological data is periodic. Ingrid will automatically match that periodic data properly with the anomalies time grid.

Select (and compute anomalies if necessary) the Monthly Rainfall and climatological precipitation data variables. EXPERT

START

While in expert mode, enter the following line below the text already there.

add
Click "OK". CHECK

To see the results of this operation:
Select a grid point (31.5°E, 10°S) to make the size of the data file more manageable. CHECK START
Select Tables link > Agree button> columnar table link. CHECK Compare with CLIMO DATA , ANOMALY DATA and OBSERVED DATA .

Subtracting a number from a field

Example: Subtract 2.5°C from the minimum temperature data from a gridbox (31.5°E, 10°S) in Zambia.
Refer to the example in Section Adding a Number to a Field.

Substitute the following Ingrid command for 2.5 add.

2.5 sub
Click "OK". CHECK
Note that Ingrid subtracts the second number/field listed (e.g., 2.5) from the first number/field listed (e.g., min. temperature).

Subtracting fields

Example: Create the monthly precipitation anomalies data by subtracting the monthly climatological precipitation data from the observed monthly precipitation data.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.

Note the time grids of the two variables to be compared.

precipitation
Time grid: /T (months since 1960-01-01) ordered (Jan 1983) to (Dec 2019) by 1.0 N= 444 pts

Climatology
Time grid: /T (months since 01-Jan) periodic Jan to Dec by 1. N= 12 pts :grid

As is common when comparing two variables from the same dataset, their time grids match exactly. However, you should get always make a of point of checking this.

Make sure that the spatial grids of these two variables match.
Now that we are sure that the grids match properly, this example is very much like that in Section 1.b where we added the two fields. The primary difference here is the order that the variables are listed. As noted in Section 1.c, Ingrid subtracts the second field listed from the first field listed.

Select the monthly precipitation and its climatology data variables. EXPERT

START

While in expert mode, enter the following line below the text already there.

sub
Click "OK". CHECK

Multiplying a field by a number

Example: Convert the units of the monthly precipitation data from mm to cm by multiplying the field by 0.1.
Note: there is an Ingrid command that converts units themselves instead of just the data values. This is just an example and the units of the data will still appear as mm after the arithmetic operation.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly precipitation data variable.
CHECK EXPERT
While in expert mode, enter the following line below the text already there.

0.1 mul
Click "OK". CHECK

Multiplying fields

This operation works just like that covered in Sections 1.b and 1.d.

Ensure that the grids of the variables match, select both of them, and use mul as the operator in expert mode. The mul command can also be used to find common entries in two different data streams.

Dividing a field by a number

Example: Convert the units of the precipitation data from mm to m by dividing the field by 1000..
Note: there is an Ingrid command that converts units themselves instead of just the data values. This is just an example and the units of the data will still appear as mm after the arithmetic operation.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the precipitation data variable.
CHECK EXPERT
While in expert mode, enter the following line below the text already there.

1000. div
Click "OK". CHECK

Note that Ingrid divides the first number/field listed (e.g., precipitation) by the second number/field listed (e.g. 1000.).

Dividing fields

This operation works just like that covered in Sections 1.b and 1.d.

Ensure that the grids of the variables match, select both of them, and use div as the operator in expert mode. Again, note that Ingrid divided the first field listed by the second field listed.

Setting Limits

It is often useful to limit data values. You may want to set minimum and maximum limits on data as a means of quality control or only use data that meets particular criteria. Ingrid makes these types of operations very easy. Below are some common examples.

Setting a minimum/maximum

Example: Create a data stream where all minimum temperature data values less than 10°C are given a value of 10°C.

Start at the ZMD ENACTS ALL daily temperature* dataset main page.
Select the minimum temperature data variable.
CHECK EXPERT

As previously described, it is a good idea to note the units of the data in question as all values in Ingrid are automatically referenced to the units of the data variable.

Note that units of temperature by looking at the information under the Other Info heading.
In this case, the units are in Celsius and we must therefore give our desired minimum temperature in Celsius.

While in expert mode, enter the following line below the text already there.

10. max
Click "OK".

To see the results of this operation:
Select a single gridbox (31.5°E, 10°S) in Zambia and short time period (e.g., 1996) to make the size of the data file more manageable. CHECK

START

Select Tables link > Agree button> columnar table link. CHECK Compare with the ORIGINAL DATA .

An analogous operation, setting a maximum value of 10°C, can be done by replacing the command 10. max with 10. min.

Finding a minimum/maximum

There are two common uses of these feature. You may want to find a minimum/maximum value in a particular region or time period. Let us look at examples of these operations.

Example: Find the largest Monthly Rainfall anomalies for the entire time grid.
This example finds the largest Monthly Rainfall anomalies from the entire time grid for each grid point. The result is the largest Monthly Rainfall anomalies as function of X (longitude) and Y (latitude). Of course, you can limit the time grid to find the largest Monthly Rainfall anomalies in a more specific time period.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly Monthly Rainfall data variable (and compute anomalies if necessary).
CHECK EXPERT

To find the largest Monthly Rainfall anomalies:
While in expert mode, enter the following line below the text already there.

[T] maxover
Click "OK". CHECK

To find the largest negative Monthly Rainfall anomalies:
While in expert mode, enter the following line below the text already there.

[T] minover
Click "OK". CHECK

To see the results of this operation:
Select the options colors with coasts in the Views tab. CHECK

Example: Find the largest Monthly Rainfall anomalies for the entire spatial grid.
This example finds the largest Monthly Rainfall anomalies from the entire spatial grid for each time step. The result is the maximum global Monthly Rainfall anomalies as a function of T (time). Of course, you can limit the spatial grid to find the largest Monthly Rainfall anomalies in a specific region.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly Monthly Rainfall data variable (and compute anomalies if necessary).
CHECK EXPERT
To find the largest Monthly Rainfall anomalies:
While in expert mode, enter the following line below the text already there.

[X Y] maxover
Click "OK". CHECK

To find the largest negative Monthly Rainfall anomalies:
While in expert mode, enter the following line below the text already there.

[X Y] minover
Click "OK". CHECK

To see the results of this operation:
Select Tables tab > columnar table link. CHECK

Creating a numerical mask

Masks make data values that meet a particular threshold equal to NaN.

Example: Mask out the maximum temperature values greater than 30.°C.

Start at the ZMD ENACTS ALL daily temperature* dataset main page.
Select the maximum temperature data variable.
CHECK
Note that the temperature unit is Celsius.
This is good because our mask threshold is also in Celsius. If the units had not agreed, then we would have had to convert the mask threshold to the units of the data variable.

While in expert mode, enter the following line below the text already there.

30. maskgt
Click "OK". CHECK

An analogous operation, masking out the maximum temperature values less than 30.°C, can be done by replacing maskgt with masklt.

To see the results of this operation:
Select a single gridbox (31.5°E, 10°S) in Zambia and a short time period (1994) to make the size of the data file more manageable. CHECK

START

Select Tables tab > Agree button> columnar table link. CHECK Compare with the ORIGINAL DATA . Note that the data value from January 1st is missing. (Tables exclude NaN values.)

Flagging Data

Flags create a binary version of any variable based on a particular threshold. Those data that meet the threshold are given a value of 1 and those that do not receive a value of 0.

Example: Flag minimum temperature values greater than 10 ˚C.

Start at the ZMD ENACTS ALL daily temperature* dataset main page.
Select the minimum temperature data variable.
CHECK
Note that the minimum temperature unit is ˚C.
Our flag threshhold is in ˚C, so we must convert that minimum temperature to give Ingrid the threshhold in the units of the data variable.

While in expert mode, enter the following line below the text already there.

10. flaggt
Click "OK". CHECK

To see the results of this operation:
Select a single gridbox (31.5°E, 10°S) in Zambia and a short time period (1996) to make the size of the data file more manageable. CHECK

START

Select Tables tab > Agree button> columnar table link. CHECK Compare with the ORIGINAL DATA .

An analogous operation, flagging minimum temperature less than 10 ˚C, can be done by replacing flaggt with flaglt.

Creating Averages

Spatial averages

When creating a spatial average of station of data, one typically wants to take into account the location of each station (e.g., weighted average). That operation is beyond the scope of this tutorial. However, creating a spatial average of gridded data is much more straightforward and an example is given here.

Example: Find the spatial average of monthly precipitation data in a region in Zambia for Jan-Dec 1998.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly precipitation data variable.
CHECK
Select the 1998 time period and the lat/lon defined region. CHECK EXPERT

START

Enter expert mode and enter the following line below the text already there.

[X Y] average
Click "OK". CHECK

To see the results of this operation:
Select one of the options in the views tab.

This procedure can be easily applied to other types of spatial averaging. For example, if you wanted to create a zonal average, then you would use the following line of Ingrid instead.

[X] average
This creates a zonal average as a function of T (time) and Y (latitude). Click here to see an example of this operation.

Seasonal/chunk averages

Example: Create seasonal averages (DJF, MAM, JJA, SON) of monthly precipitation data from 1990-1999.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly precipitation data variable.
CHECK
Select the Dec 1989-Nov 1999 time period. CHECK EXPERT

START

While in expert mode, enter the following line below the text already there.

T 3 boxAverage
Click "OK". CHECK

To see an animation of the seasonal averages you just created:
Select one of the options of the views tab.
Enter the "Jan 1990 - Oct 1999" in the time text box at the top of the data viewer.
Click "Redraw".
CHECK

Note that if you had wanted JFM, AMJ, JAS, OND seasonal averages, then the selected time period would have been Jan 1990 to Dec 1999. Another important point here is that the step over which the average is created is always in the units of the data variable in question. For example, had the data been at a daily time scale, the above Ingrid command would have created a 3-day average instead of a 3-month average. Therefore, it is an excellent idea to get in the habit of making sure the units of the data variable and the step agree with each other. The technique used in this example is particularly useful when 12 is evenly divisible by the step over which you want to average. The next example addresses the cases when this is not true.

Example: Create a May-Sept averages of Monthly Rainfall anomalies data for the time period 1985-1994.
This example creates an average over 5 months. Twelve is not evenly divisible by this step (e.g., 5 months) so we much use a different technique than the one above.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the Monthly Rainfall data variable (and comptue anomalies if necessary).
CHECK
Select the Jan 1985- Dec 1994 time period. CHECK EXPERT

START

While in expert mode, enter the following line below the text already there.

T 12 splitstreamgrid
Click "OK". CHECK

This Ingrid command splits the time grid with a period of 12. That is, in this example, it creates a dataset of Jan data, a dataset of Feb data, etc. This is an important step, but we are not quite finished.

Select May-Sept grids and average over them with the following Ingrid commands.

T (May) (Jun) (Jul) (Aug) (Sep) VALUES

[T] average
Click "OK". CHECK

There is also a convenient option if you want to create averages/climatologies of single months.

Example: Create a monthly climatology of precipitation data for the time period 1982-2001.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly precipitation data variable.
CHECK
Select the Jan 1982-Dec 2001 time period. CHECK EXPERT

START

Select the Filters tab.
Select the monthly climatology link.

CHECK EXPERT

Note that this command can only be applied to monthly data.

Running averages

This operation offers a fast and easy way to smooth data temporally. Let us look at an example.

Example: Create a 15-day running average of minimum temperature data.

Start at the ZMD ENACTS ALL daily temperature* dataset main page.
Select the minimum temperature data variable.
CHECK

At this point, it is a good habit to check the temporal unit to make sure it agrees with how you want to define your average step. In this example, we want to create a 15-day running mean. Therefore, the unit over which we want to average is a day.

Make sure that the temporal unit of the minimum temperature data is the same as the unit over which you want to average.
While in expert mode, enter the following line below the text already there.

T 15 runningAverage
Click "OK". CHECK

Note that this operation will truncate the data to fit the step. In this example, we have a step of 15 days and are using the full time grid of full T domain. Therefore, after the running mean is created, the data will include the dates T domain after runningAverage.

To see the results of this operation:
Select a gridbox (31.5°E, 10°S) in Zambia.

START

Select one of the views links in the function bar. Compare with those from the line , bar , and scatter plots of the original data.

Statistical and Other Mathematical Operations

Anomalies

Earth science data is commonly viewed in term of anomalies (i.e., difference between observations and climatology) rather than as raw values. Anomalies can be produced with Ingrid by first calculating a climatology and then calculating the difference between it and the observed data. However, Ingrid also has a single command that does all of these calculations. Let us look at an example.

Example: Recreate the Monthly Rainfall anomalies data for the time period 1982-2001.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the monthly precipitation data variable.
CHECK
Select the Jan 1982-Dec 2001 time period. CHECK EXPERT

START

While in expert mode, enter the following line below the text already there.

yearly-anomalies
Click "OK". CHECK

You have just created the Monthly Rainfall anomalies for the time period 1982-2001 based a 1982-2001 climatology. While convenient, this operation is bit limited in that it can be applied to monthly data. And like the yearly-climatology command, you can find this options via the "Filters" tab.

Correlation

This is an excellent example that combines many of the techniques covered to this point.

Example: Correlate precipitation amount observations in a gridbox (31.5°E, 10°S) in Zambia with SST anomalies in the region defined by 130°-90°W, 5°-15°S for the time period 1997-1998.
Note: in order to correlate two sets of data, they must have the exact same temporal unit.

Let us use ZMD ENACTS rainfall ALL monthly, that has monthly data.

Select the ZMD ENACTS rainfall ALL monthly dataset by either searching for it or through the SOURCES option. CHECK
Select the precipitation variable. CHECK
Select the the gridbox (31.5°E, 10°S) in Zambia. CHECK EXPERT
Select the Jan 1997-Dec 1998 time period. CHECK EXPERT

At this point, when the first dataset selections have been made, it is typically easiest to make the second dataset selections in expert mode.

While in expert mode, enter the following lines below the text already there. All of these commands should look familiar to you from previous examples.

SOURCES .NOAA .NCDC .ERSST .version3b .anom
T (Jan 1997) (Dec 1998) RANGE
X (130W) (90W) RANGE
Y (15S) (5S) RANGE
Click "OK". CHECK

START

You now have two data fields with identical time grids. Let us correlate these fields.

While in expert mode, enter the following lines below the text already there.

[T] correlate
Click "OK". CHECK

To view the correlation data you just produced:
Select one of the options in the views tab.
OR
Select Tables tab > Agree button > columnar table link. CHECK
You can correlate over the spatial grids as well by replacing [T] with [X], [Y], [X Y], etc.

Trigonometric functions

Basic trig functions are typically used with the spatial grids. The results of this function can then be used as part a broader technique, such as spatial weighting.

Example: Find the cosine of a latitudinal grid of ZMD ENACTS rainfall ALL monthly data.

Start at the ZMD ENACTS rainfall ALL monthly* dataset main page.
Select the ZMD ENACTS rainfall ALL monthly data variable.
CHECK
While in expert mode, enter the following line after the text already there.

Y cosd
Click "OK". CHECK
To find the sine of data, replace cosd with sind.

To view the data you just produced:
Select one of the options in the views tab.
OR
Select Tables tab > columnar table link. CHECK