Exploratory Data Analysis

GeoEAS provides a program, called Stat1, for univariate exploratory data analysis.  The most important thing here is to get an idea of the histogram of the data points and look for outliers that might signal an unusual site or a data processing error.   In this phase, it is also important to display a point plot of the data to see the spatial pattern of the sample sites and to look for unusual point data patterns.  This can be done either with the GeoEAS program, Postplot,  (see below) or using ArcView. The same information can be obtained from other programs.  Because the geostatistical tools of variogram analysis and kriging are sensitive to extreme values,  it is important to have a look at the histogram of the data and to keep it in mind during the remainder of the steps of the analysis.

To execute the STAT1 program, open a DOS window and move to the GeoEAS sub-directory. (i.e. in our case, you would type cd\data\geoeas to get to the GeoEAS sub-directory.  Then at the DOS prompt type: stat1
The following screen will appear:

Step 1.  Be sure the File Prefix is correct.  If not, highlight Prefix in the bottom menu bar, press enter, and type the correct path to the data file.  Be sure to include a "\" at the end of the path.

Step 2.  Highlight Data in the bottom menu bar, press enter, and type the data file name.

Step 3.  Highlight Variable in the bottom menu bar, press enter, and tab to the variable name to be analyzed.

Step 4.  Highlight Execute in the bottom menu bar and press enter. 

A summary of univariate statistics for the variable will appear:

Select Histogram from the bottom menu and press enter:

The histogram is skewed with many values less than 10% incidence and relatively few values greater than 50% incidence.  Such a histogram is common with plant disease data sets.  Most geostatistical operations do not assume normally distributed data sets, but we frequently use indicator transforms of the data with various cutoffs as part of our geostatistical analysis of skewed data sets such as these .  This procedure will be covered in more detail later following the discussion of the basic methodology.

Data points with incidence over 80% were rechecked and their location noted on the point plots.
 

Using Postplot to explore data point patterns.

At the Dos prompt in the GeoEAS sub-directory, type Postplot.

Set the prefix, file name, and variable name as explain for the Stat 1 (see above)   program and press execute.

The postplot shows the overall sampling pattern as well as possible trends in the data.  There appears to be a trend in S strain incidence from east to west.  This possibility, noted first in the postplot, will be examined further as the analysis proceeds.

return to top
return to previous page


U of A Geostatistics | U of A Plant Pathology GIS Home | U of A GIS
 
Contact:  Tom Orum at torum@ag.arizona.edu
  Merritt Nelson at mrnelson@ag.arizona.edu
11/12/99 http://ag.arizona.edu/PLP/GIS