Sunday, September 24, 2017

Assignment 1
Goals of Assignment 1
Differentiate Between Levels of Measurement
Differentiate Between Classification Methods
Retrieving Data for the US Census and Joining Data
Enhance Cartographic Knowledge

Part I

Nominal Data: Nominal data is data that is categorized (into one of two or more categories) by membership label and has no inherent value associated with it; the unit assignment is categorical only. In Figure 1, tree climate zones are portrayed; the climate zones have no quantity associated with them and are simply regional labels.



Figure 1: Nominal Data
 (printable-maps.blogspot.com)

Ordinal Data: Ordinal data is data that is placed in a rank order, such as what is found in Likert Scales (on a range scale of 1-10 very dissatisfied to highly satisfied) or Moh’s Scale of Hardness, which is an ordering of scratch resistances of minerals from 0 (least) to 10 (most). The ordering of the data is what matters here; differences between values are not important or even knowable. This idea is exemplified in Figure 1 that I have chosen to use here. The differences in happiness levels are not known but the rank order of the data is.



Figure 2: Ordinal Data
(http://www.huffingtonpost.com/2013/08/02/happiest-states-_n_3696160.html)

Interval Data: Interval data is data that has a known order and values between the data points but has no true zero value origin making it impossible to calculate ratios. We know exact differences between the data points; for example, 71 minus 64 is 8 degrees. There is no such thing as a zero temperature on the Fahrenheit scale and 0 is simply a reference point. Figure 3 is a map of temperatures across the United States which is a common way to portary interval data. 



Figure 3: Interval Data
(https://weather.com/maps/ustemperaturemap)

Ratio Data: Ratio data is data that has an order, measurable differences between data points, and a true origin of zero which allows for interpretation and the knowing of true quantities. Figure 4 represents the temperature across the Earth's surface in degrees Kelvin. The Kelvin scale has a true zero point which is the point at which all molecular motion stops.



Figure 4: Ratio Data
(climate.nasa.gov)

Part II

Methods: An Excel sheet with all Wisconsin county Geo_IDs was provided by Dr Ryan Weichelt and the number of organic certified farms by county was then entered, using information located  at https://www.agcensus.usda.gov/Publications/2012/Full_Report/Volume_1,_Chapter_2_County_Level/Wisconsin/st55_2_042_042.pdf. This contained the relevant data (number of certified organic farms) gathered by the 2012 Census of Agriculture. A Wisconsin counties shapefile was then downloaded from the United States Census Burea located at 
http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml after performing the following selections from the home page: advanced search, data selection (2010 SF1 100% Data), geographies, county as geographic type, Wisconsin as state, then shapefile download in the map mode . The shapefile was then connected to ArcGIS and added to a blank map. The Excel chart was then joined to the ArcGIS map based on the Geo_ID codes. Three maps were made using three different data classification methods (explained in detail below) presenting identical data; each classification method utilized four classes. 


Equal Interval Classification Method: Figure 1 represents the data displayed on a map using the equal interval classification method. The equal interval classification method divides the range (max observation-minimum observation) of observed values into a predetermined number of classes of equal size; the problem in this case is that 70 counties fall into the first class (0.0-58.3), with only one each in the second (58.4-116.5) and fourth (174.9-233) classes and none in the third (116.6-174.8) class. Using the equal interval classification method is not an effective way to portray this data; you cannot propose concentrating on any of the 70 counties with any confidence when you  have such disparate values in groups as seen in the following example. Douglas County has 5 organic farms while Clark County has 49 organic farms. Both of these counties appear in the first group, which is where efforts should be concentrated, but Douglas County seems to be the better choice over Clark County as it has a far fewer organic farms. That is impossible to determine with the data mapped in this manner.


Figure 1: Mapped data using the equal interval classification method.


Natural Breaks Classification Method: Figure 2 represents the data displayed on a map using the Natural Breaks classification method. The Natural Breaks method seeks the minimization of variance within classes and the maximization of variance between classes and assigns the data accordingly; i.e., it classifies data that are closest in values into four classes (in this case) based on breaks (gaps) in the data. This can lead to classes that contain widely varying number ranges. This classification resulted in a smaller range of values in the first three classes but still does not present the data in a manner that will allow for a good business decision to be made. Maximum effort should be made in those counties that have the fewest certified organic farms, but a majority of the counties in Wisconsin still fall into the first class and the Natural Breaks method isn't sensitive enough to portray differences at the lower end of the data range. There is an interesting spatial relationship beginning to become apparent here; more on this in the next method description.



Figure 2: Mapped data using the Natural Breaks classification method.


Quantile Classification Method: Figure 3 represents the data displayed on a map using the quantile classification method. In the quantile classification method an equal number of features are placed in each group, independent of quantities of farms. In this case, the 72 counties (features) of Wisconsin were divided by four (the number of classes used for this project), which resulted in the placement of 16 counties into each class. This method is much more sensitive to differences at the lower end of the values than the other two classification methods used previously, making it extremely valuable in making an informed decision as where to efforts to promote the startup of organic farms should be made. The first two data groups are comprised of the 32 counties where there are fewer than 10 organic farms; ideal areas to promote the message of increasing organic farms. The spatial pattern that began in Figure 1, and became readily obvious by Figure 3, is that the southwestern, southern, and central regions of Wisconsin have many more organic farms than the surrounding areas, generally speaking. Efforts should be focused on the areas surrounding southwestern Wisconsin.



.Figure 3: Figure represents the data using the quantile classification method.

Results: I have already determined, based on the explanations preceding each figure, that the quantile classification method works best for this project. Both the equal interval and Natural Breaks methods were sensitive to the Vernon County outlier (value of 233) which affected the ability of these classification methods to portray the data in an easily discernible manner. This is, however, an initial step and further studies would be needed, such as demand for organic foods, population densities, soil quality, aquifer stability, and the available labor (organic farming being labor-intensive) in each county. Despite the use of the same data for each map, the widely varying results were entirely dependent upon the data classification method used. The use of a larger number of classes could possibly have had an effect on the method I chose that offered the best presentation of the data.