Sunday, August 21, 2011

Data: NOAA Freeze/Frost and Growing Season Data

The National Oceanic and Atmospheric Administration (NOAA) is a US government agency focused on oceanic and atmospheric data. They keep rather large data sets from several other agencies beneath them, including the National Weather Service, the National Ocean Service, the National Marine Fisheries Service, and more. The National Climatic Data Center is the storage facility for 99% of the NOAA data, which includes 1.2 petabytes of digitized data (1).

There is a great deal of useful data in the above links to plow through, but at least one good data set is here:
In short, the data in the above second link above is in an ASCII text format.
The statistics in the data set were computed from data collected by 4,346 stations between the years 1971 and 2000. There are 15 space-seperated columns of data for each of 3,578,228 lines, however the first column represents 3 different values without any spacing between them. Those values are state code number, station code number, and a division code number. The explanation document mentions one essential companion document, which is used to get station metadata (e.g. name, location, elevation, etc.). It is linked from the Freeze/Frost Data page above in PDF form.

Column 2 represents the temperature threshold for which the freeze date in that line was calculated. The dataset uses 6 different temperatures to compute the freeze dates, all in Fahrenheit measurements: 36, 32, 28, 24, 20, and 16.

Column 3 is the freeze season character, where '1' is for a late Spring freeze, '2' is for a first Fall freeze data, and 3 is the growing season length.

Columns 4 through 12 are, for freeze season characters 1 and 2, the freeze dates for the given temperature threshold in column 2. The columns are individually the freeze probability of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, and 90%. If the line is for freeze character 3, then these columns are the growth season length computed with the given temperature temperature.

Column 13 represents the number of years in which the specified freeze temperature threshold was reached or exceeded in the 30 years of data.

Column 14 contains the mean number of days of occurrences associated with the given freeze threshold. I'm not sure if that means that the exact specified temperature occurred or not. Nevertheless, the data there actually has an implied decimal point that would need to be accounted for when reading in the data.

Column 15 is the standard deviation of the mean found in column 14.

I believe this data was used to the Freeze/Frost map visualizations linked above. The explanation document mentions derived maps, but it does not give a link to them, so I assume those are it. I might write a visualizations article with those maps.

Finally, the explanation document mentions the obvious, that these freeze/frost data are useful for the agricultural industry.