Tuesday, November 14, 2006

Automated Processing of Yield Monitor Data in Python.

George W. Mueller-Warrant and Gerald Whittaker. USDA-ARS, National Forage Seed Production Research Center, 3450 SW Campus Way, Corvallis, OR 97330

As combine yield monitor data proliferates, so does the need for robust procedures to correct artifacts and improve quality of data summaries. Initial data processing procedures usually include setting the time phase delay value and removing of zeros from the data stream. These steps are often followed by removal of outliers based on criteria such as a 3-sigma deviation from average yield. Data files transferred to researchers for subsequent analysis and interpretation may or may not include full metadata on processing procedures already employed, and even if that metadata is present the processing steps may have been inadequate to achieve best data quality. We present Python scripts to (1) verify the correctness of the time phase delay using the Beal-Tian 3-D to 2-D minimization procedure and (2) remove outlying data points based on non-uniformity in travel speed rather the more arbitrary deviation of yield versus the average. Application of these procedures to 219 data files from harvest of grass seed crops allowed us to recognize field-specific relationships between elevation from DEMs and seed yield, as well as relationships between soil types and seed yield.

Handout (.pdf format, 14996.0 kb)