139b Comparison of watershed disturbance predictive models for stream benthic macroinvertebrates: Multiple linear regression vs. alternative models

Tuesday, May 19, 2009: 10:30 AM
Pantlind Ballroom
Ian R. Waite , Oregon Water Science Center, U.S. Geological Survey, Portland, OR
Larry R. Brown , U.S. Geological Survey, Sacramento, CA
Jonathan G. Kennen , New Jersey Water Science Center, U.S. Geological Survey, West Trenton, NJ
Thomas F. Cuffney , U.S. Geological Survey, Raleigh, NC
Jason T. May , California Water Science Center, U.S. Geological Survey, Sacramento, CA
James L. Orlando , U.S. Geological Survey, Sacramento, CA
Kimberly A. Jones , Utah Water Science Center, U. S. Geological Survey, West Valley City, UT
If biological responses to human disturbance are linear then they should be easily modeled using standard regression techniques, however, if responses are more complex and nonlinear then other modeling techniques such as regression classification trees, machine learning techniques, multilevel hierarchical modeling or structural equation models may be necessary to model these responses. We aggregated macroinvertebrate data from various sources in order to assemble data sets for modeling in three distinct ecoregions in Oregon and California. We used land use/land cover as explanatory variables to predict macroinvertebrate metrics. Multiple linear regression models from each region required only two or three explanatory variables to explain 41 to 74 percent of the variation in macroinvertebrate metrics. The responses were in general linear, yet improvements to the regression models by alternate models (random forest and boosted trees) will be discussed. Models like these can be used to better understand causal linkages between environmental drivers and stream biological attributes or condition. Further, such models not only represent the foundation of more complex mechanistic models but may also be highly useful tools for researchers or managers for predicting biological indicators of stream condition at unsampled sites.