In search of causal watershed variables for watershed classification and daily streamflow prediction in ungauged watersheds
MetadataShow full item record
Hydrological predictions at a watershed scale are generally made by extrapolating and upscaling hydrological behavior at point and hillslope scales. However, some dominant hydrological drivers at a hillslope may not be as relevant at the watershed scale because of watershed heterogeneities. Quantifiable watershed data in the form of watershed descriptors and streamflow indices are becoming readily available such that appropriate variable selection provides new insight into the watershed descriptors that dominate different streamflow regimes at the watershed scale. Stepwise regression and principal components analysis are commonly used to select descriptive variables for relating runoff to climate and watershed descriptors. These methods do not derive causal associations between response and explanatory variables. Therefore, this study compares the accuracy, stability, and predictive power of variables selected by stepwise regression and principal components analysis with causal selection methods(e.g. HITON Markov Blanket) and their relevance in watershed hydrologic modeling. The results demonstrate that causal variable selection methods (especially HITON Markov Blanket) have a high probability of selecting true variables compared to stepwise regression and principal component analysis. Also, variables selected by causal methods give high classification accuracy of hydrologically similar watersheds and improve the predictive power for regionalized flow duration curves. Classification of hydrologically similar watersheds in three Mid–Atlantic regions of Appalachian Plateau (28 basins; 98–1779 km2), Piedmont (19 basins; 34.8–620 km2), and Ridge and Valley (25 basins; 48–1857 km2) are highest for variables selected by causal algorithms using a similarity index (SI) which quantifies agreement between hydrological similarity (based on streamflow indices) and physical similarity (based on selected variables). For the HITON-MB method, SI=0.71 for Appalachian, SI=0.90 for Piedmont, and SI=0.72 for Ridge and Valley; compared to variables selected by stepwise regression (SI=0.72 for Appalachian, SI=0.87 for Piedmont, and SI=0.64 for Ridge and Valley) and principal component analysis (SI=0.71 for Appalachian, SI=0.76 for Piedmont, and SI=0.57 for Ridge and Valley).