THE SCIENCE OF NOAA’S OPERATIONAL HYDROLOGIC ENSEMBLE FORECAST SERVICE
By Zhu, Yuejian | |
Proquest LLC |
HEFS extends hydrologic ensemble services from 6-hour to year-ahead forecasts and includes additional weather and climate information as well as improved quantification of major uncertainties.
As no forecast is complete without a description of its uncertainty (
Ensemble approaches hold great potential for operational hydrologic forecasting. As demonstrated with atmospheric ensemble forecasts, the estimates of predictive uncertainty provide forecasters and users with objective guidance on the level of confidence that they may place in the forecasts. The end users can decide to take action based on their risk tolerance. Furthermore, by modeling uncertainty, hydrologic forecasters can maximize the utility of weather and climate forecasts, which are generally highly uncer- tain and noisy (Buizza et al. 2005). With the major uncertainties quantified and their relative importance analyzed, ensemble forecasting helps identify areas where investments in forecast systems and processes will have the greatest benefit.
Development and implementation of hydrologic ensemble prediction systems is still ongoing and hence only limited operational experience exists. A number of case studies using experimental and (pre) operational systems, however, have demon- strated their potential benefits (see, e.g., Cloke and Pappenberger 2009 and Zappa et al. 2010 for refer- ences). Recent verification studies of hydrologic ensemble forecasts or hindcasts (i.e., forecasts that are retroactively generated using a fixed forecasting sys- tem) over long time periods include Bartholmes et al. (2009), Jaun and Ahrens (2009), Renner et al. (2009),
In the
The next section presents an overview of the HEFS and its various components. In the subsequent four sections, the individual components are described in more detail and selected illustrative verification results are presented to demonstrate HEFS potential benefits. Finally, future scientific and operational challenges for improving hydrologic ensemble fore- casting services are discussed.
OVERVIEW OF THE HEFS. Uncertainty in hydrologic predictions comes from many differ- ent sources: atmospheric forcing observations and predictions; initial conditions of the hydrologic model, its parameters, and structure; and streamflow regulations among other anthropogenic uncertain- ties (Gupta et al. 2005). The uncertainties in the atmospheric forcing inputs are typically referred to as input uncertainty and those in all other sources as hydrologic uncertainty (Krzysztofowicz 1999). A hydrologic ensemble pre- diction system could either model the total uncertainty in the hydrologic output forecasts (e.g., Montanari and Grossi 2008; Coccia and Todini 2011; Weerts et al. 2011; Smith et al. 2012; Regonda et al. 2013) or explicitly account for the major sources of uncer- tainty, which is the primary approach of HEFS (Seo et al. 2006). As noted by Velázquez et al. (2011), hy- drologic ensemble predic- tion systems presented in the literature often account for the input uncertainty only. Recently, however, a few systems have included various techniques to address specific hydrologic uncertainties, such as hydrologic data assimilation to reduce and model the initial condition uncer- tainty,
A schematic view of the HEFS is given in Fig. 1 along with the information flow. For input uncertain- ty modeling, the Meteorological Ensemble Forecast Processor (MEFP; Schaake et al. 2007a; Wu et al. 2011) combines weather and climate forecasts from various sources to produce bias-corrected forcing (precipita- tion and temperature) ensembles at the space-time scales of the hydrologic models. These ensembles have coherent space-time variability among the different forcing variables and across all forecast locations. The Hydrologic Processor ingests the forcing ensembles and runs a suite of hydrologic, hydraulic, and reser- voir models to produce streamf low ensembles. The data assimilation (DA) process currently consists of manual modifications of model states and param- eters by the forecasters based on their expertise; therefore, it will be included in HEFS in the future when automated DA techniques are implemented. For hydrologic uncertainty modeling, the hydro- logic Ensemble Postprocessor (EnsPost; Seo et al. 2006) adjusts the streamflow ensembles to reflect the total hydrologic uncertainty in a lumped manner and produce bias-corrected streamflow ensembles. Along with the above uncertainty components, the Graphics Generator and the Ensemble Verification Service (EVS; Brown et al. 2010) enable forecasters to produce uncertainty-quantified forecast and verifica- tion information that can be tailored to user needs.
Diagnostic verification of hydrologic forecasts needs to be routinely performed by scientists and operational forecasters to improve forecast quality (Welles et al. 2007) and to provide up-to-date verifica- tion information in real time to users. Such activity requires the capability of running the hydrologic ensemble forecast system in hindcasting mode to retroactively generate ensemble forecasts for multiple years using the newly developed ensemble forecasting approaches. Verification of hindcasts may be used to evaluate the benefits of new or improved ensemble forecasting approaches, analyze the various sources of uncertainty and error in the forecasting system, and guide targeted improvements of the forecasting system (Demargne et al. 2009; Renner et al. 2009; Brown et al. 2010). Hindcast datasets may also be re- quired by operational forecasters to identify historical analog forecasts to make informed decisions in real time and by sophisticated users to calibrate decision support systems (
In the context of operational hydrologic forecasting in the NWS, the HEFS has been developed to im- prove upon operational single-valued forecasting and seasonal ESP forecasting while capturing user requirements, which include 1) supporting both real-time ensemble forecasting and hindcasting for large-sample verification and systematic evaluation, 2) maintaining interoperability with the single-valued forecasting system for the short range (given that single-valued forecasting is only a special case of en- semble forecasting), and 3) producing ensemble fore- cast information that is statistically consistent over a wide range of spatiotemporal scales. The operational hydrologic and water resources models used for both single-valued and probabilistic forecasting are simple conceptual models applied in a lumped fashion, with relatively few parameters estimated by manual calibration (a unique set of parameters being defined for all f low regimes, from low f low to f looding con- ditions). Expectedly, the hydrologic predictability could be limited in poorly monitored areas, with river gauges malfunctioning (e.g., during flood events) and during rapidly changing hydrometeorological condi- tions. Moreover, modeling reservoir regulations and diversions is challenging because of the lack of reliable information for the RFC forecasters and changes of reservoir operations to adjust to the current and fore- cast f low situation. Also, the estimation of historical past forcings for model calibration and hindcasting may not be consistent with real-time meteorological model inputs, owing to changes in tools (e.g., gauges versus radar for precipitation estimation) and models, as well as estimation errors. To address these data and model challenges, the RFCs have longstanding practices to apply in a subjective way manual modi- fications of model states and parameters for single- valued forecasting (see Raff et al. 2013 for details on RFC practices)-modifications that are not currently included in HEFS.
The initial HEFS prototype system, referred to as the Experimental Ensemble Forecast System (XEFS; www.nws.noaa.gov/oh/XEFS/), began testing at selected RFCs in 2009. The ongoing HEFS implemen- tation is based on three software development releases to five test RFCs, from spring 2012 to fall 2013. The development phase is targeted to be completed by the end of 2013 with HEFS implementation to all 13 RFCs in 2014. The project has been accelerated by an agreement with the
Similarly to NWS operational single-valued hydrologic forecasting, HEFS uses CHPS, an open service-oriented architecture built on the Delft- FEWS framework (Werner et al. 2004). It facilitates incorporation of new models and tools, establishes interoperability with partners, and accelerates research to operations. CHPS is critical in support- ing the NOAA Integrated Water Resources Science and Services in partnership with federal agencies [e.g.,
METEOROLOGICAL ENSEMBLE FORE- CAST PROCESSOR. Reliable and skillful atmo- spheric ensemble forecasts are necessary for hydro- logic ensemble forecasting. Ensemble forecasts from NWP models are widely available from several atmo- spheric prediction centers. However, these ensembles are generally biased in the mean, spread, and higher moments (Buizza et al. 2005), both unconditionally and conditionally on magnitude, season, storm type, and other attributes. The conditional biases may be particularly large for heavy precipitation events that are crucial in flood forecasting (Hamill et al. 2006, 2008; Brown et al. 2012). There are several statistical techniques for estimating the conditional probabil- ity distribution of an (assumed unbiased) observed variable given a potentially biased forecast (see, e.g., references in Brown et al. 2012). These techniques vary in their assumptions about the conditional (or joint) probability distribution, the predictors used (e.g., single-valued forecast, attributes of an ensemble fore- cast), and the estimation of the statistical parameters (e.g., full period, seasonal, moving window, threshold dependent). Several techniques have been compared for specific variables and modeling systems (e.g., Gneiting et al. 2005; Wilks and Hamill 2007; Hamill et al. 2008). Bias correction of precipitation ensemble forecasts is particularly challenging because precipi- tation amount is intermittent, it depends strongly on space-time scale, and is relatively unpredictable in many cases (e.g., convective events). For hydrologic forecasting with lumped models, the gridded NWP ensembles need to be processed at the basin scale, which requires "downscaling" (described as a change of support in geostatistics) and bias correction. This downscaling includes corrections to match the clima- tology of the forcings used to calibrate the hydrologic model.
The MEFP aims to generate unbiased ensembles that capture the skill of the forecasts from multiple sources for individual basins while preserving the space-time properties of hydrometeorological vari- ables (e.g., precipitation and temperature) across all basins (Schaake et al. 2007a; Wu et al. 2011). For short-range forecasts, human forecasters generally add significant value to single-valued hydrometeo- rological forecasts derived from raw NWP forecasts (Charba et al. 2003). Also, postprocessing studies have repeatedly demonstrated that most information from NWP medium-term ensembles comes from the ensemble mean (e.g., Hamill et al. 2004; Wilks and Hamill 2007). Therefore, the MEFP uses the single- valued forecasts modified by human forecasters for short-range forecast horizon (up to 7 days) and the ensemble mean forecasts from multiple NWP models for mid- to long range to generate seamless and calibrated hydrometeorological ensembles up to a 1-yr forecast horizon. Precipitation and temperature are processed slightly differently since precipitation is intermittent and highly skewed whereas the tem- perature distribution is nearly Gaussian. MEFP uses the normal quantile transform (NQT) to transform observed and forecast precipitation variables into normal variates. The precipitation part of MEFP also includes an explicit treatment of precipitation intermittency using the mixed-type bivariate meta- Gaussian model (Herr and Krzysztofowicz 2005), parametric and nonparametric modeling of the marginal probability distributions, and a parameter optimization under the continuous ranked probabil- ity score (CRPS; Hersbach 2000) and other criteria (see Wu et al. 2011 for details). For temperature, the MEFP procedure first generates ensembles of daily maximum and minimum temperatures, and then generates ensembles at subdaily time steps from the daily ensembles through a diurnal variation model. The above scheme is based on the same interpola- tion procedures used to calculate subdaily historical temperature time series and account for the diurnal cycle assumed in the operational calibration process (Anderson 1973).
For each hydrometeorological variable for a given basin (i.e., precipitation, maximum temperature, and minimum temperature), a specific forecast source, and a given forecast lead time, MEFP estimates the joint probability distribution of observations and single-valued forecasts based on a multiyear archive of the observations and forecasts. This calibration is performed for each day of the year by pooling histori- cal observed-forecast pairs from a time window cen- tered on that day in order to account for seasonality. In real time, given the current single-valued forecast, MEFP derives the conditional probability distribution of the observations, from which ensemble members are sampled. The ensemble members are generated for each individual time step, and then the Schaake shuffle (Clark et al. 2004) is applied to arrange the ensemble values according to the ranks of the histori- cal observations. In this way, the produced ensemble time series preserve rank correlations for multiple lead times and basins across hydrometeorological variables (e.g., precipitation and temperature). The ensemble copula coupling approach (Schefzik et al. 2013) also aims to recover the space-time multivariate dependence structure from the raw ensembles instead of the historical observations. Both approaches are very attractive computationally, requiring only the computation of marginal ranks, and could be applied for any dimensionality. However, they are both lim- ited in the number of postprocessed ensembles and equal to the number of observed historical years for the Schaake shuffle and to the number of raw en- sembles for the ensemble copula coupling (making it difficult to use multiple forecast sources with dif- ferent numbers of ensembles). For extreme events, if the NWP ensembles are skillful, the multivariate dependence structure should be contained in the raw ensembles (and therefore should be realistically de- scribed with the ensemble copula coupling approach). However, it may be lacking in the observation record for the Schaake shuff le approach owing to the likely lack of occurrences of similar events historically over the forecast horizon. If the NWP model output is strongly structured, parametric copula approaches might be used (as in Möller et al. 2013) to correct for any systematic errors in the ensemble's representation of the conditional dependence structure. However, such parametric procedures are very expensive computationally and could be limited in practice by the output dimensionality. We therefore suggest examining in the future alternative approaches, which use raw forecasts, observations, or some com- bination of the two (e.g., "analogs") for improved space-time rank structure.
In general, the forecast uncertainty and skill are time-scale dependent. Even though the forecast skill at the individual time steps may be limited, especially for long lead times, the skill of forecasts aggregated over multiple time steps is likely to be useful and needs to be exploited for hydrologic and water resources applications. Therefore, the MEFP calibration and ensemble forecasting procedures are also applied to a set of precipitation accumulations and temperature averages defined by the user across different forecast periods from the individual time steps (e.g., n-day events and x-month events up to the maximum available forecast horizon for each forecast source). The final ensemble members at the individual time steps are sequentially produced by the Schaake shuff le for the original and aggregated temporal scales according to increasing forecast skill at the individual scales and for the different forecast sources, with the highest skill having the greatest influence on the final values (see Schaake et al. 2007a for details).
MEFP has been experimentally implemented and evaluated at several RFCs using single-valued forecasts from various sources for a number of dif- ferent forecast horizons. For the short-range forecast horizon, MEFP uses RFC operational single-valued forecasts as modified by the human forecasters. Depending on forecast locations, these forecasts are available from 1 to 5 forecast lead days for precipita- tion and up to 7 forecast lead days for temperature. Validation results were reported by 1) Schaake et al. (2007a) for precipitation and temperature for one basin in
Figure 2 (from Wu et al. 2011) shows the CRPS val- ues of MEFP-generated 6-h precipitation ensembles for the first lead day for the
For medium range (up to 14 forecast lead days), the single-valued forecasts are obtained as the ensemble means from the frozen version (circa 1998) of the Global Forecast System (GFS; Hamill et al. 2006) of the
Figure 4 shows the verification results from dependent validation for the 14-day GFS-based precipitation ensembles compared to the climatology- based precipitation ensembles. All ensembles were produced at a 6-h time step from 1979 to 2005, with 45 ensemble members, using method 3, and were veri- fied with the EVS as daily totals. The mean error is reported for the ensemble mean (commonly used as a single-valued representation of the ensemble in opera- tional forecasting), along with the continuous ranked probability skill score (CRPSS), which describes the overall quality of the probabilistic forecast in refer- ence to the climatology-based ensembles. At short lead times, the precipitation ensembles are relatively unbiased in the unconditional sense as evidenced by the mean error for the precipitation intermittency threshold; however, they underforecast for larger observed events. These Type-II conditional biases are common in ensemble forecasting systems since model calibration typically favors average conditions and such conditional biases are more difficult to remove with postprocessing (Brown et al. 2012). When com- paring to the climatology-based ensembles for all 14 lead days, the MEFP-generated ensembles expectedly show reduced conditional biases. The quality of GFS- based precipitation decreases rapidly with increasing forecast lead time as evidenced by the increased mean error and the reduction in CRPSS. However, because of the relatively large predictability of orographic pre- cipitation in the
As part of the ongoing comprehensive evaluation of HEFS ensembles, Brown (2013) analyzed verifica- tion results of GFS-based precipitation and tempera- ture ensemble hindcasts for a 14-day forecast horizon for four pairs of headwater-downstream test basins located in
MEFP has recently been enhanced to ingest forecast from the NCEP's latest Global Ensemble Forecast System (GEFS), which was implemented in
In the future, MEFP should include forecasts from other NWP models [e.g., the Short-Range Ensemble Forecast (SREF) system produced by the NCEP (Du et al. 2009)], techniques to estimate precipitation from the combination of different NWP model output variables (e.g., total column precipitable water), and additional and/or alternative postprocessing tech- niques, for example, to incorporate information from the ensemble spread and higher moments (Brown and Seo 2010). In the experimental Meteorological Model- Based Ensemble Forecast System (Philpott et al. 2012), three Eastern Region RFCs and a Southern Region RFC are also investigating the use of SREF and GEFS ensembles, as well as North American Ensemble Forecast System (NAEFS) ensembles, all produced and bias corrected (at the grid scale) by the NCEP (Cui et al. 2012) (experimental products available at www .erh.noaa.gov/mmefs/). Grand-ensemble datasets such as
HYDROLOGIC ENSEMBLE POSTPROCES- SOR. Sources of hydrologic bias and uncertainty may be unknown or poorly specified in hydrologic ensemble prediction systems. Therefore, a range of statistical postprocessing techniques have been devel- oped to account for the collective hydrologic uncer- tainty (Krzysztofowicz 1999; Seo et al. 2006; Coccia and Todini 2011; Brown and Seo 2013; and references therein). They aim to producing reliable (i.e., condi- tionally unbiased) hydrologic ensemble forecasts from single-valued forecasts or "raw" ensemble forecasts, sometimes with the aid of covariates, accounting only for the hydrologic uncertainty in the forecasts. The resulting probability distribution is described by a complete density function (e.g., Krzysztofowicz 1999; Seo et al. 2006; Montanari and Grossi 2008; Todini 2008; Bogner and Pappenberger 2011) or several thresholds of the distribution (e.g., Solomatine and Shrestha 2009; Brown and Seo 2013). Examples of postprocessing techniques for hydrologic ensemble prediction systems include error correction based on the last known forecast error (Velázquez et al. 2009), an autoregressive error correction using the most recent modeled error (Renner et al. 2009;
In the HEFS, the EnsPost (Seo et al. 2006) accounts for the collective hydrologic uncertainty in a lumped form. Since MEFP generates bias-corrected hydrometeorological ensembles that reflect the input uncertainty, EnsPost is calibrated with simu- lated streamflow (i.e., generated from perfect future meteorological forcings) without any manual modifications of model states and parameters. The hydrologic uncertainty is, therefore, modeled inde- pendently of forecast lead time. The postprocessed streamflow ensembles result from integration of the input and hydrologic uncertainties and hence reflect the total uncertainty. The current version of the EnsPost employs a parsimonious statistical model that combines probability matching and time series modeling. Parsimony is important to reduce data requirements and, therefore, reduce the sampling uncertainty of the estimated parameter values. The procedure adjusts each ensemble trace via recursive linear regression in the normal space (see Seo et al. 2006 for details). The regression is a first-order autoregressive model with an exogenous variable, or ARX(1,1), and uses normal-quantile- transformed historical simulation and verifying observation. The regression parameter is optimized for different seasons and flow categories, taking into account that the correlation depends greatly on flow magnitude and season. Recently, this model has been modified to better simulate temporal vari- ability in the postprocessed streamflow ensembles by accounting for dependence in the normal space between the residual error of the model fit and the observed streamflow, as well as the serial correlation in the residual error.
EnsPost is currently applied to daily observed and forecast streamflows; after statistical postprocessing, the adjusted ensemble values are disaggregated to subdaily flows. In Seo et al. (2006) and subsequent studies for other locations, the EnsPost shows sat- isfactory results for short forecast horizons and for all ranges of flow. However, independent valida- tion shows slightly degraded results in comparison to dependent validation when EnsPost parameters were estimated from a 20-yr record, mainly owing to uncertainties in the empirical cumulative distribu- tion functions of observed and simulated f lows. Seo et al. (2006) underlined that in real-time applications, when the postprocessor parameters may be regularly (e.g., annually) updated using more than 20 years of data, the performance of EnsPost would be similar or better than the obtained independent validation results. Examples of cross validation results are shown in Fig. 5 for postprocessed f low ensemble hindcasts produced with perfectly known future forcing. The daily flow ensemble hindcasts were generated for the NFDC1 basin using 38 years of observed-simulated flow records. In Fig. 5, the reliability diagram and the ROC curve relative to a threshold of 95th-percentile flow indicate good reliability (left plot) and dis- criminatory skill similar to the single-valued model predictions (right plot) for the first and fifth lead days. However, the current version of EnsPost is of limited utility for complex flow regulations and does not explicitly account for timing errors in the streamflow simulations (see Liu et al. 2011).
Regarding the quality of HEFS flow ensembles, examples of dependent verification results are given for raw and postprocessed f low ensemble hindcasts produced by the Hydrologic Processor and EnsPost using the GFS-based precipitation and temperature ensembles generated by MEFP. For flow hindcasting, the Hydrologic Processor is first run in simulation mode with the observed precipitation and tem- perature time series to generate the historical initial conditions for all hindcast dates. Based on the his- torical initial conditions, flow ensemble hindcasts are produced by the Hydrologic Processor for each hindcast date using the MEFP precipitation and tem- perature ensemble hindcasts, and then postprocessed by EnsPost. To evaluate the performance gain using MEFP and EnsPost, f low ensembles produced by the Hydrologic Processor (using the same retrospec- tive initial conditions) from climatological forcing ensembles are used as reference forecasts.
The example verification results are given for the NFDC1 basin, for which 6-h ensemble hindcasts were produced from 1979 to 2005 and verified with EVS as daily average f lows. The comparisons of dependent and independent validation results for MEFP and EnsPost in the previous studies (e.g., Wu et al. 2011; Seo et al. 2006) have shown their robustness. Thus, the following dependent validation results for HEFS- generated flow ensembles give a reasonable indication of the expected performance of HEFS in real-time applications, when both MEFP and EnsPost are cali- brated with more than 25 years of data, even if some degradation is expected for rare events. As illustrated in Figs. 2-4 for the NFDC1 basin, MEFP precipitation ensembles perform well, particularly when compared with climatological ensembles. The marginal value of EnsPost depends largely on the magnitude of the systematic bias in the model-simulated streamf low. For the NFDC1 basin, the model simulation is of very high quality with a volume bias of only about 1%. As such, one may expect the contribution from the EnsPost to be modest, coming mostly from im- proved reliability by adding spread to the streamflow ensembles.
Figure 6 shows the mean error for the ensemble means and the CRPSS for the postprocessed flow ensembles and raw flow ensembles in reference to the climatology-based flow ensembles. The GFS-based flow ensembles exhibit a conditional bias consis- tent with the conditional bias of the precipitation ensembles: overforecasting of small events and underforecasting of large events. However, owing to hydrologic persistence or "basin memory," the quality of the flow ensembles declines more slowly than that of the precipitation ensembles. Regarding CRPSS results, the sharp increase in skill between the first and second forecast days is due to the fact that, for the first lead day, the climatology-based flow ensembles too have good skill owing to persistence, which results in reduced skill score for the GFS-based flow ensembles. The comparison of the CRPSS values for the raw flow ensembles and the postprocessed f low ensembles shows that most of the flow forecast skill comes from the MEFP component, with limited impact of EnsPost. The additional improvement by EnsPost is marginal because of small hydrologic biases and uncertainties in this basin. It decreases very fast within the first few days as a ref lection of the fast-decaying memory in the initial conditions, noting that the prior observation is a predictor in the EnsPost.
However, as pointed out by Brown (2013), the overall skill of GFS-based postprocessed flow ensembles in reference to climatology-based flows, as well as the relative contributions of the MEFP (with GFS forecasts) and EnsPost components, depend on the basin location (as illustrated in Fig. 7 with basins located in four different RFCs), f low amount, and season. In Fig. 8, examples of ensemble traces for two different basins illustrate how MEFP and EnsPost may help to predict a large f low event 5 days in advance but with a peak timing error (top plots), and may produce ensembles with a much reduced spread compared to climatology-based flows but with a low bias tendency (bottom plots).
The performance of EnsPost depends largely on the availability of long-term observed and model-simulated flows and the assumption that the streamflow climatology is stationary over a multidecadal period. If additional stratification of the observed-simulated f low dataset is necessary for parameter estimation to improve the model fit for specific conditions (e.g., snowmelt), the EnsPost will require an even larger dataset for its calibration. In the future, for areas where observed and simulated flow data are available at subdaily scales (6 hourly or hourly), direct modeling of the subdaily f low will be necessary for improved performance. The use of multiple temporal scales of aggregation to improve bias correction at longer ranges is under investigation. Evaluation of other bias-correction techniques (including those used for atmospheric forcings) is also ongoing (e.g.,
Moreover, EnsPost needs to be currently applied without any manual modifications of model states and parameters to maintain the consistency between the real-time ensemble flows and the simulated f lows used for its calibration, as well as the EnsPost- generated streamflow hindcasts and verification results. Therefore, for real-time ensemble predic- tion, the set of model states used in HEFS are generated with a simulation time window long enough to minimize the impact of any modifica- tions previously applied in single-valued forecasting. Obviously, EnsPost needs to evolve along with the data assimilator component to utilize automated DA procedures. Meanwhile, given that the current manual modifications address significant limita- tions in the operational models and datasets, we recommend analyzing the potential impact of these modifications on the performance of HEFS flow ensembles. Such comprehensive evaluation could offer objective guidance on best operational practices for applying manual modifications and cost-effective transitioning of experimental automated DA capabili- ties into operational ensemble forecasting.
ENSEMBLE VERIFICATION. To evaluate the performance of HEFS for both research and operational forecasting purposes, ensemble verifica- tion is required. Key attributes of forecast quality include the degree of bias of the forecast probabilities, whether unconditionally or conditionally upon the forecasts (reliability or Type-I conditional bias) or observations (Type-II conditional bias), the ability to discriminate between different observed events (i.e., to issue distinct probability statements), and skill relative to a baseline forecasting system (Jolliffe and Stephenson 2003; Wilks 2006). Ensemble forecasting systems, such as HEFS, are intended for a wide range of practical applications, such as flood forecasting, river navigation, and water supply forecasting. Therefore, forecast quality needs to be evaluated for a range of observed and forecast conditions in terms of forecast horizon, space-time scale, seasonality, and magnitude of event. The EVS, built on the Ensemble Verification System (Brown et al. 2010; freely available from www.nws.noaa.gov/oh/evs.html), was designed to support conditional verification of forcing and hydrologic ensembles, generated by HEFS, as well as external ensemble forecasting systems. EVS is a flexible, modular, and open-source software tool pro- grammed in Java to allow cost-effective collaborative research and development with academic and private institutions and rapid research-to-operations transi- tion of scientific advances.
Key features of EVS include the following (see Brown et al. 2010 for details):
* the ability to evaluate forecast quality for any continuous numerical variable (e.g., precipitation, temperature, streamf low, river stage) at specific forecast locations (points or areas) and for any temporal scale or forecast lead time;
* the ability to evaluate the quality of an ensemble forecasting system conditional upon many factors, such as forecast lead time, seasonality, temporal aggregation, magnitude of event (defined in various ways, such as exceedance of a real-valued threshold or climatological probability), and values of auxiliary variables (e.g., quality of f low ensembles conditional upon the amount of ob- served precipitation);
* the ability to evaluate key attributes of forecast quality, such as reliability, discrimination, and skill, at varying levels of detail, ranging from highly summarized (e.g., skill scores such as CRPSS) to highly detailed (e.g., box plots of con- ditional errors);
* the ability to aggregate the forecasts in time (e.g., hourly to daily) and to evaluate aggregate perfor- mance over a range of forecast locations, either by pooling pairs or computing a weighted average of the verification metrics from several locations;
* generating graphical and numerical outputs in a range of file formats (R scripts are also provided for further analysis and generation of custom graphics);
* the ability to implement a verification study via the graphical user interface (GUI) or to batch process a large number of forecast locations on the com- mand line, using a project file in an XML format (the EVS can also be run within CHPS-e.g., to produce diagnostic verification results for one or multiple hindcast scenarios); and
* the ability to estimate the sampling uncertainty in the verification metrics using the stationary block bootstrap-synthetic realizations of the original paired data are repetitively generated and the verification metrics are computed for each sample to estimate a bootstrap distribution of the verifica- tion metrics, from which the percentile confidence intervals are then derived.
EVS is regularly enhanced to address needs from modelers and forecasters as HEFS is being imple- mented and evaluated across all RFCs and since the Ensemble Verification System is being used in other projects such as HEPEX.
GRAPHICS GENERATOR. Communicating uncertainty information to a wide range of end users represents a challenge. As hydrologic ensemble fore- casting is relatively new, much research is needed to define the most effective methods of presenting such information and design decision support system that maximize their utility (Cloke and Pappenberger 2009). Challenges in communicating hydrologic ensembles include how to understand the ensemble forecast information (e.g., value of the ensemble mean, relation between spread and skill), how to use such information (e.g., in coordination with deterministic forecasts), and how to communicate it (e.g., spaghetti plots versus plume charts), even to nonexperts (Demeritt et al. 2010). A variety of prac- tical approaches and products have been presented by Bruen et al. (2010) for seven European ensemble forecasting platforms and by Ramos et al. (2007) and Demeritt et al. (2013) for the European Flood Alert System. Pappenberger et al. (2013) formulated rec- ommendations for effective visualization and com- munication of probabilistic flood forecasts among experts, acknowledging that there is no overarching agreement and one-size-fits-all solution.
In HEFS, the Graphics Generator (GraphGen), a generic software tool for CHPS, enables forecasters to generate and visualize information for internal deci- sion support during operations as well as disseminate the final products to end users. This tool is expected to be accessed externally through a web service interface, which will allow the uncertainty-quantified forecast and verification information to be tailored to the needs of specific external users. GraphGen includes the functionality of the NWSRFS Ensemble Streamflow Prediction Analysis and Display Program (
Furthermore, verification information needs to be provided along with forecast information to support decision making (Demargne et al. 2009). Similar approaches have been reported by Bartholmes et al. (2009), Renner et al. (2009), and
Through customer evaluations of the AHPS web- site and the NWS Hydrology Program, the NWS has recognized the need to better communicate hydro- logic forecast uncertainty information for the end users to understand better and use such information more effectively in their decision making. The NWS, USACE, and
CONCLUSIONS AND FUTURE CHALLENGES. The end-to-end HEFS provides, for short to long range, uncertainty-quantified fore- cast and verification products that are generated by 1) the MEFP, which ingests weather and climate forecasts from multiple Numerical Weather Prediction models to produce seamless and bias-corrected precipitation and temperature ensembles at the hydrologic basin scales; 2) the Hydrologic Processor, which inputs the forcing ensembles into a suite of hydrologic, hydraulic, and reservoir models; 3) the EnsPost, which models the collective hydrologic uncertainty and corrects for biases in the streamflow ensembles; 4) the EVS, which verifies the forcing and streamflow ensembles to help identify the main sources of skill and bias in the forecasts; and 5) the Graphics Generator, which enables forecasters to derive and visualize products and information from the ensembles. Evaluation of the HEFS through multiyear hindcasting and large- sample verification is currently underway and results obtained so far show positive skill and reduced bias in the short to medium term when compared to climatology-based ensembles and single-valued fore- casts. However, the performance varies significantly with, for example, forecast horizons, basin locations, seasons, and magnitudes, which underlines the need for a systematic and comprehensive evaluation of HEFS ensembles across the different RFCs.
Increased skill in forcing forecasts generally translates into increased skill in ensemble stream- f low forecasts. As such, the HEFS should utilize the most skillful forcing forecasts at all ranges of lead time. To translate this skill to ensemble streamf low forecasts to the maximum extent, hydrologic un- certainty must be reduced as much as possible. For example, assimilation of all available measurements of streamflow, soil moisture, snow depth, and oth- ers would reduce the initial condition uncertainty. Although not implemented in the first version of the HEFS, a number of DA techniques have been developed and/or tested for the
Obviously, the different uncertainty modeling approaches available in the HEFS and in other research and operational systems will need to be rigorously compared via ensemble verification to define opti- mized systems for operational hydrologic ensemble predictions. Close collaborations between scientists, forecasters, and end users from the atmospheric and hydrologic communities, through projects such as the HEPEX, help support such intercomparison, as well as address the following ensemble challenges:
* seamlessly combine probabilistic forecasts from short to long ranges and from multiple models while maintaining consistent spatial and temporal relationships across different scales and variables;
* include forecaster guidance on forcing input fore- casts and hydrologic model operations, especially in the short term;
* improve accuracy of both meteorological and hydrologic models and reduce the cone of uncer- tainty for effective decision support;
* improve the uncertainty modeling of rare events (e.g., record flooding or drought) when availability of analogous historical events is very limited;
* integrate and leverage conditional uncertainty as- sociated with NWP and human adjusted forecasts of atmospheric forcings;
* improve computing power, database, and data storage, with forecasts becoming available at higher resolution and from an increasing number of models, to produce long hindcast datasets for all forcing inputs and hydrologic outputs for research and operation purposes;
* improve the understanding of how uncertainty and verification information is interpreted and used in practice by different groups (including forecasters and end users) to provide this infor- mation in a form and context that is easily under- standable and useful to customers; and
* develop innovative training and education activities to fully embrace and practice the en- semble paradigm in hydrology and water resources services and increase the effectiveness of probabi- listic forecasts in risk-based decision making.
ACKNOWLEDGMENTS. This work has been supported by the
RefeRences
Addor, N., S. Jaun, F. Fundel, and
Anderson, E. A., 1973: National Weather Service River Forecast System-Snow accumulation and ablation model. NOAA Tech. Memo. NWS HYDRO-17, 217 pp.
Bartholmes, J. C.,
Bogner, K., and
-, and F. Pappenberger, 2011: Multiscale error analy- sis, correction, and predictive uncertainty estimation in a flood forecasting system. Water Resour. Res., 47, W07524, doi:10.1029/2010WR009137.
Brown, J. D., 2013: Verification of temperature, pre- cipitation and streamf low forecasts from the NWS Hydrologic Ensemble Forecast Service (HEFS): Medium-range forecasts with forcing inputs from the frozen version of NCEP's Global Forecast System. HSL Tech. Rep. to the NWS, 133 pp. [Available online at www.nws.noaa.gov/oh/hrl/hsmb/docs /hep/publications_presentations/Contract_2012- 04-HEFS_Deliverable_02_Phase_I_report_FINAL .pdf.]
-, and D.-
-, and -, 2013: Evaluation of a nonparametric post-processor for bias-correction and uncer- tainty estimation of hydrologic predictions. Hydrol. Processes, 27, 83-105, doi:10.1002/hyp.9263.
-, J. Demargne, D.-
-, D.-<person>J. Seo, and
Bruen, M.,
Buizza, R.,
Burnash, R. J. C., 1995: The NWS river forecast system- Catchment modeling. Computer Models of Watershed Hydrology,
Charba, J. P.,
Clark, M., S. Gangopadhyay,
Cloke, H. L., and F. Pappenberger, 2009: Ensemble flood forecasting: A review. J. Hydrol., 375, 613-626.
Coccia, G., and
Cui, B., Z. Toth, Y. Zhu, and
Day, G. N., 1985: Extended streamflow forecasting using NWSRFS. J. Water Resour. Plann. Manage., 111, 157-170.
Demargne, J.,
-,
-, J. D. Brown, Y. Liu, D.-
Demeritt, D., S. Nobert,
-, -, -, and -, 2013: The European Flood Alert System and the communication, perception, and use of ensemble predictions for operational flood risk management. Hydrol. Processes, 27, 147-157, doi:10.1002/hyp.9419.
Du, J., G. Dimego, Z. Toth,
Fundel, F., S. Jörg-Hess, and
Georgakakos, K., D.-
Gneiting, T.,
Gupta, H. V.,
Hamill, T. M.,
-, -, and
-,
-,
He, M.,
He, Y., F. Wetterhall,
-, and Coauthors, 2010: Ensemble forecasting using TIGGE for the July-
Herr, H. D., and R. Krzysztofowicz, 2005: Generic prob- ability distribution of rainfall in space: The bivariate model. J. Hydrol., 306, 234-263.
Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15, 559-570.
Jaun, S., and
Jolliffe, I. T., and
Kang, T.-H., Y.-O. Kim, and I.-
Krzysztofowicz, R., 1999: Bayesian theory of probabi- listic forecasting via deterministic hydrologic model. Water Resour. Res., 35, 2739-2750.
Lee, H., D.-
Liu, Y., J. D. Brown, J. Demargne, and D.-
-, and Coauthors, 2012: Advancing data assimilation in operational hydrologic forecasting: Progresses, challenges, and emerging opportunities. Hydrol. Earth Syst. Sci., 16, 3863-3887.
McEnery, J.,
Möller, A., A. Lenkoski, and T. L. Thorarinsdottir, 2013: Multivariate probabilistic forecasting using ensemble Bayesian model averaging and copulas. Quart.
Montanari, A., and G. Grossi, 2008: Estimating the uncertainty of hydrological forecasts: A statistical approach. Water Resour. Res., 44, W0 0B0 8, doi:10.1029/2008WR006897.
Pappenberger, F., J. Bartholomes,
-,
Park, Y.-Y.,
Philpott, A. W.,
Pyke, G., and
Raff, D.,
Ramos, M.-H., J. Bartholmes, and
-,
Reggiani, P.,
Regonda, S., D.-
Renner, M., M. G. F. Werner, S. Rademacher, and
Saha, S., and Coauthors, 2013: The NCEP Climate Fore- cast System Version 2. J. Climate, in press.
Schaake, J., and Coauthors, 2007a: Precipitation and temperature ensemble forecasts from single-value forecasts. Hydrol. Earth Syst. Sci. Discuss., 4, 655-717.
-,
-, and Coauthors, 2010: Summary of recommenda- tions of the first workshop on Postprocessing and Downscaling Atmospheric Forecasts for Hydrologic Applications held at Météo-
Schefzik, R., T. L. Thorarinsdottir, and T. Gneiting, 2013: Uncertainty quantification in complex simulation models using ensemble copula coupling.
Schellekens, J.,
Seo, D.-J., V.
-,
-,
Singla, S., J.-P. Céron,
Smith, P. J.,
Solomatine, D. P., and
Thielen, J.,
Thirel, G., F. Regimbeau,
Todini, E., 2008: A model conditional processor to assess predictive uncertainty in flood forecasting. Int. J. River Basin Manage., 6, 123-137.
- , A. Weerts,
Van den Bergh, J., and
Velázquez, A., T. Petit, A. Lavoie, M.-A. Boucher,
-, F. Anctil, M.-
Weerts, A. H.,
Wei, M., Z. Toth,
Welles, E., S. Sorooshian, G. Carter, and
Werner, K.,
Werner, M.,
- ,
Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed.
-, and
Wood, A. W., and
Wood, E. F., X. Yuan, and
Wu, L., D.-
Yuan, X.,
Zappa, M., and Coauthors, 2010: Propagation of un- certainty from observing systems and NWP into hydrological models:
-, F. Fundel, and S. Jaun, 2012: A 'Peak-Box' approach for supporting interpretation and verification of operational ensemble peak-flow forecasts. Hydrol. Processes, 27, 117-131, doi:10.1002/hyp.9521.
Zhao, L., Q. Duan,
AFFILIATIONS: Demargne-Office of
CORRESPONDING AUTHOR: Dr.
E-mail: [email protected]
The abstract for this ar ticle can be found in this issue, following the table of contents.
DOI:10.1175/BAMS-D-12-00081.1
In final form
Copyright: | (c) 2014 American Meteorological Society |
Wordcount: | 11487 |
NEWSMAKER
GVEDC gets insurance estimate on ice damage
Advisor News
Annuity News
Health/Employee Benefits News
Life Insurance News