of the Fall Meeting of the
American Statistical Association (ASA)
Committee on Energy Statistics
October 28 and 29, 2004
Energy Information Administration
1. EIA Program to Evaluate Form EIA-920, Combined Heat and Power Plant Report, Robert Rutchik, SMG, EIA
In early 2004, EIA disseminated the new survey, Form EIA-920, Combined Heat and Power Plant (CHP) Report to CHP facilities to collect data on their total fuel used, fuel used to generate electricity, generation, and fossil fuel stocks. The survey’s primary purpose was to collect fuel used to generate electricity. It should be less than facilities total fuel used.
The EIA-920 evaluation was part of EIA’s new survey quality self-assessment program and it had three major purposes:
EIA wanted the Committee’s feedback on the EIA-920 evaluation plan, and how to conduct survey evaluations in general.
ASA Committee Suggestions: The Committee advice to EIA focused more on the former than the latter (in great part because EIA’s presentation focused on the EIA-920). The major points of the Committee’s advice were:
Intended EIA Response:
EIA has interviewed EIA-920 survey staff to get their evaluation on the survey’s strength and problems. EIA has also evaluated the survey’s first twelve months of data to determine if survey respondents understood the concept of the survey; that total fuel used reported should be greater than fuel used to generate electricity. Finally, EIA has not yet done an evaluation of data accuracy. It is trying to determine what measure or measures by which to do this.
2. A Customer Evaluation of the Short-Term Energy Outlook (STEO) Howard Bradsher-Fredrick, SMG, EIA
The authors developed and administered a short survey questionnaire to query customers of the Short Term Energy Outlook (STEO). The purpose of this survey was to determine how the STEO was perceived by its regular customers. The results of this survey are to be used for strategic planning purposes (performance evaluation) and to aid in the improvement of the STEO. The survey questions were based as closely as possible upon the instrument employed in the customer evaluation of the Annual Energy Outlook (AEO) and the International Energy Outlook (IEO) during the spring of 2004. A total of 500 customers who are not employed by EIA were randomly selected from the email list maintained for regular STEO mailings as potential respondents. In order to ensure a reasonably high response rate, three iterations of non-response follow-up were administered.
The ASA Committee recommended that EIA not conduct a customer survey on an individual product (e.g., the STEO) every year. These surveys should be done less frequently than annually. Also, the Committee suggested a rotating panel could be used to track changes over time. The core of the panel could be chosen from those who expressed interest in being re-interviewed. The 25.9% response rate isn’t bad, but it could be better if alternative modes were used in addition to e-mail. Sub-sampling of non-respondents could be used in conjunction with the multi-mode approach. Also, incentives might be useful in obtaining a higher response rate.
Intended EIA Response:
I think I can safely say that we will not conduct the STEO survey in 2005. However, I’m not sure when we do plan to conduct the survey again. In terms of the other suggestions (i.e., using a rotating panel, employing a multi-modal approach in conjunction with sub-sampling and/or providing incentives), these will be discussed with our SMG committee when the STEO and/or AEO/IEO customer surveys are again conducted. In my opinion, all of these further suggestions require a more intensive use of resources with probably little benefit. I think that my analysis of early versus late respondents’ responses provides evidence that providing more resources to increase the response rate will have little or no impact on the evaluation scores.
3.A. Assessing EIA Frames: An update on EIA-Wide Project Grace Sutherland, Howard Bradsher-Fredrick and Shawna Waugh, SMG, and Tom Lorenz, EIA’s Office of Energy Markets and End Use
In the Spring, 2004, EIA presented to the committee the activities to try to evaluate its frames. The activities included checking respondent lists, comparing data at an aggregate level, examining supply/disposition balances, and comparing price data volumes.
In response, the Committee recommended: 1) Ask known establishments to identify others within the same market, 2) calculate “propensity” scores by post-stratifying the sample using Census data 3) Apply the principles of dual system estimation to available frame data, 4) Obtain as much information from Census, without disclosing sensitive data in order to identify missing respondents, and 5) do not use “balancing item” for measuring coverage. The consensus of the committee regarding balancing items was that this was the least favorable method of assessing frames.
EIA’s Intended Response:
1) EIA has asked respondents to provide names of their customers, but respondents have been reluctant. However, we have had success in obtaining names of suppliers or competitors. Some of our electric power surveys, for example, have used this technique. EIA does not plan to pursue the adaptive sampling approach at the present time, but may consider it again in the future. 2) EIA is in favor of pursuing dual system estimation and has done this with EIA electricity renewable frame and the National Renewable Energy Laboratory frames. 3) Because of the difficulty in coming up with a quantitative assessment of frame sufficiency for all surveys, EIA will pursue a qualitative approach. EIA has formed a new inter-office team to evaluate frame sufficiency. The team is using the information gathered previously on its surveys and will look at frame stability over a longer period of time, than what we currently have (the past year and ongoing). The team will be researching whether or not surveys are using comparable lists, and if they aren’t being used, what are the reasons, i.e. budget constraints, legal issues, etc. You will hear the results of this effort at the Spring 2005 meeting of the ASA Energy Committee.
3.B. Dual System Frame Estimation:
The ASA Committee recommended an alternative computational procedure to be used in the dual system estimation process. This amounted to changes in the denominator of the probability equations. The Committee also suggested that a conference paper be developed in order to provide for a wider dissemination of this methodology.Intended EIA Response:
I will revise the denominator in the probability equation and re-run for the present application. Moreover, I will present the methodology and findings of this work at the Federal Committee on Statistical Methodology (FCSM) Research Conference in November 2005.
3.C. Presentation on Evaluation of Five EIA Frames conducted by Census
Shawna Waugh of EIA described the collaboration between the Census Bureau and EIA regarding the evaluating of several of EIA frames in the manufacturing sector. Tom Lorenz of EIA described what he learned by using data from EIA’s petroleum surveys (Monthly and Annual) to edit the Manufacturing Energy Consumption Survey (MECS) data.
EIA was interested in the committee’s opinion as to the direction EIA is going with the frames activities.
The committee provided the following advise to EIA: apply the Dual System Estimation procedure to the EIA frames being evaluated by the Census Bureau; stratify the results for the larger surveys (EIA-3 and EIA-860) in order to identify which specific industries have insufficient coverage since the composition of those establishments missing from the frame will impact the accuracy of estimates; and determine whether a frame is sufficient or not based on accuracy of the estimation. The committee also acknowledged the difficulty in matching since naming conventions between one survey and another.
Intended EIA Response:
EIA intends to act upon the Committee's three recommendations: we will apply the Dual System Estimation to the two large survey (EIA-3 and EIA-860); we will stratify (to the extent possible) by NACIS code for the two large surveys; and we will determine the sufficiency of the frames for the 2 large surveys based on the accuracy of the estimation.
4. The EIA Short-Term Regional Electricity Model: Capabilities and Data Requirements Phillip Tseng, SMG, and Dave Costello, EMEU, EIA
EIA is developing a thirteen-region electricity demand and supply model in response to questions on regional energy issues from high-level decision makers. One of the important features of the new modeling system is transparency; it must provide tractable results and insights that stakeholders can easily understand. The electricity module of the Regional Short Term Energy Model serves that function and provides a vital link to the new integrated regional energy system in answering several key region specific questions on:
The objective of this paper is to document the structure of the electricity model, identify data requirements, and demonstrate the model’s capability in providing users an understanding of issues facing the current electricity market. The model will be used for generating routine short-term forecasts. However, the model can also be used as an analytical tool to provide useful insights into the electric power market itself and into the principal interactions between electricity supply and fossil fuel markets. The interactions and linkages between the electricity market, the natural gas market, and heating fuel market will be described explicitly. In addition, this paper will illustrate potential modeling applications that can help policy makers in their decisions on deployment of technologies that may be cost effective socially but not privately.
ASA Committee Suggestions:
The committee raised two questions/issues related to the regional short term model:
The first is about the measurement and implementation of the transmission constraints.
The second is about interaction between prices and demand when the monthly demand is transformed into daily load curves. An additional question in this area is about model validation when baseline data is an issue. Part of the problem is that some of the data necessary to make really good assessments is proprietary data that's not immediately available and the question was raised whether or not if given the potential usefulness of this model, whether or not it might be worth expenditure of EIA resources to go out and actually collect that data directly.
Intended EIA Response:
EIA modeling team has performed extensive search of transmission data and found that the North American Electricity Reliability Council (NERC) publishes report on NERC regional transmission capacity. However, NERC regions do not have all the needed detailed information on power flows. In response to the committee comments and as part of model calibration effort, EIA will document the method used to estimate transmission constraints at the April 2005 ASA Energy Committee meeting.
To address the second question regarding the quality of the hourly load curve, EIA compared daily 24-hour load data from published sources such as the Midwest Independent System Operator (MISO), the New York ISO, and California ISO. The proprietary data provided by the consulting firm compares well with available data. EIA plans to discuss the transformation of monthly data to hourly load and the effects of prices on demand at the April ASA Energy Committee meeting as well. The presentation on model calibration will address data issues and the assessment of model performance.
5. Natural Gas Production, Frames, Samples and Estimation Session in two parts: (1) Preston McDowney, SMG, and (2) John Woods, OOG, EIA
At our spring 2004 meeting, you may recall that you heard about EIA’s new Natural Gas Production survey, the EIA-914. At that time, Inderjit Kundra described our proposal to select a probability proportional to size (pps) sample, using the EIA-23 frame as the frame for the new EIA-914. The Committee supported our efforts to design the sample. This session provides an update to that effort.
The Statistics and Methods Group (SMG) prepared a matched file of respondents to the EIA-23 in 2000 and in 2002, and used them as the basis as an assessment of the sampling and estimation methodology. We conducted a simulation using these data sets to select a pps sample according to the methodology we described in the spring. We prepared a variety of estimates including the traditional estimate using sampling weights, and several regression-based estimators (weighted () and unweighted (), with and without deleting outliers and influential observations.) The results of this assessment made us reconsider our sampling strategy. We are now planning to use a cut-off sample with a goal of approximately 90% coverage.
The Reserves and Production Division (RPD) of the Office of Oil and Gas has compiled a long history of operator-level data, they have evaluated a variety of estimation methods, and evaluated the changes that occur in the dynamic natural gas production industry.
This session will provide two presentations: first, a summary of the results of SMG’s simulation study by Preston McDowney and Kara Norman, and second, the information on the dynamic natural gas production industry, a description of the sample, and evaluations of estimation methods by John Wood and Gary Long of RPD, OOG.
5.A. A Summary of the Results of SMG’s Simulation Study, Preston McDowney, SMG, EIA
The main comment from this portion of the session was that the estimators were biased. The committee recommended using the Hajek Estimator as an unbiased alternative to the variations of the Horowitz Thompson Estimator that EIA was utilizing. The committee also recommended that EIA eliminate using any estimators that used both certainty and non-certainty operators in the estimation procedure.
Intended EIA Response:
EIA has incorporated the Hajek estimator into the simulation program. Estimation procedures that use both certainty and non-certainty operators are no longer being considered.
(5.B. SUMMARY OF THIS (WOOD) SESSION IS OUTSTANDING)
Summary of the ASA Energy Committee Suggestions
The main comments from this portion of the session revolved around the 90% cut-off sample. The first concern was how EIA was going to estimate variance. One proposed method was to look at a window of time and see how the estimation actually did perform when the data comes in. The second concern was the random perturbation of the 90% sample.
EIA’s Response to Committee Suggestions
Please edit and return.
The committee recognized that EIA’s estimation procedure attempts to calibrate for the random perturbation. EIA will closes monitor the issue and calibrate as needed.
6. Methods for Assessing NEMS Solution Data for Interpretive and Diagnostic Purposes, George M. Lady, Ph.D., SMG Contractor, EIA
The Committee agreed that the methods presented were valuable techniques for assessing changes in NEMS model versions and assisting in validating the NEMS projection methodology. The Committee specifically recommended:
(1) Undertake to document changes in NEMS model versions using regression analyses and the other diagnostics presented to the Committee.
(2) Add confidence intervals, based on the regression results, to the diagnostics used to assess changes in NEMS solutions.
(3) Consider specifying NEMS scenarios to support the regression analyses.
(4) Select sufficient solution series for the regression results to “stand alone,” i.e., provide balanced and intuitive portions of the NEMS solution.
(5) Assemble historical data that mirror the portions of solutions assessed via the regression analyses to: (5a) partition forecasting errors between general uncertainty and errors in forecasting exogenous variables; and, (5b) conduct regression analyses using historical data that correspond to those based on NEMS solution data to validate the variable/parameter sensitivities within NEMS.Intended EIA Response:
A project has been initiated to develop and provide PC-based software that accepts data extracted by graf2000 and processes the data for interpretive and diagnostic purposes. The software will enable linear and kernel regression and the processing and comparison of NEMS solutions across the model changes initiated in support of the Annual Energy Outlook. Features of the data and diagnostics utilized will be developed via interactions with NEMS support staff and beta versions of the software. As currently proposed, this project will respond to recommendations (1)-(4) above made by the
7. Introduction to Progress Assessment Rating Tool (PART) Program Evaluation Nancy Kirkendall, SMG, EIA
The Office of Management and Budget has long been interested in quantifying Federal program results to balance against their dollar costs. That information could then be used to focus Federal resources on those programs and projects that produce results.
The purpose of this two-part session was to discuss how external program and product reviews could be structured to demonstrate (or not) results. After a brief introduction to PART, we broke into two discussion groups to focus on and discuss (1) families of surveys, and (2) on models and forecasts. We then re-convened as one group for pooled discussion.
7. A. External Evaluations of Survey Programs, Brenda Cox, Contractor to SMG, EIA
Survey Evaluation is a broad construct, and can have both internal and external aspects. Over the past several meetings EIA has presented the Committee with information about EIA’s internal evaluation programs.
This survey evaluation project involves External Evaluations of families of EIA surveys. Brenda Cox will present her initial work on a template for an external survey evaluation. Dr. Cox is working under contract to SMG to help develop templates for an external evaluation of a family of related surveys and its component surveys. The intention is for this work to feed directly into the annual OMB Performance Assessment Rating Tool program, which was instituted to encourage rigorous performance assessment to boost the quality of Federal programs. The templates will be proof of concept tested on the Petroleum Marketing Surveys – a relatively stable, relatively well-documented family of surveys.
The Committee recommendations focused on the Survey Evaluation Template, which was reviewed in the meeting. Specific examples of recommended inclusions were QA procedures for ongoing verification of data, questionnaire testing procedures, customers for the data sets, the confidentiality assurance process, legislation under which the data are collected, interviewer training and supervision, and confidentiality assurances. Some of the Committee’s comments were quite insightful of pitfalls we would face. The Committee mentioned that methodology reports would be needed for each survey, that repetition may occur when you look across surveys, and that data consistency across surveys might be an issue.
Intended EIA Response:
Committee recommendations were incorporated into the Survey Evaluation Template so they could be included in the concept testing process. Other recommendations were used as input in preparing the Program Evaluation Template. This meeting will hear the results of the concept testing process.7.B. External Evaluations of Forecasting and Models, Douglas Hale, SMG, EIA
The result of forecasting and analysis programs and products is to “inform” decision makers and the public. Though dollar metrics are not possible, EIA could employ external evaluations to show if its programs and products are meeting important public policy needs with high quality analysis. We discussed past and current practices at EIA. Many years ago, all new models were reviewed prior to their use. However, comprehensive reviews are resource-intensive and were the cause of some intense internal disagreements. We now do fewer reviews, and primarily use external reviewers.
These comments focus on the models portion of this session.
The committee discussed some of the reviews that they have been involved in or are aware of, and suggested the following:
o Document who requested special studies and model runs. Having this and other materials well organized will allow EIA to present a more organized package.
o Rotate through your products on some type of schedule.
o Keep OMB informed of your activities.
o Any review should be at both the macro and micro level. It looks like EIA is OK for the micro, but need to develop macro-level reviews.
o Discussed at length some of the reviews that committee members have been involved with in academia. The demand for a review originates with the Regents, and the Dean co-ordinates it within the university. The department suggests a list of individuals, who need to be independent from but knowledgeable about the department. They will spend a few days on site examining documents and interviewing staff. The extent of the reviews depends on the purpose. One committee member described the very extensive reviews needed when they recently formed a separate statistics department. Reviews frequently on an ongoing basis as less involved, but all still require work on the part of the Professors.
o Several committee members mentioned that their departments use the review to request more resources.
o The ASA Energy Committee doesn’t see itself as independent. They suggested that we form a separate High-Powered Review Team patterned after an academic review. This team would examine all products on a rotating basis.
Intended EIA Response:
EIA will research what is likely to be required for a review similar to an academic department, and will explore the options of forming an independent High-Powered Review Team.
Update as of 4/11:
8. Data Analysis on the EIA-826-906
EIA Form 826 collects information, monthly, from regulated and unregulated companies that sell or deliver electric power to end users. It collects state-level sales volumes, sales revenues, and number of customers by end-use sector (residential, commercial, industrial, and total). The existing sample and methodology to estimate population totals will be described together with the results of an ongoing evaluation of this methodology. Statistical issues that remain, including formation of homogeneous subpopulations for estimation, making greater use of historical time series data and ways to increase precision by pooling similar data, will be presented.
EIA Form 906 collects monthly data from a sample of plants (of regulated companies and unregulated independent power producers) on total fuel used (by type) for power generation, total electricity generated by prime mover type, total fuel (by type) used to generate electricity by prime mover type, and fuel stocks at the end of the month. EIA Form 920 collects analogous data for plants for combined heat and power producers. Plans to extend the analyses described above for the EIA 826 will be outlined.
The Committee posed the following questions and made the following comments:
1. Is the cutoff sample maintained?
2. The Committee encouraged the use of both cross-sectional and time series analysis of the “beta plots.” The “beta plots” are plots of the estimated regression coefficients by region and month (over a two year period).
3. Regarding point two, the Committee suggested (for the cross-sectional analysis) that it might be fruitful to search for covariates to explain the differences among the estimated betas across regions.
4. There was discussion about empirical methods to define post-strata based on estimated company/state level regression coefficients.
5. The Committee suggested that we try to take advantage of the (apparent) spatial structure.Intended EIA Response:
The EIA-826 cutoff sample is not maintained (in the sense of ensuring current coverage at the level originally attained). As part of our evaluation we plan to draw a new cutoff sample (actually several cutoff samples corresponding to different levels of coverage). We will then be able to ascertain the overlap of this sample with the current one. (We expect the overlap to be large.) Second, we will evaluate the performance of this new sample.
EIA plans to do the analyses suggested in comments 2, 3 and 4 (above).
For the variables in the EIA-826, i.e., sales and revenue, use of spatial structure is not a promising alternative to post-stratification. However, for electricity generation (EIA-906, 920) this is an excellent suggestion.
9. Post-Stratification Methodology for the 2002 Manufacturing Energy Consumption Survey (MECS), Rick Hough and Stacey
The goal of the Manufacturing Energy Consumption Survey (MECS) is to provide a comprehensive set of detailed energy statistics for the Manufacturing sector for government and non-government policy-makers. The survey is sponsored by the Energy Information Administration and is conducted every four years by the U. S. Census Bureau.
The 2002 MECS is a probability sample of approximately 15,000 establishments selected from the manufacturing sector of 2002 Economic Census (EC) mail-file. For a large portion of the 2002 EC mail-file, the individual establishment-level industry classifications have not been updated since the 1997 Economic Census. Approximately 5% of the establishments are expected to change industry classifications based on their reported 2002 census data. Given that the MECS sample design incorporates both industry and geographic stratification, there are concerns regarding the representativeness of both the sample frame and the resulting MECS sample. Consequently, we developed a post-stratification procedure to address these concerns.
The 2002 MECS is the first MECS where the reference year coincides with the EC. This provides us with the opportunity to improve the survey estimates by using the results from the 2002 Census as a benchmark. Using the set of complete set of in-scope establishments in the EC and their associated energy data, we will derive an “energy consumption” control total for each sampled cell. The sample weights of MECS establishments within each cell will then be ratio adjusted such that their weighted “cost of energy” is equal to the corresponding control total. The focus of the paper is to describe this methodology and the presentation will summarize the impact of the procedure on the survey results.
Dr. Sitter asked Stacey Cole whether he had considered using calibration methods instead of post-stratification. Mr. Cole said he had not and Dr. Sitter said that thinking of the problem in that way might lead to some insights and that he had some references to give him. John Slanta of the Census Bureau spoke from the audience in response to a question about sensitivity of the variables and the bias introduced into the procedure. Discussion among the committee members centered around performing a simulation study to examine all the issues of robustness of the post-stratification. Of special concern was whether the method would still be appropriate with an increasing or decreasing sample size, or with different strata. The topic then switched to whether and how to post-stratify in years when the MECS and the Economic Census do no fall in the same year. The discussion mentioned going back to the previous Economic Census, which would introduce a time lag problem, or using the Annual Survey of Manufacturing (ASM), which is itself a sample with sampling error. No resolution was suggested. Another question came up about what metrics to show along with the variances to get at the quality of the estimation procedure. Agreement was that it was complicated, but no resolution was suggested. After a wide-ranging discussion, the topic discussed by committee, speakers, and audience was whether to keep the MECS and Census “in sync” as they were in 2002. Problems were brought up such as resource sharing (implied) and expanding the MECS interval to 5 years when 4 was already considered too long. Dr. Feder suggested that design issues such as this and a rolling sample be put on the agenda for April.Intended EIA (and Census) Response:
Census (John Slanta) is still going to prepare a paper on examining the quality and robustness of the post-stratification method employing some type of simulation experiment referred t above. The work has not yet begun but I was assured that it’s still very much part of their agenda. Also Stacy Cole will be looking at the calibration references that Dr. Sitter suggested. Speaking for EIA, April does not give us enough time to talk about 2006 and future MECS design issues since we have experienced delays in processing the 2002 data. (The aforementioned resource sharing.) We have already have had some informal preliminary discussions about an interim mini-MECS, but nothing has been explored beyond that. It is possible that October may be a better time to bring up MECS design issues.
10. Time Series Edits for the Electric Power EIA-906, Tom Broene, SMG, EIA
Simple time series models have provided useful edits for EIA’s weekly petroleum surveys and for monthly petroleum marketing surveys. This project examines the use of simple time series models for editing the EIA-906 Electric Power Plant Report that collects monthly data from a sample of regulated and unregulated generators (excluding combined heat and power plants.) The data varies in complexity among the 1600 responding plants, with some consuming only one type of fuel, but others having several types of equipment utilizing a variety of fuels. We will describe our progress to date on developing simple exponential smoothing models for generation, fuel consumption, stocks, and the ratio of generation to consumption. Our initial efforts have focused on only the regulated respondents with complete data. We will have examples of cases where the simple non-seasonal model fits well and examples where it does not.
The committee discussed the data collected on the form and some of its complexities, and the existing edits in the Internet Data Collection. We discussed what types of errors would be flagged by an exponential smoothing model. The committee suggested several approaches to try on this project:
o Continue the basic exploratory data analysis to determine where the model fits and look at plant characteristics. Try a transformation of the data.
o Look into setting plant-specific flags, so that only flag the first month if the plan has an equipment change or is a peaking plant.
o Could try to incorporate weather, although that can be complicated and may depend on weather in other regions also.
o Try fitting a range of alpha values, and try a seasonal exponential smoothing model such as Holt-Winters or an adaptive model minimizing the squared error with each period.
o Spend the time to build or select an agreed-upon test dataset to use for this.Intended EIA Response:
EIA will conduct the exploratory data analysis, starting with an investigation of the fitted alpha values by plant characteristics. EIA will pursue the other suggestions as time or contracting funds are available.
If You Were King? Howard Gruenspecht, Deputy Administrator, EIA, Discussion Leader
Each year as the agency confronts new budgets, continuing resolutions and funding issues related to EIA programs and products, management is called upon to deal with program choices due to these restraints. If EIA’s budget really looked grim and if we needed to spend at “continuing resolution” levels,
In this session we are seeking advice from the
Intended EIA Response: