of the Fall Meeting of the

American Statistical Association (ASA)

Committee on Energy Statistics

October 20 and 21, 2005

with the

Energy Information Administration

1000 Independence Ave., SW.

Washington, D.C.  20585

Introductory Sessions:

Opening Remarks:  Guy Caruso, EIA Administrator to include:

1. Update on the External Review Team

2. New EIA Homepages

3. EIA-914 Data

4. Hurricane Katrina


5.  Since the Spring Meeting, Nancy Kirkendall, Director, SMG

6.  Results of the Simulation Study for the EIA-914, Preston McDowney, SMG

7.  Update:  EIA’s Regional Short-Term Energy Outlook, Margot Anderson, Director, EMEU

Session Topics where EIA seeks ASA Committee Advice:

1.   Short-Term Forecasting Performance Measures: Accuracy Evaluation, Margot Anderson, Director, EMEU

This plenary session: (1) presented and discussed diagnostic tools for gauging forecast accuracy; (2) provided an initial assessment of forecast errors for key variables in the STEO system; and (3) discussed how the errors can be used to improve model performance.

Committee Advice:

Tom Rutherford’s points, first ASA discussant,:

1.  the main reason for having a model and paying attention to what the model does and how if forecasts is to have a better handle on what data are important to be collecting;

2.  the public goods . . . produced within the DOE (is) the information, what data do we need to generate . . . a good forecast;

3.  good government forecasting encourages good work by the private sector.  So good data supports good forecasting;

4.  “. . . so the purpose of the government agency . . . is to identify what data is important and to help” . . . collect it and make it available.

5.  natural science forecasting such as with hurricanes is not a fair standard for judging energy forecasting; and,

6.  The Regional STEO may not be a good model for doing policy analysis.

Moshe Feder, second ASA discussant, points:

1.  focus of my comments will be “on one question, which is the forecasting errors and how to measure them, what to do about them?”

2.  Demographers use brainstorming, statistical inputs and judgment for population projections. So suggests,  “there could be some serial correlation which (EIA) could exploit and the other part, which you said you are not doing yet, original patterns;

3.  I suggest EIA “look at the prediction errors both temporally and spacially and see, is there anything that we are missing, and that could contribute to finding out which you need to augment your model.”

4.  . . . acknowledge that the “mean absolute error is a good measure but I think mean squared error is another tool” . . . answered by the F-statistic.

Committee Comments and Suggestions:

1.  Statistically based models like the Regional STEO usually precede judgmental discussion whereas structural models are used in weather and hurricane forecasting and based on statistics and judgment (Edmonds);

2.  Go “back and put in all the real variables and ask the question, how would my forecast have changed if I had known exactly what the world market was doing and you’d gotten all that right (in) my hypothesis . . .“  This approach “will help you know whether you’ve got a serial correlation problem in the model or whether it’s outside the model. (Edmonds)

3.  Suggestion: If you are comparing your forecast to private forecasters, you might want to ‘ . . . see if they are understating prices on a regular basis.” (Bernstein)

4.  Also, some things that are “missing in the models (are) futures and options and trader behavior and things like that which affect the markets in very unpredictable ways . . . “  Seems like these have been having more impact on gasoline prices than ever before. (Bernstein)

5.  Think about missing data as well as model metrics. (Sitter)

EIA Intended Reaction to Committee Advice

          (EIA response(s) outstanding)

2.  Vehicle Energy Use: What We Did and What It Tells Us, Mark Schipper, Energy Markets and End Use (EMEU)

Mark Schipper discussed the methods used to develop the EIA report Household Vehicles Energy Use: Latest Data & Trends. Topics included (1) the sources – both EIA and non-EIA – for three fundamental inputs crucial to developing annual household vehicles energy consumption and expenditures information: composite fuel economy, retail fuel price, and in-possession vehicle-miles traveled; (2) the methods behind adjusting imputed composite fuel economy to calculate an on-road fuel economy; (3) the rationale behind further adjusting on-road fuel economy to calculate an in-use fuel economy based on actual household driving characteristics; and, (4) the derivation – our ultimate goal - of annual energy consumption and motor fuel expenditures information from these adjusted inputs.

Mr. Schipper requested the committee to comment on two questions: (1) do the extensive assumptions in creating this data set cause any concern about this as an official EIA data release and (2) should EIA expend resources to do this again with the National Personal Travel Survey – the U.S. Department of Transportation’s household travel survey – in 2008.

ASA Committee Advice

The ASA Committee recommended that EIA employ the use of “modeled” rather than “imputed” data for its related analytical and data products, as the committee identified the derivation of the transportation statistics as a deterministic model, rather than simple point imputations. Given the modeled nature of the data and strict documentation of methods used, the extensive assumptions caused the committee no concerns with EIA releasing this information. The ASA Committee recommended that EIA continue its partnership with the U.S. Department of Transportation in any future DOT travel study. Acknowledging that there are many uses for transportation data by researchers, one member suggested that EIA, in its reporting and analysis of these data, focus on issues related to reducing gasoline demand and therefore prices, in an effort to use these data to inform decision makers on near and long term demand issues.

EIA Intended Reaction to Committee Advice

EIA intends to follow the Committee advice to continue the efforts to coordinate with the U.S. DOT’s 2008 NPTS, to the extent resources and staff levels allow. While EIA has already released a compendium of statistics and an in-depth analytical report from these modeled data, EIA has offered a public-use version, in microdata format, for researchers to investigate near and long term energy demand issues.

3.  Preserving EIA Trustworthy Datasets, Model Documentation and Contextual History, John Paul Deley, National Energy Information Center, EIA, John Paul Deley, EIA Records Officer, National Energy Information Center

The purpose of this session is to present an overview of EIA’s recordkeeping practices and solicit advice and recommendations on issues surrounding the life-cycle management of EIA statistical datasets including documentation for: survey planning and design; components of modeling systems; standards and procedures for processing and editing; retention of information products and the applications and software used to collect, analyze, access, use and maintain EIA’s e-assets.  EIA will introduce the committee to plans for incorporating records management transparency into the agency’s routine business processes and seek advice on its current records initiatives and plans for increased efficiency in the management of its records.

The session will include an overview of federal record keeping requirements; an overview of EIA current records management practices; and an overview of 2005 records management initiatives.  It will also cover ongoing efforts to create retention and disposition schedules for electronic system program components; the efforts of the new EIA History Committee and initial deliberations on a possible EIA Content Management System (document / data repository).

The committee will be asked to address the following questions:

·  What methodologies and best practices should be put into place to insure the “trustworthiness” (authenticity) of EIA recordkeeping systems?

·  Which web accessible finding aids (including a Master Publications Index) might improve EIA customer access to permanent (pre-web) products (e.g. this committee’s historical materials)?

·  How should modeling system documentation be preserved?  By whom and for how long?

ASA Committee Advice:

1.  EIA should create a central e-repository for the preservation of its data, documents and historical image files (including maps).

2.  EIA should adopt cross-organizational metadata standards for improving the interoperability of its information assets.

3.  EIA should create on-line indexes and finding aids to its legacy data and publications.

4.  EIA should examine industry, government and academic best practices for the preservation of modeling systems and prepare a report for future discussion at subsequent meetings of the full committee.

EIA Intended Reaction to Committee Advice

ASA advice was referred to the Business Process Improvement Workgroup of the EIA Quality Council.

ASA advice will be incorporated into future sessions of the EIA Strategic Planning process.

ASA advice has been considered by the EIA History Committee in reference to e-map preservation and documentation of pre-NEMS modeling systems.

ASA advice will be referred to NEMS re-design team when it is constituted.

4.  Learning from the Past: Updating Data Quality Efforts, Renee Miller and Alethea Jennings, SMG, EIA. 

Many years ago EIA presented data comparisons, documenting why data series that purport to measure similar concepts differ, to the public as part of detailed reports assessing data quality.  An example is EIA data on motor gasoline sales, EIA data on product supplied of motor gasoline, and Federal Highway data on sales. These reports were discontinued in the early 1990’s.   SMG is considering a new Web product to display such data comparisons. Because the Web presents an opportunity to present information as we obtain it, we can publish information without waiting for a detailed product to be completed. 

We are now working on ideas for this new product and welcome suggestions from the Committee. This paper presents background on how data quality assessments were previously performed and discusses what did and did not work well.  It also presents an example of what we are thinking about for the future.  Past activities may serve as an impetus for other ideas for Web products and are related to the paper on survey self-assessments.

ASA Committee Advice

The ASA Committee recommended that EIA be very clear about its objectives for this presentation and for all others.  They noted that there was a mix of talking about data quality improvement and reporting about data quality.  Several members said that when we know or suspect there is a data problem we should provide something very specific (perhaps a pop-up) that says either we know there is a problem and we are fixing it or we're not sure what it is yet, but be careful. The Committee also noted that the language we use is important.  For example, saying that there is a “data quality problem” comes across differently than saying that something has “higher variance than we expected.”  They also thought that we should focus on our biggest data quality problems first.

On the Web-friendly question and answer-product that we were proposing, the Committee saw some merit in it as a teaching tool for new users.  They advised us, however, to determine who are audience was because the questions we presented to them did not look appropriate for sophisticated data users.  They also advised us to start by measuring data quality, then prepare an internal report, and then a Web product.  In this way there would be a strategy that doesn't increase workload too much but recognizes that these things are going to be linked to a web-based system.

EIA Intended Reaction to Committee Advice

EIA will strive towards clearly stating objectives in future presentations.  We have begun to identify our data quality problems, so that we can determine which ones are our biggest and focus on them, by contacting internal users as the Committee suggested.  We will also examine performance measures as an indication of potential areas to improve data quality.   We intend to discuss the issue of acknowledging data quality problems to our users with our Inter-office Issues Group.  We will rethink the Web-friendly product idea, but do not have an approach as yet.

5.  Can Discrepant Estimates Be a Good Thing?, Renee Miller, SMG, EIA

In August the Washington Post ran an article with this title showing “conflicting” data on the poverty rate using the official government measure and a measure based on the National Academy of Sciences recommendation.  The Census Bureau was cited as the source for both data series.  However, they are not the only agencies with conflicting or potentially conflicting data.  In the presentation, “Learning from the Past: Updating Data Quality Efforts,” we show various estimates that could be surrogates for motor gasoline demand.  In addition, EIA has shown estimates of sales of fuel oil and kerosene benchmarked to product supplied and estimates that are not benchmarked, carefully labeled as different series.  For monthly natural gas production, we are showing data from our new survey that we consider to be experimental as well as our official data series.  So we think we are adequately explaining what we are doing, but are we?   We would like to have a discussion with the committee on what they think of these practices, how we can make things less confusing for data users and avoid headlines about conflicting data.  This will be an open discussion and we would like to hear from each of the Committee members.

ASA Committee Advice: 

The ASA Committee reinforced its recommendation from the previous discussion to disclose our data issues and challenges.  They noted that high quality software manufacturers not only disclose the problem of the week, but tell you historically what problem was generated when.  Furthermore, the Committee pointed out that we have a bigger mission and that is to educate users, so that they will be aware that data are not perfect. They also suggested that we let the user know, in general, how we deal with things that we don't expect to see in the data.  For example, we would explain to the user that we flag unusual occurrences and that someone will be trying to resolve them.  They may then see additional explanations or the flag may be removed when the problem is resolved.  By having a process in place and describing it we would be showing the users that we’re on their side and are not concealing anything.

EIA Intended Reaction to Committee Advice

We are currently working on describing the unexpected relationship we have observed between the estimate of residual fuel oil from the Manufacturing Energy Consumption Survey and EIA’s estimate of industrial consumption of residual fuel oil.  We also plan to continue to pursue hypotheses that may lead to a resolution of the problem.  We plan to raise the issue of flagging unusual occurrences such as this with EIA’s Inter-office Issues Group and will let the Committee know of our progress at the next meeting.

5.  Survey Self Assessments, Tom Broene, SMG

This session will briefly review self-assessments conducted by government offices in Europe, especially Sweden and Portugal.  Progress on this effort at EIA will be described, which began with us first developing a self-assessment questionnaire, and then conducting meetings for each of our surveys.  The interviews touched on all aspects of survey operations, but did not cover as much detail as a quality profile.  In addition, survey managers were asked to complete a one-page table summarizing the basic quality measures and how readily available these were.  They were also asked to select a survey specific target for improvement.  Current progress, usefulness, and options for future work will be discussed.

ASA Committee Advice

The committee recommended that the assessment program should be flexible in terms of frequency, team composition, and the level of detail in the questions.  The committee is concerned about maintaining the interest of EIA staff in these assessments in the future.  They recommended that we identify specific items learned and improvements made as a result of the interviews, and that this be completed prior to any second iteration of this effort.  They also recommended that we consider other approaches to evaluating and improving quality that would require less staff time.  The committee also offered several specific suggestions, including focusing on the biggest problem for each survey, having some external members on the interview team, and producing the written reports in a more timely manner.

EIA Intended Reaction to Committee Advice

We plan to discuss the results of the assessments with EIA management, focusing on specific items needing work and successes to share.  The schedule and scope of future work will be decided in discussions with EIA management.

6.  Data Errors, Structural Change and Time Series Shocks in the Electricity Market, Lindolfo Pedraza, author, and Joel Douglas, presenter, SAIC, SMG contractor

This paper uses publicly available electricity generation micro level data and current imputation methods to test if microeconomic theoretical concepts can be used to: improve the accuracy of currently used survey sampling and imputation methods and the time series processes these survey aggregates represent. In addition to this, resulting imputation models are compared across sample years to characterize structural change in the electric industry.

ASA Committee Advice and EIA Intended Reaction (Combined)

The ASA Committee recommendations were grouped in three areas: documentation, modeling and methodologies. The modeling and methodologies issues were separated to distinguish between existing models and alternative methodologies used to improve the efficacy and efficiency of existing models.

Documentation:  The committee recommended documentation in the paper be improved, and that EIA make available a description of the method that is currently working, specifically, what would be the outcome from the ideal imputation system.

In response, (1) Documentation of current imputation models is an on-going task in CNEAF. Current efforts focus on documenting imputation systems on an annual basis to keep track of improvements in modeling methodologies and assumptions; (2) Although documentation efforts focus on the reliability and improvement of existing imputation models some efforts have been spent in determining how the ideal imputation method should manage observed survey data and predict for non respondents, and (3) Current methodology tests provide preliminary evidence of structural change between 2001 and 2002. Moreover, the strongest evidence is found in states that actively perused de-regulation efforts during those years such as Texas, California and the Atlantic region.  

Modeling:  The Committee asked, How did the sample change over time?  Should such outlier data points be flagged on the website? And, Can other information be used in the regression? Would this benefit the fit?

In response, (1) currently, surveys samples are reviewed every year for representative robustness. While EIA-826 uses a cutoff sample some state and regional samples are balanced manually. CNEAF is currently working on the development of cutoff sampling algorithms to eliminate possible bias in the balancing process; (2) outlying and influential observation identification is a key determinant of survey imputation accuracy. However and beyond estimation rules, what makes an observation outlying or influential is out for debate; (3) specifically, are we finding many outliers because the model has a poor fit or is it due to many influential observations that need to be kept of the imputation process? We can not publish what firms become outliers because of this; (4) we are currently experimenting with multiple variable regression imputation however, we are only using other EIA data and heavy co-linearity issues have been found. In addition to this, outlier identification and detection becomes cumbersome in this scenario.

Regarding Methodologies, the Committee asked, Are you sure that there is no relationship between states?  That is, is there more to learn from all states, instead of a single state?  Is the difference between states enough to justify Seemingly Unrelated Regressions (SUR)?  Then they offered, cross sectional estimations could ‘tune’ the estimation methods rather than ‘drive’ them.  Finally, they asked, is there an efficiency gain in SUR as is?

In Response, (1) preliminary tests show two aspects. There are both differences and similarities across state data. First, states have different economic and seasonal environments in addition to different mixes of residential, commercial and industrial electricity demand. Second, every state in the Nation has a different mix of electricity generation capacity by consumed fuel.  (2) In addition to this, while some states are net importers of electricity others are net exporters. Moreover, some states sell their excess electricity to some states during the summer and others during the winter. As a results market conditions across states mater as the eastern and western interconnects balance demand and supply for electricity,  Finally, (3) preliminary results suggest these differences are real. Current work focuses on the comparison of these state parameters and the definition of appropriate testing procedures.

Regarding SUR justification, Yes preliminary results show how, in certain cases, SUR performs better than current methodologies with or with out outlier detection tools. In addition to this, we are currently working on how to use SUR measures of fit to identify optimal stratification schemes.

7.   Frames Comparisons of the EIA-3 and EIA-860 with the Manufacturing Sector of the 2002 Economic Census and the 2002 Manufacturing Energy Consumption Survey, Vicki Haitot and Richard Hough, U.S. Census Bureau, and Shawna Waugh, SMG, EIA 

The Energy Information Administration contracted with the U.S. Census Bureau to conduct five frame evaluations for CNEAF surveys. This analysis was intended to evaluate whether or not EIA has sufficient coverage of manufacturing establishments within each survey frame. The Census Bureau has completed these evaluations and presented the results for the EIA-5 (Quarterly Coal Consumption and Quality Report, Coke Plants), EIA-63a (Annual Solar Thermal Collector Manufacturing Survey) and the EIA-63b (Annual Photovoltaic Module and Cell Manufacturing Survey) during the April 2005 meeting of the ASA Committee on Energy Statistics.  The paper contains the results documentation for all five evaluations, and the Fall 2005 presentation focused on the results of the evaluations for the EIA-3 (Quarterly Coal Consumption and Quality Report, Manufacturing Plants) and the EIA-860 (Annual Electric Generator Report” for Combined Heat and Power Plants).

ASA Committee Advice:

The Committee thought EIA received fairly high coverage for the five surveys, ranging from 70 to 92 percent by volume and 75 to 100 % by units. Committee members agreed and understood that matching, even a simple two-step process, may have resulted in a lower coverage rate since some of the nonmatched establishments may actually be on both EIA and Economic Census frames and were simple not matched.

One suggestion to consider if this evaluation is conducted in the future is the possibility of collecting the EIN number on EIA surveys.  This number is available for Census establishments and collecting this information on EIA frames would facilitate matching of respondents on EIA frames to the Economic Census and reduce the number of nonmatches.

The committee thought it was important to take a “cost-benefit” approach when seeking to identify establishment missing on EIA’s frame. They recommended that it may be useful to focus on portions of the frame (e.g. states or industries) with the lowest coverage.

EIA Intended Reaction to Committee Advice

Additional resources may be required to conduct a study to identify missing establishments.  In response to this concern, SMG prepared a brief paper identifying approaches (e.g. states and NAICS codes) with the lowest coverage as well as websites for identifying missing establishments on all five surveys.

8.  The Relationships Between Various Price Series: Are Futures Contracts Prices Good Predictors of Future Spot Prices?  Bill Trapmann and Lejla Alic, Office of Oil and Gas, EIA

The purpose of the initial analysis conducted was to determine whether the prices of the natural gas futures contracts traded at the New York Mercantile Exchange (NYMEX) can be used to predict the Henry Hub spot price. This analysis compared realized Henry Hub spot market prices for natural gas during the three most recent winters with futures prices as they evolve from April through the following February, when trading for the March contract ends. Comparing monthly futures and spot market prices provides a basis to assess the performance of futures prices as a predictor of spot prices. An examination of price data for recent years shows that futures prices are relatively poor predictors of the Henry Hub spot price that is realized during the corresponding delivery or target month, and even the final futures price for a given contract often does not anticipate correctly the realized average spot price.

Two representatives from the natural gas division presented the paper and other options for additional work, such as extending this approach to address prices of other fuels. The Committee was also asked whether the current approach should be modified in order to better meet the established objective and to address other objectives that would need to be identified. 

Committee Advice

The Committee's recommendation was to not extend this analysis into other fuels, as this analysis seemed to be sufficient to also address the generic question of the use of futures prices as predictors of spot prices.  The committee also suggested incorporating futures prices into the STEO analysis to refine the STEO price forecasts, and to conduct a comparative analysis of STEO price forecasts and futures contract prices with respect to their ability to predict prices.

EIA Intended Reaction to Committee Advice

EIA will utilize the existing report in response to inquiries concerning the pricing of other fuels as suggested above.

The additional analyses to refine the STEO methodology or examine the STEO forecasts relative to futures contract prices is under consideration for future work planning, but has not yet been implemented.