Summary of Comments from the American Statistical Association (ASA)

Committee on Energy Statistics

at a meeting with the

Energy Information Administration (EIA) 

October 5 and 6, 2006

Washington, D.C.

1.  The National Energy Modeling System:  The Next Steps Forward, Susan H. Holte, Technical Assistant to the Administrator

This presentation to the Committee provided a status report on EIA’s effort to upgrade and enhance the National Energy Modeling System (NEMS). While NEMS has been continuously updated and revised over its 15-year life to maintain relevancy, the time is right to step back, reconsider parts of the model, and request additional resources in support of the effort to upgrade and enhance the model.

At this time, an EIA team has been formed to lead the effort. The team has requested input on NEMS from stakeholders within the Department.  Several offices in the Department are very interested in the effort because they must use NEMS to illustrate the potential energy market and economic benefits of their programs as part of their budget submissions. Also, letters have been sent to various energy trade groups and research organizations, as well as to some of the National Laboratories with interest in energy modeling. The NEMS modelers and analysts in EIA’s Office of Integrated Analysis and Forecasting have also provided input on potential model enhancements that are likely out of the scope of typical annual updates.

Over the next several months, the team will synthesize the input and prioritize the potential model enhancements in support of future budget submissions and other planning activities.

EIA’s activities and approach mentioned in M’s Holt’s presentation led to questions and discussion but not advice.  The purpose of this session was to inform the Committee.

2.  Needs and Priorities for Consumption Information in the Future,  sometimes called, “Back to the Future:  Learning from the Past About the Future of Energy Demand Data Collection”  Dwight French, EMEU, EIA


The purpose of this presentation was to acquaint the Committee with the limitations of the energy consumption data program in EIA, and solicit the opinions of the Committee on the criteria that should be used to decide what if any additional programs or improvements to already-existing programs are worth pursuing in this area, and how the need for consumption data should be balanced against the need for data from elsewhere in EIA about other aspects of energy throughput.

The energy consumption area of EIA, rather than covering all energy use and breaking it down (the “accounting” approach used by supply information systems in EIA), covers individual sectors of U.S. society, and provides information about the characteristics of the consumers in each sector, as well as their energy consumption and expenditures.  The characteristics data are used to provide more detailed understanding of energy consumption.  However, some sectors are missing from consumption system coverage.  Also, not all energy sources are covered within covered sectors, and many characteristics of consumers in the various covered sectors are not collected.  There are good reasons for these omissions:  expense of providing more information; burden control; inability of respondents to provide many technical characteristics; inability to take advantage of some detailed information that might be collected; etc.  Many of these obstacles could be overcome and the consumption data program made more complete and responsive, but it would take people and dollars to accomplish it.

Therefore, the questions for the Committee members at the session were:  what consumption area enhancements should be considered, especially keeping in mind that at most limited additional funds can be expected in the immediate future; and what criteria should EIA use in deciding how to allocated funds between EIA’s consumption data programs and other pressing EIA program needs?

ASA Committee Recommendations

(1)   Focus on the sectors of U.S. society where changes are occurring most rapidly.

(2)   Focus on directions that emerging energy policy seems to be pushing; for example, it might be wise to prepare to deal with emerging renewable energy forms, or prepare to measure emergence of alternative vehicle fuels.

(3)   Focus on doing more detailed breakdowns of “other” energy use (the remainder after identified end uses have been quantified).  Especially for electricity, “other” uses are growing over time.

(4)   Consider doing mini-surveys between regular cycles of the surveys, to get basic information more often.  Or perhaps even go to something closer to “continuous measurement”, where a data on a subset of the sample is collected every year, equivalent to doing a big survey every 4 years, and data are produced every year, highly correlated from year to year, through a 4-year rolling average estimator.

 EIA Intended Response(s)

(1)   We try to focus on change in the various sectors, but it can be difficult because our surveys are conducted only every 4 years.  If we miss something new, it’s awhile until we have another chance to address it.

(2)   It’s fair to say that EIA will be keeping a careful eye on renewable energy forms and incorporating these into data collection.  We have a problem in the consumption area with emerging subject matter areas, because our relatively small sample sizes don’t allow us to address relatively rare events.  Also, consumers tend to be ignorant of emerging areas, so it can be difficult to get reliable data on new things.  It is interesting that the hot new emerging subject matter area at the moment is alternative fuels for highway transport – an area in which EIA does not currently collect data.

(3)   EIA has done some additional breakouts of electricity use for appliances over time, but it can be difficult to model the effects of a myriad of small, not-all-that-intensive appliances that consume a significant amount of energy because there are so many, and so many different ones.

(4)   EIA has thought about doing this with MECS.  The critical issue is timing an interim effort right, budget-wise, and maybe moving a regular survey back once to make a budget available for such a special effort.  MECS would be the easiest to do logisticly, and maybe could serve as a first initiative, with other consumption surveys following its lead at a later time.  The problem with taking such an approach for the consumption surveys is cost inefficiencies, though there could be some cost savings through amortized processing systems, use of already-trained interviewers, etc.

3.     Using Models to Detect Outliers in Refinery Data, Lawrence Stroud and Phillip Tseng, SMG, EIA


Using Models to Detect Outliers in Refinery Data*

Phillip Tseng, SMG,, 202-287-1600,

Lawrence Stroud, SMG,, 202-287-1722

October 5, 2006


Refinery operations in the Unite States provide useful information on the supply of petroleum products. The EIA Form-810 provides a very comprehensive set of data on inputs, outputs, and changes in inventory of all refineries in the U.S. In recent years, Form-810 data receives special attention by stake holders as it provides useful information on the trends in demand for major products and can have profound implications on prices of crude oil and major petroleum products.

The goal of this paper is to review the refinery input and output data system that report the demand for petroleum products, to identify possible data inconsistencies, and to make adjustments if necessary to the time series demand data. Several simple regression models will be used to examine the relationship between refinery gain and production of major products. The statistical relationships developed in these models can be used to identify potential outliers and to help data analysts verify data accuracy with respondents. The end product will be a methodology and a computer program that EIA may use to detect outliers and improve data quality.

The Energy Information Administration (EIA) defines demand as product supplied, which measures the disappearance of products from primary sources, i.e., refineries, natural gas processing plants, blending plants, pipelines, and bulk terminals. In general, product supplied of each product in any given period is computed as: field production, plus refinery production, plus imports, plus unaccounted for crude oil, minus stock change, minus crude oil losses, minus refinery inputs, minus exports. Refinery production is the largest component of product supplied and gasoline dominates U.S. refinery production, therefore, this paper will focus only on refinery operations of gasoline and its effects on refinery gain. Specifically, the focus of this paper includes inputs to refineries, production of gasoline, and refinery processing gain. Historically, refinery volumetric gain ranges from 4 to 7 percent and is closely related to gasoline production. In 2005, refinery production averages about 17.7 million barrels per day and a one percent increase in refinery gain amounts to more than 170 thousands per day additional production or about 62 million barrels per year.

*This is a working document prepared by the Energy Information Administration (EIA) in order to solicit advice and comment on statistical matters from the American Statistical Association Committee on Energy Statistics. This topic will be discussed at EIA's fall 2006, meeting with the Committee to be held October 5 and 6, 2006.

ASA Committee Recommendations and EIA Intended Response(s)

·        Multi-collinearity: data series may be correlated in the refinery gain equation.  We should correct that before worrying about autocorrelation.  Response: we will re-run the refinery gain equation with gasoline production and crude oil input as independent variables.  We will run covariance analysis for independent variable in the gasoline production equation.

·        Missing variables: Joann Shore, from the audience, said the equation may have missed one variable.  She indicated that the ACU cut point may vary with seasons and the model may need a variable to pick up seasonality.  Response: we can insert seasonal or monthly dummy variables to capture the effects of seasonality on refinery operations.

·        December tax treatment.  Henry Brooks, from the audience, said that December production is higher than July and August because refiners want to push inventory out to avoid taxes on inventory.  Response: if it is for tax purposes, refiners should also increase refinery inputs.  Does tax treatment also affect the way refiners operate in a way that alters input and output relationship and technologies?

·        Use PADD data to run regression analysis instead of national data.

4.  Electricity 2008  Introduction: The Broad Plan, Robert Schnapp, Director, Electric Power Division, CNEAF, EIA   


EIA presented a summary of the current status of the “Electricity 2008” project.  This project was initiated by the Electric Power Division (EPD) to reassess the current electric power surveys, which have to be submitted to the Office of Management and Budget every three years for approval.

The Committee was presented with the major issues that the EPD is addressing.  These include: 1) merging the Form EIA-906, “Power Plant Report,” the Form EIA-920, “Combined Heat and Power Plant Report,” and the Form EIA-423, “Monthly Cost and Quality of Fuels for Electric Plants Report;” 2) merging the Form EIA-860, “Annual Electric Generator Report,” and the static information from the Form EIA-767, “Steam-Electric Plant Operation and Design Report;” 3) modifying the samples for these forms; and 4) revising the data confidentiality policy.

ASA Committee Recommendations:

The purpose of this presentation was to inform the Committee on the status of the project.  Several questions were raised in the discussion period that followed.  These included: the reasons for limiting the sample, why the CHP fuel needed to split out, the choices for the cut-off samples, the future importance of the CHP sector, and non-response issues. 

EIA Intended Response(s):

These questions were addressed and there were no outstanding issues or suggestions to change the proposed course of the project.

5.  Electricity 2008  Forms Redesign Issues Resulting From Combining Electric Power Surveys, Bob Rutchik, SMG, EIA

Abstract: This presentation outlined issues that had arisen from merging five EIA electricity survey forms into two new surveys. EIA will merge three monthly surveys into one survey that will collect data both monthly and annually. These surveys are:

·        EIA-423, Monthly Cost and Quality of Fuels Electric Plants Report

·        EIA-906, Power Plant Report, and

·        EIA-920, Combined Heat and Power Plant Report.

The second combination will merge two annual forms into one annual survey. They are:

·        EIA-860, Annual Electric Generator Report, and

·        Parts of the EIA-767, Steam, Electric and Plant Operation and Design Report

The session was intended to ask the Committee’s advice on the following:

·        Does EIA have the correct data requirements for the EIA-423, EIA-906, and EIA-920 combined form?

·        What should EIA be aware of in formulating an imputation methodology for monthly data for Combined Heat and Power Plants (CHPs) on the merged EIA-423/EIA-906/EIA-920?

·        Does EIA have the correct data requirements for the combined EIA-860 and EIA-767?

·        What does the committee think of the cut off sampling methodologies for both new surveys?

ASA Committee Recommendations:

One of EIA’s data requirements on the combined monthly surveys is fuel cost. Mr. Burton advised EIA that EIA is going to run into situations where the respondents are going to have a hard time telling EIA what their fuel costs are and that EIA should be concerned bout the reliability of these data.

On imputing monthly data for CHPs, he advised EIA possibly to use different imputation methodologies for large and small facilities because of the possible difference in economics between the two.

Mr. Edmonds was also concerned about the differences between large and small facilities. By cutting back on the number of monthly respondents that EIA will sample, EIA will be sampling only large plants and not small ones. There are differences between the two in the way they operate. If EIA does not account for this difference, the quality of the data collected could be adversely affected.

Mr. Neerchal asked about just using covariates, like total sales, other than generation capacity for the sampling. He also brought up the subject of EIA asking the small plants in the combined EIA-860/EIA-767 which will be sampled once every three years to provide their data for the other two years. EIA and other Committee members voiced concern about recall and the quality of records. This, though, is something EIA could ask about during its forms testing period.

EIA Intended Response(s):

EIA will address these issues, the quality of cost data and the differences between large and small facilities in both combined surveys when it tests the forms with respondents.

6.  Emerging Technologies, Data, and NEM Modeling, Chris Namovicz, OIAF, EIA


EIA provided an overview of the wind energy model used in NEMS.  Focus was given on the challenges involved in collecting data in a market that is relatively new.  Although some data and approaches are becoming available, identifying the resource limitations is difficult, since such a small fraction of the potential resource base has been exploited.

ASA Committee Recommendations:

The Committee respondents recommended several lines of potential data for some issues, such as community reaction and growth limits.  Using data collected for other locally undesirable land uses (such as prisons) may help provide a gauge on the potential cost of “NIMBY” issues in the wind industry.  Some international markets are experience similar wind industry growth as the U.S. , and may provide insights into issues within the U.S. market.

EIA Intended Response(s):

EIA will continue to pursue all available avenues of wind industry data, including from international markets.  EIA will consider the potential applicability of data from non-wind industries facing similar negative community reaction.

7.     Suggestions from the External Study Team to EIA and Progress for Strategic Planning, Howard Gruenspecht, Deputy Administrator, EIA

This session was presented by Howard Gruenspecht, EIA’s Deputy Administrator.  This was an information session, and the Committee asked some clarifying questions.  The External Study Team’s report may be found on the ASA Meeting Home Page at and the Transcript of the session may be found at “Transcripts and Summaries” on EIA’s Home Page at

Friday, October 6, 2006

8.  Oil Production Model, sometimes calledNon-OPEC Oil Production in SAGE, Harry Vidas, Contractor to OIAF, EIA


Harry Vidas from Energy and Environmental Analysis (EEA) presented slides describing the World Oil Logistics Model (WOLM) that was prepared under contract to EIA.  WOLM is a spreadsheet model that takes a long-run crude oil supply curve from EEA’s WAU model plus historical production and reserve data and produces a forecast of annual crude oil production to 2100 by country as a function of oil price. The model is based on a "Hubbert" or "Logistic" Curve concept but modifies that concept to account for the crude oil supply curve economics and how the price of oil could affect the "shape" of the curve (that is, higher oil price will accelerate oil exploration and skew the Hubbert curve to the left.) The model also calculates reserves, reserve additions, new oil wells drilled and the number of operating oil wells as additional outputs.

Mr. Vidas explained that WOLM was built to test out data and algorithms that could be used in SAGE to represent oil production for non-OPEC regions. The WOLM was built at a country level because there are differences in the amount and quality of the data for each country.  The country-level structure in the model will allow the underlying data to be kept in its original format and will permit aggregations into regions using current (or possible future changes to) SAGE region definitions.

Mr. Vidas discussed the data sources used in the model including:

·        Annual production: EIA International Energy Annual 2003

·        Cumulative production: USGS 2000 World Assessment

·        Proved reserves: Oil and Gas Journal

·        Undiscovered resource, new fields: USGS 2000 World Assessment.

·        Undiscovered resource, reserve growth: USGS 2000 World Assessment, modified by EEA.

·        Annual oil well completions: World Oil Magazine.

·        Annual number of operating oil wells: World Oil Magazine

·        Resource cost of undiscovered resource; EEA’s WAU Model

Reliability problems of underlying data were discussed and were the inherent uncertainties related to resource future technology assumptions.

ASA Committee Recommendations:

Committee Recommendations received from Harry Vidas

There was general agreement in the Committee that the subject matter of how non-OPEC oil production will respond to oil prices was an important area of research for EIA to pursue.  There were also comments that the very long-run aspect of WOLM and its ability to address the issues surrounding where and when conventional oil production peaks and then begins to decline was a welcome change in focus that EIA should continue to pursue.

Specific comments and suggestion included the following:

·        Accelerated oil production caused by high prices might decrease (rather than, as is assumed in the model, increase) ultimate recovery of oil if inefficient production techniques are applied, as they were in the 1920’s and 1930’s.  EIA should consider whether this phenomenon is possible given current technical understanding and practices.

·        EIA should report the implied price elasticity for oil production coming from WOLM.  This would be of great interest to analysts and would be useful for assuring reasonable results.

·        EIA is right to be cautious of USGS reserve growth numbers based on U.S. field growth experience. More research into this area may be warranted.

·        Upstream technologies that are applied in U.S. oil fields may not be applied at the same speed or at all in some foreign settings.  This may have implications for costs and recovery factors in WOLM as well as the size of potential growth in existing fields.

·        EIA should consider whether the rate of technical change has accelerated and how that might affect how historical data can be used in statistical analyses that underlie oil supply forecasting models.

·        The lag effect between oil prices and levels of drilling and production should be considered carefully.  It may take several years of sustained high prices before investments that require high prices are made.

·        WOLM has a “patch work” quality in that the long run oil supply curves are coming from a separate model.  It would be better to integrated all steps into a single model.

·        Since the world is diverse and complex, EIA first should concentrate on the top 10 oil producing countries in getting more reliable historical data and a reasonable forecast.

·        For many countries, the political situation and the investment climate may just as important as the world price of oil.  Such non-price considerations need also be reflected in EIA’s projections.

·        It’s important to draw a distinction in such a model between price and price expectations by market participants.  The translation from forecasted prices to forecasted price expectations should be more explicit.

·        Upstream technology advances are difficult to analyze in historical context and to model for the future.  One perspective that EIA should consider is that upstream technology advances are independent of oil prices and might be influenced by overall technology advances occurring in the economy as a whole.  The reasons behind the accelerating rate of economy-wide technical innovation are themselves worth investigating.

·        The definition of what is “conventional oil” is important to get right and to apply it consistently across all countries.  It is also important to have a representation of unconventional oil and how its future production will respond to oil prices and future technological advances. 

·        In some instances, conventional and nonconventional oil can both exist in the same oil field.  They are not necessarily two distinct resource “boxes.”

·        There needs to more than one “Hubbert curve” to the degree that separate resources such as oil shales, oil sands and coal-to-liquids are being represented in the model.

EIA Intended Response(s):

Possible EIA Response to Suggestions received from Harry Vidas

EIA is currently working with the spreadsheet version of WOLM to better understand how it works and what projections it produces under a variety of cases.  EIA will determine whether and how the WOLM algorithms should be introduced into SAGE after the evaluation is completed.  

9.  Weekly Modeling of Stocks of Other Oils from Monthly Data, Ruey-Pyng Lu, SMG, EIA


EIA collects data on stocks of other oils on a monthly basis.  However, weekly stocks of other oils are estimated from the monthly total petroleum stocks and petroleum products supplied because stocks of other oils are not collected on a weekly basis. Due to the seasonality and other unexpected disruptions, a notable difference was observed between EIA’s weekly estimates and the monthly data for the stocks of other oils product. Thus, we began to explore other ways to estimate weekly stocks of other oils.  Note:  The estimation procedure excluded the weekly propane data. 

We evaluated alternative methods by comparing the weekly estimates with the Petroleum Supply Monthly (PSM) data.  The PSM data used for estimation purposes include weekly data for major oil products, monthly data for major oil products and the “Stocks of Other Oils” products.

EIA recommendations are:

1.     To collect weekly stocks of other oils from the sampled companies.  Nine of the largest companies may be contacted to find out how reasonable it would be for them to report one aggregate “other oils stocks” number on a weekly basis.  If ten or more need to be contacted, we can conduct a generic clearance to test the feasibility of this collection.  Historically, stocks of other oils were difficult to collect, but respondents may now have the data in some automated systems.

2.     We would strongly recommend including the unobserved components model (UCM) to be examined by the petroleum products team.  We should also reconcile with EMEU to get consensus on how to publish the WPSR and STEO estimates of stocks of other oils.

3.     Document the procedures/models used to perform the estimation and the rationale for selecting the particular estimate, and make sure that the stocks of other oils numbers are reproducible.

ASA Committee Recommendations

1.     One of the ASA committee members, Moshe Feder commented: “Actually I want to say that I like the unobserved components model and I think that’s the way to go, but maybe even that can be improved.”  “I cannot speak for the committee but for myself, I endorse this approach, I think it’s great, adjusting that --- you need to keep working and making sure that those spikes will be smaller.”  “If you have just monthly data, if you put it correctly in your model, and you run the model, assuming that the model performs well, you will get weekly estimates from the model that should be quite good. That is what I suspect. So even if those data are available only monthly, since you have – also have other weekly data, I think you can actually write a model to use all the data and get the estimates that you want.”

2.     The Unobserved Components model (UCM) is the right approach. When fitting the Unobserved Components model, try to use the monthly seasonality instead of weekly seasonality components in the UCM.

3.     Plot the whole available time series data to evaluate the fitness of the model.

4.     Fit the monthly and weekly data to the UCM and adjust the models to improve the precision of the estimates.

All the BLS estimates of unemployment rate are based on state-based modeling, and they use the UCM to adjust the rate, there is an ongoing project to do benchmarking to improve the precision of those estimates.

EIA Intended Response(s)

SMG will fit the Unobserved Components model with monthly and weekly data separately to adjust the models’ fit, then compare the forecast precision with the available data series. After adjusting the forecasts’ precision, the fitted UCM will be delivered to the Petroleum Supply division to be used in the WSPR publication. 

10.  An EIA Quality Product, George M. Lady, Ph.D., Contractor to SMG, EIA


On October 6, 2006 a proposed methodology for evaluating NEMS forecasts was presented to the American Statistical Association (ASA) Committee on Energy Statistics. NEMS forecasts are made conditionally, given a variety of assumptions about future economic conditions and other characteristics of energy markets. An evaluation of the forecasts calls for identifying the sources of error as they are associated with differences between the conditions assumed and those eventually prevailing. Sources of error include the following:

          Transitory Influences, e.g., weather, strikes, accidents, embargoes not accounted for in the projections.

          Institutional Influences, e.g., changes in laws and regulations and changes in data series definitions compared to model assumptions.

          Structural Influences, e.g., changes in resource availability or energy use technology compared to model assumptions.*

          Errors in Projecting Conditional Variables, e.g., differences in the eventual values of activity drivers and other exogenous factors such as GDP and population.*

          Errors in Behavioral Parameters, e.g., changes in consumer price sensitivities compared to those assumed by the forecasting methodology.*

          Uncertainty, e.g., the residual error of the projection method.

An evaluation of long term forecasts is complicated by the length of time between the forecast publication and the eventual values of the variables projected. It is not practicable to retain versions of NEMS associated with past forecasts and re-run the model at a future time using the observed, rather than assumed conditional assumptions. The proposed evaluation methodology presented to the ASA Committee calls for processing NEMS solution data, which can be easily archived, via regression analyses of the supply and demand relationships implicit to NEMS representation of energy markets. The regression results can be then assessed using observed, rather than assumed, values for the conditional variables. The analysis can be differentially configured to isolate the impact of each conditional assumption upon the differences between forecast and historical values.

ASA Committee Suggestions:

The Committee strongly agreed with the goals and method of the proposed forecast evaluation methodology and felt that conducting the forecast evaluation was an important tool in assessing the merit of long term forecasts. The Committee specifically recommended:

(1) Undertake to develop the statistical approximations of NEMS and use the results to assess the sources of forecast error.

(2) Conduct the forecast evaluation on a regular basis and provide the results as a resource in NEMS development.

(3) Select relationships submitted to regression analysis so as to enable a statistical  approximation of NEMS itself for major categories of energy production and consumption. The resulting system was termed by the Committee as “NEMS Lite.” The resulting system would provide an integrated assessment of the accuracy of NEMS projections.

Intended EIA Response(s):

A project is now underway to complete the specification of the NEMS forecast evaluation methodology. Once completed, EIA plans to initiate the development of the evaluation methodology. This follow-on project will develop the data and software resources for conducting the evaluations and demonstrate the method through assessments of the solution data for NEMS solutions produced in support of the Annual Energy Outlook (AEO) starting with the 1998 AEO. EIA staff will be trained in conducting the evaluations and presenting results on an annual basis. The current project and the proposed follow-on are responsive to ASA recommendations (1) and (2) above. Expanding the effort to result in a statistical approximation of important features of NEMS, a “NEMS Lite” as identified in item (3) above, remains contingent on available resources.