U.S. Energy Information Administration logo

Assessment of consumption and expenditure data collected from energy suppliers against bill data obtained from interviewed households: Case study with 2009 RECS

February 15, 2013

The Residential Energy Consumption Survey (RECS) is a national area-probability sample survey that statistically selects housing units across the U.S. to collect energy-related data for occupied primary housing units. This survey mainly consists of two parts: the RECS Household Survey (RECS-HS) and the Energy Supplier Survey (ESS). In RECS-HS, the respondents are to provide information about their household characteristics, energy-using equipments, fuels used, and other information related to energy use. In ESS, the energy suppliers are required to provide twenty months of energy consumption and expenditure data for the RECS-HS sample households.

In RECS-HS, the households are asked to submit their energy supplier information via utility billing statements. Beginning with the 2005 RECS data collection cycle, the interviewers used portable devices to scan the respondents’ utility bills in order to gather the supplier names and account numbers more easily and accurately. The scanned bills actually contain more information than supplier names and account numbers. They contain the energy consumption and expenditures of at least the current service period, and we utilize this information to examine the ESS collected data in terms of numerical accuracy. By comparing these sources, we hope to learn more about any limitations in the data that we collect, which we can then attempt to address. As such, this limited empirical study is an example of the research that EIA conducts to evaluate and subsequently improve on the quality of data that EIA collects.

Research design

Population of interest
The 2009 RECS-HS has 12,083 cases. In this analysis, we compare the ESS collected data to the RECS-HS scanned electricity bill data, as electricity is the most commonly and widely used energy source. The population of interest is the 2009 RECS-HS interviewed households with scanned electricity bills, for which energy suppliers provided consumption and expenditure data. The size of this analysis population is 6,150. Note that the RECS-HS sampling weights have no bearing in this analysis and that we are not inferring any findings to a wider population.

Data
Since the analysis population is too large to examine in its entirety, we limited ourselves to selecting one household per supplier. There are 408 unique electricity suppliers found in the 6,150 scanned bills, and we randomly selected one household from each supplier. This approach is based on our assumption that the data quality does not vary much within each supplier, while the variation could be substantial across suppliers. Although this assumption may not actually be the case, we decided to use this simple approach for our initial study. There are other approaches that could have been used, such as acceptance sampling or probability proportional to size sub-sampling, but those would have required considerably more resources since many more cases would be needed.

In the analysis population, the maximum number of households linked to any one supplier is 325. The minimum was one, and there were 46 such suppliers. The mean number of households per supplier is about 15; the median is 5.

ESS coverage period
The ESS data we examined contained raw data, i.e., data submitted directly from the supplier, on household electricity consumption and expenditures from September 2008 to April 2010. The reference period of the 2009 RECS-HS is from January to December 2009.

Data extraction from scanned electricity bills
An electricity bill for each selected sample case was extracted from our image data files and its content was manually processed to produce consumption and expenditure values consistent with the ESS data requirements. This was a time-consuming process because of the irregularity of the bills and the specificity of numerical information we needed to collect (i.e., electricity consumption in kWh and only its cost and tax).

Matching ESS data and scanned bill data at one billing month
Since the energy bill was scanned by the interviewer at the time of the RECS-HS interview, the bill usually contains only the consumption and expenditure information, if any, of some billing period preceding the interview time, which fell sometime between February and August 2010. The ESS collected data, which are expected to span the period from September 2008 to April 2010, and the RECS-HS scanned bill data can be matched only at one particular billing period within the overlapping months (February – April 2010). We might expect that if any systematic data problems exist in the ESS collected data they are likely to prevail in all months within a given supplier. This comparison will give some, if not complete, insight into the quality of ESS data relative to bill data.

See full report