Assessment of consumption and expenditure data collected from energy suppliers against bill data obtained from interviewed households: Case study with 2009 RECS
February 15, 2013
The Residential Energy Consumption Survey (RECS) is a national area-probability sample survey that statistically selects housing units across the U.S. to collect energy-related data for occupied primary housing units. This survey mainly consists of two parts: the RECS Household Survey (RECS-HS) and the Energy Supplier Survey (ESS). In RECS-HS, the respondents are to provide information about their household characteristics, energy-using equipments, fuels used, and other information related to energy use. In ESS, the energy suppliers are required to provide twenty months of energy consumption and expenditure data for the RECS-HS sample households.
In RECS-HS, the households are asked to submit their energy supplier information via utility billing statements. Beginning with the 2005 RECS data collection cycle, the interviewers used portable devices to scan the respondents’ utility bills in order to gather the supplier names and account numbers more easily and accurately. The scanned bills actually contain more information than supplier names and account numbers. They contain the energy consumption and expenditures of at least the current service period, and we utilize this information to examine the ESS collected data in terms of numerical accuracy. By comparing these sources, we hope to learn more about any limitations in the data that we collect, which we can then attempt to address. As such, this limited empirical study is an example of the research that EIA conducts to evaluate and subsequently improve on the quality of data that EIA collects.
Research design
Population of interest
The 2009 RECS-HS has 12,083 cases. In this analysis, we compare the ESS collected data to the RECS-HS
scanned electricity bill data, as electricity is the most commonly and widely used energy source. The
population of interest is the 2009 RECS-HS interviewed households with scanned electricity bills, for
which energy suppliers provided consumption and expenditure data. The size of this analysis population
is 6,150. Note that the RECS-HS sampling weights have no bearing in this analysis and that we are not
inferring any findings to a wider population.
Data
Since the analysis population is too large to examine in its entirety, we limited ourselves to selecting one
household per supplier. There are 408 unique electricity suppliers found in the 6,150 scanned bills, and
we randomly selected one household from each supplier. This approach is based on our assumption
that the data quality does not vary much within each supplier, while the variation could be substantial
across suppliers. Although this assumption may not actually be the case, we decided to use this simple
approach for our initial study. There are other approaches that could have been used, such as
acceptance sampling or probability proportional to size sub-sampling, but those would have required
considerably more resources since many more cases would be needed.
In the analysis population, the maximum number of households linked to any one supplier is 325. The minimum was one, and there were 46 such suppliers. The mean number of households per supplier is about 15; the median is 5.
ESS coverage period
The ESS data we examined contained raw data, i.e., data submitted directly from the supplier, on
household electricity consumption and expenditures from September 2008 to April 2010. The reference
period of the 2009 RECS-HS is from January to December 2009.
Data extraction from scanned electricity bills
An electricity bill for each selected sample case was extracted from our image data files and its content
was manually processed to produce consumption and expenditure values consistent with the ESS data
requirements. This was a time-consuming process because of the irregularity of the bills and the
specificity of numerical information we needed to collect (i.e., electricity consumption in kWh and only
its cost and tax).
Matching ESS data and scanned bill data at one billing month
Since the energy bill was scanned by the interviewer at the time of the RECS-HS interview, the bill
usually contains only the consumption and expenditure information, if any, of some billing period
preceding the interview time, which fell sometime between February and August 2010. The ESS
collected data, which are expected to span the period from September 2008 to April 2010, and the
RECS-HS scanned bill data can be matched only at one particular billing period within the overlapping
months (February – April 2010). We might expect that if any systematic data problems exist in the ESS
collected data they are likely to prevail in all months within a given supplier. This comparison will give
some, if not complete, insight into the quality of ESS data relative to bill data.