Residential Energy Consumption Survey (RECS)

Glossary › FAQS ›

Overview
Data

Analysis & Projections

2001 RECS Survey Data 2020 | 2015 |2009 | 2005 | 2001 | 1997 | 1993 | Previous

Housing characteristics

Consumption & expenditures

Microdata

Methodology

Jump Menu

Methodology
Data Quality

2001 Survey Methods

Overview

The Residential Energy Consumption Survey (RECS)

was designed by the Energy Information Administration (EIA) to provide information about energy consumption within the residential sector.
was conducted in two major parts: the Household Survey and the Energy Suppliers Survey.
The Household Survey collected information about the housing unit through personal interviews with a representative national sample of households.
In the Energy Suppliers Survey, data concerning actual energy consumption were obtained from household billing records maintained by the energy suppliers. The data are collected by questionnaires mailed to all the suppliers for the households in the Household Survey.

Detailed tables based on results of the Household Survey are now available on the web.

The data collection contractor collected and processed the 2001 RECS data for EIA. The data was collected using the Household Survey Questionnaire (PDF format).

This section contains detailed information about the RECS sample design, Household Survey, Low Income Home Energy Assistance Program (LIHEAP) Supplemental Sample and confidentiality of the survey information.

Sample Design
RECS Household Survey
Conducting the Interviews
Data Collection Procedures
Data Editing
Special Supplemental Survey of LIHEAP Recipients
Confidentiality of Information

Sample Design

Overview

The sample design for the 2001 RECS was based on the design used for the 1993 RECS, which was a multistage area probability design.
The universe for this sample design includes all housing units occupied as the primary residence in the 50 States and the District of Columbia.
The RECS does not cover vacant housing units, seasonal units, nor second homes.
Households on military installations are included.
The definition of household is the same as that used by the U. S. Bureau of the Census.
In RECS, by definition, the number of households is the same as the number of occupied primary housing units and these terms are used interchangeably.
The universe was estimated to contain 106,989,000 households based on extrapolations from Current Population Survey (CPS) estimates at the time of the 2001 RECS (July 2001). This definition excludes group quarters such as military barracks, dormitories, and nursing homes, which are considered to be out-of-scope.
The overall plan for the 2001 RECS included a basic sample of approximately 5,000 completed household interviews, plus a supplemental sample totaling approximately 500 completed interviews.
The basic sample was designed to represent the total population of households in the United States, with specified levels of precision for each of the nine geographically defined Census divisions.
The supplemental sample, included in the plan to meet special analytical needs, was designed to provide a representative sample of households receiving energy assistance.

Multistage Area Probability Sample

In a multistage area probability sample design, the universe is broken up into successively smaller, statistically selected areas. The process starts with the selection of primary sampling units (PSUs) and ends with the selection of individual households.

Primary Sampling Units (PSUs)

PSUs are either metropolitan areas containing a central city of 50,000 or larger population, or they are counties or groups of counties containing small cities and rural areas. In the sample design used for the 2001 RECS, the total land area of the 50 States and the District of Columbia was divided into 1,786 PSUs. These PSUs were based on county and independent city boundaries and on Metropolitan Statistical Areas (MSAs) as defined in June 1990. The primary mode of stratification of PSUs was by the nine Census divisions. Strata were separately defined within Census divisions for four populous States (California, Florida, New York, and Texas) and for two States with unique weather conditions (Alaska and Hawaii). Stratification was also based on MSA or nonMSA status of PSUs and, to the extent feasible, on dominant residential space-heating fuel and weather conditions. PSUs were grouped into 116 strata with one PSU selected from each strata. The PSUs that were selected for the 1993 RECS were also used for the 2001 RECS.

Secondary Sampling Units (SSUs)

A number of SSUs, usually eight or more, were selected in each PSU. SSUs consisted of one or more Census blocks, selected directly from Census statistics. Blocks were combined, as necessary, to create SSUs that contained at least 50 housing units. SSUs that contained very large numbers of housing units were divided into smaller listing segments and one listing segment was selected for detailed address listing.

New Construction Canvass

The starting point for the SSU new construction update procedure was the set of SSUs selected for the 1997 RECS. The first step was to expand the 1997 SSUs to include selected blocks in the same PSU, creating groups of blocks with at least 400 housing units. A new construction update procedure was used to determine if significant new construction--defined as groups of 50 or more housing units--had occurred within the expanded SSUs since 1997. This was based on a canvass, primarily by telephone, of local sources of information, such as building-permit-issuing agencies, zoning boards, and tax offices. If no significant new construction had occurred, the SSU selected for the 1997 RECS was used for the 2001 RECS. If significant new construction had occurred, rough counts of the number of housing units by block were obtained for the expanded SSU, the expanded SSU was divided into segments, and a segment was selected. The selected segment was then used as the SSU for the 2001 RECS.

The detailed field listings of all housing units in the 2001 RECS SSUs were either carried over from the 1997 RECS or were created by field workers who visited the SSUs and identified each housing unit by street address, apartment number, or other obvious features.

Addresses of these housing units were placed in a database used for actual sample selection.

Sample Selection--Ultimate Clusters

Specific addresses chosen from each of the field listings comprised the ultimate clusters of the 2001 RECS sample. An ultimate cluster of housing units to be contacted for interview (averaging over four 4 housing units for the 2001 RECS) was randomly selected by computer from the penultimate cluster; these housing units constituted the assignments given to interviewers.

Population of Special Interest

The 2001 survey featured a supplemental sample of LIHEAP recipients designed to be merged with the main RECS sample and to meet special analytical needs of the Office of Family Assistance, Family Support Administration (FSA), U.S. Department of Health and Human Services. The FSA is interested in households living below the poverty level. The initial RECS housing characteristics report will use only the households selected as part of the area probability sample.

RECS Household Survey

A complete RECS interview consists of data for a completed household questionnaire and a signed Authorization Form. The large majority of interviews were completed via a Computer Assisted Personal Interviewing (CAPI) system. EIA personnel programmed the survey instrument in the BLAISE software system. The paper version of the survey instrument can be found in Form EIA-457A, "Household Questionnaire." When technical problems were encountered, interviews were completed using the paper version of the questionnaire. At the end of each interview, the household respondent was asked to sign an Authorization Form. The signed Authorization Form gave permission for EIA to obtain the housing unit's energy bills from each fuel supplier provided by the respondent during the course of the interview. The case management system employed was SurveyTrak, which was developed by the University of Michigan.

A total of 7,037 housing units were selected to participate in the 2001 RECS.
Completed interviews were obtained for 4,822 eligible households.
This section describes the procedures involved in collecting the completed interviews.

Conducting the Interviews

Interviewer Training

To accommodate interviewers from various parts of the country, two main training sessions were held-each of which were 4 1/2 days long. The first session was held March 21 through March 25, 2001, in Atlanta, Georgia; and the second session, March 28 through April 1, 2001, in Grapevine, Texas. Only interviewers with little or no CAPI experience attended the first day of each of these sessions. To augment the staff lost to attrition and to provide extra coverage for low-response-rate areas, a third training session was held June 7 through June 10, 2001. Because all the interviewers attending this training were experienced with CAPI administration, this session was 3 1/2 days in length. Approximately 190 interviewers were in attendance across all sessions (80 at each of the main trainings and 30 at the supplemental training). Each session was led by a group of trainers who had attended a 3 day workshop in Princeton, New Jersey. Department of Energy staff, who also participated in the training, monitored all training sessions.

The Interviewers

A total of 158 interviewers completed one or more personal interviews for this study. Sixty-one interviewers (39 percent) had completed interviews during a prior RECS. The remainder were conducting their first RECS, but had prior interviewing experience, either with other survey research organizations or with the US Bureau of the Census.

Interviewers conducted an average of 29 interviews. Forty interviewers completed fewer than ten interviews each, with an average of five per interviewer. Twenty-five interviewers completed 50 or more interviews each, with an average of 72 per interviewer. Thirty-two percent of the personal interviews were verified by telephone or mail to ensure that interviews were conducted as intended.

The Interview

Household interviews were conducted with the householder or the householder's partner. The questions covered energy-related features of the household, such as the type of heating and cooling systems; the fuels used for heating and cooling; household appliances and their usage; the receipt of government assistance for the cost of heating and cooling; and demographic data on household members. Interviewers also asked permission to measure the household's living quarters.

Data Collection Procedures

Multiwave, Multicontact Approach

In an effort to minimize nonresponse and, therefore, maximize the validity of the survey data, a multiwave, multicontact approach was employed. Before the initial personal contacts, a letter stressing the purpose and importance of the survey was sent to each household with a street address. Beginning in late March 2001, interviewers made several callbacks at different times of the day and different days of the week in an effort to minimize the number of uncontacted households. The interviewers also queried neighbors regarding the most opportune times to contact the prospective respondent.

After initial attempts to complete interviews at the selected housing units were exhausted, field supervisors determined which cases would be reassigned to another interviewer. Types of non-interviewed households that were reassigned included cases where the householder refused to participate and cases where the householder was not available or not at home. Types of non-interview households that were not reassigned included cases where the householder would be unable to complete an interview during the field period due to absence or illness and cases where the household had moved after the initial contact. Reassignments continued throughout the field period.

Data Collection Period

The data collection field period lasted 6 months. Approximately three-quarters of the personal interviews were completed between the end of March through the first week of July 2001 (15 weeks). Ninety-nine percent of the personal interviews were completed by September 29, 2001. In a few sample locations with low response rates, interviewing continued through October 7, 2001.

Mail Questionnaire

In late August, mail follow-up attempts were made at households that had not completed a personal interview. An abbreviated, self-administered version of the questionnaire was mailed to 1,991 of these households with a letter asking that they return the completed questionnaire in the business reply envelope provided. The mailing also included a copy of the Authorization Form for the respondents to sign. A mail questionnaire was considered usable if the respondent had completed the majority of the questionnaire and the Authorization Form was signed. A total of 241 usable mail questionnaires were returned by the end of October 2001.

Households to which a mail questionnaire was sent were given the option of completing the questionnaire via the Internet. Seven sample households responded via this mode.

Authorization Form Follow-up

A follow-up contact was attempted with all respondents who completed a personal or mail interview and reported paying for at least one fuel but did not sign an Authorization Form. Sixty-four additional forms were obtained through this effort.

Overall Response Rate

After all data collection attempts (both personal interview and the mail questionnaire) 4,822 households completed a personal interview, 241 households completed a mail questionnaire, and seven households responded via the Internet.

Response Rates and Household Characteristics

Various response and non-response rates will be compared across Census region, urban status, and housing structure type later in the analysis stage.

Data Editing

Data for completed interviews were transmitted to the data collection contractor's headquarters via modem using SurveyTrak sample management system. All completed interviews were combined into one Blaise database for further processing. All paperwork, which included the Authorization Form, Measurement Booklet, and Housing Unit Address Lists were mailed to the data collection contractor's headquarters. Contact information from the Blaise interview, the Housing Unit Record from the SurveyTrak sample management system, and the Authorization Form were reviewed to ensure that each had been completed correctly and that the correct housing unit had been interviewed.

Checks and edits were programmed into the Blaise data capture instrument for the Household Questionnaire to reduce the amount of missing or invalid responses. In addition, to ensure the overall quality of the data, checks were programmed in SAS and used to identify range, logic, and consistency problems.

The data collection collection contractor attempted to resolve inconsistencies or ambiguities in the data by referencing interviewer notes and other parts of the questionnaire. When these efforts failed to resolve important problems, particularly those involving heating fuels or heating equipment and/or relationships between questionnaire responses, a follow-up telephone contact with the rental agent or with a member of the household in question were made.

Rental Agent Survey

In addition, 401 follow-up contacts were conducted with rental agents for households who did not pay for their fuels, who rent their living quarters or own and occupy living quarters in a condominium or cooperative building or community were attempted with rental agents, landlords, and apartment managers. Respondents to this survey were asked about the heating, water heating, and cooking fuels used by tenant households; the household's heating equipment; and method of bill payment (i.e., included in rent or paid by household).

The interviews with rental agents or their representatives were conducted in November 2001. Altogether, 57 landlords or rental agents were interviewed; these interviews covered 127 households. Comparisons were made between rental agents' and household respondents' reports on their main space-heating and water-heating fuels; main space-heating equipment; fuel for cooking range; and how the fuels for all of these uses are paid. Each discrepancy was examined, and changes were made to the household data whenever it was judged that the rental agent was more knowledgeable than the household respondent on the different items of information.

Generally, the person who paid for a specific fuel for a specific use was deemed the most knowledgeable person. However, error resolutions were made only after careful examination and consideration of all available sources of information including the rental-agent questionnaire, the household questionnaire, and questionnaires of other households located in the same building. Landlords and rental agents were usually judged to be more knowledgeable about the type of main heating equipment; household respondents were typically deemed more reliable sources concerning fuel for cooking range.

Special Supplemental Survey of Low-Income Home Energy Assistance Program Recipients

LIHEAP (Low-Income Home Energy Assistance Program)

is a federally funded program to help eligible low-income households meet their home heating and/or cooling needs.
The Administration for Children and Families (ACF), within the US Department of Health and Human Services, is responsible for Federal programs that promote the economic and social well being of families, children, individuals, and communities and also administers LIHEAP at the federal level.
ACF oversees and finances a broad range of activities in partnership with State, local and tribal governmental agencies. These agencies provide direct services and assistance to children, youth, families, persons with developmental disabilities, refugees, migrants, Native Americans, legalized aliens, and others eligible to receive help under ACF legislative authorities.
ACF also conducts research, collects and analyzes data, prepares budget documents and reports to Congress, issues regulations and policy materials, publishes various technical assistance reports, and develops the Annual Government Performance and Results Act (GPRA) Plan.

Overview

The supplemental sample included 840 LIHEAP-recipient households, with a goal of completing 500 interviews. The purpose of this supplemental sample was to obtain statistically reliable home energy estimates for LIHEAP-recipient households. Sample management protocol differed from the main sample, but the same survey instrument was administered to both samples.

Of the 840 households selected to participate in the LIHEAP Supplement, completed interviews were obtained for 497 households. This section describes the sample plan and the procedures involved in collecting the completed interviews.

Sample Plan

The LIHEAP sample was restricted to 59 PSUs (primary sampling units) out of the 116 PSUs in the RECS sample, with a cluster size of three expected completes, and 2.8 clusters per PSU on average.
The criterion used to determine the allocation of the targeted 500 LIHEAP sample cases across the nine Census Divisions was the number of LIHEAP heating recipients in each division according to administrative statistics.
This allows more reliable estimates of fuel usage in cells with small sample sizes.
Strata were allocated to divisions in the same proportion as the interviews.
PSUs were combined to form strata based on a set of priority rules and then selected with probability proportionate to the size of the stratum they represented.
Using the number of 1990 Census households, clusters were allocated to each stratum by multiplying the total number of clusters allocated to the division by the percentage of total population in the division represented by the stratum.
Using the list of LIHEAP recipients within PSU counties provided by the state or local LIHEAP offices, clusters were selected based on the number of LIHEAP recipients in the ZIP Code. Households were then randomly selected from each of the selected ZIP Codes, resulting in a total of 810 households selected.

Interviewer Training

A subset of the 2001 RECS interviewing staff was trained via conference call on the specific requirements of the LIHEAP Supplement. Such training was necessary because of the differences in the sample management aspects of the study between the main survey and the supplemental sample. The field supervisory staff to whom the interviewers were assigned for the 2001 RECS conducted the training session.

The Interviewers

A total of 36 interviewers completed 1 or more personal interviews for the LIHEAP Supplement. Interviewers conducted an average of 12 interviews. Four interviewers completed fewer than five interviews each, with an average of three per interviewer. Six interviewers completed 20 or more interviews each, with an average of 24 per interviewer. A percentage of all interviewers working on the supplemental sample had passed verification procedures during the main RECS survey. However, verifications did continue for supplemental of interviews at reduced levels by mail.

The Interview

Unlike the main sample where the interview was conducted with the current occupant of an eligible housing unit, interviewers were directed to interview a member (householder or partner or spouse of the householder) at a designated housing unit. If the family no longer lived at that address, the interview was not conducted. Lists of LIHEAP households were obtained from individual State LIHEAP offices.

Other than the differences in sample management and the identification of the respondent to be interviewed, the same materials and protocol as those used in the main study were employed.

Data Collection Dates

The data collection field period for the LIHEAP supplement began July 20, 2001, and was completed in December. Each PSU was given a minimum of 4 weeks for completion. There was no mail questionnaire follow-up for sample households in the LIHEAP Supplement who did not complete a personal interview.

Data Collection Procedures

Before the initial personal contacts, a letter stressing the purpose and importance of the survey was sent to each household with a street address. For households with only Post Office box addresses, letters were mailed to those addresses asking the sample members to call to arrange an interview.

Beginning in mid-July 2001, interviewers made several callbacks at different times of the day and different days of the week in an effort to minimize the number of uncontacted households. The interviewers also queried neighbors regarding the most opportune times to contact the prospective respondent. For the most part, a single interviewer was responsible for a PSU. If that interviewer was unable to achieve the targeted goal for the PSU, an interviewer traveled from another PSU to complete the assignment.

Response Rates and Household Characteristics

Various response and non-response rates will be compared later in the analysis stage.

Data Editing

Data for completed interviews in the LIHEAP Supplement will be were subjected to the same data editing steps as those cases in the main sample.

Confidentiality of Information

EIA does not receive nor take possession of the names or addresses of individual respondents or any other individually identifiable energy data that could be specifically linked with a household respondent; the data are collected for statistical purposes only.
All names and addresses and identifiable information are maintained by the data collection contractor for verification purposes only.
The household records that are placed on the public-use data file do not have name or address information.
Additional measures have been taken to mask the data for further confidentiality protection. Unlike other EIA surveys, the consumption surveys pledge confidentiality to their respondents.

2001 Data Quality

All the statistics published in the RECS tables are estimates of population values, such as the number of households using natural gas. These estimates are based on a randomly chosen subset of the entire population of households. The universe includes all households in the 50 States and the District of Columbia, including households on military installations.

The differences between the estimated values and the actual population values are due to two types of errors, sampling errors and nonsampling errors.

Sampling errors are errors that are random differences between the survey estimate and the population value that occur because the survey estimate is calculated from a randomly chosen subset of the entire population. The sampling error, averaged over all possible samples, would be zero, but since there is only one sample for the 2001 RECS, the sampling error is nonzero and unknown for the particular sample chosen. However, the sample design permits sampling errors to be estimated. The section, Estimation of Sampling Error, provides details on calculation of the sampling errors for the 2001 RECS.
Nonsampling errors are related to sources of variability that originate apart from the sampling process and are expected to occur in all possible samples or in the average of all estimates from all possible standards. Nonsampling errors can result from:
- Inaccuracies in data collection: due to questionnaire design errors, interviewer error, respondent misunderstanding, and data processing error;
- Unit nonresponse: when an entire sampled household does not participate in the survey, and;
- Item nonresponse: when a particular item of interest is missing from a completed questionnaire.

Adjustments for Unit Nonresponse

Weight adjustment was used to reduce unit nonresponse bias in the survey statistics. Weights were calculated for each sample household. The household weight reflected the selection probability for that household and additional adjustments. These adjustments included correcting for potential biases arising from the failure to list all housing units in the sample area and failure to contact all sample housing units.

Six factors are used in the processing of Residential Energy Consumption Survey (RECS) results to develop an overall weight for each household for which a completed questionnaire, either a personal interview or mailed questionnaire, is obtained.

The factors are:

The basic weight
A noninterview adjustment
A first-stage ratio estimate
Three second-stage ratio adjustments

The overall household weight is the product of these six factors.

The Basic Weight

The basic weight is calculated and applied to households at the Secondary Sampling Unit (SSU) level.

Basic Weight = 1/ (Probability of Selection)

For the 2001 RECS, all households in the same SSU had the same probability of selection and hence the same basic weight.

The Noninterview Adjustment

The noninterview adjustment factor (NIAF) compensates for nonresponse households and for nonhousehold units that were identified during the survey. Basically, this adjustment reflects the ratio of the number of completed and uncompleted responses among those selected to the number of completed responses. Since the probabilities of selection are constant within an SSU for 2001, these adjustments were applied at the SSU level.

The NIAF is computed at the SSU and is equal to:

(Total Completed Plus Uncompleted Responses in the SSU) / (Completed Responses in the SSU)

If the ratio exceeds 2.0, then the NIAF is set equal to 2.0 and the NIAFs for SSUs in the same Primary Sampling Unit (PSU) and with the same metropolitan status are increased.

The First-Stage Ratio Adjustment Factors

The primary purpose of the first-stage adjustment factor is to reduce the sampling variation in the estimates of the number of housing units by main space-heating fuel resulting from sampling of PSUs during the first stage of the sample design. The correlation between main space-heating fuel and other important energy-related characteristics implies that this adjustment will also reduce the sampling variation for many important variables collected for the RECS.

In some cases, a single PSU comprising all or part of a large metropolitan area was large enough in population to be a stratum by itself. PSUs of this type are called Self-Representing (SR) PSUs because the sample from each SR PSU represents only that PSU. The first-stage ratio adjustment factor was 1.0 for all observations in SR PSUs.

In other strata, one PSU was selected from among two or more PSUs in the stratum. Each of the PSUs selected from these strata is called a Non-Self-Representing (NSR) PSU because each such PSU represents not only itself; it also represents the unselected PSUs in the stratum.

The 1990 Census data were used to determine the difference between the distribution of the main space-heating fuel in the set of selected NSR PSUs and the distribution in the set of all PSUs (selected and unselected) in the strata from which the NSR PSUs are selected. Fuels are under-represented if the percentage of households using the fuel is lower in the selected NSR PSUs than the percentage in the set of all PSUs in the NSR strata. Fuels are over-represented if the opposite occurs. The weights for the responding households in NSR PSUs are adjusted upward when their main space-heating fuel is under-represented and the weights are adjusted downward when it is over-represented.

The Second-Stage Ratio Adjustments

The second-stage ratio adjustments are used to improve the accuracy of the estimates of the number of households using data obtained from the Bureau of the Census as control totals. The RECS can be used to produce an estimate of the number of households in the country, but the Bureau of the Census produces much more accurate estimates. Improving the accuracy of the data on the number of households also improves the accuracy of almost all other estimates obtained from the RECS. The first priority is the accuracy of estimates for the number of households for the nine Census divisions and for the four largest States. The second priority is the accuracy of estimates for the number of households for three demographic cells (multiperson households, single-member female households, and single-member male households).

The ratio adjustment process was carried out in three steps:

Step One, the population was divided into 15 geographical cells. (Hawaii and Alaska were treated as separate cells because their climate is different than the rest of the country.) Control totals giving the number of households in each cell were derived from Current Population Survey results. A ratio adjustment equal to the control total divided by the weighted count using the weights after the first-stage ratio adjustment was created. Multiplying the weights after the first-stage ratio adjustment by the ratio yields the new weights which, when summed, equal the control totals for the 15 cells. This calculation yielded a weighted total number of households equal to 101,481,000. Refer to Table B1 for estimates for each of the 15 geographical areas.
Step Two, the control totals are use for the three demographic cells (multiperson households, single-member female households, and single-member male households).
Step Three, the same as the first step except that the input weights are those resulting from the second step. This produced a set of weights whose sum reproduced the 15 geographic cell control totals and yielded estimates that are quite close to the control totals for the three demographic cells.

Table B1. Control Totals for Ratio Adjustment of Sampling in the 2001 RECS
Location	Thousands of Households
New England	5,407
Middle Atlantic (minus New York State)	7,766
East North Central	17,091
West North Central	7,400
South Atlantic (minus Florida)	13,986
East South Central	6,818
West South Central (minus Texas)	4,133
Mountain	6,725
Pacific (minus Alaska, California, and Hawaii)	3,603
New York	7,081
Florida	6,328
Texas	7,669
California	12,347
Alaska	227
Hawaii	409
Total United States	106,990
Source: EIA's linear extrapolation from U.S. Bureau of the Census, 2000 and 2001 Current Population Survey.

Adjustments for Item Nonresponse

Item nonresponse occurs:

When respondents do not know the answer or refuse to answer a question or
When an interviewer does not ask a question or does not record an answer.

The incidence of the latter, the interviewer not asking and/or not recording the answer, was greatly reduced by the use of Computer Assisted Personal Interviewing (CAPI). The majority of nonresponse was due to interviewers recording answers of “don’t know” and “refused.”

Methods of Imputation

Missing data values were assigned to question items in otherwise completed RECS Household Questionnaires. Two imputation methods were used in the 2001 RECS, “hot-deck” and “deductive procedures.”

Hot-Deck: This procedure requires sorting the file of household data by variables related to the missing item. A household is then selected that has the same value of the related variables, and this “donor” household supplies the value of the variable that is missing to the “donee” household. For example, a six-room, two-full bathroom, single-family detached housing unit with a household size of three members, and a household income of $50,000 per year would be the donor household for a similar housing unit, the donee, having all the same characteristics but with a missing value to the annual household income item.

Deductive: This procedure uses information available from the RECS questionnaire, or from related external data sources such as utility bills and the Rental Agent Survey, that permit a logically deduced value for the missing item. For example, a respondent that reports that they do not use their air-conditioning at all would permit a logically deduced value of zero for the missing response to the number of the rooms in the home that are air-conditioned question.

Most Frequently Imputed Household Questionnaire Items

Table B2 presents the household questionnaire items most frequently imputed in the 2001 RECS. In addition to these 13 items, there were 126 other questionnaire items for which responses were imputed. Of these 126 items, 34 involved imputations for 11 to 45 survey cases, between .23 percent and .93 percent, and 92 involved imputations for 10 or fewer cases, less than .23 percent.

Table B2. Household Questionnaire Items Most Frequently Imputed in the 2001 RECS
Imputed Item	Cases Imputed	Percentage of Total Sample	Imputation Method	Question Number
Household Income in past 12 months	487	10.1	Hot Deck	J-14
Year housing unit was built	483	10.0	Hot Deck	A-15
Were all rooms heated last winter	189	3.9	Deductive/Hot Deck	D-10
Number of rooms not heated last winter	189	3.9	Deductive/Hot Deck	D-10a
Natural gas available in neighborhood	186	3.9	Deductive/Hot Deck	A-17
Use a separate built-in range top or burners	117	2.4	Deductive	B-1
Range top or burners fuel	115	2.4	Deductive	B-1b1/B-1d
Use a separate built-in oven	106	2.2	Deductive	B-1
Oven fuel	104	2.2	Deductive	B-1b2/B-1f
Fuel used to heat hot water	88	1.8	Hot Deck	E-1
Amount of heat provided by main space heating equipment	60	1.2	Hot Deck	D-6
Entire housing unit air conditioned	55	1.1	Deductive	F-3
Number of rooms air conditioned	51	1.1	Deductive	F-3a
Source: Energy Information Administration, Office of Energy Markets and End Use, Form EIA-457A of the 2001 Residential Energy Consumption Survey (RECS). RECS Public-Use Data Files.

Comparison with Previous Surveys: The incidence of item nonresponse in the 2001 RECS is consistent with the rate experienced in 1997 where imputations for 145 variables were performed. Of these variables, 6 contained missing data for 100 or more cases and 80 variables contained missing data for 10 or fewer cases.

Stove and Oven Questions: Of particular note in the above table are the imputations for “use a separate built-in range top or burners” and “use a built-in oven.” Question B-1 of the 2001 Household Questionnaire presented respondents with a list of cooking appliances and asked them to report all that they had. Included in this list was a stove that has both burner`s and one or two ovens, separate built-in range top or burners, separate built-in oven, and built-in or stove-top grill. In the 1997 questionnaire, no question asked about built-in or stove-top grills and respondents were asked about the other three items in each of three separate questions.

The result of this change in questionnaire format was that an unusually large number of respondents who reported that they had a built-in or stove-top grill did not report that they also had a stove, 115 cases, and an oven, 104 cases. A comparison of these housing units with those reporting having a built-in or stove-top grill and both a stove and an oven revealed that the obtained results were highly unusual and likely due to item nonresponse. Accordingly, it was deduced and imputed that these housing units did in fact have an oven and stove. Since respondents that did not report having a stove or an oven where not asked about the fuel used by these appliances, the fuel used was also imputed. For those housing units that reported having both a stove and an oven the same fuel was used for both appliances in 91 percent of the cases. Accordingly, the fuel used by the stoves and ovens was deduced to be the same one used by the built-in or stove-top grills.

Self Cleaning Features: Finally, the follow-up questions about the presence and type of a self-cleaning features of ovens asked when an oven was reported were also not asked for the 104 cases where a built-in stove-top grill was reported but an oven was not. For these cases no response value was imputed. Instead a response of “not asked” was recorded on the data file.

Quality of Specific Data Items

This section addresses some of the difficulties encountered in trying to obtain meaningful energy data on specific Household Questionnaire items in the 2001 RECS.

Housing Unit Type

Historically, how a housing unit was characterized in the RECS, e.g., whether it was a single-family detached unit, a single-family attached unit, a mobile home, an apartment in a 2-4 unit building, or an apartment in a building with more than 4 units, was the work of the interviewer. Upon the interviewer’s arrival at the selected housing unit they completed a summary sheet that included their observation as to the type of housing unit. In addition, the interviewers also recorded the number of floors and units in apartment buildings with more than 4 units. There was no independent verification of how the unit was characterized by either the householder or others.

There are two exceptions to these procedures. In the 1997 RECS, the householder, as part of the on-site, in-person interview, characterized the type of housing unit they lived in. The householder also provided the interviewer with the number of floors in apartment buildings with more than 4 units. The interviewer accepted the householder’s responses without question.

In the 2001 RECS, responsibility for the characterization of the type of housing unit and, for apartment buildings with more than 4 units, the number of floors and units in the building, was returned to the interviewer who recorded their observations before beginning the formal on-site, in-person household interview. However, in the 2001 interview the householder was asked to confirm the interviewer’s characterization of the housing unit. If the householder disagreed with the interviewer they were asked for their characterization. The interviewer resolved contradictions between the two observations. Of the 4,822 housing units included in the 2001 RECS, .75 percent (36 cases) disagreed with the interviewer’s characterization. Of these contradictions, 19 cases were re-characterized based on the householders input.

The change in the data collection procedures may have contributed to changes in the survey results. For example, the estimated number of occupied single-family detached units decreased from 63.8 million for the 1997 RECS to 63.1 million for the 2001 RECS. Conversely, the number of occupied housing units in buildings with two to four units increased from 5.6 million for the 1997 RECS to 9.5 million for the 2001 RECS.

Programmable (Set-Back or Clock) Thermostats

In the space-heating section of the 1997 RECS, respondents who reported that they had a thermostat in their home were also asked: Is that thermostat either a set-back or clock thermostat and if they actually programmed the thermostat or used the manual features. An estimated 44.9 million households reported having set back or clock (programmable) thermostats in 1997. Of these, an estimated 11.7 households reported that they programmed their thermostats and an estimated 33.2 million reported that they only used the manual features. The very large number of households with programmable thermostats was in itself questionable; even more so when compared to the 10.8 million households reporting having a programmable thermostat in the 1993 RECS where a comparable question was included in the conservation measures and usage section of the questionnaire.

The change in the placement of the question in the 1997 RECS may have contributed to the large change in the survey results. In addition, the question concerning programmed versus manual use of the thermostats may have changed how the interviewers coded the question on the presence of a programmable thermostat.

In the 2001 RECS questionnaire, the wording of the follow-up question after the respondent reported having a thermostat was revised to more explicitly describe the type of thermostat. Specifically, the question asked: “Is that thermostat programmable? That is, can you set it so that the temperature setting automatically changes at the times of the day or night that you choose?” In response, 25.1 million households in the 2001 RECS reported that they had such a thermostat. This number of households is substantially lower than the 44.9 million households that reported having this type of thermostat in the 1997 RECS and substantially higher than the 10.8 million households in the 1993 RECS.

The 2001 estimate of the number of programmable thermostats is probably the most accurate of the three attempts to determine the actual number in U.S. housing units. Inevitably, an uncertain amount of response error will result in an uncertain amount of inaccuracy. Respondents, when asked about a programmable thermostat, may have different notions of the meaning of programming than intended by the drafters of the question.

Estimation of Sampling Error

Sampling error is the random difference between a survey estimate and an actual population value. It occurs because the survey estimate is calculated from a randomly chosen subset of the entire population. The sampling error averaged over all possible samples would be zero, but there is only one sample for the 2001 RECS. Therefore, the sampling error is not zero and is unknown for the 2001 RECS sample. However, the sample design permits sampling errors to be estimated.

This section describes how the sampling errors were estimated and how they were made available to readers of RECS tables and analyses who are interested in the precision of the estimates in the RECS tables.

Relative Standard Errors (RSE’s)

Throughout the RECS tables, standard errors are given as percents of their estimated values; that is, as relative standard errors (RSE). The RSE is also known as the coefficient of variation.

For a given population parameter Y that is estimated by the survey statistic Y, the relative standard error of Y, RSE (Y), and standard error of Y, S (Y), are given by:

RSE (Y) = [S (Y)/Y] × 100.

S (Y) = [RSE (Y)/100] × Y.

For some surveys, a convenient algebraic formula for computing variances can be obtained. However, the RECS used a multistage area sample design of such complexity (see Survey Methods) that it is virtually impossible to construct an exact algebraic expression for estimating variances. In particular, convenient formulas based on an assumption of simple random sampling, typical of most standard statistical packages, are inappropriate for the RECS estimates. Such formulas tend to give low values for standard errors, making the estimates appear much more accurate than is the case.

Balanced Half-Sample Replication

Instead, the method used to estimate sampling variances for the RECS was balanced half-sample replication. The balanced half-sample replication method involves calculating the value for a statistic using the full sample and calculating the value for each of a systematic set of half samples. (Each half sample contains approximately one-half of the observations contained in the full sample.) The variance is estimated using the differences between the value of the statistic calculated using the full sample and the values of the statistic calculated using each of the half samples.

For every estimate in the RECS tables, the RSE was computed by the balanced half-sample replication method. This RSE was used for any statistical tests or confidence intervals given in the text, or to determine if the estimate was too inaccurate to publish (RSE greater than 50 percent).

Generalized Variances

Instead of publishing the complete set of RSE’s, a generalized variance technique is provided, by which the reader can compute an approximate RSE for each of the estimates in the detailed RECS tables. For the statistic in the ith row and jth column of a particular table, the approximate RSE is given by:

RSE (i,j) = R(i) × C(j)

where R(i) is the RSE row factor given in the last column of row i, and C(j) is the RSE column factor given at the top of column j.

This value for the relative standard error can be used to construct confidence intervals and to perform hypothesis tests by standard statistical methods. However, because the generalized variance procedure gives only approximate RSE’s, such confidence intervals and statistical tests must also be regarded as only approximate.