Commercial Buildings Energy Consumption Survey (CBECS)

Glossary › FAQS ›

Overview
Data

Analysis & Projections

1995 CBECS Survey Data 2018 | 2012 | 2003 | 1999 | 1995 | 1992 | Previous

Methodology & Development

How the Survey Was Conducted: Introduction
- Building Characteristics Survey
- Energy Suppliers Survey
Non-sampling and Sampling Errors: Introduction
Survey Forms

1995 Commercial Buildings Energy Consumption Survey: Sample Design

1. How the Survey Was Conducted: Introduction

The Commercial Buildings Energy Consumption Survey (CBECS) is conducted by the Energy Information Administration (EIA) to provide basic statistical information on energy consumption and expenditures for U.S. commercial buildings and data on energy-related characteristics of these buildings. The survey is based upon a sample of commercial buildings selected according to the sample design requirements described below. A "building" as opposed to an "establishment" is the basic unit for the CBECS because a building is the energy-consuming unit.

The CBECS is conducted in two major data-collection stages: a Building Characteristics Survey and an Energy Suppliers Survey. The first stage, the Building Characteristics Survey, collects information about selected commercial buildings through voluntary personal interviews with the buildings' owners, managers, or tenants. In 1995, the data were collected by using Computer-Assisted Personal Interviewing (CAPI) techniques. An authorization form signed by the respondent is used to secure the release of the building's energy consumption and expenditures records from the energy supplier.

The Energy Suppliers Survey, the second data-collection stage, obtains data about the building's actual consumption of energy and associated expenditures for that energy from records maintained by energy suppliers. The information is obtained by means of a mail survey conducted under EIA's mandatory data collection authority. Additionally, the CBECS asks energy suppliers about any demand-side management programs they may have provided to the building. Under EIA's direction, a survey research firm conducts both the personal interviews for the Building Characteristics Survey and the mail survey for the Energy Suppliers Survey.

Confidentiality of Information

EIA does not receive or take possession of the names or addresses of individual respondents or any other individually identifiable energy data that could be specifically linked with an individual sample building or building respondent. All names and addresses are maintained by the survey contractor for survey verification purposes only. Geographic identifiers and NOAA Weather Division identifiers are not included on any data files delivered to EIA. Geographic location information is provided to EIA at the Census division level. In addition, building characteristics, such as number of floors, building square footage, and number of workers in the building, that could potentially identify a particular responding building, are masked on data files provided to EIA, as well as on all public-use data files.

Target Population

The target population for CBECS consists of all commercial buildings in the United States larger than 1,000 square feet, with the exception of commercial buildings located on manufacturing sites. To be eligible for the survey, a building had to satisfy three criteria: (1) it had to meet the survey's definition of a building, (2) it had to be used primarily for some commercial purpose, and (3) it had to measure 1,001 square feet or more. A building is defined by CBECS as a structure totally enclosed by walls that extend from the foundation to the roof and is intended for human access. To be used primarily for some commercial purpose, the building must have more than 50 percent of its floorspace devoted to activities that are neither residential, industrial, nor agricultural. The 1995 CBECS estimated that there were 4,579 thousand buildings in the target population.

Sample Design

The sample design for the CBECS is a multistage area probability cluster sample design supplemented by a list sample of "large" buildings, recently constructed buildings, and "special" buildings (Federal Government buildings and post offices, hospitals, colleges, and universities). The area sample portion of the design is a sample from the broad spectrum of commercial buildings. The supplemental list sample provides an oversample of "large" buildings and "special" buildings. Similarly, for recently constructed buildings, the area sample is used to provide a sample from the broad spectrum of new buildings and the supplemental list sample provides an oversample of "large" new buildings.

Multistage Area Probability Sample: The area component of the CBECS sample uses a four-stage cluster sampling design that selects primary sampling units (PSU's), secondary sampling units (SSU's), segments, and, ultimately, buildings. The first three of these stages involve sampling progressively smaller geographic areas. For the 1995 CBECS, the same PSU's, SSU's, and segments that were selected for the 1986 CBECS were reused. For the fourth stage of sampling, the 1995 selection of buildings was executed by using procedures to update the 1986 CBECS building lists to include new construction in the sampled segments.

Supplementary List Sample from Lists of Large and Specialized Buildings: To ensure adequate coverage of buildings that are significant energy users, the multistage area probability sample is supplemented within each selected PSU by a sample from a list of "large" buildings (buildings over 250,000 square feet) or facilities. In addition, to improve the precision of energy consumption estimates for certain types of buildings, a supplementary sample is drawn from several lists of special buildings. These list frame files differ from the area segment listings in that the list files are primarily facility or construction-project based as opposed to building based.

Sample Selected

The goal of the 1995 CBECS sampling procedures (both the area sample and the supplemental list sample) was to achieve completed interviews for 5,500 buildings — 4,450 buildings from the area sample and 1,050 buildings from the supplemental list sample. In order to achieve the goal for number of respondents, a sample of 8,074 potential cases was selected, consisting of 6,633 buildings from the area sample frame and 1,441 buildings from the supplemental list sample frames consisting of large buildings and special buildings. Of these 8,074 buildings, 6,590 buildings were found eligible for interviewing. The three primary eligibility criteria, building definition, building use, and building size are described in the "Determining Building Eligibility" section below. Other reasons for sample building listings to be classified as ineligible included duplication of buildings, demolished buildings, buildings under construction, or commercial buildings on industrial facilities.

Response Rates

Of the 6,590 eligible buildings, interviews were completed for 87.5 percent, or 5,766 buildings (4,728 buildings from the area sample and 1,038 buildings from the supplemental list sample). Of the 5,766 CBECS respondents, 5,668 reported some energy use in the building. For 92.6 percent, or 5,250, of these buildings, an authorization form was obtained which allowed the survey contractor to contact the energy suppliers for release of the energy billing data for the building.

Building Characteristics Survey

Building Eligibility

Determining building eligibility was a three-step process. The first step occurred during the development of the area and supplemental sample listings. The second step occurred when the interviewer observed the building, and the third step occurred during the interview of the building owner or manager. While criterion one, the definition of a building, can be determined during the first and second steps, criteria two and three are based more on lister or interviewer judgment and could result in exclusion of eligible buildings or the inclusion of ineligible buildings during those steps. The third step is crucial in identifying ineligible buildings. Once the interviewer begins the interview, initial screening questions instruct the interviewer to terminate the interview if criterion two or three is not met.

Criterion 1—Building Definition: The definition of a building was the same one used in previous CBECS: a structure totally enclosed by walls that extend from the foundation to the roof and intended for human access. Thus, structures such as water, radio, and television towers were excluded from the survey. Also excluded were (1) parking garages and partially open structures, such as lumber yards; (2) enclosed structures that people usually do not enter or are not buildings, such as pumping stations, cooling towers, oil tanks, statues, and monuments; and (3) dilapidated or incomplete buildings missing a roof or a wall. There is one exception to the building definition criterion: a structure built on pillars so that the first fully enclosed level is elevated. These were included because such buildings fall short of meeting the definition due only to the technical shortcoming of being raised from the foundation. They are totally enclosed, are used for common commercial purposes, and use energy in much the same way as buildings that sit directly on a foundation.

Criterion 2—Building Use: The second criterion was that a building had to be used primarily for some commercial purpose; that is, more than 50 percent of the building's floorspace must have been devoted to activities that were neither residential, industrial, nor agricultural. The primary use of the sampled building governed whether the building was included in the CBECS. In 1995, there was one exception to this criterion: commercial buildings on manufacturing sites were considered out of scope. (In previous CBECS, if a commercial building (e.g., an office building), was located on a manufacturing site, it would have been considered in scope). Examples of nonresidential buildings that were not included in the CBECS samples are:

Farm buildings, such as barns, unless space is used for retail sales to the general public Industrial or manufacturing buildings that involve the processing or procurement of goods, merchandise, or food Buildings on most military bases Buildings where access is restricted for national security reasons Single-family detached dwellings that are primarily residential, even if the occupants use part of the dwelling for business purposes Mobile homes that are not placed on a permanent foundation (even if the mobile home is used for nonresidential purposes). During the interviewing stage, interviewers were instructed not to begin interviews at buildings where they observed that 75 percent or more of the floorspace was used for residential, industrial, or agricultural purposes. Once the interview began, screening questions instructed the interviewer to terminate the interview if the respondent indicated that 50 percent or more of the square footage was used for residential, industrial, or agricultural purposes.

Criterion 3—Building Size: The third criterion was that a commercial building had to measure more than 1,000 square feet (about twice the size of a two-car garage) to be considered in scope for the 1995 CBECS. This building size criterion was met in two successive size cutoffs, which were enacted during the listing and interviewing processes. During the listing stage, buildings judged to be less than 500 square feet were not listed. Interviewers did not begin interviews when they observed a building to be 500 square feet or less. Then during the interviewing stage, interviewers asked screening questions designed to terminate the interview when the square footage was reported to be 1,000 square feet or less.

Data Collection

Data collection encompasses several phases, including: (1) designing the questionnaire, (2) training supervisors and interviewers, (3) collecting data, (4) minimizing nonresponse, and (5) processing the data. A survey contractor performed the data collection under the direction of EIA.

Designing the Building Characteristics Survey Questionnaire

Questionnaire design work for the 1995 CBECS was conducted by EIA. Although a set of core questions remained the same or very similar to those used in previous surveys, the 1995 Building Questionnaire was redesigned to improve data quality and to allow the data to be collected by use of Computer-Assisted Personal Interviewing (CAPI) techniques.

Training Supervisors and Interviewers

The CBECS building questionnaire is a complex instrument designed to collect data during a personal interview at the building site. Well-trained interviewers are imperative to the collection of technical information. Training for the 1995 CBECS included three in-person training sessions: one session for the interviewer trainers, monitors, and regional supervisors and two sessions for the interviewers. Because the 1995 CBECS was collected for the first time by using CAPI, all interviewers were trained in the general use of the computer and in interviewing and administering the CAPI questionnaire. Training sessions included lectures, interviewers slide presentations, and small group sessions where the interviewers practiced administering the questionnaire by using laptop computers. EIA personnel participated in all training sessions, providing an overview of the CBECS and a presentation on the key 1995 CBECS energy concepts.

Collecting the Data

Initial contacts with the building representatives were made through an introductory letter mailed to them at each building or facility in the survey sample. The letter, signed by a representative of EIA, was addressed to the building owner or manager. The letter explained that the building had been selected for the survey, introduced the survey contractor, assured the building manager that the data would remain confidential, and discussed the uses and needs for the CBECS data in setting national energy policies. To protect confidentiality, the letter was addressed by the survey contractor after it was signed at EIA. A worksheet was attached to the letter that listed several pieces of information that the respondent should have ready for the interviewer.

Data collection began August 28, 1995, and ended December 8, 1995. The data were collected by the survey contractor's field staff. This staff consisted of 149 interviewers under the supervision of seven regional supervisors and their assistants and a central office staff consisting of a project manager, a field director, and a subsampling assistant.

Interviewers: Prior to beginning the interview, the interviewer observed the outside of the building to ascertain if the structure met the size and building-use eligibility requirements of the survey. If the building failed to meet any one of the definitional criteria, the building was classified as ineligible and no interview was conducted. During the initial visit to the sampled buildings, the interviewers identified and attempted to schedule an interview with a knowledgeable respondent who met the survey criteria for a building representative. The respondent could be the owner of the building, a tenant, a hired building manager or engineer, or a spokesperson for a management company.

The Interview: Each interview began with a series of screening questions designed to verify the building's address and eligibility for the survey. Respondents were asked about the building as a whole rather than individual establishments located within the building. The completed building interview lasted an average of 40 minutes. That included the time for the interviewer to record the results of the screening, to ask all questions on the building characteristics questionnaire, and to obtain a signed authorization form from the respondent for the release of energy billing data from the energy supplier to the building. It did not include the observation time prior to the interview to determine if the building was eligible or the time needed to obtain a signed authorization form from someone other than the building respondent in those cases when the building respondent did not have the authority to sign the form.

The average time to obtain each completed interview, including interviewer preparation, travel, callbacks, interviewing, and transmitting the completed interviews to the home office, was 6 hours and 54 minutes. Each interviewer conducted an average of 53 interviews: 5 interviewers each completed 10 or fewer interviews, while 6 interviewer each completed more than 70.

Interviewer Supervision: Procedures were taken to ensure that the interviews were conducted as intended. Ten percent of each interviewer's cases were preselected for validation to verify that the interview had been conducted and that it had been conducted at the correct building according to specified procedures. That validation occurred by telephone at the survey contractor's home office. If a disproportionate percentage of an interviewer's validation cases were classified as ineligibles or nonrespondents, additional cases were selected as needed to ensure 10 percent coverage of responding cases for each interviewer. Interviewers were informed that a sample of their work would be validated, but they were not informed which completed interviews would be checked. If a field supervisor was concerned about a particular interviewer, he or she conducted discretionary validations.

Minimizing Nonresponse

Several approaches were employed in an effort to minimize nonresponse, including: advance mailings to building owners or managers; in-person visits; telephone callbacks; establishment of a toll-free "hot-line" number to address respondents' concerns or questions; personalized letters to documented refusals; and provision of additional field staff in several Metropolitan Statistical Areas to help those who still had problem cases. These approaches dealt with the three categories of nonresponse for CBECS: (1) refusals, (2) cases where the knowledgeable respondent was located outside of the sample PSU's, and (3) cases where the respondent was unavailable during the field data collection period.

An additional type of nonresponse conversion dealt with respondents who declined to sign the authorization forms that would allow their energy suppliers to release the building's energy consumption records and information on demand-side management program participation. Personalized written requests for signed authorization forms were mailed for all buildings for which energy usage had been reported and a signed form had not been obtained by an interviewer. Such requests were mailed to 219 buildings interviewed by field staff. A total of 24 signed authorization forms were received by mail.

Processing the Data

The initial processing of the CBECS data occurred at the survey contractor's home office and included receipt of the CBECS questionnaires as they were transmitted from the field, editing the questionnaires, calculating the survey weights for each building, and masking the data for confidentiality before it was transmitted to EIA. Final data preparation occurred at EIA and consisted of checking the data for internal consistency, checking the data against data from previous surveys, conducting imputation procedures for missing data, and preparing cross-tabulations for release to the public.

Data Editing: Data editing for the 1995 CBECS Building Characteristics Survey occurred at several points during data collection and processing. Initial editing occurred during the Computer-Assisted Personal Interviewing (CAPI) interview. Additional editing occurred upon receipt of the questionnaire for data processing and during data entry. The final data editing occurred during review of data frequencies and cross-tabulations.

CAPI Edits During Interview: Data collection using CAPI techniques allowed for some data editing to occur during the interview, thus ensuring higher quality of data, as well as reducing the time required for post-interview editing. Higher quality of data was achieved by programming skip patterns that prohibited the entry of ineligible codes directly into the CAPI questionnaire.

Data Editing at Home Office: Completed questionnaires were transmitted electronically to the survey contractor's home office and the hard-copy materials were mailed. Clerks reviewed the hard-copy materials to locate a signed authorization form and any hard-copy listings of account numbers and supplier customer lists used to supplement CAPI. Linkage of the building with the energy supplier was completed as part of the processing of building survey data.

Edits at this stage were of three types: (1) missing data checks, (2) automated logic checks that verified compliance with codes and skip patterns as specified in the codebook, and (3) inter-item consistency checks. The survey contractor took several steps to resolve inconsistencies or ambiguities in the data. First, the contractor reviewed other parts of the questionnaire for explanations that might help solve the problem. Several open-ended questions were included in the questionnaire that allowed the respondent to either describe or include additional information about a particular item. Also, the interviewers had been asked to write comments in "comment boxes" to explain unusual circumstances. These open-ended questions and notes were relied upon extensively in the resolution process and were very helpful in explaining some of the inconsistencies. Second, in some hard-to-resolve cases, EIA personnel provided technical guidance on how to reconcile some questionnaire responses. Finally, when these efforts failed to resolve a problem, especially when the energy sources or heating and cooling equipment were involved, the survey contractor contacted the respondent by telephone for clarification.

Overall, telephone contacts to clarify both questionable or missing information were completed for the respondents of 602 buildings, 10 percent of all completed cases. All changes made to any questionnaire response as a result of these reviews were carefully documented and explained on an error-resolution sheet attached to the questionnaire.

As the last step prior to the delivery of the draft data tape to EIA, the contractor produced data frequencies and cross-tabulations. These were reviewed to reveal any outlying values and inconsistencies that the edits may not have identified. Inconsistencies were corrected by the contractor before data tapes were transmitted to EIA.

Energy Suppliers Survey

During the Building Characteristics Survey, each respondent was asked to provide the name, address, and account numbers for all suppliers of energy to the building. In addition, respondents were asked to sign an authorization form that gave permission to the suppliers to release the building's monthly billing data to EIA. EIA's survey contractor sent copies of this form to the suppliers to secure the release of the buildings' billing records, as well as the buildings' participation in any demand-side management programs, if programs were available from the energy supplier. Attempts were made to contact all suppliers of electricity, natural gas (including suppliers of natural gas transported for the account of others), fuel oil, district sources (steam, hot water, and chilled water) that were identified during the Building Characteristics Survey.

Data Collection Instruments

Consumption and Expenditures Forms: Each supplier of electricity, natural gas, fuel oil, or district sources to a sampled building was asked to provide consumption and expenditures data on a mailed survey form. Because there were minor differences in data items by energy source, there were corresponding variations in the reporting forms as well. For example, the electricity forms requested kilowatt (kW) demand; the natural gas forms included transportation gas, as well as provision for reporting variable units of measures (such as therms, cubic feet, or 1,000 cubic feet); the fuel oil forms requested information about the type of fuel oil used; and the district heating forms asked for information concerning the entire district or system.

Despite the above-mentioned differences, the forms for the different fuels were similar in terms of the data requested. In each case, the supplier was asked to report the following data: (1) quantity of specific energy source consumed or delivered; (2) total cost; (3) unit of measure; (4) dates of deliveries or consumption; and (5) number of customers included in both the consumption and cost data reported on the form.

Suppliers were not required to transcribe data onto the survey forms—responses were accepted in any format (including computer printouts), as long as the necessary information was provided. Additionally, electric, natural gas, and fuel oil suppliers could submit their data on a formatted computer diskette provided by EIA. Response to the forms was mandatory for the supplier.

The data were requested for a 14-month period between December 1, 1994, and January 31, 1996, in order to ensure that data would cover a full calendar year no matter what the actual billing period had been. For example, if the billing period began on the 10th of each month, the first bill would be from December 10 through January 9. The bills were then prorated (annualized) to obtain data for the calendar year.

Demand-Side Management (DSM) Forms: An additional form was inserted in the electricity and natural gas usage forms to collect data about the building's participation in utility-offered energy-savings programs. Both forms collected essentially the same type of information, although each was tailored to the particular energy source, either electricity or natural gas. For example, the electricity suppliers were asked about DSM programs, such as lighting, energy-efficient motors, metered peak demand, time-of-day pricing, and standby electricity generation. The natural gas form asked about DSM programs but did not include those measures that were not applicable to natural gas suppliers, such as peak demand or time-of-day pricing.

Data Collection

Advance Mailings: An initial letter from EIA was mailed in September 1995 to electricity and natural gas utility companies that served buildings surveyed in the 1992 CBECS, explaining the survey and requesting a contact person be designated for the 1995 survey. A second letter from EIA, which included a copy of the 1992 CBECS executive summary from the Commercial Buildings Energy Consumption and Expenditures 1992, was mailed in November 1995 to companies that had not responded to the earlier request for information.

Survey Mailings: For the 5,766 buildings for which responses had been obtained in the Building Characteristics Survey, a total of 11,091 energy suppliers forms were mailed to 1,218 suppliers of energy. Of these suppliers, 518 (43 percent) were electricity and natural gas suppliers (including suppliers of gas transported for the account of others), 415 (34 percent) were fuel oil suppliers, and the remaining 285 (23 percent) were district heating suppliers.

The initial mailing of the survey forms to the energy suppliers occurred in early February 1996, with a due date of March 1, 1996, for the forms. Reminder letters to suppliers who had not returned the forms were sent shortly after the due date, with a second written request to nonrespondents in May 1996. Survey closeout was September 5, 1996 (the closeout date was extended by 3 weeks to accommodate several late-responding suppliers.)

Minimizing Nonresponse

Extensive efforts were used to obtain usable energy supplier data. Letters and telephone prompts were made to the energy suppliers throughout the data collection period to remind the suppliers to provide the data within the required time period. In addition, a toll-free telephone hot-line number was provided to all suppliers, both in the cover letter accompanying the survey forms and on the face of each survey form. Suppliers were encouraged to call this number if they had any questions. Hot-line staff were knowledgeable regarding the most frequent technical problems encountered by suppliers and the instructions to be given to suppliers calling with these questions.

Electricity, Natural Gas, and Purchased District Heat Suppliers: The nonresponse effort for the suppliers of electricity, natural gas, and purchased district heat began with a personalized reminder letter to all companies that had not returned any survey forms as of the March 1, 1996, date. Another nonresponse conversion letter was mailed May 1, 1996, to companies that had returned some but not all of their forms, as well as to companies that had not responded at all. Beginning May 23, 1996, nonrespondents were then telephoned and asked for the expected forms' completion date. These calls resulted in 128 requests for more forms. The companies were called again if that date arrived and they still had not responded. The nonresponse procedure was followed both for complete nonresponse by an energy supplier and for incomplete or missing buildings within a supplier's response.

Fuel Oil and Nonpurchased District Heat Suppliers: On March 6, 1996, a reminder letter was sent to each fuel oil supplier and each supplier of nonpurchased district heat that had not returned all forms. This was followed in April by a remailing of the entire packages of survey forms to those companies that had not yet responded. Telephone nonresponse conversion calls began on June 5, 1996, after a third letter was sent in May alerting the respondent to the telephone calls. The telephone calls resulted in numerous requests for additional survey forms, which were mailed in mid-June to 57 companies. When possible, the telephone interviewers attempted to obtain data over the telephone if a limited number of survey forms was missing.

Energy Suppliers Survey Response Rates: The overall response rate for the 1995 Energy Suppliers Survey was 84.9 percent (see Table 1 at end of section). The response rate is defined as:

(usable records) ÷ ((all records) - (out-of-scope records))

Each record corresponded to a single energy supplier for a particular energy source to a particular building. For example, a building with one electricity supplier, two fuel oil suppliers, and no other energy suppliers would have a total of three energy supplier records, one for electricity and two for fuel oil. Records were initially created on the basis of the Building Characteristics Survey respondents' reports of the names and addresses of their energy suppliers. A record was declared out-of-scope if it turned corresponded to a supplier that did not actually serve the building during calendar year 1995.

Response rates for natural gas (not identified as gas transported for the account of others) and for electricity were 86.5 and 89.1 percent, respectively, which were similar to results obtained in previous CBECS. The response rate for the suppliers of gas transported for the account of others was 79.1 percent. The response rate for fuel oil was 77.7 percent and the rates for steam and hot water (district sources) were 62.3 percent and 25.7 percent, respectively.

Of the forms mailed, 1,516 (about 14 percent) were classified as nonresponse. That category included refusals, inability to respond within the data collection period, and inability to locate the correct account for the building.

Data Editing

As the suppliers' forms were received, they were screened for accuracy and completeness. The forms were then keyed and edited. (In 1995, for the first time, PC-based key entry was used for the suppliers survey forms.) The Energy Suppliers Survey used an extensive program of automated machine edits, including:

(1) Basic Energy Range and Skip Checks. EIA specified ranges and values to be used for the technical edits. These values were based on previous CBECS responses and on knowledge of utility rates and practices. The first edits were range and basic logic checks.

(2) Consistency Checks Among Data Items. Edit failures at these levels were most often due to coding or data entry error. If the causes of the error were not apparent to the technical reviewer, it was referred to supervisory staff for resolution.

(3) Technical Edits. EIA specified a series of sophisticated edit checks to ensure that, to the extent possible, errors of the following types were detected and corrected: a too-long or too-short billing period; a consumption ratio that indicated there was extreme variability across the periods; a failure to report expenditures despite the presence of consumption, and vice versa; reported expenditures that were out of range for the consumption amount, for the price per unit of consumption based on known market prices, or for the metered demand.

Data Adjustments: Adjustments for unit nonresponse were performed in conjunction with weighting of the sample. Cases missing all or part of calendar year 1995 consumption or expenditures were considered as a particular kind of item nonresponse.

Weather Data: A file of heating and cooling degree-days for each of the billing periods reported by each building supplier was created in the following manner:

A National Oceanic and Atmospheric Administration (NOAA) division code was assigned to each building in the CBECS sample. Working with NOAA division maps and building address information, EIA assigned one of 356 division codes to each building.

A file of NOAA data covering the 27-month period from January 1994 to March 1996 (the most recent information available at the time) was used to compute the average daily temperature for each day in the 27-month period for each weather division.

Daily heating and cooling degree-day averages were computed for each of 10 base temperatures (degrees Fahrenheit): 50, 55, 57, 60, 65, 68, 70, 73, 75, and 80. Only base temperature 65 degrees Fahrenheit is covered in this report.

Degree-day totals were constructed for each billing period, or gap between billing periods, for each energy supplier for each building. In addition, degree-day totals were constructed for each of the 12 calendar months of 1995 for each sampled building, whether or not the building had any energy supplied in 1995.

As part of the annualization and imputation procedures billing period dates were imputed. The edited dates were used for the final degree-day computations.

Data Preparation for Report

After receiving the CBECS data tapes from the survey contractor, EIA data analysts reviewed and processed the data to prepare them for the final data tape. Cross-tabulations were run to check for internal consistency of the data, and the 1995 CBECS data were compared with data from previous CBECS. Commercial buildings' consumption and expenditure data are complex and interrelated. The EIA review was extensive and paid special attention to the issues of peak electricity demand, gas transported for the account of others, and incomplete data for buildings. Questions concerning data accuracy or outlier values were referred to the survey contractor for verification. EIA staff reviewed the data questionnaires at the survey contractor's site, and EIA's staff judgment was the final authority on some of the data items.

The sections above on data editing, data adjustments, and weather data provide details on the work undertaken to prepare the data for this report. In addition, if retrieval of missing data for one or more items failed, or if retrieval was not performed because the item was not a key item, data values were supplied by imputation. Additionally, the consumption and expenditures data were annualized; that is, they were adjusted by proration methods to estimates for calendar year 1995, when the reported data spanned a longer, shorter, or offset time period. When consumption or expenditures data were completely missing, the annual amounts were imputed by regression.

Once the annualized consumption and expenditures were computed or imputed for each building, statistical tables ("Detailed Tables") of aggregated data were then produced and analyzed.

Table 1. Response Rates for Energy Suppliers Survey by Energy Source, 1995v
Survey Category	Electri- city	Natural Gas	Trans- ported Gas(a)	Fuel Oil	Steam	Hot Water	Total
Total Mailed Out	5,486	3,684	338	869	518	196	11,091
Out of Scope	206	302	42	162	27	29	768
Nonresponse	545	448	56	158	185	124	1,516
Complete: Usable Records	4,705	2,924	234	549	306	43	8,761
Complete: Unusable Records(b)	30	10	6	0	0	0	46
Response Rate(c) (percent)	89.1	86.5	79.1	77.7	62.3	25.7	84.9
(a) Transported gas is natural gas purchased from a source other than the local utility company but delivered to the building by the local utility. Transported gas is also called gas transported for the account of others. (b) An unusable record contains all of the information requested on the survey form, but either does not cover all of the building's square footage or includes more square footage than is in the building, as defined by CBECS, and information is not available for calculating a disaggregation or aggregation factor. (c) A response rate is calculated by dividing the complete usable record by the difference of total mailed out minus out of scope and multiplying the result by 100. Source: Energy Information Administration, 1995 Commercial Buildings Energy Consumption Survey

2. Nonsampling and Sampling Errors: Introduction

All of the statistics published in this report are estimates of population values, such as the total floorspace of commercial buildings in the United States. These estimates are based on reported data from representatives of a randomly chosen subset of the entire population of commercial buildings. As a result, the estimates always differ from the true population values.

The differences between the estimated values and the actual population values are due to two types of errors, sampling errors and nonsampling errors.

Sampling errors are errors that are random differences between the survey estimate and the population value that occur because the survey estimate is calculated from a randomly chosen subset of the entire population. The sampling error, averaged over all possible samples, would be zero, but since there is only one sample for the 1995 CBECS, the sampling error is nonzero and unknown for the particular sample chosen. However, the sample design permits sampling errors to be estimated. "Estimation of Standard Errors" describes how the sampling error is estimated and presented for statistics given in CBECS detailed tables and reports.

Nonsampling errors are related to sources of variability that originate apart from the sampling process and are expected to occur in all possible samples or in the average of all estimates from all possible samples.

The next two sections, "Data Collection Problems" and "Nonresponse," describe some of the sources of nonsampling error in the Building Characteristics Survey and how that portion of the CBECS is designed and conducted to minimize such errors. Nonsampling errors can result from (1) inaccuracy in data collection due to questionnaire design errors, interviewer error, respondent misunderstanding, and data processing errors; (2) nonresponse for an entire sampled building (unit nonresponse); and (3) nonresponse on a particular question from a responding building (item nonresponse). The section "Data Collection Problems" addresses some of the difficulties encountered in trying to obtain meaningful energy data on questionnaire items in the 1995 survey. The section "Nonresponse" presents survey design and data collection procedures used to minimize unit and item nonresponse in both the Building Characteristics Survey and the Energy Suppliers Survey.

The energy consumption and expenditures data were based on monthly billing records submitted by the buildings' energy suppliers. The section "Annual Consumption and Expenditures" provides a detailed explanation of the procedures used to obtain annual consumption and expenditure estimates from the bills, as well as the procedures used to handle partial or completely missing data. The peak electricity demand estimates were also based on the monthly billing data, as described in the section "Annual Peak Electricity Demand."

The section titled "Other Problems" discusses reconciliation of building and supplier reports on the types of energy sources use; and attempts to collect natural gas expenditures from both the local natural gas suppliers and non-local natural gas suppliers.

Data Collection Problems

Most unit nonresponse cases occurred because an appropriate respondent was unavailable or declined to participate in the survey. Item nonresponse resulted when the building respondent did not know, or, less frequently, refused to give the answer to a particular question.

Even though the interviewer was instructed to conduct the interview with the person most knowledgeable about the building, there was a great deal of variation in how much CBECS respondents knew about their buildings. Some respondents did not know some of the information requested; some were able to provide certain information only if the questions were expressed in the particular terms they understood. This presented a special challenge to the CBECS questionnaire designers—with such a diverse population of respondents, it is difficult to construct standard wording for energy concepts that would be understood by all respondents. Thus, a certain amount of respondent error can be expected. Additionally, even when a question is worded clearly and the respondent understands the question and has the required knowledge, simple clerical errors (possibly the fault of the questionnaire layout) can sometimes lead to inaccuracies in the data. Unlike the sampling error, the magnitude of nonsampling error cannot easily be estimated from the sample data. For this reason, avoiding biases at the outset is a primary objective of all stages of survey design and field procedures. The wording and format of survey questionnaires; the procedures used to select and train interviewers; and the quality control built into the data collection, receipt, and processing operations were all designed to minimize these sources of error.

Following is a summary of some of the most significant difficulties that EIA staff has identified with the survey responses. The extent of these comments should not be viewed as a failure of the questionnaire or the interview process; the data collection process worked well. Rather, these comments indicate areas that require further refinements to improve overall data quality.

Principal Building Activity: The principal building activity refers to the primary function or activity that occupies the most floorspace in the building sampled. In some cases, particularly if the sampled building was one of a number of buildings on a facility, the respondent reported the overall function of the facility or establishment to which the building belonged. In CBECS, for instance, a library is classified as a public assembly building, but a library on a university campus may have been reported as an education building (academic or technical instruction). To help alleviate this confusion, the 1995 CBECS asked a separate question for the overall facility activity for those buildings identified as being part of a facility. The principal activities of respondent buildings were checked against other available information, including the facility activity, interviewer observations, the building's name, and recoded if an obvious assignment error was made.

Another difficulty with identifying principal building activities is that buildings with the same title may, in fact, have different primary functions. For example, space in a building referred to as a "courthouse" can be devoted primarily to office activities (office), to jail cells (public order and safety), or to hearing rooms (public assembly).

For some buildings, no one activity occupied 50 percent or more of the floorspace, but the activity occupying more space than any other was either industrial or residential. For example, it is possible for a building to have 30 percent of the floorspace devoted to assembly, 30 percent to food sales, and 40 percent to residential. Since more than 50 percent of the floorspace was occupied by commercial activity, these buildings were retained in the sample as commercial buildings but were included in the "Other" category.

Number of Workers: The CBECS collects data on the number of people who work in commercial buildings. Included in this number are volunteer workers, but not clients, students, or employees who work away from the building. In 1995, the number of people working during the main shift was requested. In the 1995 CBECS, if a building was not in use during the previous 12 months, it was still included in the "less-than-five" category of number of workers.

Heating and Cooling: The phrasing of questions on heating and cooling equipment has presented difficulties in every CBECS conducted thus far and, unfortunately, illustrates difficulties both in question wording and in respondent knowledge. Commercial buildings' heating and cooling systems vary greatly in design and complexity. The CBECS questionnaire designers try to formulate a few questions that could broadly characterize a building's heating and cooling system.

In previous CBECS, some building respondents (especially those from larger buildings) found the questions to be too general to adequately describe their buildings' systems. Other building respondents lacked even the rudimentary knowledge of their buildings' systems required by the questionnaire. To alleviate some of the problems encountered in earlier CBECS in which inconsistencies appeared between types of equipment, fuel sources, and the distribution system, the 1995 CBECS questionnaire limited the respondents' choices in such a way that only answers to sensible combinations of heating or cooling equipment with distribution equipment could appear.

Additionally, a general question asked the respondent to describe the heating and cooling system. This verbatim description was not coded on the computer file but was of immeasurable value in deciphering the respondents' intentions. In particular, the question of whether the buildings uses "heat pumps" elicited some surprising responses at some of the interviewed buildings. Several respondents indicated that they used a heat pump for heating but not cooling, or vice versa. After review of the verbatim description and callbacks to the respondents, corrections were made in cases where this information was in error. However, there were 212 cases where the heat pumps did indeed have a single use.

Electricity Generation or Cogeneration: A series of questions was asked about the buildings' electricity generating systems and the sources of electricity. Respondents were asked whether the building could generate electric power and, if yes, what was the primary use of the generators. Of the 5,656 buildings that used electricity, approximately 1,257 reported that they had the capability to generate electric power. Of these, 87 percent use the generators for emergency backup use only.

Respondents reporting that their buildings could generate electricity but that the primary use was for something other than emergency backup were then asked whether the electric power generating system was also a cogeneration system. Because the number of sampled buildings that had a cogeneration system was less than 20, the data were not published.

Two new questions were asked in 1995 in an attempt to gather information about different purchasing arrangements of electricity. With the probability of deregulation in the electric utility sector, increasing numbers of consumers will be able to purchase their electricity from nonutility sources, similar to purchasing natural gas from independent suppliers. Respondents were asked if any of the electricity used in the building was obtained from a nonutility, non-in-house source, such as an independent power producer, and, if yes, how much of the electricity used was obtained from this source. While the vast majority of buildings purchased all of their electricity from a local utility, there were 26 sampled buildings that obtained some of the electricity used from a nonutility, non-in-house source. After these 26 buildings were examined, it was determined that most of these buildings were on facilities with central heating plants or had the capability of generating electricity themselves. It appears that the respondents might have confused nonutility source of electricity with the ability to generate electricity on the facility or in-house.

Gas Transported for the Account of Others: The respondents to the 1995 CBECS were asked whether the building bought or contracted for natural gas from someone other than the local distribution company and the name and address of the company or broker from whom the direct purchase gas was bought or contracted. This purchasing arrangement is known as "gas transported for the account of others." It is also known as "direct purchase gas" or "spot market gas."

This general question, plus several other specific, price-related questions were first asked during the building characteristics portion of the survey in the 1992 CBECS. (Prior to 1992, this information was asked only of the energy suppliers. Although suppliers could provide the volume of natural gas delivered, they could not, in many cases, report the expenditures since they did not know the purchase price of the transported gas.) It was believed that the building respondent would be better able to provide information about whether they purchased natural gas under this arrangement, who the suppliers were, and what were the wellhead costs, city gate price, local distribution company (LDC) charge, and other costs associated with gas transported for the account of others. This, however, proved to be another area where the building respondents had difficulty providing information. Therefore, based on the 1992 CBECS experience, where only 18 percent of the building respondents could report one or more of the costs associated with the purchase, the cost questions were eliminated in 1995 from the building characteristics questionnaire.

It appears that CBECS respondents, the people who are supposed to be most knowledgeable about the energy-using systems of the buildings, are not the most knowledgeable about billing arrangements. In future CBECS, it may be necessary to target the person most knowledgeable about billing with a separate data collection effort in order to make reliable estimates about gas transported for the account of others.

Renewable Energy Source: The CBECS attempted to collect information on the use of renewable energy sources by including wood and solar thermal panels in the list of possible energy sources that were used to supply energy to the building. In 1995, wood was used in about 3 percent of the buildings as an energy source. Data on the use of solar panels could not be published because either the number of buildings reporting the use was too small or the relative standard error (RSE) was greater than 50 percent. Additional questions were asked about the use of the renewable energy features and the sponsors to each one. The energy features included passive solar features, photovoltaic (PV) arrays, geothermal or ground source heat pumps, wind generation, and well water for cooling. The sponsors included utilities, the Federal Government, in-house or self-sponsored, third party, or other. With the exception of passive solar features (which included trees that could be used for shade), fewer than 20 buildings of the 5,766 sampled responded to each of the renewable energy features. Therefore, these data were not imputed or published.

Nonresponse

Unit Nonresponse

The response rate for the Building Characteristics Survey portion of the 1995 CBECS was 87.5 percent. That is, of the 6,590 buildings eligible for interview, 12.5 percent did not participate in the Building Characteristics Survey. The unit response rate for the Energy Suppliers Survey was 84.9 percent. This response rate for that portion of the CBECS varied by energy source.

Weight adjustment was the method used to reduce unit nonresponse bias in the survey statistics. The CBECS sample was designed so that survey responses could be used to estimate characteristics of the entire stock of commercial buildings in the United States. The method of estimation used was to calculate basic sampling weights (base weights) that related the sampled buildings to the entire stock of commercial buildings. In statistical terms, a base weight is the reciprocal of the probability of selecting a building into the sample. A base weight can be explained as the number of actual buildings represented by a sampled building—a sampled building that has a base weight of 1,000 represents itself and 999 similar (but unsampled) buildings in the total stock of buildings.

To reduce the bias from unit nonresponse in the survey statistics, the base weights of respondent buildings were adjusted upward, so that the respondent buildings would represent not only the unsampled buildings they were designed to represent, but also nonrespondent buildings and the unsampled buildings they were designed to represent. The base weights of respondent buildings were multiplied by an adjustment factor A, defined as the sum of the base weights over all buildings selected for the sample divided by the corresponding sum over all respondent buildings. Respondent weights remained nonzero after weight adjustment. Nonrespondent weights were set to zero because they were accounted for by the upward adjustment of respondent weights.

Unit nonrespondents tended to fall into certain categories. For example, nonresponse tended to be lower in the Northeast than in the Midwest (11.9 percent and 14.8 percent, respectively). To reduce nonresponse bias as much as possible, adjustment factors were computed independently within 38 subgroups according to characteristics known from the sampling stage for both responding and nonresponding buildings. These characteristics included the general building activity, the rough size of the building, Census region, and metropolitan versus nonmetropolitan location.

Item Nonresponse

Nonresponses to several items in otherwise completed the Building Characteristics Survey questionnaires were treated by a technique known as "hot-deck imputation." In hot-decking, when a certain response is missing for a given building, another building, called a "donor," is randomly chosen to furnish its reported value for that missing item. That value is then assigned to the building with item nonresponse (the nonrespondent, or "receiver").

To serve as a donor, a building had to be similar to the nonrespondent in characteristics correlated with the missing item. This procedure was used to reduce the bias caused by different nonresponse rates for a particular item among different types of buildings. What characteristics were used to define "similar" depended on the nature of the item to be imputed. The most frequently used characteristics were principal building activity, floorspace category, year constructed category, and Census region. Other characteristics (such as type of heating fuel and type of heating and cooling equipment) were used for specific items. To impute hot-deck values for a particular item, all buildings were first grouped according to the values of the matching characteristics specified for that item. Within each group defined by the matching variables, donor buildings were assigned randomly to receiver buildings.

As was done in previous surveys, the 1995 CBECS used a vector hot-deck procedure. With this procedure, the building that donated a particular item to a receiver also donated certain related items if any of those were missing. Thus, a vector of values, rather than a single value, is copied from the donor to the receiver. This procedure helps to keep the hot-decked values internally consistent, and avoids the generation of implausible combinations of building characteristics.

Annual Consumption and Expenditures

The estimates of energy consumption and expenditures in commercial buildings are for calendar year 1995. These estimates were computed from the annual consumption and expenditures determined for each building in the CBECS sample. However, these "annual" values were not obtained directly from the suppliers for the buildings. Rather, energy suppliers provided monthly billing data that were used to calculate calendar year consumption and expenditures for each building, according to the procedures described in this section. Also described in this section are the imputation procedures used in cases where the energy supplier survey data were unavailable or inadequate.

To ensure that the energy consumption for calendar year 1995 would be completely accounted for, the data requested from suppliers were bills that covered the period from December 1994 through January 1996. These bills formed the basis for the annual energy consumption and expenditures estimates.

Billing Data: Ideal and Reality

The basic consumption and expenditures data were reported for each building by billing period. Ideally, the data for each continuous-delivery energy source (electricity, natural gas, and district heating) used in each sampled building should have been in the form of complete records for every billing period within calendar year 1995, provided complete coverage for 1995, and exactly covered the energy consumed within the sampled building. The data for the discrete-delivery energy source (fuel oil) should have been in the form of complete data records for all deliveries during 1995. For both continuous- and discrete-delivery energy sources, the delivered energy source should have been used entirely within the sampled building.

In practice, though, the billing data often covered either more or less square footage than just the sampled building's square footage, or did not match the target time frame—calendar year 1995. There were several common types of discrepancies between the bill coverage and the ideal of a single building and fixed time frame:

Bill coverage included days in 1994 and 1996, as well as in calendar year 1995. This was the typical situation for a complete billing record. Rarely would one billing period begin on January 1, 1995 and another end on December 31, 1995.
Bill coverage spanned at least a 1-year period, but did not include all of 1995. In most such cases, the time frame covered by the bills extended from the middle of 1995 into the middle of 1996. Many energy suppliers maintain accessible billing records only for the most recent 13 months. Thus, at the time of reporting, the data available did not cover the beginning of 1995.
Bill coverage spanned less than a one year period.
Bill coverage was for several sampled buildings combined. This occurred when no authorization form was obtained to authorize the supplier to provide data for individual buildings. In such cases, the supplier reported only summed annual totals for a group of sampled buildings.
Bill coverage included nonsampled buildings or equipment outside the sampled buildings, as well as the sampled building.
Bill coverage excluded some of the building's occupants or tenants. This undercoverage occurred when the energy supplier had several customers in a sampled building and was unable to identify all of them on the basis of the information provided by the Building Characteristics Survey respondent. In a few cases, energy suppliers were unwilling to release information on all customers in a building, even in aggregate form, without having a separate authorization from each.

The problem of determining bill coverage was compounded by incomplete dates. In the most common case, the billing period date included a month and year, but not the day of the month.

To reconcile the discrepancies between the ideal billing data and what could actually be obtained, six processing steps were taken (detailed explanations of the processing follows):

1. Each set of bills from a particular energy supplier for a particular building was classified according to the extent of coverage in terms of both building and time frame.
2. Billing dates for all energy bills were completed.
3. Bills with full-year time-frame coverage were annualized.
4. Bills with part-year time-frame coverage were annualized.
5. Annualized bills were adjusted for building overcoverage and undercoverage.
6. Annual energy consumption and expenditures for buildings with completely missing data were imputed.

Step 1. Classifying Coverage of Building and Time Frame

This classification was performed by the CBECS contractor as part of the data collection record keeping. To track responses to the mailed Energy Suppliers Survey, the contractor determined whether a response received represented complete data for a building. In many cases, follow-up letters converted initial responses from partial to complete, or nearly complete. In other cases, the incomplete response was all that could be obtained.

Determining Time Frame: An important aspect of the time-frame classification was to determine why data were missing for part of calendar year 1995. The main question was whether consumption had actually taken place during the entire year or was actually zero during the unreported time.

If consumption occurred through the entire year, data might be missing for several reasons. For example, the supplier's active records might not go back far enough or the data may simply have been lost from the supplier's record, even though in general these records did go back to the beginning of 1995.

A more complicated situation occurred when a new customer occupied a building in the middle of the target year. The data provided for this customer, for which the authorization form was signed, would be complete, but the data for the previous occupant, who consumed energy in the first part of the year, would be missing. In that case, annual consumption would be understated if the reported 1995 data were treated as complete.

The opposite situation would occur if a customer initially occupied the building in the middle of the year and no customer occupied the building for the first part of the year. In that case—no consumption during the first part of the year—annual consumption would be overstated if the reported data were annualized.

A special set of questions on the Energy Suppliers Survey forms was designed to determine if any change in customers had occurred during the target year and, if so, how those customers were covered in the reported data. However, most suppliers did not answer those questions. As a general rule, data were treated as complete if they covered a full year, whether calendar 1995 or not. Part-year data were treated as incomplete, unless the supplier specifically indicated otherwise.

Particularly complicated were some electricity and natural gas cases where individual records were provided for each customer in a building with several customers. In most such cases, bills for all the customers covered the same time frame. Sometimes, though, different customers' records covered different time frames. In those cases, it was assumed that the data were complete for each customer, but that the customers began or ended service at different times during the year. Aggregate consumption and expenditures were therefore computed for each time period by summing whichever customers had consumption during that period. If consumption was present for a particular customer in a particular period but expenditures were missing (or vice versa), aggregate expenditures (or consumption) were left as missing.

Determining Building Coverage: Building coverage was determined from information obtained from both the Building Characteristics Survey respondent and the energy suppliers. Two types of problems could arise: (1) the energy bills covered more buildings than just the sampled building or (2) the energy bills omitted some of the building's occupants. In the first case, if the Building Characteristics Survey respondent indicated that a particular supplier's bill covered several buildings, the total square footage of buildings on that bill was requested. Then a disaggregation factor was computed as the ratio of the sampled building's square footage to this total square footage. This factor allowed the total reported consumption to be adjusted downward to cover only the sampled building. In the second case, when the billing data omitted some customers in a building, an aggregation factor was computed. This factor was usually the ratio of the number of customers in the building to the number reported. Where more detailed information was available, the aggregation factor was the ratio of the total building floorspace to the floorspace occupied by the reported customers. For those cases, the reported consumption of only a portion of the building was adjusted upward to represent consumption in the building as a whole.

Step 2. Assigning Billing Dates

Virtually all missing billing dates were one of two types. The first type of dates that were incomplete had the month and year entered, but the day was missing for the beginning and ending dates of all billing periods on a record. These cases were imputed by assigning "16" to each beginning date and "15" to each ending date.

The second type of incomplete dates were missing the day of the month for some, but not all, billing periods. For each case of this second type, the billing periods affected were either bounded (surrounded by billing periods with known beginning and ending dates) or unbounded (either at the beginning or end of the set of billing periods). Any set of consecutive bounded billing periods with missing dates was assigned billing dates that would make all billing periods in the set have as close to the same number of days as possible. Unbounded billing periods were assigned beginning and/or ending dates as needed so that the number of days in each unbounded period was the same as the median number of days in billing periods of known length.

Step 3. Annualizing Full-Year Data

One of the main reasons that the CBECS requested energy supplier data from December 1994 through January 1996 was to assure that 1995 consumption would be completely accounted for in the case of a complete response. However, unless a billing period happened to end on December 31, 1994, or December 31, 1995, consumption as reported by the energy suppliers ran over from the target period of calendar 1995, forward into 1996 and backward into 1994. In general, procedures were required to trim away these excess data. For this trimming, different procedures were used for continuous- and discrete-delivery energy sources.

Continuous-Delivery Energy Sources (electricity, natural gas, and district sources): Consumption and expenditures for a billing period that extended into 1996 were adjusted by splitting the overlapping period into two subperiods, one running from the beginning date through December 31, the other from January 1 through the billing or meter reading date. Consumption and expenditures were prorated according to the number of days in each subperiod, and the consumption and expenditures for the subperiod that fell in 1995 were included in the total expenditures and consumption for 1995. An analogous procedure was used for a billing period extending into 1994. The assumption that the use of continuous-delivery energy sources took place at a constant rate throughout the billing period may be incorrect for any particular building. However, the procedure should yield approximately unbiased overall estimates.

Discrete-Delivery Energy Source (fuel oil): Billing periods that extended outside 1995 did not affect the discrete-delivery energy source (fuel oil) because, for that energy source, all deliveries during 1995 were accumulated. For fuel oil, the ending dates on the bills were used to determine what bills were for deliveries during 1995. No attempt was made to prorate bills, since there was no necessary connection between billing dates and consumption, as was the case for continuous-delivery energy sources.

For both continuous- and discrete-delivery cases, where the billing time frame covered a full year but was shifted so that either the beginning or the end of 1995 was not included, a similar procedure was used. In those cases, the data were annualized to a one year period within the reported time frame, to overlap as much as possible with 1995.

Step 4. Annualizing Part-Year Data

The annualization procedures for cases that had partial billing data, but less than a full year's data, were also different for continuous- and discrete-delivery energy sources.

Continuous-Delivery Energy Sources: The number of reported days of consumption was at least as large as the number of reported days of expenditures for almost all sets of bills. Expenditures were annualized by using the partial expenditures data and the annualized consumption data.

The part-year annualization method for the consumption of continuous-delivery energy sources depended on the number of days of reported consumption. If at least 331 days were reported, then consumption for the missing portion of the year was imputed by computing the average consumption per day for the adjacent billing period(s), then multiplying by the number of days of missing data. In certain cases, at least 331 days of consumption were reported, but 365 days of expenditures were reported. In those cases, the missing consumption was computed by using the average price for billing periods in which both consumption and expenditures were reported. Summing all reported and imputed consumption then yielded the total annual consumption.

Expenditure imputations were performed after completion of all imputations for partially missing consumption since (1) consumption data were usually more complete than expenditures data, and (2) given a value for consumption, the expenditures could be estimated without difficulty.

As was true for consumption, the imputation procedure for missing continuous-delivery expenditures was determined by the number of days of reported data. If 30 or fewer days of expenditures were reported, then the expenditures were treated as completely missing. Otherwise, expenditures were imputed based on average prices within the set of bills for a given building. Using bills where both consumption and expenditures were reported, the consumption and the expenditures were summed. The average price was calculated as the sum of the expenditures divided by the sum of the consumption. This average price was multiplied by the reported (or imputed) consumption to obtain the estimated expenditures.

Discrete-Delivery Energy Source: The billing dates for fuel oil, a discrete-delivery energy source, are not linked to the time of consumption. Thus, the annualized data represent the total deliveries of fuel oil during the year. Furthermore, unlike continuous-delivery bills, discrete-delivery bills tend to be irregularly spaced. Gaps between bills could represent either missing data or periods during which no deliveries were required. The completeness of a set of bills was determined by relying on reports of suppliers. A set of bills was treated as complete if the supplier stated that the bills were complete for the year, and treated as missing otherwise, even if a partial set of bills was available.

Buildings rarely had more than one supplier for a continuous-delivery energy source, such as electricity, but multiple suppliers for fuel oil occurred frequently. If data for one or more of several suppliers were missing, even though responding suppliers had reported all their 1995 deliveries, those buildings were also treated as if no data were available.

Imputations for both deliveries and expenditures used the observed price(s). An average price, Px, for each set of bills, was computed by using the data from billing periods in which both consumption and expenditures were reported. If expenditures were missing, the expenditures were imputed as Px times the quantity delivered. For missing deliveries, the reported expenditures were divided by Px to impute the amount delivered.

Step 5. Adjusting for Building Overcoverage and Undercoverage

The annualization procedures for full- and part-year data were adjusted for inconsistent time-frame coverage. After the nonmissing consumption and expenditures data were annualized, the annual values were adjusted for building coverage. Where data were requested from the supplier for a single sampled building, but were provided only for a group of buildings, including the sampled one, or were provided only for a portion of the building, the coverage adjustment was a simple multiplication of the annualized consumption and expenditures by the disaggregation or aggregation factor. As described under Step 1 above, that factor was computed by the survey contractor directly on the basis of information received on the building or suppliers survey.

Step 6. Imputing for Completely Missing Consumption and Expenditures

In a significant fraction of cases, the energy supplier did not provide the consumption or expenditures data for some or all billing periods or deliveries in 1995. Reasons for missing data included energy supplier refusal; archived, lost, or destroyed billing records; and authorization form refusal on the part of the building respondent. In other cases, the energy supplier provided data, but either the building data were combined with those of nonsampled buildings and could not be disaggregated or the consumption or expenditures, or both, were incomplete enough to be treated as missing.

The general approach taken to the problem of imputing annual consumption or expenditures was to annualize the complete or partial sets of bills first, then use those annualized bills in regression equations to develop imputed values for the data that were totally missing. The regression imputation approach was chosen because data from the Building Characteristics Survey were already available for all of the buildings that lacked energy supplier data. The first step was the estimation of missing consumption based on characteristics of buildings. After the consumption had been imputed, missing expenditures were estimated based on the reported or imputed consumption.

Completely Missing Consumption: Each of the energy sources presented in this report was imputed separately, although the overall methodology was similar for all. The consumption imputation method is, therefore, described in general terms, and refers to individual energy sources only where necessary. The regression equations were developed primarily to serve as adequate predictors of building consumption based on building characteristics.

The data used to specify regression equations and estimate the regression parameters used for consumption imputation had to meet several criteria. Only cases with essentially complete consumption data were used. For continuous-delivery energy sources, "essentially complete data" included buildings with 331 to 365 days of reported consumption; for discrete-delivery energy sources, only buildings with completely reported deliveries were included. In addition, cases were not used to estimate regression parameters if the information received from the energy supplier included too much data from unsampled buildings (before disaggregation) or the data reported from the building respondent were missing key regressor variables.

The development of regression equations began by an examination of the distributions of the dependent variable "consumption." Experience showed that the error term associated with the regression procedure is highly skewed in the positive direction. Consequently, the regression procedures used for the 1995 CBECS minimized the sum of squares of the difference between the log of the actual consumption and the log of the predicted consumption rather than the sum of squares of the difference between the actual consumption and the predicted consumption. Accordingly, the imputed consumption values were calculated by using parameter values estimated in two stages: the initial regression of consumption on building characteristics, and a bias correction. The bias correction coefficient was estimated by (1) summing the total actual consumption of cases used to estimate the regression parameters, (2) summing the total of the predicted values for these same cases, and (3) dividing the sum of the actual values (1) by the sum of the predicted values (2).

Completely Missing Expenditures: Similar to consumption imputations, expenditure imputations were performed separately for each of the four major fuels, although the overall methodologies for each fuel were similar. Again, the imputations are described in general terms, and refer to individual energy sources only where necessary.

Energy supplier rate schedules are usually structured so that the price per unit of energy is lower as consumption increases. The rate schedule is usually a step function and the definition of steps and rates vary by energy supplier and by rate class. For the CBECS, rate schedules were not collected for the sampled buildings, but many suppliers did submit an overall rate schedule for their commercial customers. That was useful in estimating expenditures. In cases where rate schedules were not supplied, a statistical procedure was used to relate the expenditures to the consumption for imputation purposes.

As with the consumption imputations, the data used to specify the form and estimate the parameters of the expenditure imputation equations had to meet two criteria. First, only cases with essentially complete consumption and expenditures were used. For continuous-delivery energy sources, "essentially complete data" included buildings with 331 to 365 days of reported data for both consumption and expenditures; for discrete-delivery energy sources, only buildings with completely reported deliveries and expenditures were included. In addition, cases were not used to estimate regressor parameters if the information received from the supplier included too much data from unsampled buildings before disaggregation.

Once cases with complete expenditures data were chosen, they were used to develop an ordinary least squares regression equation to relate expenditures to consumption and to the fuel price for commercial customers. The independent variables were chosen to mimic a decreasing block rate structure. The resulting fitted equation was used to impute for cases where expenditures were missing.

Annual Peak Electricity Demand

Peak electricity demand data were requested for the same billing periods for which electricity consumption and expenditures were reported. Ideally, the metered demand represented the maximum consumption rate (in kilowatts) during the billing period. However, two special data problems affected the availability of peak electricity demand data.

First, although virtually all electricity consumption is metered, peak electricity demand is only metered where it is economical to do so. In general, peak demand meters are only installed for larger consumers of electricity. Second, in multicustomer buildings, each customer with a demand meter had its own peak demand. Since those peaks would rarely be coincident, the building peak cannot be taken as the sum of individual peaks. However, the overall building peak must be greater than or equal to the maximum customer peak. Following Step 2 described in the previous section "Annual Consumption and Expenditures," the peak electricity demand data was processed in three additional steps.

Step 1. Classifying building as demand-metered or not demand-metered

For the 1995 CBECS, a building was considered to be demand-metered if the billing data for any account within the building showed metered peak demand. (The 1989 CBECS obtained demand-metered information from both the building respondent and the energy supplier. However, there was considerable discrepancy between the two sources of data and subsequent CBECS obtained this information only from the energy supplier.)

Step 2. Determining the annual peak demand, the season of the peak, and the annual load factor for each building

For single-account buildings that were determined to be demand-metered, the annual peak demand was taken as the maximum of the billing period peaks. For the few buildings that had part-year electricity billing data, the annual peak was taken as the maximum of the peaks in the reported billing periods. This approach results in a slight understatement of the annual peak, because the actual peak may have occurred during one of the unreported periods. However, since the number of buildings involved was relatively small, the difference between the part-year and full-year maxima would be small in most cases.

In multicustomer buildings, the overall building peak demand was not available. However, the overall peak had to be at least as high as the highest peak reported for any single customer. In buildings where one customer's peak was substantially larger than that of any other customer, that customer's peak would have been close to the overall peak. Therefore, in processing bills from multicustomer buildings, the peak demand for any single customer was designated as a "partial peak" (associated with part of the building electricity consumption), although the overall building peak was still treated as missing.

Before assigning the peak to a season, the month of the peak was found. Since the exact time of the billing period peak was unknown, the peak was taken to have occurred in whichever month contained the most days in the billing period during which the peak occurred. Peaks occurring in November through April were classified as winter peaks, while those occurring in May through October were classified as summer peaks.

The annual load factor was then calculated, using previously calculated annual electricity consumption, as follows:

annual load factor = (annual consumption)÷(365×24×peak annual demand)

As an edit, the annual load factor was calculated by using the partial peak, and the partial peak was set to missing if the load factor was less than 0.1 or greater than 1.

Step 3. Imputing peak demand and season of peak for demand-metered buildings

Although any electricity consumer has a peak demand, three types of buildings were missing peak demand: (1) buildings not determined to be demand-metered; (2) buildings with completely missing supplier data; (3) multicustomer buildings, and other buildings with partial peaks. No attempt was made to impute for the first type of missing demand, mainly because buildings without demand-metering tended to be smaller than the demand-metered buildings, so that imputation would involve extrapolation beyond the range of the reported data. Accordingly, tables dealing with peak electricity demand have been limited to buildings with (reported or imputed) demand-metering. Once the decision was made to exclude buildings that had not been demand-metered, imputation became a two-step process. First, it was necessary to impute whether the building with missing data was demand-metered. If the building was imputed to be a demand-metered building, then the peak and season of the peak were imputed.

Imputation of the demand-metering attribute made use of the relationship observed within suppliers between the presence of demand-metering and annual electricity consumption. For those buildings with reported data, the probability of being a demand-metered building was estimated as a logistic function of the annual consumption. The parameters estimated from the reported data regression were used to estimate probabilities for each unclassified building, and a uniform random number was generated. If the random number was less than or equal to the estimated probability, then the building was imputed to be demand-metered. For buildings imputed to be demand-metered, the season of peak demand was imputed by hot-decking, the same method used to impute missing items from the Building Characteristics Survey.

Finally, annual load factors were imputed for each building imputed to be demand-metered. Values were imputed by using parameters estimated from a linear regression of the logistic transformation of the annual load factor on various building characteristics (such as weekly operating hours, end uses of electricity, and percent of floorspace heated). Separate imputation equations were estimated for each of nine principal building activities. The imputed annual peak demand was then calculated by solving the load factor equation for the annual peak.

Load factors were imputed, and peak demand values calculated, for multiple-account buildings that had partial peaks. If the partial peak was less than the imputed peak, then the imputed peak was treated as the buildings' annual peak demand; otherwise, the partial peak was used.

Load factors and peak intensities were computed for each building reported or imputed to have metered demand. Also of interest are the analogous ratios over a utility service region, or other large area. The ratio of a region's consumption to the annual peak for the region as a whole would represent the average utilization of the region's generating capacity. The ratio of the region's annual peak to the total floorspace in the region would represent the average capacity requirement per square foot. However, the regional peak cannot be determined from the individual annual (or even monthly) peaks alone, since these peaks are not coincident. That is, the individual peaks occur at different times, so that the sum of the individual peaks can be considerably greater than the overall regional peak.