U.S. Energy Information Administration logo
Skip to sub-navigation
‹ Consumption & Efficiency

Commercial Buildings Energy Consumption Survey (CBECS)

1999 CBECS Survey Data 2018 | 2012 | 2003 | 1999 | 1995 | 1992 |

Public Use Microdata

The Public Use Files are microdata files that contain 5,430 records,representing commercial buildings from the 50 States and the District of Columbia. Each record corresponds to a single responding, in-scope sampled building and contains information for that building such as the building size, year constructed, types of energy used, energy-using equipment, and conservation features.

CBECS data are available for the 4 Census regions and 9 Census divisions. No state-level data are available.

The 1999 Public Use Files are constructed in ASCII format. The records are comma delimited with fixed column positions.

1999 Commerical Buildings Energy Consumption Survey (CBECS) Public Use Files Layout File Data File Revised Date
File 1: General Building Information and Energy End Uses TXT CSV 10/01
File 2: Building Activities, Special Measures of Size, and Multibuilding Facilities TXT CSV 10/01
File 3: Heating and Cooling Equipment and Conservation Features TXT CSV 10/01
File 4: Water Heating, Refrigeration, and Office Equipment and Special Space Uses TXT CSV 10/01
File 5: End Uses of Major Energy Sources, Electricity Generation, and Purchasing of Electricity and Natural Gas TXT CSV 10/01
File 6: Minor Energy Sources and End Uses for Minor Energy Sources TXT CSV 10/01
File 7: Lighting Percents, Equipment, and Conservation Features TXT CSV 10/01
File 8: Imputation Flags for File 1 TXT CSV 10/01
File 9: Imputation Flags for File 2 TXT CSV 10/01
File 10: Imputation Flags for File 3 TXT CSV 10/01
File 11: Imputation Flags for File 4 TXT CSV 10/01
File 12: Imputation Flags for File 5 TXT CSV 10/01
File 13: Imputation Flags for File 6 TXT CSV 10/01
File 14: Imputation Flags for File 7 TXT CSV 10/01
File 15: Consumption and Expenditures for Sum of Major Fuels and Electricity (includes Imputation Flags) TXT CSV 10/01
File 16: Consumption and Expenditures for Natural Gas, Fuel Oil, and District Heat (includes Imputation Flags) TXT CSV 10/01
How are these files used?      
Download all format codes (text file, 9Kb)      
Download all layout files and format codes (PDF file, 373 Kb)      
Note: The following are all text files. Layout files range from 3 to 6 Kb; data files range from 488 Kb to 1.16 MB.

File Organization

Because of the size of the CBECS questionnaire, the variables were separated into groups by subject matter. These 16 smaller files make it easier to manipulate the data.

Several variables are frequently used in the analysis of commercial energy data. These core variables are included in each group of variables:

  • PUBID7: building identifier, which is the link between files;
  • ADJWT7: adjusted sampling weight;
  • STRATUM7 and PAIR7: variance stratum and pair member which can be used for calculating variances;
  • REGION7 and CENDIV7: Census region and division;
  • SQFT7 and SQFTC7: square footage, both exact and category;
  • PBA7: principal building activity;
  • YRCONC7: year constructed category; and
  • ELUSED7, NGUSED7, FKUSED7, PRUSED7, STUSED7, HWUSED7: a set of variables indicating whether electricity, natural gas, fuel oil, propane, district steam or district hot water were used in the building.

For each group of variables, there are two items: a layout file and a data file. The layout file is a text file which gives, for each variable on a file: the variable name, a description, the position on the file, and the corresponding format. The data file is an ASCII comma delimited file.

To determine what the different values for each variable represent, use the text file provided of all the format codes (alternatively, there is also a PDF document containing all the layout files and format codes). The formats are arranged in alphabetical order and are written so that they may be easily turned into a SAS format library.

To download the data files, click on "Data File" to open it, then select "File" and "Save as" to save the file to your hard drive or a disk.


To find the national estimate for... Do this... And you should get...
Total number of buildings Sum ADJWT7 4,656,772.04 (or 4,657 thousand)
Total number of office buildings Sum ADJWT7 for cases where PBA7="02" 738,743.03 (or 739 thousand)
Total floorspace Create a new variable (weighted square footage) by multiplying ADJWT7 by SQFT7 for each case, then sum this new variable 67,337,801,620 (or 67,338 million square feet)
Total floorspace in buildings that use natural gas Sum the new weighted square footage variable (see above) for cases where NGUSED7="1" 45,525,252,716 (or 45,525 million square feet)

Each of these 16 files can be used by itself or be merged with other files for more complex analyses. By merging files together, a new file can be created that contains, for each respondent, variables from two or more files. The variable PUBID7 should be used to link the files.

The CBECS sample was designed so that survey responses could be used to estimate characteristics of the entire commercial buildings stock nationwide. All published CBECS tables report national estimates.

In order to arrive at national estimates from the CBECS sample, base sampling weights were calculated for each building (these are the reciprocal of the probability of that building being selected into the sample). Therefore, a building with a base weight of 1,000 represents itself and 999 similar, but unsampled buildings in the total building stock. The base weight is further adjusted to account for nonresponse bias. The variable ADJWT7 in the data file is the final weight. In order to obtain a national estimate, each sample building's value must be multiplied by the building's weight.

How are the variables that begin with "Z" different from the "NON-Z" variables?

Files 8 through 14 contain variables that begin with the letter Z. These "Z variables" are also referred to as "imputation flags." Imputation is a statistical procedure used to fill in values for missing items. Missing values for many, but not all, of the variables were imputed in 1999. The imputation flag indicates whether the corresponding non-Z variable was reported, imputed, or inapplicable. There are no corresponding "Z variables" for variables from the CBECS questionnaire which were not imputed, variables where there was no missing data, and variables which were derived based on other variables.

How is the survey respondent's confidentiality protected?

The EIA does not receive nor take possession of the names or addresses of individual respondents or any other individually identifiable energy data that could be specifically linked with a building respondent. All names and addresses are maintained by the survey contractor for survey verification purposes only. Geographic identifiers and National Oceanic and Atmospheric Administration Weather Division identifiers are not included on any data files delivered to EIA. Geographic location information is provided to EIA at the Census division level. In addition, specific building characteristics that could uniquely identify a particular responding building are masked on any data provided to EIA by the survey contractor.

These specific variables that have been modified to protect the confidentiality of respondents are:

Square Footage: FFor buildings over one million square feet, the numeric square footage was replaced with the weighted average square footage of all responding buildings over one million square feet. Separate weighted means were calculated for each of the four Census regions. For buildings one million square feet or less, the numeric square footage was rounded to within 5 percent of the upper limit of the buildings' square footage categories. If the rounded value fell below the lower limit of the category, the value was coded at the lower limit -- for example, buildings in the range of 5,001 to 10,000 square feet were rounded to the nearest 500 square feet (except that buildings rounding to 5,000 were coded as 5,001.)

Number of Workers: For buildings where the numeric number of workers was between 2,500 and 4,999, the reported number was rounded to the nearest 250. For buildings where the numeric number of workers was 5,000 or more, the reported numeric number of workers was replaced with the weighted average number of workers of all responding buildings with 5,000 or more workers. Separate weighted means were calculated for each of the four Census regions.

Number of Floors: The upper range of the number of floors was replaced with two categories: 15 to 25 floors (coded as 994 on the file) and over 25 floors (coded as 995 on the file).

Special Measures of Occupancy: Seven special measures of occupancy are included in the 1995 CBECS (seating capacity for religious buildings, public assembly buildings, education buildings, and food service buildings; licensed bed capacity for in-patient health care and skilled nursing buildings; and number of guest rooms for lodging buildings). These numbers were rounded to the following: Fewer than 25 units (no rounding performed); 25-49 units (rounded to nearest 5); 50-99 units (rounded to nearest 10); 100-249 units (rounded to nearest 25); 250-499 (rounded to nearest 50); 500-999 units (rounded to nearest 100); 1,000-2,499 units (rounded to nearest 250); 2,500-4,999 (rounded to nearest 500); 5,000 or more units (rounded to nearest 1,000).

Questions about CBECS may be directed to:

Joelle Michaels
Survey Manager