AMERICAN STATISTICAL ASSOCIATION (ASA) MEETING OF THE ASA COMMITTEE ON ENERGY STATISTICS WITH THE ENERGY INFORMATION ADMINISTRATION (EIA) OF THE UNITED STATES DEPARTMENT OF ENERGY Washington, D.C. Friday, October 21, 2005 324 1 COMMITTEE MEMBERS: 2 NICOLAS HENGARTNER, Chair Los Alamos National Laboratory 3 MARK BERNSTEIN 4 RAND Corporation 5 JOHNNY BLAIR Abt Associates 6 MARK BURTON 7 University of Tennessee 8 JAE EDMONDS University of Maryland 9 MOSHE FEDER 10 Research Triangle Institute 11 BARBARA FORSYTH Westat 12 WALTER HILL 13 St. Mary's College of Maryland 14 NAGRAJ NEERCHAL University of Maryland 15 THOMAS RUTHERFORD 16 MPSGE 17 DARIUS SINGPURWALLA LECG 18 RANDY SITTER 19 Simon Fraser University 20 EIA PERSONNEL: 21 BOB ADLER 22 LEJLA ALIC BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 325 1 EIA PERSONNEL (CONT'D): 2 MARGOT ANDERSON 3 COLLEEN BLESSING 4 TOM BROENE 5 GUY CARUSO 6 JOHN PAUL DELEY 7 ALOULOU FAWZI 8 HOWARD BRADSHER-FREDRICK 9 STAN FREEDMAN 10 CAROL FRENCH 11 DWIGHT FRENCH 12 BILL GIFFORD 13 BEHJAT HOJJATI 14 SUSAN HOLTE 15 ALETHEA JENNINGS 16 RAY KASS 17 ROBERT KING 18 NANCY KIRKENDALL 19 TOM LORENZ 20 RUEY-PYNG LU 21 PRESTON McDOWNEY 22 RENEE MILLER BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 326 1 EIA PERSONNEL (CONT'D): 2 MICHAEL MORRIS 3 ERIK RASMUSSEN 4 MARK RODEKOHR 5 BHIMA SASTRI 6 MARK SCHIPPER 7 BOB SCHNAPP 8 JOHN STAUB 9 LAWRENCE STROUD 10 AMY SWEENEY 11 PHILLIP TSENG 12 KEN VAGTS 13 ANGELA VEITCH 14 BILL WATSON 15 SHAWNA WAUGH 16 PAULA WEIR 17 ALSO PRESENT: 18 SUSAN BUCCI United States Department of Commerce 19 Census Bureau 20 STACEY COLE United States Department of Commerce 21 Census Bureau 22 JOEL DOUGLAS Science Applications International Corporation BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 327 1 ALSO PRESENT (CONT'D): 2 VICKI HAITOT United States Department of Commerce 3 Census Bureau 4 SUSAN LISS Federal Highway Administration 5 RICHARD HOUGH 6 United States Department of Commerce Census Bureau 7 NANCY McGUCKIN 8 United States Department of Transportation 9 ASHLEY ROBINSON United States Department of Commerce 10 Census Bureau 11 WILLIAM WEINIG ASA 12 KATHLEEN WERT 13 ASA 14 15 16 * * * * * 17 18 19 20 21 22 BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 328 1 C O N T E N T S 2 AGENDA SESSION: PAGE 3 Data Errors, Structural Change, 331 and Time Series Shocks in the 4 Electricity Market 5 Frames Comparisons of the EIA-3 402 and EIA-860 with the Manufacturing 6 Sector of the 2002 Economic Census and the 2002 Manufacturing Energy 7 Consumption Survey 8 9 10 11 * * * * * 12 13 14 15 16 17 18 19 20 21 22 BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 329 1 P R O C E E D I N G S 2 (8:32 a.m.) 3 DR. HENGARTEN: All right, to 4 order, enough of this Canadian thing. Good 5 morning, everybody. Welcome back. We have 6 another day of work here today. There is one 7 announcement that I'd like to do before we 8 start. I'd like to invite Tom Rutherford to 9 replace Mr. Cleveland's discussion and he 10 will be summarizing the break-out session on 11 the relationship between various price 12 series, are future contracts good predictors 13 of future spot prices. So that's the one 14 change to the program that I have to 15 announce. 16 Next thing, I'd like those who were 17 not here yesterday to please introduce 18 themselves. If you can use the microphone 19 over there that would be good. 20 MS. ROBINSON: Hi, my name is 21 Ashley Robinson and I am here from the Census 22 Bureau. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 330 1 MS. BUCCI: Susan Bucci from the 2 Census Bureau. 3 MS. HAITOT: Vicki Haitot from the 4 Census Bureau. 5 MS. HOUGH: Rick Hough from the 6 Census Bureau. 7 MR. COLE: Last and least, Stacey 8 Cole from Census Bureau. 9 MS. ALIC: Lejla Alic from the 10 Office of Oil and Gas in EIA. 11 DR. HENGARTEN: Welcome. Now to 12 Nancy to introduce the next speaker. 13 MS. KIRKENDALL: I would like to 14 introduce Joel. He's going to be making the 15 presentation for Lindolfo Pedraza. Just to 16 put this in context, in the past you have 17 heard us talk about looking at outlier 18 detection and various estimation methods for 19 some of the electric power data, the monthly 20 EIA 826, and there is an annual 861. 21 Earlier this summer the same data 22 were being used only a historical time series BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 331 1 framework by people in support of STEO, the 2 short-term energy outlook, down in the Office 3 of Energy Markets and End Use. And he 4 observed that there are still some 5 irregularities in the time series. There are 6 some errors that have snuck into data 7 especially at the state level and so he 8 talked to Lindolfo and Lindolfo got excited 9 about trying these ideas by looking at the 10 time series context. And so that's how this 11 stuff started. He came back and talked to me 12 —————————— do some work on it. 13 So Lindolfo wanted to talk to you 14 about his work and then two weeks ago he said 15 oh, I'm going to Sweden, and so Joel is the 16 one who gets stuck making the presentation 17 today. He was gracious enough to agree to do 18 it. So if we get into technical details and 19 we don't know all the answers it's Lindolfo's 20 fault. He's not here. 21 MR. DOUGLAS: Good morning. My 22 name is Joel Douglas, as Nancy said, and I'm BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 332 1 here to present the paper titled "Data 2 Errors, Structural Change, and Time Series 3 Shocks in the Electricity Market." The 4 contributors to the paper are Nancy 5 Kirkendall of the DoE, more specifically the 6 Statistical Methods Group, Joe Sedransk of 7 that specific group also, and Lindolfo 8 Pedraza of Science Applications International 9 Corporation. You went through Fred's story 10 already so that's fine. 11 The two purposes of this paper, one 12 is they wanted to improve the methods of 13 imputation for not only data nonrespondents 14 within a survey but also data that has not 15 been sampled within that survey which needs 16 to be estimated for on a monthly basis as 17 opposed to a yearly basis. Again, a couple 18 of months ago the data was being sifted 19 through so that aggregates and databases of 20 electricity generation and sales data could 21 be compiled for forecasting techniques over 22 at STEO. The two forms that were being used BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 333 1 at the time were EIA 826 and the Form 861, 2 which I will go into in a little bit, but 3 essentially 826 is a sample from the universe 4 of 861. 5 What the colleague hoped to find 6 specifically in a Form 826 is the published 7 versus the reported data. If you look at the 8 blue line it is consistently below the red 9 line and its peaks are the peaks of the red 10 line and its troughs are also the troughs of 11 the red line. The movement in sync is 12 specifically what he was looking for and the 13 spread between the two is made up for by the 14 imputation system. So essentially the 15 reported data is the sample and the published 16 data is a sample plus the imputed values. 17 Here is another example. This one 18 is Mississippi residential sales. The spread 19 is a little bit different in this one, it's a 20 little bit greater, but if you notice it's 21 still consistent with the troughs and the 22 peaks and everything moving in sync more or BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 334 1 less from about 1993 until 2000 so for a good 2 decade. 3 The data, though, that he found was 4 not always what he was expecting, as you can 5 see in Colorado commercial sales. Around 6 June 1997 there was a significant dropoff in 7 the reported 826 data. This is to be 8 expected in some cases when there's a very 9 large nonrespondent although this was a 10 significantly large nonrespondent if not more 11 than one in that single month. 12 But as you can see in the red 13 published line the imputation system imputed 14 for the missing respondents as well as all of 15 the firms not sampled within the survey was 16 made up for and that is the purpose of the 17 imputation system. The imputation system 18 deals with things like this very, very well. 19 And just to be clear, when I say 20 impute I am not just speaking about imputing 21 values for nonrespondent firms within the 22 samples also imputing values for the non- BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 335 1 samples that make up the universe. 2 If you look at New York residential 3 sales you'll see a different kind of data 4 error. Obviously the previous error in 5 Colorado was a lack of response or a 6 misreported value in the less than category 7 but here there was not a lack of reported 8 value but there was a value that was reported 9 in error. Probably someone reported in 10 kilowatts as opposed to megawatts or what 11 have you. 12 This type of problem is very vexing 13 to the imputation system. It doesn't deal 14 with these problems very well and, as you can 15 see, the data error reported in blue was not 16 fully corrected for as seen by the published 17 data in red. It was dampened but it still 18 existed throughout the published data. 19 DR. HENGARTEN: Joel, if there was 20 an error like kilowatts instead of megawatts 21 why does the blue line start to go up slowly? 22 MR. DOUGLAS: I am not sure. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 336 1 MS. KIRKENDALL: That looks like 2 there is a whole year where that goes on. 3 DR. HENGARTEN: But, I mean, if 4 someone reports badly the whole thing would 5 go up. It wouldn't be a trend. 6 MS. KIRKENDALL: The other thing 7 that gets complicated in figuring out some of 8 these things is that after the data are 9 published there you collect the data, you 10 benchmark to the most recent 861, and then at 11 the end of the year they adjust the monthly 12 numbers so they add up to the annual number 13 that reflects that year. And so you have to 14 really look to see if that's what happened to 15 those other data because of that smoothing 16 that maybe something -- 17 DR. BURTON: The smoothing pulled 18 it up. That's -- 19 MR. BERNSTEIN: You don't change 20 the reported data. We're talking about the 21 blue line there. That blue line starts going 22 up before the peak. So there was somebody BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 337 1 misreporting before this. 2 DR. HENGARTEN: There's something 3 going on and that's actually interesting. 4 DR. BURTON: One taught the other 5 how to misreport. 6 DR. HENGARTEN: It's an interesting 7 error. 8 MR. DOUGLAS: Well, anyway the 9 imputation system, like I said, was not 10 really designed to fix problems like this. 11 The question that the paper addresses is is 12 it possible for the imputation system to fix 13 problems like this. Are there certain 14 algorithms or certain methods that could be 15 incorporated just filling in missing values 16 that could help or alleviate such obvious 17 data errors? 18 A little background on the surveys. 19 The EIA 861 was the annual electric power 20 industry report. The data, of course, is 21 collected annually. The data represents a 22 census of the entire universe of power BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 338 1 industry participants and it also represents 2 the frame of which the monthly sample can be 3 drawn. The data collected is at the firm 4 level and state level and broken into four 5 sectors, the commercial, the residential, the 6 industrial, and the transportation sectors. 7 Form EIA 826 is the monthly sample 8 that's collected from the 861. It is a 9 cut-off sample. The remaining nonsample 10 firms are given an estimated or an imputed 11 value on a monthly basis; therefore, every 12 month you have a value whether imputed or 13 reported for every single firm in the 861 but 14 it's referred to for the rest of this 15 presentation as 826 published data because 16 it's not 861 census annual data. 17 The survey sampling methodology for 18 the 826 is they use of cut-off sample due to 19 the skewed nature of electricity sales and 20 generation. A weighted regression is used to 21 refute for the nonsample data. This is also 22 due to the skewness of the sample. The BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 339 1 survey model, like I said before, is the 2 reported values plus the sum of all the 3 imputed values and, again, as I said, the 4 spread between the two lines that I showed 5 you in Colorado and New York was the reported 6 and then the reported plus the imputed, so 7 just to revise that. 8 The first data cleaning technique 9 was used before we even reached regression 10 imputation. It is called scatter plot 11 editing. It allows data analysts to compare 12 the currently reported data to previously 13 reported data or previously reported 14 aggregate data depending on what you choose. 15 Hopefully this will catch misreported data or 16 data that stands out from its peers and more 17 specifically in context with its previously 18 reported values. In a perfect world that 19 this would be the best way to catch them but 20 obviously things slip by especially when you 21 have a several thousand points in the same 22 graph. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 340 1 Scatter plot editing is currently 2 in place on a few surveys within at least 3 CNEAF but also in the EIA and it's coming up 4 on several other surveys so hopefully that 5 will take care of a lot of the data problems 6 in the future. In the meantime if that's not 7 enough, again, just to refresh, as I said, 8 imputing monthly EIA 826 with EIA 861 the 861 9 is the frame. 10 We're talking about electricity 11 sales although total revenues are also 12 reported on both forms. The specific 13 regression equation with statistic errors is 14 Y equals X where Y is the state level monthly 15 revenue from the I firm into the commercial, 16 industrial, residential, or transportation 17 sectors and X is the state level annual 18 revenue or sales from one of those sectors 19 reported on an 861 and that's divided by 12 20 to give it a yearly or monthly average for 21 the year and the beta value is the growth 22 rate and that should take care of a lot of BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 341 1 the seasonality between the data. 2 Here is a more formal 3 representation on the equation. Y is the EIA 4 826 monthly reported data and X is the 861 5 annual data average for each month within the 6 most recently finalized year. Currently it's 7 2003 but soon to be 2004. 8 Pretty much the heart of the paper 9 dealt with data versus expectations. There 10 is a need to reconcile the survey sampling 11 methodology and the desired property of 12 accuracy that methodology ascribes to and 13 time series which has the desired property of 14 consistency. 15 To repeat, the survey sampling is 16 the reported plus the imputed whereas a time 17 series could be anything from an indicator 18 variable or a reported value plus lagged, 19 reported, or estimated values. The question, 20 of course, is can both be achieved 21 simultaneously. The process that Lindolfo, 22 Nancy, and Joe Sedransk went through to begin BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 342 1 analyzing these problems was that they first 2 had to re-estimate the 826 line by using 3 reported data and consistent imputation 4 system. 5 Now, the imputation system has not 6 changed all that much over the past 10 or 15 7 years but for analysis purposes they needed 8 to recreate the lines and have a consistent 9 process by which to impute from missing and 10 nonsampled data for cross monthly comparisons 11 as well as cross yearly comparisons. 12 Hopefully the newly estimated line matches 13 closer with the historical line. Otherwise 14 the relationship needs to start over because 15 we need a base of not only the imputation 16 system but a time series language to compare 17 other alternative imputation methods. 18 And, as you can see, the 861 and 19 826 re-estimated line matches up quite well 20 with the original until you get about 2002 21 when the relationship seems to break down for 22 some reason, not quite known why but it's BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 343 1 actually very close and, as Nancy pointed out 2 a few days ago, there are very few changes 3 within the imputation system over the years, 4 mostly stratification levels, different 5 samples being taken, and such. 6 But as you can see we have a nice 7 base right now for comparison of the time 8 series as well from now we can take this 9 imputation system that created these lines 10 and add or subtract stuff and attempt to fix 11 data errors. 12 As I previously discussed, scatter 13 plots, first two were used in trying to 14 uncover data errors within both 826 and 861. 15 The next three methods are as follows. The 16 second method was to incorporate automatic 17 outlier and influential observation 18 detection. 19 I will explain how we detect those 20 in a minute and that method would be treated 21 as erroneous or bad data and would be 22 overwritten with an imputed value just as if BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 344 1 they were a nonrespondent. The second would 2 be to take those outlying and influential 3 observations and treat them as add-ons, that 4 is, assume their value is good but do not add 5 them into the model and do not let its large 6 effects rain in on the other estimated or 7 imputed values for either sample or 8 nonrespondent. 9 The third method is the 10 experimental method which Lindolfo had 11 started right before he had left to go to 12 Sweden or Denmark or wherever he is this 13 week. And he used the seemingly unrelated 14 regression model or the SUR model, which I 15 will get into also. 16 Mobile and influential and outlying 17 observations, influential observations, if 18 omitted, is a reported value and the 19 regression estimation would considerably 20 influence the value of its corresponding or 21 predicted value. The way Nancy and Joe and 22 Lindolfo decided to uncover these BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 345 1 observations, they used DFFITS, greater than 2 two over the square root of the count. An 3 outlying observation is one with a 4 considerable rift between its reported value 5 and its predicted value and they use a 6 studentized residual on that one greater than 7 3.5. 8 There are probably different 9 methods to calculating these and as of right 10 now both an influential and the outlying 11 observations are treated as the same whereas 12 you could probably in some cases treat 13 influential observations as add-ons and treat 14 outlying observations as erroneous data to be 15 imputed for. 16 So the second step after scatter 17 plots would be to re-estimate the 826 line by 18 detecting the outliers and influential and 19 imputing for their values. This, of course, 20 is the extreme position on the other side of 21 detecting them as add-ons. Detecting 861 22 outliers showed the effect of smoothing BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 346 1 extreme imputed values whereas 826 outlier 2 and influential detection should have the 3 effect of smoothing extreme monthly 4 observations. As you can see, the results 5 were mixed. 6 If you look in January 1994 the 7 original trend of the time series was kept 8 intact all the way almost into 1995 but, as 9 you can see, it was not necessary because the 10 shock up still occurred. It was not just a 11 yearly shock that was go back down through 12 this little time series line. In some cases 13 the EIA 861 and 826 re-estimated or muted 14 some of the shock. In some cases they 15 exacerbated some of the shocks. Like I said 16 before, the results are just mixed. 17 MR. RUTHERFORD: I don't understand 18 the previous slide. The influential ones, I 19 don't understand why you override the 20 influential. Could you go back to the 21 definition? 22 MR. DOUGLAS: Yes. The influential BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 347 1 ones are not necessarily the largest but 2 those that would have the most effect on the 3 regression and, as I said, they were treated 4 as erroneous data. But they could also be 5 treated as add-ons or they could be kept and 6 say that's good data and we want it in the 7 model. The method that Nancy and Joe and 8 Lindolfo chose was to treat both influential 9 and outlying as the same, either impute for 10 them or treat them as add-ons, of course, 11 different methods. Like I said, it's the 12 extreme position but that was the way —————— 13 ———————————————————— 14 DR. EDMONDS: Is there any 15 difference between those two? 16 DR. HENGARTEN: Yes, one is an 17 outlier in the X direction. If you think of 18 a regression of Y on X if one of the 19 observation on the X-axis is way out that 20 acts like leverage. It will move the 21 regression line that's nullified on the X. 22 That's an influential observation or BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 348 1 possibly. And an outlier on the Y, it's 2 higher than it ought to be. So there's a 3 distinction there. Am I correct? 4 MS. KIRKENDALL: I think so. 5 DR. HENGARTEN: And so if you admit 6 that both forms have outliers then that's ——— 7 —————— 8 MR. RUTHERFORD: Yes, if you just 9 drop off the data and estimate it or if you 10 in a sense drop off the data and then use 11 those data to impute the values and then put 12 it back in that would be identical. 13 DR. HENGARTEN: Yes, when you 14 impute there are going to be small 15 differences because the imputed values are 16 not necessarily on the regression line. 17 DR. EDMONDS: Oh, I thought you 18 were using the regression to do the 19 imputation. 20 DR. HENGARTEN: Oh, so it's not 21 imputation. It's prediction? 22 MS. KIRKENDALL: Right. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 349 1 DR. EDMONDS: So it would be the 2 same if those are identical. 3 DR. HENGARTEN: Then it doesn't 4 make a difference. 5 MS. KIRKENDALL: What doesn't make 6 a difference? 7 DR. HENGARTEN: If you use the 8 regression line to impute then -- 9 DR. EDMONDS: Then dropping them 10 off and just doing the regression or doing 11 the regression and then using it to impute 12 the value for the outlier and replacing it 13 leaves you in the same spot. 14 DR. BURTON: But would you have the 15 same regression line? 16 MS. KIRKENDALL: You get a 17 different regression line especially with 18 influential out because the influential 19 actually moves the regression line with 20 change of data. 21 DR. EDMONDS: Oh, but I thought you 22 dropped them out and then imputed them from BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 350 1 the regression of the other -- 2 MR. DOUGLAS: The regression was 3 run and then that's how they are identified. 4 Then they were taken out and blanked out. 5 And then the regression was re-run and then 6 imputed value was put in. 7 DR. EDMONDS: So that's how you did 8 it. 9 DR. NEERCHAL: Take it out but I 10 think maybe what you think is we go one step 11 further and re-estimate your —————————————— 12 DR. EDMONDS: Right, then it would 13 be identical. 14 DR. NEERCHAL: But they are not 15 doing that because their objective is the 16 imputation only. 17 DR. EDMONDS: Right. 18 DR. NEERCHAL: Also in the previous 19 graph I think the structure in that one where 20 the jump is the blue jumps first and then 21 after a few years red jumps the same amount. 22 And is there an artifact in these kinds of BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 351 1 things? 2 MR. DOUGLAS: Well, each imputation 3 year was run on its own so when you hit right 4 here and the data jumps up the imputation 5 then cleans it and keeps on with the original 6 time series line but then when it's not 7 —————————— this any more ———————————— here 8 the data jumps up again. So it's probably 9 not necessary to clean the data at that 10 point. There were data errors that were not 11 permanently cleaned by this method of 12 automatic outlier influential detection. 13 MS. KIRKENDALL: This one could be 14 that they had found some new companies and 15 added it too. 16 MR. DOUGLAS: Exactly, and the 17 imputations fix that but then it got to the 18 point where it couldn't fix it any more, 19 where it wasn't necessary to fix it any more, 20 because it turned out to be probably good 21 data. 22 MR. BERNSTEIN: Nancy, just on a BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 352 1 side comment for our discussion yesterday, 2 this is an example of something that should 3 be flagged or should have been flagged but 4 still if that data -- 5 MS. KIRKENDALL: Well, but this is 6 just ongoing. I mean, we just discovered 7 these funny data points this year. 8 MR. BERNSTEIN: Right, well, but 9 now you know this one. Whoever that data is 10 should be flagged and said if you're going to 11 use pre-1994 data in this time series be 12 aware that it does this strange thing. And 13 once you know it somebody's got to write a 14 little sticky and make a memo out of it. 15 DR. EDMONDS: Actually it looks 16 like you've already imputed backwards. 17 MS. KIRKENDALL: We're going to 18 have to figure how to do that. 19 MR. HILL: If I remember the state 20 it said Colorado was the one where there was 21 a very large low outlier. It was one of your 22 earlier. I remember it was one of your ————— BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 353 1 —————— 2 MR. DOUGLAS: Yes, Colorado was the 3 one but it wasn't the industrial. It was the 4 commercial sector. 5 MR. HILL: Is it different? 6 MR. DOUGLAS: Yes. 7 DR. SITTER: Could I ask a 8 question? You said this was calculated by 9 year? 10 MR. DOUGLAS: Yes, it's calculated 11 on a monthly basis but if you have 1994 it's 12 using 1993 as its regressor year and when we 13 get to 1995 it uses 1994. So when you had 14 the shock up in 1994 and then with the newly 15 estimated it's taking out the influential and 16 outlier observations in both 861 and 826 so 17 it's reading a different set of data points 18 than the other one. 19 And then in a similar example we 20 have the Arizona residential sales. If you 21 look at the red line it's well within the 22 yellow and the blue lines. The peaks and the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 354 1 troughs and the extremes are not nearly as 2 much and that was when both again were 3 re-estimated and for comparison purposes the 4 blue line is just 861 re-estimated with 5 outliers influential detected and imputed 6 for, not 826. 7 So you can see the difference 8 between the two lines but it almost mirrors 9 the original data. So some of the extreme 10 data points on a monthly basis were not 11 corrected for or in this case if it even 12 needs to be corrected for. It is more of a 13 comparison. You see in almost every single 14 case the red line maybe prints half of the 15 way up the peaks and the troughs of the 16 yellow and the blue line. 17 So again the results mixed. Maybe 18 the red line is better. Maybe it's taking 19 out points or good data and should not be 20 overwritten and the blue and the yellow line 21 are more correct. 22 DR. NEERCHAL: I cannot see the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 355 1 blue at all except in two or three places -- 2 MR. DOUGLAS: It mirrors the 3 original data very, very well and that's just 4 with the regressor data —————————————— 5 DR. HENGARTEN: What is under the 6 yellow one? 7 DR. NEERCHAL: Right under the 8 yellow one? 9 DR. HENGARTEN: Yes, it's sneaking 10 under the yellow. 11 DR. NEERCHAL: Then it should be 12 green. 13 MR. DOUGLAS: The third method, 14 like I said, was to create 826 influential 15 and outlier observations as add-ons, assuming 16 that their reported data was good and the 17 value was kept intact but it was not kept in 18 the regression model so the value did not 19 affect the imputation or estimation of other 20 nonsampled or nonrespondent firms. And again 21 the results were mixed, as you can see. 22 If you notice, the re-estimated red BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 356 1 line before 1995 actually made data worse for 2 some reason. From 1995 until about 2001 it 3 mirrored it pretty well, didn't change it all 4 that much. In 2002 it made it worse and then 5 2003 it actually made it a little bit more 6 consistent. 7 So of these three methods discussed 8 so far none of them really deals with any of 9 the extreme data points that we've seen in 10 some of the historical data or would have if 11 they had been incorporated at the time the 12 data was collected. So the two conclusions 13 of these three methods are that automatic 14 outlier and influential detection is not a 15 global solution to deal with bad or erroneous 16 data points coming through in the reported 17 data and existing in the published data. 18 And a more positive conclusion is 19 that possibly the detection of outliers and 20 influentials could help analysts use scatter 21 plots and run a regression and color code the 22 most statistically significant points within BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 357 1 that scatter plot, help them direct their 2 efforts to those plants which need the most 3 attention. 4 The final method that they tried, 5 again with seemingly unrelated regressors, 6 this is still very experimental. In fact I 7 just got an e-mail from Lindolfo this morning 8 with a bunch more graphs. I think you got it 9 too saying look at these, aren't they really 10 cool, and they go along with some of the 11 graphs I'm about ready to show you. 12 SUR allows for information feedback 13 across regressor groups. It uses more 14 information than other regression types, 15 different co-variance relationships and it 16 also utilizes different strata. I'm sure 17 Nancy could go into that a little bit more if 18 there are questions on that. 19 This example was the one we used 20 back in the outlier influential detection. 21 As you can see with the SUR method, the blue 22 line, that shock we were just talking about, BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 358 1 the re-estimated one delayed it for a year 2 but then it still shocked up. With SUR it 3 smoothed it out and was not a single shock 4 but it's more of a consistent time series. 5 But if you notice in the middle about 1998 6 the SUR creates more extreme peaks and 7 troughs than the outlier influential 8 detection method used previously. 9 And then finally New York 10 residential sales was the second example that 11 I gave you. If you notice the point in 12 December of 1999 of reported data that went 13 through to the published data the SUR method 14 actually smoothed that out although the model 15 completely broke down in 2002 for some 16 unknown reason and created more extreme peaks 17 and troughs in the early '90s but the 18 original problem of that one outlying point 19 was solved although it did cause other 20 problems, overall positive but mixed. 21 MS. KIRKENDALL: More work needed. 22 MR. DOUGLAS: Yes, more work BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 359 1 needed. So anyway, in closing, improving 2 upon the imputation methods by both 3 acknowledging survey sampling data quality as 4 well as the need for consistent time series 5 processes is pretty important for EIA and 6 various surveys. Its ramifications could be 7 widespread not only as better time series and 8 data quality but also by the analysts for 9 cleaning the data and things like that. So 10 anyway thank you for your patience and your 11 time. 12 DR. HENGARTEN: Thank you very 13 much, Joel. I'd like to invite Mark Burton 14 to start the discussions. 15 DR. BURTON: Well, if it's an 16 invitation does that mean I'm allowed to 17 decline? As usual I feel like I'm equipped 18 with a knife going to a gunfight. This is 19 interesting to me because a couple of years 20 ago I had somebody ask me to produce some 21 passenger vehicle mileage forecasts and I 22 thought I have no idea how to do this. So I BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 360 1 talked to a friend of mine and she said well, 2 this is very easy. She said passenger 3 vehicle miles are highly correlated with 4 population and land area, land area is not 5 going to change, and our friends at Census 6 provide very good population forecasts, so do 7 the cross-sectional regression and once you 8 have those regressional results use the 9 population forecasts to drive the forecasts 10 for the vehicle miles. 11 It worked very well and this is in 12 a sense a probably more elegant example of 13 the same type of methodology where you're 14 using cross-sectional estimations to help not 15 drive forecasts as much as tune them and so 16 at the broadest level I find it to be very 17 satisfying mostly because it reflects what I 18 might have tried to do if I'd been given the 19 job. 20 Beyond that I have mostly just a 21 lot of questions and I don't think that I can 22 say anything that will add to the methodology BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 361 1 but I do have some questions that might help 2 someone else do that. It seems like in the 3 examples that are used that the sample values 4 before the imputation are almost consistently 5 under the values that result after the 6 imputation so I'm wondering if there's 7 something within the sampling process that's 8 leading to that consistent under-reporting. 9 MS. KIRKENDALL: Well, I think 10 that's just because it's from a sample, it's 11 not from the total, and then the published 12 line is an estimate for the total. So the 13 sum of all the data reported on the sample 14 will always be less than the estimated total. 15 DR. BURTON: So the difference 16 between the two is the actual difference 17 between the annual total and the firms in the 18 sample that weren't a part of the sample? Is 19 that what you are saying? 20 MS. KIRKENDALL: Yes, the 21 difference between those two is the 22 estimation that we made for those companies BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 362 1 that weren't sampled. 2 DR. BURTON: The second question I 3 have is from the discussion it sounds like 4 the regression that forms a basis for the 5 imputation is simply regressing against the 6 annual values. There's other information 7 available for the firm on an annual basis. 8 Has there been any thought of modifying that 9 regression to include more information that 10 might tighten to fit some? I have no idea 11 whether that would be productive but -- 12 MS. KIRKENDALL: Jim Knobb in CNEAF 13 has done a lot of work with these regression 14 equations and I believe he has tried multiple 15 regressors. Particularly he has tried 16 capacity. 17 DR. BURTON: That was one without 18 any real improvement of it? 19 MR. DOUGLAS: ———————————————— wind 20 energy and things like that. 21 DR. BURTON: Well, the idea is to 22 keep the regressions as simple as possible BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 363 1 unless you have got something that adds 2 appreciably to the fit. So, I mean, if that 3 makes me feel better at least somebody else 4 already had the same idea that I had. And 5 along those lines, Nick, I want you to know 6 that I have a note here that says why the 7 run-up in data errors for New York 8 residential so I'd actually written it down 9 before you said it. 10 DR. HENGARTEN: Good job. I'll 11 take your word for it. 12 DR. BURTON: Another question that 13 I had is does the set of sampled firms change 14 over time and if so in what way? 15 MS. KIRKENDALL: It has changed 16 over time. I mean, this is a survey that 17 started in certainly early '90s and we had in 18 the frame everybody that we knew about. And 19 since then, of course, people have tracked 20 the industry and added firms to the frame. 21 They've also observed errors in the 22 estimation and when they thought that the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 364 1 errors were too big they'd add a few firms. 2 There have been changes in the industry in 3 the early 2000s with deregulation and so they 4 changed certain ———————————— firms. 5 DR. BURTON: Firm mergers? 6 MS. KIRKENDALL: Well, 7 deregulation, they're no longer regulated 8 now. They're public firms and we've changed 9 how we treat them. So there have been 10 changes over time and this isn't always true 11 either but usually they make the changes with 12 January data but they'd add firms to a 13 sample. 14 DR. BURTON: Is there any 15 correlation between those changes and some of 16 the data points that are at issue? 17 MS. KIRKENDALL: Maybe. I think so 18 far what's happened is they've run all the 19 methodology but we haven't really gone in to 20 look into the detail to see what caused funny 21 things and that's probably a step that we 22 need to do at some point. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 365 1 DR. BURTON: The last question I 2 have, and I apologize. Again I have a very 3 little to offer in terms of guidance and just 4 a lot of questions but the last question I 5 have is one of the purported goals as I read 6 this was to try and distinguish between data 7 issues and actual economic shocks. And I 8 wasn't convinced that I was seeing how that 9 happened and it's not the first time I've 10 missed something but if you can help me see 11 how this does that because that's a 12 tremendously important thing to do. 13 If it's an actual data shock then 14 you've got to leave it and let it have 15 whatever influence it has. If it's an 16 economic shock it should be there. If it's a 17 data issue then you want to try and remove 18 the influence. I couldn't quite see how that 19 distinction was being made so if either one 20 of you -- 21 MS. KIRKENDALL: I'm not sure we 22 are making that distinction yet. I think BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 366 1 what's been tried so far is this automatic 2 detection. You call them all outliers, 3 either take them all out, don't use them at 4 all and impute for them, or take them out of 5 the regression but add them back in. I don't 6 think the solution is that if you do it all 7 completely automatic. I think Joel said 8 that. So you probably do need to rely on 9 somebody who's going to talk to the company 10 and find out whether it was a real data 11 point. You'll just never get out of having 12 some of that go on although in other venues 13 we've had problems, as you heard, that 14 respondents are not as knowledgeable as they 15 used to be in this area. I got it off of 16 this form in my spreadsheet. Of course, it's 17 right. 18 DR. BURTON: You guys are awfully 19 knowledgeable about what's going on. 20 Sometimes you probably can look at 21 circumstances in a particular state and at a 22 point of time and say oh, yeah, that's when BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 367 1 something happened to the grid or that's when 2 this terminal went down. There probably 3 would be instances where even without 4 verifying it through the firms you guys know 5 enough to find it. 6 DR. NEERCHAL: But you seem to put 7 this drill down feature to the scatter plot. 8 Does that give these remarks or something 9 from the database for the user to see? Do 10 you have a feature like that in the scatter 11 plot? 12 DR. EDMONDS: It gives a list of 13 all the firms that made that monthly 14 aggregate for that sector and the state 15 whether they're imputed or whether they're 16 observed and their values. One thing to 17 notice if you remember back to —————————— New 18 York residential generation the large spike 19 up that occurred in December of 1999. That 20 seemed like an odd month for one spike up to 21 occur. 22 DR. BURTON: For it to be BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 368 1 nontransitory, I mean, spikes are one thing. 2 There you had a nontransitory shift where it 3 just went up and from then on forward it 4 stayed there and from an analytical 5 standpoint that's even more significant and 6 something to deal with. The bottom line is 7 this is very cool and the fact that you guys 8 are engaged in an effort like this I think 9 reflects the sophistication or elegance 10 pervasive among the work that you all do so I 11 like it. 12 DR. HENGARTEN: Thank you very 13 much. Nagaraj, anything to add? 14 DR. NEERCHAL: I think first of all 15 I should say that Joel did a fantastic job of 16 presentation on this one. I think it was 17 very, very helpful to me really because this 18 new format about two discussants puts a 19 little bit of added pressure on the lead 20 discussant because they have to read it ahead 21 of time and come prepared. 22 DR. BURTON: No, you don't. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 369 1 DR. NEERCHAL: So it ended and I 2 think that a lot of the details brought up 3 today I think will really help if they are 4 part of the documentation. For example, the 5 fact that the equation of known plus unknown 6 in a classical sampling paradigm, seen plus 7 unseen kind of concept, I mean, it was very 8 useful for me to really see it even though I 9 was trying to get that from him. So you are 10 doing a really good job. 11 And I think that I come to this 12 from a different angle because I'm a 13 statistician. I'm a you can say closet 14 econometrician because my advisor was an 15 econometrician and so I understand some of 16 the language but in general people may not 17 understand this. 18 So the first thing I would say that 19 it seems to me that the objective is to 20 impute data. Now, impute obvious errors I 21 think like megawatt being mistaken for 22 kilowatt and things like that, in that case BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 370 1 if it's really identifying an error but once 2 you identify it you know exactly what the 3 answer should be. 4 In other cases where you are really 5 using the prediction ability of your model in 6 a way because you don't know. Something has 7 gone wrong. You have not identified the 8 reason why it is. That is a prediction 9 problem in a way and you can call it impute 10 because it is in the middle of the data. 11 And I wondered what exactly is 12 fixing. Is it just enough to catch them or 13 you want a method that is robust with respect 14 to these things? Those are questions that I 15 couldn't quite get an answer to in the 16 documentation. So I would like to see some 17 discussion there saying what does it mean to 18 say you have a method that's working. I 19 think it would be nice if there is 20 description of that somewhere. And I think I 21 would do that. 22 And some of my other comments are BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 371 1 specific. I believe seemingly unrelated 2 regression, the name was coined by Zellner 3 (?) because saying there are two regression 4 equations they should not really be thought 5 of any way related. Maybe it's the economy 6 of Japan, it's the economy of another 7 country, and he just said if you put them 8 together, do the regression simultaneously, 9 you can get some small sample efficiency 10 gain. 11 I think it is small sample 12 efficiency gain because you really are using 13 the same data as you did earlier. You just 14 use the correlations to do a second stage 15 estimation so as regarding that it will tell 16 a different story. So I think that in this 17 case on the other hand there are unrelated at 18 all. It has different states. There are 19 some borrowing and lending going on between 20 states. 21 So it's an obvious thing to do a 22 simultaneous estimation or simultaneous BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 372 1 system. It seems to me ———————————————— 2 happens to fall under the seemingly unrelated 3 methodology but they're not seemingly 4 unrelated at all. 5 MS. KIRKENDALL: They are seemingly 6 related. 7 DR. NEERCHAL: They're seemingly 8 related. 9 MS. KIRKENDALL: Not obviously 10 related, not seemingly unrelated. 11 DR. NEERCHAL: Or unseemingly 12 related so it seems to me that it's an 13 obvious thing to do and I think this was 14 actually partly brought out last time I think 15 when Joel presented something related to 16 these data that it was brought out. How come 17 you're not borrowing strengths. I think the 18 committee had brought this up, I remember. I 19 think that seems to me an obvious thing to do 20 but on some references in the documentation 21 like the correspondence between regression 22 estimators and the ratio estimators, those BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 373 1 relationships have to be explored under the 2 measurement error ———————— because ratio 3 estimators are like regression without 4 intercept in a sense but that will assume 5 that X with no measurements error or Y with 6 measurement error and so some details are 7 there and those technical details are to be 8 brought out. We cannot say in general one 9 beats the other. I don't think it's possible 10 to do that. And so this is obviously a very 11 data-driven problem and I really like it, 12 though. I think this is really useful to do 13 it using all the states together. We should 14 try to do it as a system. 15 Some specific things that maybe I 16 should probably volunteer later. Even if you 17 do them one state at a time you still get 18 unbiased estimates. Only when you do them 19 together you get a more efficient estimate. 20 That's the reality. They're not biased in 21 any way. I don't think that is the reason 22 why they are different. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 374 1 I think these are also driven by 2 the e-mail that Bill sent to us later, the 3 e-mail exchange between Jim Knobb and there 4 was an e-mail if you would forward it to us 5 later on. There's some comment about there 6 are some production issues. It cannot be 7 implemented during production. That's hard 8 for me to comment on that one that I don't 9 know but I'm looking at it from a 10 statistician's point of view. 11 It seems to me that you have to do 12 them together. So if you can borrow strength 13 from other states you have to try to do that, 14 I think, at least see how much you gain. So 15 I want to stay out of that controversy if 16 possible. 17 And one thing occurred to me during 18 the presentation. It seems to me that when 19 you do seemingly unrelated the differences 20 become less tractable because of the 21 smoothing. That is the other side of 22 smoothing because before you smooth I asked BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 375 1 him why it's jumped. He could easily explain 2 well, it looks like some data points are not 3 cleaned and they were being left in the data 4 segments and so on. Easy to explain why the 5 differences are but the moment you put this 6 seemingly unrelated stuff because it is 7 borrowing stuff from everywhere it is very 8 hard to tell why the difference is still 9 there. I think that is the flip side of 10 smoothing. It's hard to track why they are 11 different. 12 MS. KIRKENDALL: And I think Jim's 13 bias is to keep it simple because this simple 14 regression model has been used for a long 15 time. He's done a good job. And I still 16 think that there can be smart things to do 17 with the outliers and the influential but he 18 doesn't like the idea of the automatic 19 outlier detection because he thinks somebody 20 ought to look at it. He thinks a data person 21 ought to make sure it's right. 22 He has really some good points and BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 376 1 he's interested in a system that runs 2 smoothly and is easy to use and easy to 3 understand and does a good job. He thinks 4 maybe fine tuning what they're doing is a 5 good idea and he likes using scatter plots. 6 He likes the visual way of looking at data. 7 DR. NEERCHAL: I think, suppose you 8 have enough manpower, looking at the outliers 9 is definitely a good thing. I don't think 10 it's possible to tell some errors apart but 11 if it's an obvious error or apparent shift I 12 think really many times it could be difficult 13 to distinguish, I think. 14 So I think I'm partial to the idea 15 that every time the model flags or something, 16 look at it but I think doing things 17 simultaneously with all the states is easily 18 the right thing to do. I think you'll have 19 more to learn from that than doing them 20 separately, definitely, so I want to just say 21 that I definitely think that the one thing I 22 wanted to say was that the documentation BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 377 1 definitely needs improvement because 2 especially when it complains that the 3 documentation or the imputation is not very 4 good. Then your documentation should be 5 better. I think the presentation really 6 helped me to understand better. 7 I think I'm with Mark on that one. 8 I think this is an interesting work 9 definitely from a statistical point of view 10 and I think that I'd like to see if not 11 anything else some simulation to see whether 12 there is really efficiency gained by doing 13 this. Are we unnecessarily ———— spending 14 more energy? 15 DR. HENGARTEN: Thank you very 16 much. Mark has been shaking his head. He 17 can't wait. 18 MR. BERNSTEIN: One thing I need to 19 say on particularly the recent work we're 20 just getting ready to publish, there is a lot 21 of statistically significant difference 22 between states particularly in demand and BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 378 1 price relationships and things like that. So 2 I would not do a national. The states do 3 differ significantly enough except when 4 they're in neighborhoods. So you can look in 5 some senses to neighboring state trends but 6 even then it's iffy. So I would caution 7 against trying to pull these things together 8 because there is enough difference. 9 DR. NEERCHAL: But you don't have 10 to fit the same coefficients for all the 11 states. I mean, you do it as a system I'm 12 asking you, you can build the flexibility of 13 different coefficients, perhaps, but do them 14 simultaneously, borrow the data from each 15 other? 16 MR. BERNSTEIN: But it seems that 17 that gets more complicated particularly if 18 there are statistically significant 19 differences between the states, I think, in 20 both trends and then -- 21 DR. BURTON: I think what he's 22 saying is that you may have consumption here BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 379 1 at one state and here in another but that 2 those two are correlated. Even though 3 they're very, very different the movements 4 are correlated that I'm selling to you or 5 you're selling to me. 6 MR. BERNSTEIN: But actually there 7 are not enough. There are enough differences 8 in both trends and relationships between 9 states significant enough in some fuels and 10 not other fuels. Electricity is significant, 11 gas is not, residential is more significant 12 than commercial, but it really changes. I 13 mean, what we found in this latest work is it 14 really changes from state to state and fuel 15 to fuel. And so you can't you can't do the 16 same thing for residential electricity that 17 you're doing for commercial natural gas, for 18 example. So that's on the one thing. 19 The other, as for Mark's other 20 statement, I'm not what the objective was 21 here. There seemed to be two objectives. It 22 disturbs me that that New York residential BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 380 1 spike could still be maintained in the data. 2 I mean, I know this was yesterday's 3 discussion but that continues to bug me. It 4 shouldn't be there. For sure that wasn't 5 real, that couldn't happen, it's impossible. 6 Well, it's not impossible, I suppose, but 7 highly unlikely and it should not persist in 8 the database. 9 On the other hand Iowa, which 10 looked weird, the one you showed, actually 11 there are good explanations why that would 12 occur. And so it is hard sometimes to 13 distinguish but there are clearly times when 14 you've got these spikes, when you know it's 15 wrong, and it just has to get out of there. 16 MS. KIRKENDALL: That brings up 17 yesterday's discussion and actually Bob 18 Schnapp is here. Bob, would you like to come 19 down to the table? 20 DR. HENGARTEN: Hi, Bob. Come on 21 down, all the way down. Please join us. 22 MS. KIRKENDALL: Just sit at the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 381 1 table. 2 MR. SCHNAPP: Now I'm really in 3 trouble. 4 MS. KIRKENDALL: I'll give you some 5 background because you weren't here 6 yesterday. Yesterday we weren't talking 7 about electric power data. We were talking 8 about other data anomalies in the EIA. And 9 the committee said that they wanted to see 10 flags when we saw something that was odd. 11 Whether we knew what the answer was or not 12 they'd like have users be alerted to when 13 there are funny things in the data. And, of 14 course, you guys have data back to 1990s. 15 Rather than have to keep correcting them I'd 16 like to say that the 1990 data are fair 17 ———————— and we don't need to mess with them 18 any more but there are some funny things in 19 the old data. 20 So the research questions about 21 putting flags on, how do you put flags on? 22 How do you notify your users that there are BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 382 1 some issues? Sometimes it will be that you 2 know what the answer is and you could fix it 3 if only you had the time and resources. And 4 sometimes you might not know what the reason 5 is. I'm not asking you a question. This is 6 putting it in context because it's something 7 that the committee has brought up. 8 MR. BERNSTEIN: But we did not say 9 yes, that you had to fix every problem. 10 MS. KIRKENDALL: No. 11 MR. BERNSTEIN: You just need to 12 make sure the user knew that you thought 13 there was a problem. 14 MR. RUTHERFORD: That you're 15 working on it. 16 MR. BERNSTEIN: That you're working 17 on it or you could say we're not working work 18 on it, it's too far in the past, but the fact 19 is you just got to let the user know. That's 20 what our focus is. 21 MS. KIRKENDALL: So how do you do 22 that, though? I mean, envision a spreadsheet BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 383 1 with a bunch of numbers on it which is one of 2 the ways we disseminate data. Do you have 3 some other thing, a text message, some -- 4 MR. BERNSTEIN: If there is a 5 particular data point that you know is 6 questionable you can put a comment there on 7 that particular data point in the spreadsheet 8 or somewhere else in the spreadsheet. I 9 mean, there are different ways to do it. I'm 10 generally not the user. It's one of my 11 assistants who's the user. So I could sit 12 down and ask them what's the best way for 13 them to get it but clearly if you know there 14 is a problem then somehow let us know. 15 DR. BURTON: You may even run into 16 an instance where a user seeing that flag 17 says oh, I know what's going on there. In a 18 sense it's a way to enlist -- 19 MS. KIRKENDALL: And could help you 20 find out what the problem was. 21 MR. RUTHERFORD: But just to be the 22 devil's advocate, how is it that the people BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 384 1 who are collecting the data are supposed to 2 know this is a problem or this is just an 3 anomaly, right? I mean, a lot of times the 4 perception of what actually constitutes an 5 outlier or what constitutes a problem is 6 something very specific to the context in 7 which the data is used and we can't expect 8 the DoE analyst to be able to anticipate 9 every particular way in which you could slice 10 or look at the data samples. 11 MR. BERNSTEIN: No, and sure, there 12 are going to be cases that are not as obvious 13 as this. But if you get a case that's 14 obvious as this -- 15 MR. RUTHERFORD: Then why do they 16 have to flag it because you as the user can 17 see it as well as they can? 18 MR. BERNSTEIN: But only if you're 19 actually plotting it. If you're taking the 20 data and assuming it's okay and you're doing 21 regressions on it and you're not actually 22 plotting it when we pull this big data set BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 385 1 down we don't look to see that every number 2 is okay. We assume it's okay and therefor 3 start doing regressions on it. And the only 4 way we uncovered these is because we actually 5 started disaggregating the data to look at 6 things in a more disaggregated way and 7 noticed that six out of the 48 states we were 8 looking at had some weird things in it. We 9 wouldn't have seen it if we were doing the 10 national -- 11 MR. RUTHERFORD: But the thing is 12 that there are so many different ways to 13 slice it and then once you visualize it the 14 right way then you can see the problem. 15 MR. BERNSTEIN: There are obvious 16 ones and there are not obvious ones and 17 they're not going to be able to flag 18 everything but the obvious ones when they see 19 them and they know. And also, I mean, 20 statistically you can figure this out as 21 you're pulling data in. 22 DR. HENGARTEN: Jae? BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 386 1 DR. EDMONDS: I thought that was an 2 excellent presentation. I go back to what is 3 it we're trying to do and ask myself how does 4 this serve the basic product and it seems to 5 me that what you're trying to do is improve 6 the quality of the data that you report. And 7 so it sounds like you're doing the right 8 thing, which is using these approaches to 9 flag stuff that you want to go back and ask 10 questions about, go back and say are we 11 actually getting reported what we intended to 12 get reported. 13 There's an implication that you 14 there's another thing you might be doing and 15 I think are doing all the time which is 16 inferring the values for nonrespondents which 17 you've got to do if you're trying to make an 18 estimate of the universe. There is a third 19 thing you might be doing which flows over 20 into the STEO. Are you drifting toward 21 replacing primary data with your model 22 output? That troubles me a bit. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 387 1 DR. BURTON: It's much cleaner, 2 though. 3 DR. EDMONDS: The data sets will 4 all look great and every once in a while 5 you'll find that you'll get a data set and 6 you'll actually put together a little model 7 you'll stumble across I got a perfect fit. 8 This is really good. But it may be too good 9 to be true and then you find, of course, 10 those weren't really primary data. So I 11 think, given that your principle product is 12 the data and that the models are in service 13 of getting the best primary data, I just 14 caution, be careful about drifting over into 15 that other piece of business. It's important 16 to infer for nonrespondents but don't let 17 just a model override and overwrite the 18 primary data because it's really the primary 19 data which is your product and in hindsight 20 it may well turn out that this thing actually 21 did happen when you go back and investigate 22 it and there is real important information BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 388 1 that you don't want to lose. 2 DR. BURTON: A couple of years ago 3 I was assigned to try and estimate flood 4 damages based on flood characteristics. And 5 so we took all these damage estimates from 6 flood events and started regressing them 7 against the flows and other flood 8 characteristics and the model fit was 9 absolutely pristine and I looked and said 10 this is great. Look how good this is. Turns 11 out once we started showing this to the Corps 12 of Engineers they said well, we didn't really 13 actually have specific estimates. We just 14 used the flood characteristics. No wonder my 15 model fit so good. 16 DR. HENGARTEN: Walter? 17 MR. HILL: I found this 18 presentation fascinating —————————————— quick 19 question maybe given the time —————————————— 20 In the Oklahoma data, for example, there is a 21 shock and I'm wondering if after that you 22 observed that there are upward estimates or BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 389 1 perhaps the shock is propagating throughout 2 that latter part ———————————— those latter 3 observations because your regression model 4 has —————————————— you may have looked at 5 that but it might be something —————————— 6 MR. DOUGLAS: ———————— the base 7 imputation for ———————————— should have 8 matched up perfectly. 9 MR. HILL: It was not quite long 10 enough for me ———————————————————— quickly. 11 The other ———————————— so it does turn out 12 that the problems with the estimate might 13 occur because of the shock propagating a year 14 later. 15 MS. KIRKENDALL: I think in a lot 16 of these cases especially where the 17 re-estimation didn't do what we thought it 18 might do we need to take a look and see if we 19 can figure out why. 20 MR. HILL: And —————————————— my 21 other point if that shock really should be 22 there if I was the data user I'd want to know BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 390 1 that there was a low outlier in that 2 particular observation rather than have that 3 get cleaned up and —————————————— explanation 4 for it ———————————— something happened to the 5 weather that —————— that observation low if 6 that was the case or for the New York data 7 it's the month before the millennium —————— 8 it looks like the difference is more than 9 celebrations but there might be some easy 10 explanation. 11 DR. HENGARTEN: Do you want to 12 reply? 13 MR. SCHNAPP: I've been taking down 14 some notes as I had been hearing this. Let 15 me give you a little bit of background. I'm 16 not sure if it explains any of these data 17 anomalies but let's see. In 2002 the form 18 was changed so that we could collect 19 information from energy service providers. 20 I'm not just talking about Oklahoma here but 21 I'm just talking about in general. 22 Before that it was easy. We would BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 391 1 send the form to electric utilities. They 2 would tell us who all their customers were, 3 what their revenues, what their sales were. 4 That was easy but when the states deregulated 5 then we had to start collecting information 6 from energy service providers. Well, they 7 only give you information about what their 8 customers and their revenues are. So now you 9 have to pick up the other piece of the 10 revenue from the distributors. 11 So theoretically inside of a state 12 those numbers should match. The number of 13 customers should match and the sales should 14 match as well, the sales versus the amount of 15 electricity that you're transporting there 16 for them, and what we found is that they 17 don't. 18 And so the 2001 data actually did 19 reconcile that data. We worked with Nancy's 20 group to reconcile that. That wasn't done in 21 2002 or 2003. It was done this year for 22 2004. And this year Tom Leckey has actually BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 392 1 gone back and revised it for 2002 and 2003 so 2 that they are consistent and so we'll be 3 putting out a revised historical Excel 4 spreadsheet hopefully inside of next month or 5 so here. 6 Another large change that happened 7 was in 2003. There was a residential, 8 commercial, industrial sectors and then there 9 was another sector called other. And most of 10 the stuff that went into other was things 11 like public street lighting and irrigation 12 and transportation. And we did away with 13 that category and we created the 14 transportation category so that our 15 information could be consistent with the rest 16 of EIA's data. 17 And we asked them to take whatever 18 they're reporting that should have been in 19 the commercial or the industrial sector and 20 move it there moving forward. And so we did 21 that with the 2003 data for the first time 22 and that was fairly good data. I would say BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 393 1 the 2004 data has gotten better but what you 2 see is then it couldn't explain some of the 3 shocks that you see there, either things 4 moving in or moving out. You need to add 5 together the commercial, the industrial, and 6 transportation in order to see what's 7 happening there. 8 You also have about two years ago 9 the acting administrator asked us to look at 10 the sales data to see what was going on 11 because there seemed to be something funny 12 between the commercial and the industrial 13 sectors. Having made lots and lots of phone 14 calls, what we found out was that there was a 15 number of different things going on. You 16 have companies that merge and so the prior 17 companies categorize certain customers, for 18 example, as industrial because they give them 19 a real low rate but they're really commercial 20 customers which they categorize as 21 industrial. 22 But when they're taken over by BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 394 1 another company they say well, they're not 2 industrial, they're commercials, they report 3 to us as commercials. So now you see a 4 movement of the customers and the sales and 5 revenue going in that direction. And it 6 might look odd but that's what happens. You 7 also have -- 8 MR. BERNSTEIN: Could I stop you 9 for one second? 10 MR. SCHNAPP: Sure. 11 MR. BERNSTEIN: Do you explain that 12 anywhere in the data? 13 MR. SCHNAPP: Well, I mean, the 14 movements are fairly subtle. I mean, they're 15 there but to explain every movement of a 16 couple of percent is difficult. 17 MR. BERNSTEIN: It's not really a 18 shock. A shock means something that's going 19 to significantly change the analysis you're 20 doing. So, I mean, there are subtle changes. 21 That's fine. 22 MR. SCHNAPP: Right, and that's all BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 395 1 they were talking about because like a 2 percent or two here was throwing off the STEO 3 model. And so they wanted to know what was 4 going on there. We also have a case where a 5 company starts to build a mall, for example, 6 and all the electricity being sold to that 7 company is categorized as industrial because 8 an industrial facility is building this mall. 9 The question is after they're finished does 10 that become a commercial sale or they leave 11 as industrials and they do both. 12 So we can't control them as to who 13 they're classifying as what. We give them 14 our definitions and we can't sit there and 15 hold their hand with every single customer 16 that they have to figure out where they go. 17 The last thing that I wanted to 18 mention was I'm pretty sure I'd come here and 19 talked to you a couple of times about our 20 Internet data collection system. And what 21 that does is we have built-in edit so that 22 when the respondents key in their data when BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 396 1 they're about to submit it it automatically 2 compares their data to the previous 3 submission that they had. And if it's 4 outside of a certain bound then it says 5 you've got an error here. You need to fix 6 this before you submit it. 7 They are allowed to submit it if 8 they override it and explain why. And then 9 we look at that explanation and see if it's 10 reasonable and either accept that or we call 11 them back. So our Internet data collection 12 system right now for all of our forms we 13 collect about 36-37,000 forms during the 14 year. About 88 percent of them came in on 15 the Internet this year. 16 And as far as the 826 is concerned 17 it's probably closer to 95 percent came in on 18 the Internet. So most of that already is 19 clean and there are about 450 respondents on 20 the 826 out of 3,000-ish that's on the 861. 21 Those 450 account for somewhere between 85 22 and 90 percent of electricity sales. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 397 1 So we're getting the bulk of it and 2 what we're talking about in this presentation 3 is doing the other 10 percent. So if you're 4 looking at the national number, I mean, the 5 impact is not going to be that great. At the 6 state level it's much more important and 7 those are things that we still have to look 8 at but we're very comfortable with our 9 Internet data collection system. 10 Particularly on the 826 and the 861 they 11 really do keep perfecting them on the edit 12 side. 13 And we're also, frankly, actually 14 very excited about the scatter plots, which 15 still could show us outliers, and the way 16 they put it together, you see an outlier, you 17 click on that dot, and it gives you all the 18 information you need to know. So it's very 19 useful to us and those are the comments that 20 I have from having heard all of the things 21 that I've heard here. I'm not sure if that's 22 helped or hurt your understanding of it but BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 398 1 that's what I am going to offer to you. 2 DR. HENGARTEN: Thank you. I'd 3 like to resonate Jae's comment to understand 4 what is it that we want to do. And the way I 5 see it I don't see three problems. I see two 6 problems and the first problem is the obvious 7 one. It's the one I think everybody thinks 8 is in terms of predicting or is the current 9 estimate the correct one. And if you look at 10 the models that we've used you use the past 11 data and you're trying to figure out is this 12 observation valid, do I make an imputation, 13 is it an outlier. 14 And then there's the other question 15 that we've talked about yesterday which is 16 data quality. And data quality sometimes 17 cannot be made on the fly. Sometimes 20/20 18 hindsight is a useful thing, especially for 19 data quality. That's how sometimes we find 20 those errors that, you know what, that thing 21 just didn't look good in hindsight. And so 22 when you're going to do the data quality BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 399 1 you're allowed to use both past and future 2 observations and that can change. 3 So, for example, if you have a 4 change in level if you look at this and oh, 5 yes, there's a change in level, that doesn't 6 seem like an outlier. It's a break, it's 7 something that happened, but it's not an 8 outlier. And so really thinking of what use 9 you want to do will influence what kind of 10 data you're using and how you're using. And 11 for data quality you're allowed to use future 12 observations to predict the past. At least 13 morally I would feel satisfied if you would 14 do that. 15 So I know it's a small detail, 16 maybe, but I think it's something to keep in 17 mind of what use you want to do with the data 18 because it influences your model and what 19 you're going to do with it. There are still 20 time series but sometimes the way you're 21 going to use the data is different. So that 22 was my comment and I think it goes along a BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 400 1 lot with the discussion we're having here. 2 DR. NEERCHAL: I think we are 3 running over time, I think. 4 DR. HENGARTEN: You already had 5 your turn. 6 DR. NEERCHAL: I just wanted to 7 mention I think the NHTSA people have a data 8 set on accidents. They have a column there 9 on blood alcohol content and this is a 10 notoriously difficult number to get because 11 many people refuse to take the test and 12 sometimes they forget because it's such an 13 emergency that they don't have time to give 14 this test. And so this is one of the 15 spottiest columns in that vehicle accidents 16 data in NHTSA. 17 And they have done imputation and I 18 think that Jae's comment reminded me so they 19 still make the original data available. So 20 it's not imputed data only and since you 21 already know people in NHTSA they will be 22 able to give you more details on it. And I BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 401 1 think in terms of how they handle the 2 publication of it it's a good example. 3 DR. HENGARTEN: Well, thank you, 4 everybody, for this lively discussion. I'd 5 like to invite the committee to split in two. 6 We have break-out session. 7 MR. COLE: One quick question or 8 comment, would it be possible to include a 9 small sample of below cut-off firms in a 10 monthly survey? These cases right now are 11 being imputed based upon a Russian model. Or 12 as an alternative would it be possible to 13 develop two models, one for nonresponse for 14 the larger firms and one for nonsample cases 15 based upon the smaller observations in the 16 sample? 17 MS. KIRKENDALL: They don't have 18 very much nonresponse on this survey but it's 19 nice to have an automatic imputation method. 20 I mean, in fact usually they don't have any 21 nonresponse or if they do it's on the really 22 small ones. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 402 1 MR. COLE: Would it be possible to 2 build a model for nonsample cases based upon 3 the smaller units in the sample? Small firms 4 are different from big firms and they have 5 different behaviors and they may in fact have 6 a different pattern. That's all. 7 MS. KIRKENDALL: The initial survey 8 design for this did have a probability sample 9 and the problem with the small firms is they 10 don't like to report to us and when they do 11 report to us they report bad data. 12 DR. HENGARTEN: Any more questions? 13 So I'd like to invite Mark Bernstein, Mark 14 Burton, Jae Edmonds, and Tom Rutherford to go 15 downstairs to Room 5E-069 and the other 16 committee members will stay up here for the 17 break-out session. We should reconvene at 18 10:45. 19 (Recess) 20 MR. HOUGH: Good morning, everyone. 21 It's tough to follow that lively discussion 22 but we'll give it a shot here. My name is BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 403 1 Richard Hough. I'm familiar with most of you 2 folks. I've been coming here to talk to you 3 about some Census Bureau project. Now I 4 think this is the fourth straight meeting 5 that I've had the pleasure of coming up here 6 and talking to you. I'm here today to talk a 7 little bit about the final two evaluations 8 from the frames evaluation project that we 9 undertook with the CNEAF surveys and CNEAF 10 stands for coal, nuclear, electric, and 11 alternative Fuels. 12 I'm here with Vicki Haitot, who is 13 the senior analyst on my staff and did the 14 majority of the groundwork for this project. 15 Also here today from Census is Susan Bucci, 16 who is the branch chief of the Construction 17 and Minerals Branch, where the work was 18 conducted. Also Stacy Cole is the branch 19 chief from the Research and Methodology 20 Branch within the Manufacturing and 21 Construction Division. 22 Also sitting in the back there is BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 404 1 another member of my staff, Ashley Robinson, 2 who worked with Vicki on the project, and 3 Bill Bostic, who is the division chief of the 4 Manufacturing and Construction Division and 5 he'll be available to answer any questions 6 that I cannot when we get to the end of the 7 presentation. 8 I think most of you were here in 9 the spring when I presented the results from 10 the first three surveys that we did a frame 11 evaluation on. This is actually the 12 conclusion of the project today. A little 13 overview of what we're going to talk about 14 today, first of all I'm going to talk to you 15 a little bit about background and some of the 16 purposes of the frame evaluations, give a 17 little bit of summary on the final two 18 surveys that we evaluated, talk to you a 19 little bit about the generalized two-step 20 matching process that we went through to 21 match establishments on both survey frames. 22 Then Vicki is going to come up and BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 405 1 present some of the results. She's going to 2 talk to you a little bit about some of the 3 implications of the frame structure and how 4 we dealt with that. Then she's going to give 5 you the results of both by coverage, by 6 establishment counts. She's going to talk to 7 you a little bit about coverage by volume and 8 then we'll give a summary of what we found on 9 these two surveys. 10 Then I'll come back up and talk to 11 you a little bit about some general 12 conclusions from the project as a whole, talk 13 a little bit about the next steps for EIA off 14 of these results, and then we have a question 15 for the committee that we'd like to end with, 16 maybe possibly having a little bit of a 17 discussion. 18 The purpose of the frames 19 evaluation, we wanted to evaluate the 20 coverage of the EIA frames, we wanted to 21 identify differences between the frames, and 22 we wanted to supply EIA with any BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 406 1 characteristics of the missing establishments 2 to point them in the right direction to go 3 and enhance their frames if they wanted to. 4 The final two evaluations that 5 we're going to be talking about today, both 6 evaluations were conducted using the 2002 7 survey frames. The first one was the EIA-3, 8 which is the quarterly coal consumption and 9 quality report for manufacturing plants. 10 This survey collects data on coal consumption 11 within the manufacturing sector of the United 12 States. 13 The second evaluation that we 14 conducted was actually a EIA frame subset of 15 two EIA surveys. The first survey was the 16 EIA-860. This is the annual electric 17 generator report. This survey collects 18 information about generators from electric 19 power producers. The EIA-906 is the monthly 20 power point report and this survey collects 21 operational information about fuel 22 consumption and electricity generation from BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 407 1 regulated and unregulated electric power 2 plants. 3 From these two surveys we created 4 what we called the combined heat and power 5 plants frame. The analysis that we're going 6 to talk about will evaluate the coverage of 7 these plants and most of them are in the 8 manufacturing sector. The EIA-860 had over 9 5700 establishments of which 562 were non- 10 regulated and on the 906 frame in 2002. 11 The nonregulated portion of the 12 electric power industry consists of these 13 combined heat and power plants and 14 independent power producers. The analysis 15 will focus on the 645 combined heat and power 16 plants or what we'll call the CHPs that were 17 self-classified on EIA's frame within the 18 manufacturing sector and throughout the rest 19 of this presentation we will refer to it as 20 the CHP frame. 21 In order to do the analysis we had 22 to match units on EIA's frames to units that BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 408 1 we have at the Census. The first step was to 2 match reporting units to our business 3 register and we needed to do this to 4 determine if there were manufacturing details 5 available in what we call the back end 6 dataset. Now, the business register is the 7 universe of establishments that the Census 8 Bureau uses. It's a large data set and it 9 contains classification information where 10 establishments are classified based on 11 primary activity at the plant or at the 12 facility that's being considered. 13 So only the establishments that we 14 were able to match to manufacturing in this 15 step of the matching process are included in 16 the results that you'll see when Vicki comes 17 up to talk to you. And, as I just stated, 18 getting ahead of myself a little bit, the 19 second step was then to take these 20 manufacturing establishments and match them 21 to establishments that were on the 2002 22 manufacturing energy consumption survey. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 409 1 The manufacturing energy 2 consumption survey is a sample survey based 3 on the economic census. The reason we used 4 this survey in this time, if you remember in 5 the spring we did the analysis with the 2002 6 economic census. The data was better for 7 this type of analysis on the MECS. We have 8 more detail for electricity consumption and 9 production and we also have more detail on 10 coal consumption. 11 For the economic census we did not 12 have a specific piece of data for coal 13 consumption that we could match to. Coal 14 consumption in the economic census is 15 considered a cost of fuels and we had no way 16 to break out what portion of that fuel was 17 spent on coal and what portion, for example, 18 would be spent on natural gas and other fuels 19 used at the establishment. So we matched to 20 the manufacturing energy consumption survey 21 and that's the data that will be used in the 22 results. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 410 1 Vicki's going to come up and talk 2 to you a little bit now about what she found 3 when she did the analysis and present some of 4 the results, all of the results, actually. 5 MS. HAITOT: Good morning. The 6 coal consumption survey frame has 7 manufacturing establishments as the target 8 population of the survey. We used a matching 9 program which we created it from Stacy's 10 office to match the companies on EIA-3 by 11 name with establishments in the business 12 register. And then we had to look at those 13 to make sure that they were accurately 14 matched. The matching program had a higher 15 percentage of one to one matches. 16 There is a threshold of 1,000 short 17 tons of coal consumed annually which was 18 easily implemented in this survey. There are 19 28 establishments on MECS that consume this 20 and so we took them out of the analysis. For 21 the CHP frame, which was the subset of the 22 annual electric generator frame, and the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 411 1 monthly power plant report we had to match 2 establishments a different way because the 3 program had a very small percentage of 4 matches. 5 The manufacturing establishments 6 were a very small percentage of the target 7 population for this survey and this survey 8 had a threshold of 1-megawatt capacity. So 9 there was no generic relationship between 10 capacity and generation which the MECS has on 11 their form. So we took the two files and we 12 wanted to see what level of generation would 13 be equivalent to the megawatt capacity of the 14 generators so where the matching on the MECS 15 became less frequent with EIA's 16 establishments we determined was the 17 threshold for MECS which was around 2,000 18 megawatt hours, so we took everybody that was 19 generating less than 2,000 megawatts off the 20 MECS for the comparison. 21 There was a little bit of confusion 22 matching plants of the EIA-60 because of the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 412 1 way that they classified their 2 establishments. The MECS is concerned with 3 the activities of the facility level so our 4 frame is based on which facilities. Well, 5 EIA is concerned with activities going on in 6 the generator stage so they're concerned with 7 what's happening at each generator. 8 Well, it turned out that some of 9 the generators could be owned by more than 10 one company and they might not even be 11 manufacturing. Like, company 1 could be a 12 utility while company 2 could be 13 manufacturing and so it would be hard to 14 figure out which one actually was at the 15 facility or owned the facility so it made 16 matching a little more difficult and it may 17 result in some false nonmatches. 18 So the coverage by establishment 19 count for the coal consumption survey, we 20 used a ratio of 355 matched out of 452 total 21 establishments in the sample for the 2002 22 MECS. We initially found 37 establishments BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 413 1 on EIA's frame not in manufacturing. There 2 were 23 in mining, four in utility, four were 3 in government, four were in some other 4 sector, and three we were unable to match. 5 So that accounted for 14 percent of EIA's 6 consumption, very small, so they're missing 7 eight establishments off the MECS survey in 8 Kentucky, seven in Ohio, and six each in 9 Georgia, Michigan, and South Carolina. 10 The 30 percent of the missing 11 establishments off of MECS were classified in 12 NAICS-327, which is nonmetallic mineral 13 products. For the CHP frame we've matched 14 410 out of 588 establishments for a 70 15 percent coverage rate. It was the same 16 ratio. 17 The combined heat and power plant 18 frame consisted of 645 establishments in 19 manufacturing so there were 27 not in 20 manufacturing, seven in utilities, three were 21 in government, and 17 we were unable to match 22 which could be that double ownership BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 414 1 situation. 2 We excluded the unmatched plants 3 that generated less than 2,000 megawatt hours 4 from the MECS frame based on our threshold 5 comparison so EIA's frame is missing 23 6 establishments in California, 15 in Michigan, 7 14 in Texas, and 32 percent of the missing 8 establishments were classified in NAICS-322, 9 which is paper manufacturing. 10 For the coverage by volume the coal 11 consumption survey had a coverage rate of .89 12 which was measured as a ratio of coal 13 consumption. This ratio for volume only uses 14 MECS data. It tells us how much of the MECS 15 unweighted estimate is accounted for by 16 establishments that matched EIA's frame, the 17 matched establishment over total in-scope 18 establishments, so the coverage for coal 19 consumed for fuel use was .86 and the 20 coverage for coal consumed for nonfuel use is 21 .98. 22 Now, fuel use is anything that BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 415 1 would be used for heat, power, or electric 2 generation. Nonfuel use would be things for 3 like, feed stocks or raw materials inputted 4 into the manufacturing process. 5 The coverage by volume for the 6 electric generator survey had a coverage rate 7 of .75 which was measured as the same ratio 8 of electricity generation. So our summary is 9 that coal consumption had a coverage by 10 establishment count of .79, a coverage by 11 volume of .89 in short tons. The electric 12 generator survey had coverage by 13 establishment count of .70 and a coverage by 14 volume of .75 kilowatt hours. And now Rick 15 will come up and talk to you about the 16 overall general conclusions. 17 MR. HOUGH: Like I said, these are 18 the results of the last two. The project 19 itself, the contract was for five surveys. 20 The next slide is going to talk a little bit 21 about some of the results in general from 22 those five. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 416 1 From the five surveys we found a 2 range of coverage by establishment counts 3 from .7 to .92. We had a range of coverage 4 by volume from .75 to 1. The average 5 difference between the count and volume rates 6 was about 11 percent so the coverage by 7 volume was on average 11 percent higher than 8 the coverage by count. 9 Looking at these numbers and 10 through our evaluations of missing 11 establishments, and when we had a missing 12 establishment we did some research on them 13 and verified that they were actually missing 14 in the sense that they were consuming coal or 15 they were in reality producing electricity 16 and should have been on EIA's frame, from 17 that analysis one of the conclusions we made, 18 that the missing establishments were 19 relatively small and in some instances would 20 have little or no effect on the published 21 estimate if they were to go out and try and 22 get them. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 417 1 The next steps for EIA, EIA is 2 going to use this information to determine if 3 the survey frames have a sufficient coverage 4 to ensure that their -- their data quality 5 standards are met. We have already had 6 meetings with each of the survey managers 7 from these surveys and they will then 8 determine the appropriate procedures to 9 implement if they want to go after some 10 coverage improvements. 11 And the one question we had is what 12 level of coverage should be significant. So 13 from these conclusions do you think this is 14 sufficient? 15 MR. SINGPURWALLA: It seems like if 16 I look at the general conclusions for all 17 five frames you got some higher hit rate for 18 some of the other surveys. 19 MS. FORSYTH: These are some of the 20 lowest, aren't they? 21 MR. HOUGH: Correct, the first 22 three surveys were very small compared to BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 418 1 these last two. I think we had -- 2 MS. HAITOT: They average between 3 20 and 25 establishments. 4 MR. HOUGH: Correct, so they were 5 small. These last two were a lot larger and 6 a computer program was actually used to match 7 the last two. We did the first three by 8 hand. So that is accurate. The lower 9 coverages would be more reflective of a 10 larger sample size. 11 MR. SINGPURWALLA: Can you tell us 12 how you did the computer program matching or 13 is that -- 14 MR. HOUGH: The computer program 15 actually looked to match on name. 16 MS. HAITOT: It matched on name. 17 First it matched like the first 10 characters 18 of the name and then it did a little bit of 19 programming stuff and it matched like on a 20 number basis after the first couple digits, I 21 think. 22 MR. COLE: I have to ask John. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 419 1 MR. HOUGH: It went from names to 2 address. 3 MS. HAITOT: Well, no, it only did 4 name and then what Ashley and I did is we 5 looked at those that matched to that to see 6 if they were the same address. 7 MR. HOUGH: Address, so the 8 computer program kicked out results on name 9 and then the analysts then went in and looked 10 at addresses and other geographic information 11 to verify that they were actually matches. 12 MR. SINGPURWALLA: Yes, I was 13 thinking like, while doing there will be some 14 fuzzy matches but it sounds like you guys are 15 taking that into account if it is. 16 MR. HOUGH: Well, the one slide, 17 that was the key in these last two is that 18 the target population for the coal 19 consumption survey was manufacturing 20 establishments so the format of their frame 21 was much more conducive to this type of 22 analysis. I mean, if you go back to the one BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 420 1 diagram where we showed this, the 806, MECS 2 is concerned with the facility. So is the 3 coal consumption survey. So the format of 4 their survey was much more conducive to do 5 these types of matches and the program kicked 6 out a lot of matches off of initial run when 7 we did the coal consumption survey. Now, 8 when we got to the combination from the 9 electric power plant surveys and we pulled 10 out the CHP units this format caused a little 11 bit of a heartache for us because that survey 12 is concerned at this level. So they may have 13 three rows with the same company name or 14 different company names, the same address 15 because maybe Company 2 was the facility and 16 then Company 1 was possibly another company 17 that might have been off in utilities. Go 18 ahead, Vicki. 19 MS. HAITOT: And EIA's 860 frame, 20 the names that they listed as their company 21 names were sometimes just like a plant 22 location so it would be like the city plant BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 421 1 and it was harder to match that because it 2 was not in that city. 3 MR. HOUGH: Right, their focus was 4 on this level so where this generator was 5 wasn't that important. The coal consumption 6 survey was focused on manufacturing plants so 7 the actual location of the manufacturing 8 plant was a piece of data that they needed. 9 But for here as long as they could get the 10 data about this generator they're not 11 concerned if it comes from the facility or if 12 it comes from Company 1 or 2 as long as they 13 can get data about what is being generated 14 from the generator. 15 MR. SINGPURWALLA: Yes, so it's a 16 different -- 17 MR. HOUGH: Well, it was a lot more 18 ground work. Go ahead, Shawna. 19 MS. WAUGH: I'd just like to add 20 that who fills out the form also can make a 21 difference. So if I understand correctly 22 because I know a little bit about MECS your BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 422 1 forms would mostly have gone within that 2 company to the accountant? 3 MR. HOUGH: Well, they're 4 instructed to send them to the engineer. 5 Through our research we have found that yes, 6 they do get into the hands of accountants. 7 MS. WAUGH: The 860 frame, actually 8 we send it to the operator and the operator 9 may not be the owner so it's whoever is 10 operating the facility that receives the 11 form. 12 MR. HOUGH: Right. 13 MR. SINGPURWALLA: So they would 14 put in the contact information and stuff like 15 that and shoot it back to you? 16 MR. HOUGH: Right. 17 MR. BOSTIC: Even to add to the 18 complexity, if you look at the example with 19 Generator B, Company 1 and 2, you could have 20 where they create a joint venture between the 21 two companies so that would also complicate 22 the matching process. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 423 1 MR. HOUGH: Don't forget there were 2 only 17 of them that fell into this but this 3 was an issue that created a lot more work. 4 MS. WAUGH: Actually, Rick, there 5 were a lot more than the 17. There were 17 6 that we were able to match. 7 MR. HOUGH: Right. 8 MS. WAUGH: But this problem 9 permeated the entire frame. 10 MR. HOUGH: Correct, there were 17 11 CHPs that we could not match to the register. 12 MS. HAITOT: That self-classified 13 themselves in manufacturing because if they 14 self-classified themselves in utility we 15 didn't look at them. 16 MR. HOUGH: Right. 17 MS. KIRKENDALL: That is another 18 source of difference on our forms. The 19 companies report the NAICS code instead of 20 what is their main business and on the MECS 21 or on the Census Bureau forms they classify 22 it based on other information that they BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 424 1 report. So it's a self-reported NAICS versus 2 an assigned NAICS based on the rulings that 3 the Census Bureau uses and there are bound to 4 be some differences. 5 DR. SITTER: But I think the issue 6 isn't whether it made your matching hard. 7 The issue is does .7 realistically reflect 8 the under-coverage. I mean, what is your 9 feeling? Did this difficulty in matching 10 impact your bottom line? 11 MS. HAITOT: It might a little 12 because of the fact that Company 1 or 2 could 13 be in utilities and if it was in utilities 14 then I didn't match it and I had a couple out 15 of those 17. They were all on the same city 16 but I didn't know because the plant name on 17 EIA frame might not have been an actual 18 company's name. I didn't know which one of 19 those they were saying was manufacturing and 20 had a generator. 21 MR. HOUGH: But basically the .7 is 22 the lowest it can be. I mean, what we're BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 425 1 saying is this would have generated false 2 nonmatches so it may be a little bit higher 3 but .7, according to this analysis, is as low 4 as it -- 5 MS. WAUGH: I have another way to 6 look at some data to answer that question. 7 If you look at the paper on page 5 we've 8 actually identified those that are by state, 9 those that are on EIA and not on MECS, and 10 those that are on MECS and not on EIA. And 11 if you look at this there are some states 12 where, for example, Arkansas there are two on 13 the MECS that are not on the EIA and there 14 are zero on EIA that are not on MECS. So 15 clearly we're missing those two. But if you 16 go down the road, some of our larger states, 17 I can't remember which you gave -- 18 MS. HAITOT: There's one on there. 19 I don't remember which state. There are 20 three missing on MECS and there are three 21 missing on EIA. They could be the same three 22 but I just don't know because of the name. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 426 1 MS. KIRKENDALL: When we met with 2 the electric power people to go over these 3 results, and those are the lowest percent 4 matches, their point was well, the purpose of 5 our surveys is to collect total generation 6 and the CHPs are something like 5 percent of 7 the total generation and the rest of the 8 generation is by utilities or IPPs. 9 MR. HOUGH: Right, I mean, the 10 manufacturing portion is a very small 11 percentage of their target population so the 12 analysis was to evaluate their coverage of 13 that portion. 14 MS. KIRKENDALL: But actually one 15 of the reasons we were very interested in 16 those matches is that we thought what if we 17 would like to use this frame for some other 18 purpose, and we wanted to know what are its 19 attributes before we would decide to do that. 20 MR. HOUGH: If you wanted to 21 publish some data on the CHPs individually, 22 for example. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 427 1 MS. KIRKENDALL: Right, if we 2 wanted another survey or something we wanted 3 to target the electric generators or 4 manufacturers this might be a handy frame to 5 have. And so then the question is is this 6 kind of coverage good enough for whatever 7 purpose we have for that. We might think it 8 might be all right in terms of the coverage 9 of the overall electric power sector because 10 it's such a small piece of it although 11 they've given us information to know where to 12 look we've got NAICS codes, which ought to 13 make it easier to pick up some of them. 14 MR. HOUGH: Correct. 15 MS. KIRKENDALL: With that kind of 16 information we might be able to do something 17 that isn't too difficult that would improve 18 the frame. And then states helps to narrow 19 it down, come up with the states for —————— 20 MR. SINGPURWALLA: It seemed like 21 another one of the surveys the mismatches 22 were just coming on like three states and BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 428 1 like, a whopper was California and Texas and 2 something else. 3 MS. HAITOT: And we provided on 4 this frame in the paper a NAICS by state of 5 the missing establishments so they know which 6 NAICS in each state they need to focus on. 7 MS. KIRKENDALL: So there's enough 8 information that we could do something. 9 MR. SINGPURWALLA: Do you ever look 10 at the coverage evaluation over time or is 11 this the first time that you've done it? 12 MS. KIRKENDALL: This is the first 13 time we've had this kind of an exercise with 14 the Census Bureau and they're the only folks 15 I know that could do this kind of work for 16 us. 17 MR. SINGPURWALLA: I think we were 18 talking about this before. I don't know a 19 whole lot about this but like, frame shifting 20 and stuff like that, were we talking about 21 that yesterday for some reason? 22 MS. WAUGH: The blenders, I think BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 429 1 you brought up yesterday that we had some 2 under coverage because we didn't have the 3 blenders in petroleum. 4 MS. KIRKENDALL: Well, I mean, in 5 fact these CHPs, 15 years ago we weren't 6 interested in CHPs because we didn't consider 7 the electric generation. They were doing it, 8 just we didn't think that was part of what we 9 should look at. But then there were some 10 laws passed that encouraged more electric 11 generation or special provisions that made it 12 more lucrative to do that and so it's been 13 encouraged and that's when we started 14 surveying it. 15 MR. HOUGH: And those laws are 16 continuing to be passed at the state level. 17 The states are giving more incentive for 18 manufacturing establishments to co-generate. 19 MS. KIRKENDALL: Yes, right, and 20 the other thing we found out when we talked 21 to the people in the Electric Power Division 22 was that in either 2002 or 2003 they had done BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 430 1 a review of the frame and I don't know that 2 we've looked at that paper or found a copy of 3 it to see if they added companies after that. 4 So we need to see has the frame expanded 5 since 2002. Maybe we picked up some of them. 6 We probably didn't pick up all of them. That 7 would be asking too much. But at least we 8 might get a feel for that by looking at the 9 paper that Glen talked about. 10 MR. SINGPURWALLA: I'm thinking 11 maybe like what you just mentioned there and 12 matching logic could be two of the things 13 that might contribute to ———————————— 14 MS. KIRKENDALL: Well, the other 15 thing that we should think about particularly 16 with the electric power form is maybe we need 17 to try to be more careful with the names of 18 companies because that was the big problem 19 they had. They couldn't identify the 20 companies. 21 MS. WAUGH: Yes, actually on our 22 frame often it would just be the abbreviation BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 431 1 for the company whereas the census was much 2 more systematic and complete. 3 MR. HOUGH: Right, but then they 4 had a multitude of information on the 5 generator, what type of fuel it used, I mean, 6 what its output was, and that's all goes to 7 the purpose of the survey and the coal 8 survey, like I said, the target population, 9 it was a lot easier because their target 10 population was manufacturing plants. 11 MS. KIRKENDALL: I think the coal 12 surveys have always carried fairly detailed 13 name and address lists and ownership too. 14 Come up to the table, Bill. Yes, Bill is the 15 Director of the Coal Division so he knows. 16 MR. WATSON: I'm Bill Watson from 17 EIA and my group runs this coal consumption 18 survey. I've raised this issue before and 19 I'll raise it in this forum. The 20 identification number that you have in MECS, 21 I believe, is the EIN? 22 MR. HOUGH: Well, the business BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 432 1 register contains the EIN number. 2 MR. WATSON: I'm looking forward to 3 future exercises like this where we match our 4 list of respondents against a list that you 5 have at the Census Bureau and I'm just 6 wondering in terms of having a better way to 7 do the matching in the future whether it 8 would be valuable for us to consider if it's 9 legally possible adding the EIN as a piece of 10 information that we would ask our respondents 11 to report to us. 12 MR. HOUGH: It would make our job 13 easier but I'll defer to Bill or Stacy. I 14 mean, could they ask establishments to 15 voluntarily report that number in your -- 16 MR. BOSTIC: Part of the issue 17 would be where are the legalities of looking 18 at data from EIA and passing it on to the 19 Census Bureau. That could be considered more 20 or less a data sharing type exercise but I'm 21 not sure. You would have to probably check. 22 MS. KIRKENDALL: Well, what we BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 433 1 could do is to say something like it would be 2 voluntary reporting but the only use of this 3 data would be to provide to the Census Bureau 4 for matching of frames or for frames 5 evaluation purposes. So that would be okay 6 in terms of ———————— or because we'd be 7 telling the respondent exactly what we're 8 going to do with it. So you could do 9 something like that. I don't know whether 10 OMB would approve it because it might be 11 sensitive. 12 MS. WAUGH: Nancy, what about the 13 perception from the respondent's point of 14 view? 15 MS. KIRKENDALL: We'd have to find 16 out. Any time you put anything on the form 17 you have to go out for public comment. 18 MS. WAUGH: No, I'm not actually 19 thinking about that aspect of testing the 20 form. I'm thinking about the aspect of 21 whether they'd consider that information 22 confidential. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 434 1 MS. KIRKENDALL: That's what you 2 would find out in the Federal Register. 3 MR. BOSTIC: I mean, you have to 4 take into consideration there is a 5 perception, I think, out there that 6 government agencies share data anyhow. 7 That's the perception and, of course, because 8 of our legalities they don't really see that. 9 The one thing that you might want to consider 10 as part of the evaluation is to go outside of 11 the manufacturing sector because, as I 12 understand from some of the previous other 13 evaluations, I think there were some 14 wholesale cases. 15 So you can see is there a coverage 16 problem which were also cases. In this 17 particular evaluation we talked about 18 utilities and we collect information on 19 utilities in the year of the economic census. 20 So part of what you might want to consider 21 and take into account, do you want to expand 22 the evaluation to be outside of the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 435 1 manufacturing sector to see if you have some 2 coverage issues in some of the other sectors 3 that are in the frame. 4 MS. KIRKENDALL: We ought to chat 5 with you about where we could do something, I 6 mean, because part of the reason why we 7 didn't, we know about MECS and we know you 8 have a good frame for the manufacturers. So 9 these were the target of opportunity that we 10 have these areas where we could do this. But 11 it would be interesting to know where else we 12 could do evaluations. 13 MR. WATSON: The second point here, 14 again related to the coal consumption results 15 and the coverage by volume of .89, put that 16 in a little bit of perspective. Roughly the 17 volume of coal in the manufacturing sector is 18 almost 200 million tons a year now and 19 there's something peculiar going on here. 20 About two-thirds of it comprise coal going to 21 things called central plants which will go 22 away in the year 2007, we think; however, the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 436 1 Congress may extend that. So when this was 2 done I think the volume of coal wasn't quite 3 up to the number that I mentioned here today 4 but it's possibly around 160 million. 5 The point here, though, is 10 6 percent of that is roughly 15 to 20 million 7 tons a year that possibly were missing and 8 then for the entire US most of the coal, of 9 course, goes to electric generation and we 10 would have to factor in the missing coal 11 there too but this is a very small amount of 12 coal. It's 15 to 20 million tons out of 13 about 1.1 billion. 14 In terms of its importance to 15 manufacturing, though, it's not trivial. I 16 mean, I actually don't like to see these 17 results. I don't think people that I work 18 for like to see them. So we are concerned 19 about trying to fill in the gaps here even 20 though at a certain level this looks like a 21 small amount of coal. But I think in the 22 scheme of things we drive ourselves crazy BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 437 1 sometimes trying to find a million that we 2 think we're missing so if we think we're 3 missing 15 to 20 we will drive ourselves 4 double crazy trying to pin this down. 5 The other thing I would say about 6 this, and this gets over into the cost- 7 benefit area, is it's extremely difficult to 8 find these firms. We do not have a ready 9 place we can go to try to figure out who out 10 there might be burning coal that we don't 11 know about. Yesterday somebody voluntarily 12 sent us an e-mail from a firm, a new employee 13 in the firm General Shale Brick, and they use 14 a lot of coal to make brick, and they in our 15 frame currently have seven sites that this 16 person was reporting for. He said, by the 17 way, I'm a new employee and I notice that we 18 have four other facilities that are not on 19 your list. How come? So, of course, we 20 followed up right away. 21 MR. BOSTIC: Is he still a new 22 employee? BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 438 1 MR. WATSON: In any event it's 2 really difficult to find these and I think 3 there was an exercise roughly five years ago, 4 maybe even longer than that. Actually it's 5 about 10 years ago, sorry, where there was an 6 attempt to look at all of the pollution 7 discharge records from individual states 8 because all of these little tiny coal fire 9 boilers are regulated under the state —————— 10 plan regulations for pollution control. But 11 what you get is you get a database in the 12 State of Ohio, for example, that might have 13 500,000 records in it and you're culling 14 through that and you've got most of everybody 15 in Ohio so it's not that you're looking for 16 obvious things. You're looking for things 17 that are difficult to find. 18 So you're culling through that 19 trying to figure out who could be burning 20 coal out of all of the pollution discharge 21 stuff that they report and then you have this 22 other problem of did they accurately report BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 439 1 the company name, the facility name, and so 2 on, something that we see in our frame 3 matched against what's in the State of Ohio 4 records. And so this is extremely difficult 5 and could be a very costly thing to do to try 6 to fill in the gap here so I'm really 7 searching for strategies to get at that in a 8 way that doesn't really require enormous 9 resource because the exercise 10 years ago, 10 and I'm only talking from conversations that 11 I've had with people on my team about it, 12 evidently engaged the work of people working 13 almost a year, four or five people full-time, 14 to try to fill in some gaps at that point. 15 That was the last time we did anything 16 systematic. 17 MR. HOUGH: One of the suggestions 18 that we had talked about when we met with you 19 folks was that you can use the NAIC-3 by 20 state and possibly make some contacts with 21 some trade associations and see if they will 22 do you a favor, so to speak, and ask their BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 440 1 members to tell them if they're consuming 2 coal or if they'd be willing to report to 3 you. 4 We use trade associations quite a 5 bit as contacts to talk to them and ask them 6 to give us information about the members that 7 they have. And I agree. I mean, as a 8 manager this does come down to a cost-benefit 9 type scenario where you have to put in one 10 hand, sure, .89 makes you a little bit 11 nervous but then, like you said, the benefit 12 to pull that up to .92, what's that going to 13 cost you? 14 Well, if you have to take three or 15 four survey analysts and have them work for a 16 year it just might not be worth that type of 17 expenditure. So I agree with you and that's 18 why we provided as much information as we 19 could under Title 13 to point you in the 20 right direction and we'd be willing to come 21 over and talk even more about some of the 22 other things that may help. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 441 1 MS. HAITOT: Doesn't the breakout 2 syn fuel nodule also focus where you need 3 to -- 4 MR. HOUGH: Well, this breakout, I 5 mean, those syn fuel plants, you've got them. 6 I mean, the coal that's being consumed as 7 material inputs, those guys are reporting to 8 you and from what I understand they'll put 9 their numbers out on billboards so they get 10 their tax breaks. But anyway it looks like 11 where you might be missing are the 12 establishments that are beginning to take 13 advantage of some of the state regulations 14 coming into effect where you may have some 15 smaller establishments that are now starting 16 to generate get the tax breaks that the 17 states are offering. 18 And the .86, the difference between 19 these two at least in my opinion might tend 20 to lead that the incentives are out there now 21 for smaller manufacturing plants to invest 22 the money up front to generate their own BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 442 1 electricity and those incentives are growing. 2 I mean, there are state incentives that are 3 still being -- 4 DR. FEDER: Rick, I may have missed 5 something but we somehow assume that our 6 universe is covered by at least the Census 7 frame or by the EIA. Is it possible that 8 there's quite a sizable subgroup of the 9 universe that's not caught at all? Is there 10 a way to estimate how it impacts the mean 11 square error of the estimates that EIA is 12 putting out? 13 Your question to the committee was 14 what coverage is enough and I think the 15 answer is partially dependant on how does it 16 impact the estimates that EIA is putting out 17 and is there any way we can know what the 18 frame under-coverage is leading to errors in 19 the estimates? 20 MS. KIRKENDALL: We think the 21 closest thing to that is the coverage by 22 volume because they're using volumes that we BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 443 1 care about. So they're using electricity 2 generation for the electricity side and coal 3 consumption for the coal side. 4 DR. FEDER: But that assumes that 5 the census data are good and perfect? 6 MS. KIRKENDALL: Well, it assumes 7 the census data are perfect and it assumes 8 that the nonmatches are really nonmatches. 9 DR. FEDER: Because frames are time 10 dependent, enterprises are generated, they 11 change their composition, they move, so we 12 know that there's no way whatever Census has 13 today is accurate for today's data. So 14 they're missing some too. 15 MR. HOUGH: Correct, right. 16 DR. FEDER: And I don't know how 17 much. 18 MR. HOUGH: Well, the process that 19 our frame goes through, the business register 20 is updated from a multitude of different 21 directions and different surveys. So those 22 types of informations we're hoping we're BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 444 1 fairly well up on. 2 MR. BOSTIC: We get a lot of 3 administrative records from the Internal 4 Revenue Service, from the Bureau of 5 Statistics. We get it whenever they send it, 6 the most recent data, but also for the very 7 large cases we conduct an annual company 8 organization survey. So we attempt to put 9 the register in the best shape that we can 10 before we conduct an economic census every 11 five years. And during that process we do 12 find out about misclassification. 13 Even from the sources that we are 14 receiving data sometimes they get incomplete 15 information, which also causes us prior to 16 the economic census to do what we call a 17 re-file survey. We send out a questionnaire 18 to have companies where we have partial or no 19 classification at all to send us in their 20 primary activity or what they're doing. We 21 update the register prior to mailing out the 22 economic census so that they can receive the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 445 1 correct form. 2 Nothing's perfect but we put a lot 3 of safeguards in place. We strive for 4 excellence because you never can get to 5 perfection. 6 DR. FEDER: So you improve your 7 estimate in coverage but can you improve also 8 your estimates of your under-coverage because 9 sometimes even looking back and saying all 10 right because I know actually from Canadian 11 experience where the counterparts of the IRS 12 provided us data and I know there's the 13 filing process and you always file for 14 previous years and so on. 15 So what you can do is looking back, 16 let's say, in 2007 you can estimate how good 17 your 2005 estimates were and make an 18 adjustment, have an under-coverage rate for 19 the census frame. Sometimes perfect coverage 20 is not as good as knowing how much under- 21 coverage there is in the same way that 22 sometimes census, and I'm saying it to a BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 446 1 Census Bureau person, is not as good as a 2 good survey because a survey could be 3 unbiased and in addition it comes with a 4 measure of uncertainty in terms of mean 5 square errors, standard errors, and biases. 6 So I know what I'm saying here is quite 7 onerous. Maybe it's not feasible. You 8 talked about cost-benefit issue. I'm just 9 saying that assuming that the census frame is 10 perfect is -- 11 MR. HOUGH: Well, no, you have to 12 realize for the coal the frame that we were 13 comparing to was reported establishments that 14 told us they were consuming coal. Now, when 15 we published MECS the only real mechanism we 16 had to identify under-coverage for coal 17 consumption would be to compare to historic 18 information or look to some other piece of 19 information that we could possibly get some 20 evidence that the establishment actually was 21 consuming coal. 22 And that's why for the nonfuel BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 447 1 users it's easier because there are some 2 products like coke, for example, that you 3 just can't produce without coal. So there 4 are some methodologies that we use and 5 implement it to prevent the under-coverage 6 but if we have an establishment that is 7 actually consuming coal and historically 8 reports to us that they're not it's very 9 difficult to have that mechanism identifying 10 that they're actually under coverage in the 11 coal consumption area. 12 MR. WATSON: This doesn't speak to 13 this particular issue but in the coal area 14 EIA has data both on the supply and demand 15 side. So we routinely in our quarterly 16 reports have a balance for the entire US 17 where we look at everything that's coming in 18 on the production side, everything that's on 19 the supply side, everything that's consumed 20 or on the demand side. We're also looking at 21 stocks so we have stock change. 22 And if you have perfect data, BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 448 1 accurate and perfect data, you have no 2 discrepancy between what's supplied and 3 what's demanded accounting for stock change. 4 This also has to take into account, of 5 course, exports and imports, the entire flow 6 of energy both on the supply side and demand 7 side, and typically what we find is we're 8 about plus or minus something like five 9 million tons, again out of a 1.1 billion 10 total, and the pattern of that thing is plus 11 and negative. 12 It looks pretty random. We never 13 actually did a statistical test to determine 14 whether it's random but it's reassuring. It 15 doesn't necessarily say, though, that you're 16 doing a good job in each particular area 17 because a large negative error in one area 18 could be offset by a large positive error 19 somewhere else but overall it gives you a 20 reassuring picture. 21 And that's also why I'm a little 22 what, uneasy, I guess, about thinking that we BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 449 1 might be missing 15 million tons here on this 2 manufacturing survey because if we were to 3 find 15 million tons here it would throw that 4 calculation way off the zero line and it 5 wouldn't be a plus and negative any longer. 6 I don't know whether it's going to be plus or 7 negative. It depends upon what signs you 8 assign. 9 DR. SITTER: But couldn't it be a 10 misclassification issue more than an under- 11 coverage issue then? I mean, there are mis- 12 definitions going all over the place here as 13 to which categories these things fall into 14 and overall you may be catching everybody. 15 You're just not classifying them that well. 16 When I say not well I mean still pretty good, 17 I mean, but some of this could be mis- 18 classification in terms of sector, in terms 19 of what kind of energy usage they're making. 20 It's pretty complicated. 21 MR. WATSON: Well, in the case of 22 coal I think the issue is not too serious. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 450 1 It is true with CHPs, combined heat and 2 power, in other words a manufacturing plant 3 that raises steam some of which it may use 4 itself or use for thermal output and it also 5 has a generator and it's hooked to the grid, 6 those firms are supposed to be both in the 7 electric sector surveys as well as on our 8 manufacturing surveys and we try very hard 9 not to double count. So we do a comparison 10 there so we won't double count. And then the 11 rest of it is primarily just electric 12 generation with exports and imports. 13 DR. SITTER: What I mean is that 14 you're carefully not double counting but some 15 of these frame comparisons may be excluding 16 those you've purposely not double counted the 17 other way so it's a mis-classification. 18 MR. WATSON: No, I understand. 19 MS. WAUGH: There's also on the 20 coal frame a little bit of over-coverage from 21 the standpoint that the surveys intended to 22 go out to manufacturing respondents but the BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 451 1 establishments on their frame include 2 government agencies, some companies in the 3 commercial area, and I'm sure at some point 4 in time there was some explanation as to why 5 those companies were included before you 6 came, Bill. 7 MR. WATSON: Well, even while I've 8 been here. We should look into that because 9 that should not be there if that's the case. 10 We do depend upon the companies initially to 11 tell us whether they are in the manufacturing 12 sector and so we may not have looked very 13 carefully at what they do relative to their 14 information about what they think they do so 15 we need to come back and revisit that. 16 MR. HOUGH: And don't forget too 17 that the register classification is based on 18 primary activity. That does not necessarily 19 mean there is no manufacturing capability at 20 the establishment. It just means that its 21 primary activity at that establishment. 22 MR. WATSON: And there are some BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 452 1 cases where it's difficult. For example, 2 there are those munitions plans out there in 3 Illinois that are run by private companies 4 under government contract and there may be an 5 issue about is that a government activity or 6 is that a commercial activity. And in that 7 case I know that we chose to make it a 8 commercial NAICS Series 3 facility reporting 9 on EIA-3. But it's right there at the line, 10 isn't it? 11 MS. WAUGH: Yes, good example. 12 MR. BOSTIC: I think one of the 13 challenges you face when you think of a 14 classification system and how companies and 15 activities now are evolving where you have 16 these Fabless companies when we go to the 17 semi conductor industry where the design and 18 distribution and marketing is, say, here in 19 United States and it's outsourced abroad or 20 to another company. A lot of these companies 21 still call themselves manufacturers but they 22 do no manufacturing whatsoever so when you BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 453 1 talk about the small cases it may be your 2 edits are built where you don't get certain 3 information that raises a red flag but part 4 of the challenge that we're having at the 5 Census Bureau is the classification system 6 itself. 7 A lot of things in the economy 8 itself are evolving so quickly and this is to 9 get back at the perfect frame. It's 10 happening faster than we can keep up with 11 which is becoming a challenge. So basically 12 you're talking about snapshots at a point in 13 time and there will be times where we'll miss 14 pieces. 15 I mean, in conducting our surveys 16 on an annual basis typically we always are 17 having revisions from the next year because 18 of new data we are capturing or they reported 19 incorrectly given some product that they 20 produce which we'll now move into another 21 industry. By the time we finish issuing 473 22 industries from the economic census that data BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 454 1 will change because in the process of doing 2 the analysis we will find misclassification. 3 And so therefore we will be moving 4 establishments into industries as we go 5 along. 6 So from the time we start with some 7 beginning set of industries we published to 8 the end before we get into our general 9 statistics that represent the US that 10 guideline has changed. So those are the 11 challenges that we're just faced with. It's 12 just the nature of the beast. 13 MR. HOUGH: I'd just like to take a 14 second to thank Nancy for allowing us to do 15 this collaborative work. 16 MS. KIRKENDALL: Yes, it's been 17 great. Thank you very much. 18 MR. HOUGH: And Shawna Waugh, who's 19 spent many hours on the subway coming over to 20 work with Vicki to help us with this project. 21 So thank you and I hope that we can 22 collaborate again on an effort like this in BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 455 1 the future. 2 DR. HENGARTEN: Thank you very 3 much. Before we proceed can I ask the 4 committee members to come here very fast for 5 a group photo? 6 (Recess) 7 DR. HENGARTEN: Welcome back. I'll 8 just start the report from the outbreak 9 session. We had two very good sessions. I'd 10 like to hear back first from the session on 11 frame comparison of the EIA-3 and EIA-860 and 12 I understand that Darius is going to 13 summarize the discussion. 14 MR. SINGPURWALLA: I'll take a stab 15 at it. We had a really good discussion on 16 the frame comparison. A little background on 17 it really quick for people who didn't sit in 18 or get a chance to read the paper, the EIA 19 contracted with the Census Bureau to conduct 20 five frame evaluations of CNEAF surveys. For 21 the spring session it covered more in depth 22 three of frame evaluations they did for BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 456 1 surveys and for this one they focused on two. 2 And the two that they focused on were the 3 EIA-3, which is quarterly coal consumption 4 and quality report, manufacturing plants, and 5 860, which is the annual electric generator 6 report. 7 And then the goal of the analysis 8 was to determine whether the EIA had 9 sufficient coverage of production of solar 10 thermal collectors and photovoltaic cells and 11 modules, coal consumption, coke and breeze 12 production in the manufacturing sector, and 13 also to identify differences between the 14 frames and identify characteristics of some 15 of the missing establishments. 16 There were five evaluations that we 17 talked about before and we focused on two. 18 And then the methodology that they were using 19 to do the actual frame comparison was a 20 two-step matching process which we talked 21 about in pretty good detail. 22 The first was to match the EIA BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 457 1 reporting units to Census Bureau business 2 register to determine if the manufacturing 3 details will be available for comparison and 4 if that match was found they could do 5 comparisons on it and analysis. And then the 6 step two of the matching manufacturing units 7 on EIA frames too, manufacturing 8 establishments from the 2002 manufacturing 9 energy consumption survey, which is also 10 known as MECS. 11 So that's the background of what 12 they presented to us and at the end of the 13 day they showed us the results of what the 14 matching was and it's not up there any more 15 which is too bad. But I think the overall 16 match rate was 70 percent for over the five 17 surveys. Is that right? It was 70 percent 18 for the electricity generation and 79 percent 19 for the coal evaluation coverage rate. 20 Oh, and then they gave us the 21 ranges overall for the five surveys. So it 22 ranged in coverage by establishment counts BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 458 1 from 70 to 92 percent and a range of volumes 2 in 75 to actually 100 percent. So, I thought 3 that they got pretty high, come to think of 4 it. 5 After that I think the discussion 6 focused on what I synthesize as about three 7 major points. The first was the two-step 8 matching process and some of the accuracy of 9 how that was going on and how they were 10 actually doing the matches. And this was 11 actually pretty interesting to me. When I 12 worked for a credit card company back in the 13 day like five or six years ago we used to get 14 so much different data from so many different 15 sources. We were talking about three 16 different credit bureaus, name and address 17 information, and at the end of the day what 18 we wanted to do was have one specific ID for 19 Darius Singpurwalla and this could be coming 20 from so many different sources. 21 This could be coming from something 22 as carefully as I filed out like a credit BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 459 1 card application when I just didn't want to 2 screw it up because I really wanted to get 3 that credit card and do whatever I wanted to 4 do with it. And then you might have 5 something as sloppy as from maybe a baseball 6 game and I've had a couple of beers and 7 they're giving away free hats if I fill out a 8 credit card application and you get there and 9 you're not filling it out as like, carefully 10 as you should. 11 So, I mean, the matching is 12 actually from the EIA frame to the Census 13 Bureau databases. It's a pretty big problem 14 and I think the way that you guys are doing 15 it handling it in a two-step matching process 16 is really a good and sophisticated way to do 17 it. Doing the straightforward matching often 18 doesn't get you the right results. 19 Like, my name is a bad example 20 because it's pretty unique but if we find 21 someone else who's got like a more common 22 name like Walter Hill, no offense, you're BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 460 1 going to get a lot of different one to many 2 matches and it's at that point in time you 3 narrow it down who the Walter Hills in your 4 universe are and then you can use further 5 information that you have to get it down to 6 the more one to one specific match and show 7 that that business unit is being accurately 8 represented in the Census Bureau frame. 9 And the only further comment I 10 have, like, when I was with the credit card 11 company we had, like, 15 different rules and 12 three levels. I mean, it was just a huge 13 headache and it was such a pain in the butt. 14 I bet that at the credit bureau place we had 15 a lot more data to play around with than what 16 you guys are going to do. 17 So I think it's nice that you 18 actually got it at, like, something that's a 19 two-step level. And if you have further data 20 that you're not using it might be a nice way 21 to try and get that implemented to do further 22 matching roles to get a bigger hit that way. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 461 1 The second thing that I think that 2 we talked about and focused on that was 3 really interesting to me as well was Bill 4 brought up when we were talking about the 5 question that they posed to group, what level 6 of coverage should be considered sufficient, 7 so we battered around that for a while. And 8 then for the coal consumption we had the 9 number of 79 percent coverage by 10 establishment there. What Bill kept bringing 11 up was asking us to think about some of the 12 cost benefits that are associated with trying 13 to get that actual match rate up because in 14 essence to be able to get that up to a higher 15 rate they're using a lot of FTE resources. 16 It takes a lot of work to be able 17 to boost that up and how much of the 18 resources should be used to actually try and 19 get it up and like, what kind of differences 20 is it going to make to some of your 21 estimates? So that was the second point that 22 we focused on as well. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 462 1 And then lastly the third one Moshe 2 brought up as opposed to actually trying to 3 fix up your frames and getting some of this 4 matching and coverage correctly what if you 5 did some sort of focus on correcting the 6 under-coverage as opposed to fixing the 7 frames like, looking at trying to provide 8 better estimates for actually who you're not 9 hitting as opposed to doing some of the 10 background work and trying to fix up what you 11 might not be missing in some of your frame 12 hits. 13 So I thought that those were the 14 three big things that we focused on. I mean, 15 as all the talks have been, it's been great 16 and very interesting to find out a lot more 17 about this stuff. 18 DR. HENGARTEN: Thank you very 19 much. Is there anybody else from that 20 breakout session who wants to add to the 21 summary? Thank you very much. So I will go 22 next to the last contribution of Mark BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 463 1 Bernstein. 2 MR. BERNSTEIN: Well, since two of 3 the four people who were at the session are 4 gone now I can say anything I want. I 5 compliment the presenters. It was very 6 interesting and I don't have to say for the 7 last time they did present their objectives 8 up front and I knew what they were trying to 9 do and I'm very happy that that happened in 10 the last session at my time here. 11 So what they did is they presented 12 some results from a paper they're hoping to 13 publish shortly on whether futures contracts 14 in natural gas can help explain spot prices 15 and came to the conclusion that they can't. 16 And they asked the committee a couple of 17 things. One is should they do this for other 18 fuels, and our reaction was, well, you don't 19 really need to. This pretty much answers the 20 generic question that you don't really need 21 to spend the resources unnecessarily asking 22 the same question for the other fuels. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 464 1 We did get into a discussion of 2 things related to STEO that we talked about 3 yesterday which is how can understanding what 4 happens in futures prices help STEO with its 5 forecast in the short term, help explain some 6 of the discrepancy, particularly the under- 7 forecasting of prices that we saw yesterday. 8 So one of the extensions of this might be 9 this is not the question of whether futures 10 prices are a good predictor of spot prices 11 but whether futures prices can help 12 understand discrepancies in forecasts. 13 They also said well, we get 14 questions like this on all the other fuels. 15 Do we need to answer them? We thought no, 16 but one thing. They get a specific question 17 from Hawaii about using future prices at the 18 PUC and there maybe a role for EIA to do a 19 little quick analysis answering that specific 20 question for Hawaii on fuel prices but not 21 necessarily need to do it generically for 22 everything. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 465 1 In fact that may be a really good 2 use of EIA time in situations where it's 3 pretty clear that the PUC is heading down a 4 path of using something that analytically 5 doesn't make any sense. It seems to be a 6 good role for EIA to show them that it 7 doesn't make good analytical sense. But 8 overall we really like the paper and as soon 9 and I can get one I would really like to have 10 a copy of it. 11 And, Nick, anything else? 12 DR. HENGARTEN: Yes, there's only 13 one other thing that was discussed in that 14 session, namely the idea of if we can combine 15 both this analysis and the STEO analysis in 16 the sense that we can compare. I mean, we 17 all are saying STEO is not doing good or the 18 prediction using the options, the futures, is 19 not very good. But in fact if we look how 20 good both of them do it's like comparing the 21 S&P 500 and what the specific mutual fund is 22 doing. BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 466 1 And so somehow maybe to use this 2 either as a benchmark one way or the other to 3 understand the fact that short-term energy 4 prediction especially in the natural gas 5 arena is extremely difficult and is a hard 6 problem and showing that you know that it is 7 a hard problem and there's no method that's 8 going to do better than what you're doing 9 actually gives you credibility that you've 10 done as good as you could given the 11 difficulty of that problem. 12 MR. BERNSTEIN: The question came 13 up and the conclusion was futures prices is 14 not a good predictor. The question was well, 15 not good compared to what, and so that's when 16 we got into discussion of whether it's better 17 or worse than STEO and how does it compare or 18 other predictors. 19 DR. HENGARTEN: So overall, I mean, 20 it was a really interesting presentation. 21 Any more? 22 MR. HILL: Was any reason given why BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 467 1 the predictions were not very accurate? 2 MR. BERNSTEIN: Why futures prices 3 don't track well? 4 MR. HILL: ———————————— the 5 economic policy ———————————————————— 6 MR. BERNSTEIN: On the economic 7 policy or the futures? 8 MR. HILL: —————————— the futures 9 ———————————— 10 MR. BERNSTEIN: No, I mean, we 11 spent a lot of time chatting about 12 anecdotally that you've got these traders out 13 there and they all are doing their own thing 14 and lot of them were using canned models, 15 canned packages. They're all doing the same 16 stuff and if all their same models tend to 17 tell them push prices up then everybody 18 starts buying higher prices. 19 Or you get one trader in one big 20 company deciding that January is going to be 21 a tough month and they put in a high futures 22 price. Then everybody else starts looking BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 468 1 well, why did he do that or why did that 2 company do that. Maybe we need to do it too. 3 And then all of a sudden all the lemmings 4 push the price up too. So there's a lot of 5 anecdotal things that don't match what you 6 might expect if you had a real robust market 7 out there. So no, there's no real particular 8 explanation but we chatted a lot. 9 DR. HENGARTEN: Anything else you 10 would like to add? Then I'd like to continue 11 and invite the public at this time for 12 general questions. Don't be shy. Yes? 13 MR. WATSON: I'm Bill Watson and I 14 do the coal surveys for Energy Information 15 Administration. The issue of frames is 16 mainly at this point in my view one of 17 prioritizing our work. We have a very 18 focused staff and most of our staff time is 19 spent running the survey, getting the data 20 back in, QA'ing the data, and getting the 21 data out to the public in a variety of 22 reports and we are like this all the time BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 469 1 just to accomplish that. So what I'd like to 2 see happen here, and maybe it happens and I 3 just don't know it, is there are obviously 4 some indicators here from the work that's 5 been done on frames that we have a potential 6 problem. 7 And I don't know how it ranks in 8 terms of all of the problems that EIA is 9 dealing with that managers see but what I 10 fear is that unless somehow this is put on an 11 agenda and this information then is presented 12 to people above me that this will just be an 13 interesting anecdote to what we do and five 14 years down the line again we will say oh, 15 yeah, we studied the frames back there in 16 2005 and this is what we found but nobody 17 decided to do anything about it and now it's 18 a little bit worse or it's not as bad as it 19 was then and so forth. 20 So I have a plea here too and maybe 21 it's a question as well as a plea. Does this 22 get translated into something that's a choice BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 470 1 set for the senior managers in the agency or 2 does it just stop here? 3 MS. KIRKENDALL: We really can't 4 let it stop here. As to what we do with it 5 we're going to have to figure that out but 6 the frames issue, evaluating EIA frames, was 7 something decided to take a look at in EIA 8 strategic plan last time. So the strategic 9 planners ultimately need to have a good 10 report about the status of our frames and we 11 had a frames team that did an internal 12 evaluation. 13 I have asked Howard Bradshaw- 14 Fredrick to talk to the frames team with this 15 new information to see if we can come up with 16 a proposal perhaps to talk to this committee, 17 about next time. But EIA is also going to 18 start up strategic planning again sometime in 19 the spring. 20 We need to talk about where we are, 21 what we've learned, do we need to develop 22 resources in some areas, among all the other BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 471 1 things we have to do too but I think we do 2 need to have the discussion at a higher 3 level. 4 MR. WATSON: Because we do not have 5 at this point staff or budget for doing 6 frames maintenance. 7 MS. KIRKENDALL: Right, and you're 8 not alone. 9 MR. WATSON: It's just not on the 10 plate at this point. 11 DR. SITTER: I've got a question 12 for you, Bill. You mentioned this looking at 13 emissions to try and track your frame 10 14 years ago. Is that data easily available? 15 MR. WATSON: Well, it varies 16 because these are data sets maintained by 17 states. So if you go to the State of Ohio, 18 as I did, roughly 24 months or so ago to 19 their website there's a very large data set 20 there. It looks very complete but it has 21 air, water, land, discharges to land. It has 22 everything in the database. It's a BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 472 1 comprehensive database on everything that 2 goes into the environment as a waste stream. 3 And I downloaded it and I played with it 4 electronically because that's what you really 5 need to do. You're not going to be shuffling 6 through stacks and stacks of printout which 7 is really what we did 10 years ago. 8 And I found a lot of the places 9 that we already had in the frame and I found 10 a couple that might have been possible 11 candidates for being in the frame but I 12 couldn't find enough information about them 13 there to really have confidence that I was 14 seeing that there was something missing. I 15 can go back to that and take another look but 16 I must have spent probably myself alone, and 17 I just tried this as a little experiment, a 18 couple of weeks to filter this data. And I 19 was trying to this using my computer. I 20 wasn't doing it by hand or anything like 21 that. 22 DR. SITTER: Well, the reason I ask BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 473 1 is because it sounds like a cool problem. It 2 sounds like a cool problem for a student type 3 of problem. Like, how do you take that kind 4 of data and evaluate a frame? It's really a 5 data mining problem if the data is available. 6 And you have your ASA fellowship 7 stuff that you put out and ask people if 8 they're interested in this kind of stuff. 9 Maybe you might want to consider you've had 10 some experience, you know what data is out 11 there. You could probably describe this kind 12 of problem and think about it from that 13 perspective as a possible proposal and maybe 14 you'll get somebody that's interested. Then 15 it's not a big resource problem. 16 MR. WATSON: Another footnote there 17 is that increasingly EPA is taking over the 18 regulation of some of those facilities 19 because they've clamped down so much electric 20 utilities the discharges from coal burning 21 and industrial facilities now began to loom 22 large. Even though they're small they began BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382 474 1 to loom large because so much of it has been 2 controlled at electric power plants. So EPA 3 has a proposal, I'm not too sure how far 4 along it is, to actually have some federal- 5 type regulation on those facilities. At that 6 point they will have an inventory themselves. 7 So that would be another way that we could 8 latch on to something that would help us 9 quite a bit here if it happens. 10 DR. SITTER: But the point is this 11 could be an interesting research problem at 12 the masters level for a student and that may 13 be a way for you to give them an opportunity 14 and you a resource that you don't have. 15 DR. HENGARTEN: Any more questions 16 from the audience? 17 Well, thank you very much for 18 attending this meeting. I'd like to adjourn 19 until sometime next spring. 20 (Whereupon, at 11:25 a.m., the 21 PROCEEDINGS were adjourned.) 22 * * * * * BETA COURT REPORTING www.betareporting.com 202-464-2400 800-522-2382