Sy Harding, author of Riding the Bear, writes: “In the fine print of the employment report each month, under a section titled ‘Reliability of the Estimates’, is this statement: *‘The confidence level for the monthly change in total employment is on the order of plus or minus 430,000 jobs.’*

Given our recent lament about the precision of the monthly NFP data, even we found that hard to believe. So I went to the most recent release, and scanned:

What do you know! There is was in BLS print:

Statistics based on the household and establishment surveys are subject to both sampling and nonsampling error. When a sample rather than the entire population is surveyed, there is a chance that the sample estimates may differ from the "true" population values they represent. The exact difference, or sampling error, varies depending on the particular sample selected, and this variability is measured by the standard error of the estimate. There is about a 90-percent chance, or level of confidence, that an estimate based on a sample will differ by no more than 1.6 standard errors from the "true" population value because of sampling error. BLS analyses are generally conducted at the 90-percent level of confidence.

For example, the confidence interval for the monthly change in total employment from the household survey is on the order of plus or minus 430,000. Suppose the estimate of total employment increases by 100,000 from one month to the next. The 90-percent confidence interval on the monthly change would range from -330,000 to 530,000 (100,000 +/- 430,000).

These figures do not mean that the sample results are off by these magnitudes, but rather that there is about a 90-percent chance that the "true" over-the-month change lies within this interval. Since this range includes values of less than zero, we could not say with confidence that employment had, in fact, increased. If, however, the reported employment rise was half a million, then all of the values within the 90-percent confidence interval would be greater than zero. In this case, it is likely (at least a 90-percent chance) that an employment rise had, in fact, occurred. At an unemployment rate of around 5.5 percent, the 90-percent confidence interval for the monthly change in unemployment is about +/- 280,000, and for the monthly change in the unemployment rate it is about +/- .19 percentage point.

Yes, its true: Zero was within the 90% confidence interval for the monthly change for nearly every month for the past 2 years!

Harding states this is nearly double the prior fudge factor.

*Source:*

Employment Situation: OCTOBER 2006

BLS, Friday, November 3, 2006.

http://www.bls.gov/news.release/pdf/empsit.pdf

Fudge is in the eye of the beholder.

BLS is never going to detect significant changes in employment until the change exceeds the margin of error.

You may wish to go down and argue with a stop sign, but you can’t argue with statistical science.

Wasn’t there some whining lately from a certain folks about the margin of error on the study of Iraqi deaths published in Lancet? As I recall, the margin of error was about the same.

Given that job growth figures are about the only thing such people have to tout about the economy, I don’t expect them to dismiss the unemployment rate as easily as they dismiss the death rate in Iraq.

I wonder if the error rate goes down when looking at yearly or quarterly employment change? An increase in the number of samples will usually decrease the statistical error.

When you have oddities in the economy, uncertainties go up, no matter your method. There is nothing wrong with the BLS. Their methods are always going to work best when things are chugging along normally.

The idea that the BLS is showing a “tight” labor market is in the eye of the overworked reporters who never look past the headline. Based on employment-to-population, the current data have been flat since early 2006. Unless you believe there is another big benchmark revision coming, you have no quarrel with the BLS.

As for the Lancet study, the people out there quarelling with it are qualitatively different from those quarreling with the BLS. In the end, the former group trades in the market. You don’t have to lecture me if you think I’m wrong — just take the other side of my trades.

Oops, grammar police: “..former group

does nottrade..”Do this:

Count approximately 1,000 children on a playground, only once, each day, and write the figure down. You can’t disturb, corral, gate or in any other way influence the children.

I doubt you could be more accurate than 3 children per thousand, but here is what will happen:

Suppose each day there are in reality 2 more children, however you are ignorant of this.

In any given day, you may observe in your count a decline in the number of children… however, many days later you will have noticed that your errors still will assemble themselves in the direction of the real change in number. That is; the distribution of error will be in the direction of the true value.

This will mean in the case of the BLS report that, even though we have a series of increases in employment that are, each one, within the statistical margin of error, we learn in time that they did represent a valid increase.

Finally, after a very long time, such that the number of children has increased to, say, 1,500, even though your counting errors will still be no more accurate than before (for each count), you’ll count a number far closer to the true 1,500 than to 1,000.

It is the same with BLS.

Cactus,

It’s even worse than that. The BLS data uses a 90% CL while the Lancet study used a 95% CL.

Thats exactly right Eclectic. And I have pointed that exact thing out here before when this exact topic was brought up previously.

It was ignored then and now here the topic is recycled again. The focus is always on the monthly slices. But never any statistical discussion about the fact that the numbers are always on the postive side and when you put the slices together over an annual period the margin of error will decrease greatly and those cummulative numbers will represent a statistically more accurate and statistically significant increase.

If the number truly was close to zero then the reports should fluctuate randomly around the general area of the true number. Since all the reports print positive over more than a year it is statistically very unlikely that the number is anywhere near zero.

But you can make a somewhat convincing argument by myopically focusing on only the month to month reports and their inherent individual margin of errors. And if you like you can argue the same point twice, just for affect.

Let’s suppose true employment increased by 10,000 this month, but that in reality it was cresting and would decline before next month’s report.

Further suppose that, due to random sampling error, the BLS reported the 10,000 increase as a decline of 20,000, because its sampling error had begun to detect ‘delta.’

The media would take up the banner headline of a decrease, and the pundits would begin to build in their assumptions of how long it would take to return to job growth. Bears would say it’s worsening… Bulls would say it’s not likely to worsen.

Then… the next month, in reality employment declines by 20,000… but the BLS reports ‘0’ change or very nearly that. It’s still a tight survey and delta is being detected, but gradually, like a hill is topped in a moving car.

The bullish pundits would classify the zero as the end of the slump and begin to build assumptions for job growth.

If in the next month, the accelerating decline produced a loss of 150,000 real jobs, the BLS sampling error might only just be beginning to accelerate the error distribution to the negative side… it’s possible it might show a decline of, say, 30,000.

All of these figures are still within sampling error, even under the best efforts of BLS.

What would both pundit camps say?

Suppose the next month the real decline was 500,000 jobs, but BLS demonstrated a loss of just 80K. Minus 500k, plus or minus 430k makes the -80k data within the margin of error.

What we have is a demonstration of a possible picture of the flux reported by BLS under real circumstances of a turnover in jobs.

Eclectic – You are probably correct about multiple observations leading to a more accurate reading of employment growth, but this assumes that the error is random. It may be, but isn’t necessarily.

Using your kids on the playground example, suppose the study defines a child as a prepubescent human. Further suppose that as some of the children reach puberty, they continue to attend the playground, but as they no longer qualify as children, they’re excluded from the sample. Monthly numbers may show a decline in kids playground attendance until one day, some years later, an enterprising researcher does a full census and finds that the postpubescent humans still attending the playground have done what comes naturally, but the resulting prepubescents were too small to be noticed by the regular researchers. Now we have a massive revision, and possibly a change in method to capture the newborns. The underlying trend may have been up all along, but the method used systematically excluded the young families.

Let’s look at it another way: what is the confidence level that the reported monthly number is accurate to +/- say 30,000? My sense is if that were made public then we’d all be golfing the first Friday of every month instead of waiting “with baited breath” for the employment report.

It does seem like the issue is how to interpret a noisy data. By my calculations based on a normal distribution, a 100,000 increase for a measurement with a standard deviation of 270,000 says that there is a 15% chance that unemployment has increased by 0-100,000.

Estragon,

Your assumptions are rather esoteric. It would be almost like all of a sudden deciding to count a child who had a pet at home, but not count them if they didn’t… or counting them if they did their homework and not counting them if they did.

No, you’re still stuck on the theme of either innocent or intentional ‘manipulation.’

Tick Tock,

That’s my point exactly. I’m sure the confidence of +/- 30,000 accuracy would make the BLS staffers chuckle, since it would have to be extremely low.

So, let’s all just ignore the BLS results until they’re statistically significant. In my view jobs are a lagging indicator, certainly no more than a coincidental indicator of GDP in a timing sense.

When employment turns over (the relative maxima or minima is reached), the BLS report won’t be able to indicate it. It will then however be very capable of indicating d-e-l-t-a.

So, we can argue about indicators that might prove a downturn in the economy, but the wise thing is probably to ignore BLS until it follows the pack.

Barry is usually careful to distinguish between the household and payroll surveys, but not here. Many readers may not notice that the 430K confidence interval is for the household survey (in the quoted, out-of-context section). The establishment survey confidence interval is about 100K.

Eclectic and Q-Ball have the right idea in their comments here. We have also covered the issue of sampling and non-sampling error extensively on our site, and the site for the payroll employment game. The monthly reports are estimates, pending the actual reports from state agencies handling taxes for unemployment insurance. That count is the source of the benchmark revisions. Those revisions have confirmed strong job growth.

Eclectic – My point was simply that without knowing the source of the error(s), we can’t be certain of trends. If the errors aren’t random, the trend may be misleading. I don’t think the numbers are intentionally manipulated. I do think that the survey method is prone to a variety of errors, and that some of them may be systemic.

I agree completely that the series is of limited use in trying to get a read on the current state of the economy. Paul Kasriel at Northern Trust had an interesting take on this recently. He asserted that the initial claims number (which is not survey or sample based) is a better coincident indicator than NFP, which lagged by 2 quarters.

Eclectic – your playground example misses the point of a survey – it doesn’t try to count the who population on the playground. Rather, the survey would be of a few classrooms to figure out what the whole population on the playground might be.

I do agree that putting together many data points increases our confidence. But better still is triangulation with multiple data sources – unemployment claims, surveys of those not in the labor force, etc.

Paul Kasriel just wrote a nice piece suggesting that one of the few reliable jobs numbers is weekly unemployment claims. They’re never revised, and even serve as a leading indicator.

Thanks winjr… exactly what I’m in the mood for.

Oh, Mrs. Ec-lec-tic, w-h-e-r-e a-r-e y-o-u? I want to share some Big Pic with you.

No, seriously, it’s some informative new information on new claims.

One figure that they have not included here that would be extremely interesting is the probability that the true number is greater than or equal to zero (with whatever given level of confidence is desired).

This should be trivial to calculate for those with access to the data, and who remember their first-year stats course. In fact, it’s a very standard test and rather surprising that BLS does not include the figure.

you are all beat for really watching and commenting on this stupid blog. You must not have jobs your damn selfs