Friday, March 20, 2009

Noise in the System-Guest post with addendum

There has been some discussion that in the commentary about the ability of climatologists to fix the noise in the raw data which I showed in my posts of March 15 (here )and March 19, 2009 ( here ). The incredible variability of the data, at least as far as this scientist is concerned, makes it impossible to dig signal out of the incredible level of noise.

A good friend, counselor, and fellow DNA counter of mine saw the comments and sent me this note with kind permission to use it as I saw fit. His note addresses a criticism in the comments on the March 15th post.

This intro is partly taken from a bio on one of his papers. Gordon Simons is a professor emeritus of statistics at the University of North Carolina. He has twice served as the Chairman of the Department of Statistics. He earned his PhD in statistics at the University of Minnesota in 1966. He has published numerous papers concerned with probability and statistics, mostly of a theoretical nature.

{start of G. Simons' note}
I would like to submit an informal rebuttal-to-the-rebuttal comment,
relating to Glenn's March 15 posting "The Raw Truth: The Actual
Temperature readings." Here is the rebuttal:

{inserted by GRM -Queen-of-fractal-beauty wrote:}
"Knowing how to throw out the noise so that you are left with meaningful information is a science unto itself. Raw meteorlogical data is nearly impossible to read. That's why all the charts and graphs put out by the community show corrected or smoothed data. They aren't "hiding" the truth. They're removing the noise so that we can see the truth."{GRM -end of queen's quotation}

It seems to me that this encapsulates the case for the dismissal of Glenn's argument in this posting. If what the rebutter says here is correct, then Glenn's position is devastated. No doubt, the rebutter felt that he has served up a coup de grace. Glenn and I were served the same argument by one of my statistical colleagues at UNC. So, I think it is important to address this argument head on.

The real problem with this kind of rebuttal traces to the word "noise," and, more importantly, to how it is modelled. The "science onto itself" that the rebutter refers to is a science BASED ON MODELLING. If the "noise" is merely that resulting from randomness and nothing more, then the rebutter's conclusion seems valid to this statistician. BUT, if this "noise" contains significant components of bias, then one will unwittingly draw a flawed conclusion from the smoothing process. Why? Because the usual statistical tools of averaging and smoothing over large aggregates of data will not, indeed can not, defang bias in the data, rendering it as "of no concern."

A very simple example is in order to clarify my point: imaging that the length of an object is measured 1 million times with a ruler, with each measurement recorded to several decimal points of accuracy, and an average is computed. One will come up with an explicit length estimate with a very small standard error. However, suppose the ruler is flawed and that all of its measurements are too large by 1 cm. This cm is what statisticians call bias. If one knows the precise size of the bias, then no harm is done, because it can be subtracted from all of the recorded measurements before the averaging is performed. Or it can be subtracted from the average itself, to the same effect. But if one only knows that a significant bias is present without knowing its size, then the statistician is helpless. There is no tool of statistical science that can rescue the situation.

So must a realistic model of historical temperature data include a bias component? Absolutely! Stripped of a lot of verbiage, this is the content of Glenn's argument in this posting. And, unfortunately, no one really know how large the various biases in the temperature data are. Faced with this intractable situation, it is tempting to ignore the issue of bias -- acting, in effect, as if it does not exist. But then, one is fooling oneself when one declares, as the rebutter does: "They're removing the noise so that we can see the truth." This simply is not "truth."

Glenn, you can use these comments in any way you wish.
{grm-end of Gordon's comments}

Addendum: I want to make a further comment on the issue above, which in part involves signal to noise. I want it clear that the responsibility for this part of the post is mine, not Gordon's, He has neither seen nor approved this, but this comes from my experience with digital signal processing. Any errors are mine.

When the noise, even if it is random, has too large of an amplitude in relation to the signal you are trying to extract from the data stream (which consists of signal plus noise), the signal is totally lost in the noise and it is impossible to recover--even if one knows that it is Gaussian in distribution.

I took all of California temperature stations and calculated the standard deviation of all the temperatures from the start to 2005. In the raw data, the standard deviation of all temperatures is 6.2 deg F. Then I took the edited data, that which is supposed to clean up the noise. After editing, the standard deviation was 6.3 deg F. Now, If all these temperatures are considered measurements of California's climate, then the change in California climate must be known only to within +/-6.3 degree, a nearly 13 degree spread.

So, if California is said to have warmed up over the past century at the same rate as the globe, then California would have warmed by 1.1 deg F +/- 6.3 degrees.

The 6.3 standard deviation is related to actual temperatures. Some think that moving to an anomaly representation of the data will fix this. The interesting thing is that an anomaly is merely the temperature minus some base line--say the average temperature between 1950 and 1961 or some such. That means you are subtracting a constant. And in any statistical set of numbers, the subtraction of a constant doesn't change the standard deviation.

So, if the anomaly shows a warming of 1.1 deg, the standard deviation (SD) will still be 6.3.

And that then brings into play another statistical point of interest, the CV, the coefficient of variation. This is the standard deviation divided by the mean * 100. That means that the Coefficient of variation for the anomaly would be 6.3/1.1 * 100 = 572.The interesting thing about the CV is that if one inverts it, one gets the signal to noise ratio, at least that is what time series folk, like EE’s do. That means that the strength of the signal (the 1.1 deg F of warming) is only .0017 the strength of the noise. In this case the noise would overwhelm any signal.

I prepared a few pictures to illustrate the effect of signal to noise. The first picture is of a pyramid with a hole in the middle of it. That is the signal. I made this in Excel.

Here is the pyramid with the signal 9 times bigger than the noise. The pyramid is quite clear

In this next picture, the signal is a little more than twice the amplitude of the noise. You can see that it is getting difficult to see the pyramid

The next picture shows a 1 to 1 signal strength to noise strength. One can barely make out the pyramid.

The third picture illustrates what we know in geophysics. When the signal to noise ratio is 1/2 (the noise is twice the strength of the signal) you can't see the signal.

In the case of a 6.3 standard deviation, the noise is 6 x the level of the signal. That is what it means statistically when one claims that the earth has warmed by 1.1 deg F +/- 6.3 deg.

There is not much point in showing you a picture which would match that applicable from California. It doesn't get better than the picture above, which is 3 times less noisy than the California temperature data.

Now, how would one over come this noise level? Taking numerous measurements and adding them together, properly aligned will bring the pyramid out of the noise at the rate of the square root of N where N is the number of measurements. That is how COBE and WMAP bring tiny signals out of the noise when they look at the microwave background. This is how we in geophysics bring out tiny signals from the sound wave field of the earth, which enable us to clearly picture the subsurface of the earth with sound wave reflections that have amplitudes of about one-angstrom!

But, with temperature, you can't go back to 1957 and re-measure the temperature in Peoria Illinois. You get one measurement of it. That measurement has both signal and noise in it. You don't know what part of that temperature is signal and what is noise. If it has a bias, then, as Gordon says, unless you know the amount of bias, you can't remove it. If it has a statistically random part plus the true temperature, I would still contend that you can't even remove it because you have only one sampling of the temperature on that day in that place. You don't have multiple measurements with which to judge the statistical estimate of the inherent fluctuations in the temperature for that day.

One final issue. This new blog has gotten some comment on other lists.
Rich Blinne wrote of my Trees don't lie Post here

"One data series using one technique in one location
simply doesn't cut it. Good science is not only peer review it's also
repeatability, specifically repeatability using multiple techniques.
You will note the dates of the cited references (2000 and 2002) Quite
a few proxy studies have been done since then. When Mann did his
original hockey stick diagram in 1998 it was novel. Now it's consensus
because it showed up over and over and over using different kinds of
proxies and with greater geographic dispersal"

Rich makes three points: One he claims that I used one data series. That is false. I cited two studies.

Secondly he claims that the one study was at one location. That too shows that he did not do his research and simply engaged in a knee jerk reaction. Below is the picture of the locations for the Espers study. Clearly this is far more than merely one location.

Rich also points everyone to two pictures. I would note that if you look closely at the pictures Rich points us to in that note, only the instrumental temperature records go shooting upwards like rockets. All the other proxies seem to be much like what I posted in my Trees don't Lie post. Apparently Rich hasn't actually looked closely at the data he claims supports his position.

In, the red CRU instrumental record and the grey, HAD record end up at +.9 and +.6 respectively. No other proxy is higher than +.2. Rich seriously needs to actually examine the data he puts forth. Clearly the data he puts forth as supportive of global warming shows merely that the instrumental temperature record is out of line with the proxy record--all of them.

Thirdly he claims that science is only done with peer review. Peer review is merely a way to enforce conformity. Newton, Maxwell, Darwin, Einstein and even Murry Gell-Mann produced important scientific works which were not peer reviewed. Gell-Mann's "The Eight-fold Path" was the most important particle physics paper ever published, yet it never was actually published. It just circulated. At some point I will post a post on the stupidity of peer review and consensus group think of the sheeple scientific community, but one example will suffice. This is what peer review did for a couple of Nobel prize winning papers.

"One example is Rosalyn Yalow, who described how her Nobel-prize-winning paper was received by the journals as follow: 'In 1955 we submitted the paper to Science...the paper was held there for eight months before it was reviewed. It was finally rejected. We submitted it to the Journal of Clinical Investigations, which also rejected it.' Another example is Gunter Blobel, who in a news conference given just after he was awarded the Nobel Prize in Medicine, said that the main problem one encounters in one's research is 'when your grants and papers are rejected because some stupid reviewer rejected them for dogmatic adherence to old ideas." According to the New York Times, these comments 'drew thunderous applause from the hundreds of sympathetic colleagues and younger scientists in the auditorium." Frank Tipler, "Refereed Journals," in William Dembski editor, Uncommon Dissent, (Wilmington Delaware: ISI Books, 2004), p. 118

Another Nobel prize winning work was rejected because --well see below

“Glaser realized that charged particles shooting through a superheated liquid will create a disturbance and trigger the boiling process as they ionize the atoms of the liquid along their paths. For a fraction of a second, a trail of bubbles will form where a particle has passed, and this trail can be photographed. But you must act quickly, or the whole liquid will begin to boil violently. Glaser therefore planned to release the pressure and then immediately restore it. Particles entering the liquid during the critical moments of low pressure would leave trails that could be photographed. The immediate restoration of pressure would mean that the liquid was once again just below boiling point, and the whole process could be repeated.”
In the autumn of 1952, Glaser began experiments to discover if his 'bubble chamber' would work. After thoroughly considering possible liquids, he chose to use diethyl ether. With a small glass vessel holding just 3 centilitres of the liquid, he successfully photographed the tracks of cosmic rays. But he faced an uphill battle in developing his invention. He was refused support by the US Atomic Energy Commission and the National Science Foundation. They said his scheme was too speculative. And his first paper on the subject was rejected on the grounds that it used the word bubblet', which was not in the dictionary. But his luck changed in 1953, when a chance meeting brought the bubble chamber to fruition.
“Glaser's first talk on his idea was to be given on the last day of the American Physical Society's meeting in Washington DC in April 1953. Among the participants at the meeting was Luis Alvarez, a distinguished physicist.”
Frank Close, Michael Marten, and Christine Sutton, The Particle Odyssey,” (Oxford: Oxford University Press, 2002), p. 92-93

I would note that J. Tuzo Wilson's paper, the one which revived continental drift was rejected by the major journals and had to be published in an obscure journal that had less stringent standards. Peer review does too much harm to be taken seriously as an arbitor of truth. In the spirit of this post, peer review seems to be a mechanism to maintain scientific noise and reject scientific truth.

Yes, Rich, let's hear it for consensus peer review, a process that can't even recognize Nobel Prize quality work!


  1. "{start of G. Simons' note}"

    I can't say that I disagree with anything he had to say. He did, however, say it much better than I or woox did.

  2. Huh? He agrees with my positon. Maybe you should read it again.