Tuesday, May 26, 2009

Daily Data, How bad is bad?

Dave has pointed us to a particular site that he says has the raw data and he urges me to use that data. So I will. I decided to look at Hallettsville and Flatonia Texas for the year 1977. I had previously, from the CO2science.com site seen large annual average temperature differences between Hallettsville and Flatonia around 1977. So I downloaded that year from

Now remember that this site claims that they have already edited out large standard deviation anomalies, so the data is partly edited. In spite of the claim to be raw data it isn't. They have also already done some areal adjustments, meaning that they have compared nearby towns. I will say that they didn't catch them all but that is for another post on another day about another pair of towns.

The first thing I did was download the chart of the annual temperature. And there it is, that weird jump in temperature at Hallettsville in 1977. It is about a 3 degree change.

So, the next thing I did for 1977 was to compare Hallettsville's temperature with the nearest town in the USHCN, Flatonia, TX. Flatonia is only 17 miles from Hallettsville and the topography is quite similar. I have been through these two towns during my career and they are both on the flat Texas Prairie. They should have very minor differences in temperature even on a daily basis but most assuredly on an annual basis. There are no mountains in that part of Texas.

So, I subtracted Flatonia's temperature from that of Hallettsville's and saw that the temperature difference was quite large throughout the year, mostly with Hallettsville being hotter.

The annual average difference in temperature between these two towns is 2.84 degrees F. You might be tempted to say that this isn't all that much. If so, that would be wrong.

A suggested set of criteria based on the horizontal temperature gradient has been devised. A weak front is one where the temperature gradient is less than 10[deg]F per 100 miles; a moderate front is where the temperature gradient is 10 [deg]F to 20 [deg]F per 100 miles; and a strong front is where the gradient is over 20 [deg]F per 100 miles.

A strong front is 10 degrees per hundred miles. That works out to be 0.1 degree F per mile. The annual temperature gradient between Hallettsville and Flatonia 17 miles away is 0.17 deg F per mile, almost twice that of a strong cold front! And it lasted for a year's duration, or a significant part of that year.

Now, why did I choose two neighboring towns? Because this is as close as we can come to actually verifying via duplication the temperature. By choosing two towns on the hot Texas prairie, where there are no mountains. Even making the adiabatic lapse rate correction, the temperature difference is still above 2.5 deg F for the year, giving an adiabatically corrected gradient of .15 deg F/mile.

Now, what does this gradient mean? As I said above it is more than 1.5 x the temperature gradient along a strong cold front. Such cold fronts bring rain, winds and thunderstorms, yet, for the year 1977, there were no year long storms over Hallettsville and Flatonia. In other words, this temperature difference could not possibly have existed or we would have seen accompanying weather phenomenon.

Temperature differences like these would lead to wind blowing most of the year from Hallettsville to Flatonia, NW winds, which are extremely unusual in that part of Texas. Below is the chart of the daily gradient, the 30 day running average gradient along with markers for the January equator to North Pole temperature gradient and markers for that due to a cold front. You can see that the temperature differences between these two towns is very large by any measure available.

Now, I ran a 30-day running average over the temperature difference. The first output point is 41 days into 1977 because Flatonia is missing data and I didn't want to plot the data before the first full 30-day running average point. It is the last picture.

Flatonia, not Hallettsville, has missing January data. I want to point that out because I am sure that Dave will want to claim that that is why Hallettsville has the high annual temperature in 1977. But that explanation can't be offered because as the first picture shows, Hallettsville, which is not missing data becomes suddenly very warm compared to Hallettsville, not compared to Flatonia. The sudden warmth in Hallettsville is real and not a factor of missing data.

Under the assumption that these two towns, only 17 miles apart should have nearly identical temperatures throughout the year,and thus are effectively quasi-repeated measurements, we can use the two towns as a measure of the noise in the raw data. The noise level is large. The blue bar is the magnitude of the yearly spread, 3.7 deg F. That becomes the error bar for the dataset used to calculate the global warming which is now pegged at around 1.1 deg F over the past century. Thus the signal we are trying to detect is 0.11 +/- 1.85 degree F for each year. (half of the 3.7 excursion). Anyone familiar with science knows that this is far too small to be detected against this noise level. So, if temperatures are not repeatable, as this shows, we can't really know what is happening globally. The data is crap. Maybe Dave should have a critical look at the data he thinks is good.


  1. This might be why climatologists use "anomaly" data to track trends rather than the absolute temperature at a site.

    Part of the "processing" of the data. It isn't making the data say what it isn't saying, it is tracking a meaningful trend as opposed to noise.

    In any system in which data is collected it is possible to filter it to remove "outliers" (that is done in statistics quite a bit as I understand it), but more importantly, the reliance on a raw temperature score as opposed to an "anomaly" can give you very different ideas of what is going on.

    Filtering exists to find a signal, correct?

    Clearly as you note there probably isn't a 2 degree F temperature difference between these two neighboring towns, but their trends in temperature change may indicate a change. MOre spatial data gridded and filtered and stronger signals.

    That's the point of the debate, isn't it? Climate scientists are using real-world data (which can often be inherently noisy) to see if the models are working accurately. To my limited knowledge the climatologists don't make claims around what the absolute temp will be in Hallettsvill, TX next year.

    To limit the discussion to a point-by-point analysis of temperature station is to miss the signal which can be emergent from the sum total of the data.

    Just as you wouldn't take one formation resistivity log data point and try to make a decision about the entire field.

  2. Hagiograph, anomaly data isnt any good if the mean, against which it is calculated is crap. Why? If we calculated a daily average temperature mean for several years, and then subtracted that curve from the curve of 1977, 1977 would have a high anomaly. It would still look warm, when in fact the lack of physical phenomenon supporting that temperature difference PROVES that that temperature can't be true. It simply can't be true there because if such a temperature difference is there, then you should see thunderstorms lasting for 3 months. Or you should see Northwesterly winds for 6 months of the year--winds go from high pressure to low pressure and a high temperature would cause high pressure. That is what temperature differences cause.

    Where are the normal expected meteorological phenomenon for such temperatures???

    Thus, it is a nice try to excuse the crap data by pulling out a non-sequiture but it won't work.

    You haven't seen yet the homogeneity filter so don't be so confident that they aren't taking data and making cooling stations turn into warming stations.

    At least we agree that a 2 deg difference isn't real, I did screw up on the size of the yearly signal. I posted a .1 rise, on the red triangle when I should have posted a .01 rise. The fact is the yearly rise in temperature is so tiny compared with the noise it is unlikely that we could pick it up, even in 100 years. There are mathematical rules to how one gets a signal out of the noise.

    As to this being anecdotal, I have already said I intend to do more analysis and know that all the stations show this kind of crap. This is why the USHCN doesn't really want the truly raw data out there. It is worse than what is in this file.

    As to one resistivity log, darn straight I would condemn an oil prospect over one bad resistivity log. (you don't know much about the oil industry judging by that comment).