Continuous Learning - which variables?

My notification request is for the general user, not the 0.1% of us who watch the UDP packets…

3 Likes

You miss-typed 10%. :slight_smile:

1 Like

Hi @peter Great questions. Antarctic – :snowflake: burr!! The CL system is unlike traditional manual offset calibration so it’s natural there are lots of questions. We’ll try our best to explain in simple terms. There are a bunch of proprietary techniques in play here, so apologies in advance for the need to obfuscate some of the technical details to protect our intellectual property.

Preamble — remember, at heart we’re a bunch of meteorologists. We built a reputation on quality ground level weather observations and we hold that sacred. It’s especially important to QC and curate public weather networks to ensure the quality of data across the board…which is why the CL system is so critical. Operating one of the world’s largest professional-grade networks for 20+ years…we learned a thing or two about sensor drift and the need for frequent calibration. A network is infinitely more useful if all nodes are in tune.

Short answer – The goal of CL is NOT a point-by-point matching of Smart Weather obs to the nearest reference sources. Rather, the calibrations are designed to account for a more general long-term and consistent offset from the trusted reference sources. We evaluate and QC a set of reference trusted data sources in close proximity to the users’ station over the course of day(s). Both the reference data and the Smart Weather data are passed through a series of smart algorithms that identify and remove any inconsistent or outlying data points to ensure that only the most accurate and representative data is used to generate the calibration. A regression analysis is run on the comparison data set and if it meets meets a certain r-squared threshold then and only then a fit value including both slope and intercept is defined and applied. Data reported comes directly from the sensors and will reflect all the nuances of the sited environment unless otherwise noted.

Longer explanation – The CL system takes a long period of past observations and analyzes through a regression algorithm to identify and remove that general and consistent offset from the trusted reference sources. The values displayed are more accurate because of CL running and applying these occasional calibration checks. But once again, your station data is direct from your sensors. As example, we do not (and should not) look at a pressure observation in real-time and adjust it up or down based on what some reference source says at that same time. If there is some interesting mesoscale feature, say a sharp pressure bump due to a thunderstorm outflow boundary, the user will see that change in real time regardless of what the reference source says about it.

Caveat --That said, if a station is truly located in a spot that always has 5% higher humidity than the local background field due to siting, then yes the CL would catch that and remove the general 5% offset. While this could be a negative for some users, we believe it is a positive for the vast majority because it essentially means that CL is ‘fixing’ siting problems. But it is true that this makes these devices inappropriate for some use cases, like measuring humidity in a spot that you know will be consistently different from the local ambient/background humidity, or putting a bunch of AIRs around your property to analyze micro-scale humidity differences. To account for that, we do intend to offer users the ability to opt out of the CL process and apply their own calibration. Calibration is typically more complex than just a simple offset and would only be a good idea for true experts.

7 Likes

Superb explanation of CL @WFmarketing :vulcan_salute:

Thank you so much @WFmarketing for taking the time to provide such a detailed answer :+1:. As a scientist I would love to delve deeper into your algorithm, but fully understand that this is not possible!

I like the sound of the CL system being able to ‘correct’ for siting issues, as this is certainly the biggest issue I face in my small garden that is surrounded by other houses on all sides.

1 Like

That is a very interesting detailed response. I have a question: we have had a rather peculiar weather situation with relative humidity above 98 percent over about a week, perhaps with a drop to 92 to 95 percent in the early afternoon. Today, the RH has dropped to less than 80 percent on two occasions and I would guess it was at 85 percent for most of this morning and afternoon. These measurements were made on both Davis and Oregon stations.

Air seems to have ignored today’s drop and from this morning through to mid afternoon, it has displayed 99 percent RH (on the WD display, the temperature and humidity lines are practically coincident). On the graph showing air temperature on my Android phone, the temperature line is just the width of the dots higher than the dew point line.

It has struck me that the sensors in Air are almost totally enclosed except for a tiny hole in the side of the device. Could it be that the humidity has not been evacuated from the interior of Air, despite max air gust of 9.8 m/s (measured on Davis) and 5.8 m/s (measured on WF): respectively max average of 4.4 and 3.2 m/s? Or, is this a CL anomaly?

Hi @bne . Your station is located in Cyrus where our CL system does not have enough comparison ground truth data to form a baseline in-situ calibration curve. Thus the CL system has not applied a tweak to your humidity yet. However, we doubt your high humidity is related to the CL system.

Instead, we suspect it is related to a known issue with prolonged high humidity exposure. Please see this thread: Humidity Appears Incorrect (solved)

1 Like

The AI must explain why humidity reporting from my new (as of 11/6) AIR is, in the last week, getting closer to what my three year old Davis Vantage Pro reports. Before Thanksgiving, in heavy fog, the Davis reported 97% while the AIR reported 78%. On 12/1, again in heavy fog, the Davis reported 96% while the AIR reported 89%.
However, rain is quite another issue. In the past three weeks, the skies have opened up and our parched California land has absorbed somewhere between 2.14" and 3.15" of rain as reported by the Davis and the SKY respectively. Here is the data:
46%20PM

Clearly there are huge discrepancies on 11/28, 11/29, and 12/4. All three days had some pretty strong winds. The Davis and SKY are approximately 15 feet apart and both are in clear air.
Do you think the AI will help make the two agree better? BTW, I also have an old Rainwise rain gauge that is in need of calibration but which read 0.06" today exactly matching the Davis; it usually reads about 20% higher than the Davis.

Thanks for your reply. For your info, Cyprus is an independent island
over 1000 km away from Greece. I mention this because our weather is
very different from that of southern Greece because many of the fronts
travel up the Aegean Sea towards the Bosporus, missing us all together.
Our climate is hotter and drier than that of Greece. That having been
said, Cyprus is quite mountainous and has many microclimate regions.
Where I am misses nearly all the major thunderstorms which pass mainly
to the north and south of where we are in the foothills, about 300 m up.
Just this past few days, the major cities of Nicosia, Larnaca and
Limassol have all had bad storm damage and flooding, where we have had
practically nothing. Automatic calibration or learning based on the
official weather stations in those cities would probably not be very
successful.

I went through the thread that you recommended and got totally confused
from the hundreds of posts and I missed out on those that seemed
relevant to my problem. Today, the humidity has increased again and all
three of my weather stations have the relative humidity constantly above
98 percent, with the wet bulb coincident to the temperature. I’m
therefore unable to say whether there is any improvement. I’ll have to
wait for better weather :slight_smile:

May I respectfully recommend that you consider limiting the length of
threads?

Best regards,

Brian

Hi @bne Appreciate the follow up ! Cyprus sounds like a fantastically interesting place…would love to visit some day. The CL system will only evaluate trusted data sources within close proximity to your exact location. If there are no available data sources, then no corrections are approved / applied (as is the case with your station). In this case, we will be enabling a feature to allow the customer to apply a manual calibration in the future.

Yes, some of the threads are quite onerous. Tricky to strike a balance between allowing open public discussion and cutting out a bunch of the crap. Many of the core topics are distilled in our official Help FAQs. We are also considering organizing topics into sub-categories…but right now we’re spending most of our time improving the station performance.

2 Likes

Correct. The CL system for humidity is evaluating data from a handful of stations near your location on a continuous basis and adjusting calibration.

Yes, the CL system for rain (not deployed yet) will bring your readings into better alignment. The haptic sensor in SKY is a very sensitive device capable of much higher resolution data than the average tipping bucket. However, the tuning of the sensor is dependent on the unique mounting situation for each unit. The CL system for rain (again, not yet deployed) evaluates reference data sources for precip and will tune the calibration curve for your individual device. This is a complex process as rain is a complex beast – sometimes light, sometimes heavy, and everywhere in between, influenced by wind, etc. For that reason, it’s not a simple % offset as you have demonstrated in your comparison data.

And as I asked somewhere before might be by temperature? Did someone did investigation (either theoretically or experimentally) on that dependency?

1 Like

I know for a fact when temperature drops too low no precipitation is registered. :wink:

Hi @dsfg . This is where running our massive data set through machine learning algorithms will begin to identify relationships including the effect of of temperature on rain and allow us to further refine our systems.

2 Likes

That’s what machine learning is made for. :slight_smile: good luck.

1 Like

…at least until the machine decides to lock @dsj out of the pod bay!!! :rofl:

image

3 Likes

Can you please define ‘trusted’ data sources? As I mentioned, there are no official sites within 18+ km and they have very different values to those of the microclimate zone where we live, including ~300 m altitude difference, no seashore, 1000+ m mountains in proximity, foehn effect, diurnal valley-directed sea breezes etc.

However, I have two other weather stations (Davis and Oregon) which agree with each other to a remarkable degree (no tweaking) and which seem very close to reality

As I write, Davis & Oregon:
Temp 16.5 16.4 °C
Hum 83 83%
Max gust 5.1 N 4.9 N m/s at 12:40 12:40.
Gust last hour 2.2 N 2.9 NNE m/s at 14:27 14:27.
Average 1.8 ENE 1.6 NE m/s at 12:54 12:47
Max temp 17.5°C 17.7°C at 12:48 12:45
Min temp 10.4°C 10.3°C at 06:17 00:00 (temp almost flat overnight)
No precipitation in last 24 h but Davis slightly lower than Oregon as a rule

I’m confident that these two stations keep within close tolerances to real conditions. Could they be used for CL for WF?

Temp is good. hum is high (95% >10% high) wind is haywire, rain (when it happens) very high.

1 Like

Can now add comparative rain:
Davis 2.6 mm
Oregon 3.8 mm
WF 5.7 mm

Comment: the Davis see-saw is corroded and slightly sluggish (awaiting replacement). The Oregon is new. Neither has been calibrated and I intend doing a calibration.

Yes, if you have a trusted source we may be able to use it as an input to the CL system. Is data from these two stations available online somewhere (e.g. Weather Underground)?

Thanks for reply. I’ve been trying to put both stations on WU, but it shows ‘Not reporting’ for both. I haven’t been able to trace why unless it is because it shows the location as Μοσφιλωτή, CY when I entered it as Mosfiloti.