Possible Bug in Timezones in python Hourly data #177
-
Hi y'all! I've been loving using the python library for Meteostat. Recently, I've been looking at the tsun values from some locations in Germany and have become concerned that the Hourly python endpoint returns incorrectly shifted values when timezone is provided. This example will take a second to talk through so please bear with me. First, let's consider the station 10505 near Bonn Germany. When I retrieve hourly data for Bonn for May 29 2021 and don't provide a timezone with the request I get back the following data (trimmed to just time, tsun, condition code in csv format):
Here we can see that the sky was mostly clear all day (condition code 1 or 2) and first light was detected at about 5:17 since there were 43 sunny minutes in the 05:00 hour. Similarly, it appears that sundown was around 8:10pm. If I check an external source, eg: https://www.timeanddate.com/sun/germany/bonn?month=5&year=2021, these values seem plausibly correct. Official sunrise occurs at 5:25 am (Europe/Berlin) slightly after first light is reported in this data extraction and official sunset is at 9:30 (Europe/Berlin), but it seems plausible that this station is shaded out before true sunset. Here is the first clue that something is up, these results seem consistent assuming their time stamps are in the local timezone. If I request the same data from station 10505 but include timezone "Europe/Berlin" with the request, then I get back this data:
The timestamps in the result now include the timezone offset. However, sunup now appears to occur at 7:17 local time, almost 2 hours after true sunrise. The last light of the day is now also shown as after sunset by ~40 minutes. It seems like adding the timezone parameter has caused all data to shift inaccurately by the timezone offset. My current theory is that the raw date times are already in the local time zone (ie. the timezone returned by the Station api), and then are incorrectly shifted under the assumption that they're UTC time. This code in meteostat/utilities/mutations.py, called from here, seems consistent with that theory. Additional questions I haven't explored but would love to know the answer to:
Let me know what you think! If I can get an answer to question 2 above I'm happy to contribute a PR or a review to changing the code in mutations.py. Or, if there's something obvious I'm missing let me know that too :) -Tristan |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Hi Tristan, Thanks for reaching out! I just had a brief look, and it seems like something is off indeed. However, this doesn't affect data provided by DWD's open data interface (example: https://meteostat.net/en/station/10637?t=2021-05-29/2021-05-29). I assume it's an issue with DWD POI data. I'll look into this and get back to you. Thanks, |
Beta Was this translation helpful? Give feedback.
-
I did some further investigation and can confirm that the data you mentioned above is coming from DWD's POI dataset. But I'm not sure whether there is a time zone issue. Here is the POI data for Frankfurt Airport: ![]() And that's the data reported by DWD's quality-controlled hourly dataset: ![]() Here, everything seems alright. Now we need to figure out if something is wrong with the data for Bonn. DWD uses the UTC time zone for all datasets. It's important to remember that Looking at data provided by Kachelmannwetter, it seems plausible that |
Beta Was this translation helpful? Give feedback.
-
Ah! ok, I think knowing that tsun refers to the previous/preceding hour explains most of my confusion here. First and most basic, I did a double check of the data I'm getting back from the Hourly python endpoint for Frankfurt (station 10637) and it's consistent with what you've posted here. ie. if I don't provide a timezone with the request then I get back data showing...
and if I do provide a timezone then I see
So this is consistent with the data you link in Kachelmannwetter, and is reasonably close to a computed sunrise of 5:22am CET. The data for Bonn shows a bit more lag between computed sunrise (5:26 CET for May 29, 2021) and the first observed tsun value Thanks again for looking into this and sharing what you found. One final tangentially related question, you mentioned that |
Beta Was this translation helpful? Give feedback.
I did some further investigation and can confirm that the data you mentioned above is coming from DWD's POI dataset.
But I'm not sure whether there is a time zone issue. Here is the POI data for Frankfurt Airport:
And that's the data reported by DWD's quality-controlled hourly dataset:
Here, everything seems alright.
Now we need to figure out if something is wrong with the data for Bonn. DWD uses the UTC time zone for all datasets. It's important to remember that
tsun
always refers to the previous hour. So2021-05-29 05:00:00,43.0,1.0
cannot be a CET timestamp if sunrise was at 5:17. I'd expect 43 minutes of sunshine to be reported at 6 CET if it was sunny throughout the hour.Looking a…