Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Stations reporting in the future #495

Open
aschl opened this issue Jan 9, 2025 · 4 comments
Open

Bug: Stations reporting in the future #495

aschl opened this issue Jan 9, 2025 · 4 comments

Comments

@aschl
Copy link

aschl commented Jan 9, 2025

The Meteostat's bulk data API (https://bulk.meteostat.net/v2/) returns stations that report weather data in the future.
What is the reason for this?
image
The data was downloaded on Jan. 6th, 2025. Some stations report weather up to a week later Jan 15th (which is in the future relative to the download date).

@clampr
Copy link
Member

clampr commented Jan 11, 2025

Hi @aschl,

This is expected behavior. Meteostat includes weather forecast models like DWD's MOSMIX in its time series. If you want actual observations only, please use the model parameter.

@aschl
Copy link
Author

aschl commented Jan 13, 2025

Thanks for your reply @clampr. I went through my code to double-check and I am not sure this is the case.
I downloaded the data in bulk and used the following URL: https://bulk.meteostat.net/v2/hourly/{station_id}.csv.gz
I think no model data should be contained or am I misunderstanding something here?

@aschl
Copy link
Author

aschl commented Jan 13, 2025

https://bulk.meteostat.net/v2/stations/full.json.gz contains the metadata for each station:
and looks something like

...
{
        "id": "00FAY",
        "name": {
            "en": "Holden Agdm"
        },
        "country": "CA",
        "region": "AB",
        "identifiers": {
            "national": "32395",
            "wmo": "71227",
            "icao": "CXHD"
        },
        "location": {
            "latitude": 53.19,
            "longitude": -112.25,
            "elevation": 688
        },
        "timezone": "America/Edmonton",
        "inventory": {
            "model": {
                "start": "2021-07-13",
                "end": "2025-01-21"
            },
            "hourly": {
                "start": "2020-01-01",
                "end": "2024-12-07"
            },
            "daily": {
                "start": "2002-11-01",
                "end": "2024-03-13"
            },
            "monthly": {
                "start": 2003,
                "end": 2022
            },
            "normals": {
                "start": 1991,
                "end": 2020
            }
        }
    },
        ...

Shouldn't the URL https://bulk.meteostat.net/v2/hourly/{station_id}.csv.gz only contain the hourly data from "2020-01-01" to "2024-12-07" in this case?
Instead the csv seems to also contain the model data?!

[...]
2025-01-20,00,-24.5,-27.0,80,,,168,18.0,,1038.4,,
2025-01-20,06,-28.0,-30.4,80,,,180,11.9,,1035.6,,
2025-01-20,12,-29.6,-31.7,82,,,321,12.6,,1035.0,,

The csv file does not flag which entries are modelled and which are actual measurement. Would it maybe make sense to clearly differentiate the hourly and model data so the pure hourly station data are contained in https://bulk.meteostat.net/v2/hourly/{station_id}.csv.gz and model data can be accessed via https://bulk.meteostat.net/v2/model/{station_id}.csv.gz?

@clampr
Copy link
Member

clampr commented Jan 18, 2025

Hi @aschl,

Inventory data is updated with delay. If you want to differentiate between the different data sources, please use source maps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants