Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Electricity Maps open data in library #244

Open
fershad opened this issue Dec 1, 2024 · 8 comments
Open

Include Electricity Maps open data in library #244

fershad opened this issue Dec 1, 2024 · 8 comments
Labels
designing Specific outcomes to address, but that we’re not committing to roadmap
Milestone

Comments

@fershad
Copy link
Contributor

fershad commented Dec 1, 2024

Is your feature request related to a problem? Please describe.
Currently, there is only one source of annual average grid intensity data available in this library. However, we know that Electricity Maps also publishes their data on an annual basis and that it is available under a very open, very permissive license.

Describe the solution you'd like
@mrchrisadams has demonstrated how this data can be collected, aggregated, and displayed in this Observable Notebook. There, he is using the hourly data provided by Electricity Maps, however that would likely be too much to ship with CO2.js (keeping package size in mind). Instead, we could look to use the annual (yearly) data they provide.

I am unsure of an easy way to get all the data from the Electricity Maps portal, besides going through each country and downloading it manually before combining it. Very open to hearing other ideas that could help automate this.

Additional context
This would be handy to have in CO2.js, and would be useful for the Grid-aware Websites library as well.

@fershad fershad added this to the 0.17 milestone Dec 1, 2024
@fershad fershad added roadmap designing Specific outcomes to address, but that we’re not committing to labels Dec 1, 2024
@fershad
Copy link
Contributor Author

fershad commented Dec 1, 2024

An important thing to note here is that the default functionality should not change. Someone doing import { averageIntensities } from "@tgwf/co2" should still receive the current average intensity data available.

What this change might open up is the chance for us to export data as a separately. Allowing users to do something like this:

import { electricityMapsAnnual } from "@tgwf/co2/data"
import { averageIntensityAnnual } from "@tgwf/co2/data" // This would be the current average intensity data that's available
import { marginalIntensityAnnual } from "@tgwf/co2/data" // This would be the current marginal intensity data that's available

@mrchrisadams
Copy link
Member

mrchrisadams commented Dec 1, 2024

oh hey @fershad -

I am unsure of an easy way to get all the data from the Electricity Maps portal, besides going through each country and downloading it manually before combining it. Very open to hearing other ideas that could help automate this.

We have a script for this from the first time I did this, and it's in the data-analysis repo for the platform. I've sent a link to you directly to the link for that code.

I've also attached the notebook in html form, which includes the code used, and commentary for creating a parquet file containing every hour of usage for every region in a given year range.

The size of the parquet file is about 14mb for 2023, and we could use a much smaller subset of these, with only one reading for each region, instead of 8760 of them (!) to make a nice rich dataset of annual figures to compare against if they don't already exist.

Electricity maps were pretty good about publishing open data last year, so I think it's plausible we could do the same thing for 2024 as soon as it's ready.

fetch-emaps-open-data.html.zip

@mrchrisadams
Copy link
Member

In response PR #247 from @thibaudcolas, I came back to this repo, and realised I hadn't linked to the Electricity Maps data portal. It's below:

https://www.electricitymaps.com/data-portal

As soon as we see the new open data published there, I think we could generate the new data for CO2.js. That would likely be the logical time to revisit any timestamps for other data we include in the package as mentioned in #247 .

@fershad
Copy link
Contributor Author

fershad commented Feb 10, 2025

Initially, we would look to only add the yearly intensity data from Electricity Maps for whatever zones are available. A quick test with all the yearly data that is available for every available region increases the size of the library a fair bit. That is even after the generated data files have been minified.

Before new data

npm notice package size: 50.1 kB
npm notice unpacked size: 256.9 kB
npm notice total files: 55

After new data

npm notice package size: 72.5 kB
npm notice unpacked size: 446.4 kB
npm notice total files: 58

The new Electricity Maps files contains data for 160 regions. The JSON data is structured like this:

{
    "Datetime (UTC)": "2024-01-01 00:00:00",
    "Country": "Australia",
    "Zone Name": "Australia",
    "Zone Id": "AU",
    "Carbon Intensity gCO₂eq/kWh (direct)": "435.46",
    "Carbon Intensity gCO₂eq/kWh (LCA)": "488.75",
    "Low Carbon Percentage": "38.76",
    "Renewable Percentage": "38.76",
    "Data Source": "opennem.org.au; Electricity Maps Estimation"
  }

Looking at the structure of the data, we can probably remove or consolidate some of the data to bring the overall size down.

  • The following properties can be removed:
    • "Datetime (UTC)"
    • Country - we don't have the country name in the currently exported data within CO2.js either
    • "Zone Name" - we don't have the zone name in the currently exported data within CO2.js either
    • "Data Source"
  • Rename "Zone ID" to zone. It would also become the key for the object.
  • We would also rename most of the fields.
    We would then have an object that looks like this:
"AU": {
  "gridIntensityDirect": "435.46",
  "gridIntensityLCA": "488.75",
  "lowCarbonPercentage": "38.76",
  "renewablePercentage": "38.76"
}

@fershad
Copy link
Contributor Author

fershad commented Feb 10, 2025

Making that change above bring the size of the library down a decent amount.

npm notice package size: 59.8 kB
npm notice unpacked size: 339.3 kB
npm notice total files: 58

Undoes some of the work that went into #121, that's a side effect of the expanding what's in the library I suppose.

@fershad
Copy link
Contributor Author

fershad commented Feb 11, 2025

A note that all the comparisons above are only including the 2024 data into CO2.js. We would look to bring in 2022 & 2023 data as well, which would obviously lead to a larger package size.

Another thing to think about is how we attribute the data, and how anyone who uses this data through CO2.js should attribute it. Electricity Maps has guidance on their website about how citations - https://portal.electricitymaps.com/datasets.

  • What would we need to include in our Readme and Developer Docs to cover this off? We currently have details of other third-party data sources listed in the licenses section, would that be enough?
  • Also, would anyone using the Electricity Maps data through CO2.js need to cite it in any particular way? What (if any) guidance should we give for that?

@tonypls @madsnedergaard any guidance would be appreciated

@madsnedergaard
Copy link

Hey @fershad, excited to see you found the new 2024 data!
And thanks for the ping, we'll discuss it and come back tomorrow with some guidance on how to attribute usage :)

@tonypls
Copy link

tonypls commented Feb 12, 2025

Hi @fershad It's great that you're interested in integrating our data!

What would we need to include in our Readme and Developer Docs to cover this off? We currently have details of other third-party data sources listed in the licenses section, would that be enough?

The data is provided under an ODbL license, the data will need to be used in accordance of this.
I suggest we:

  • Add Electricity Maps to the license section of CO2.js including a link to ODbL and our datasets page.
  • Include license information and links to our datasets page in the CO2.js README in the same way other data sources are mentioned
  • Add the same information to the CO2.js docs on data

Also, would anyone using the Electricity Maps data through CO2.js need to cite it in any particular way? What (if any) guidance should we give for that?

License info and citation guidance can be found on our newly updated datasets page.

Here's a draft of license and attribution guidance:

Electricity Maps Grid Intensity Data
-----------------------------------------------------------------

The grid intensity data for 2021-2024 is republished from Electricity Maps under the Open Database License (ODbL). Users of this data through CO2.js must:
- Attribute: Credit Electricity Maps as the source
- Share-Alike: Keep derivative works under the same license
- Keep open: Provide unrestricted versions if using DRM

For full details on the dataset, attribution requirements and citation guidance, see:
https://portal.electricitymaps.com/datasets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
designing Specific outcomes to address, but that we’re not committing to roadmap
Projects
None yet
Development

No branches or pull requests

4 participants