Updates to support Geocaching Home Assistant integration feature updates #23

marc7s · 2024-12-12T16:55:08Z

Introduction

This PR adds the necessary features required for our updates to the HA Geocaching integration. These updates are:

Add devices with entities for Trackables and Caches (such as name and location among others)
Add list of tracked Trackables and Caches
Add nearby caches search

Following is an explanation of the features we have added to the Geocaching integration.

Feature overview

For each model, we create a new device which groups all relevant entities together, in the same way as the current integration does with the profile entities. For each cache we add in HA, we therefore add a device for that specific cache with some entities that hold data for that cache. The same goes for each trackable we add. There are therefore now in total 3 different device types: Profile (1 created for the integration), Cache (1 created for each nearby cache, and another created for each tracked cache) and Trackable (1 created for each tracked trackable).

Trackable device

Each trackable contains 7 entities. These are:

Current cache code (reference code of the cache it currently resides in, or Unknown)
Current cache name (cache name of the cache it currently resides in, or Unknown)
Name (the name of the trackable)
Owner (the display name of the user who owns the trackable)
Release date (the date when the trackable was released)
Traveled distance (the total distance it has traveled, in kilometers)
A device tracker entity for its current location, so that it can be used with the map card

In addition to these entities, we also store its trackable journey (from the trackable log, filtered for only events where it moved) in the extra attributes which can be parsed and extracted using Jinja. See an example of this later.

Cache device

Each cache contains 7 entities. These are:

Favorite points (the number of favorite points the cache has)
Found (Yes/No, if the authenticated user has found this specific cache or not)
Found date (a date, or Unknown depending on if the authenticated user has found this specific cache or not)
Hidden date (the date when the cache was hidden)
Name (the name of the cache)
Owner (the display name of the user who owns the cache)
A device tracker entity for its current location, so that it can be used with the map card

Tracked trackables and caches

During configuration, the user can select one or more caches (and trackables) to track. These will then be returned as part of the GeocachingStatus object with their updated information. For example, a user can choose to track their own hidden caches, and their own trackables (or caches/trackables they are interested in) and then display the information for those in Home Assistant.

Configuration:

Displaying a Trackable:

Displaying a Cache:

Nearby caches

We use the home location of the HA instance as the position, and then search for caches in a radius around this position. These are considered "nearby caches". As with the tracked caches, we generate a cache device for each nearby cache. The user can configure the settings for this in the configuration step, to adjust the radius and maximum number of caches to generate devices for. A known limitation for this feature is that the nearby cache devices are generated during the configuration, and will therefore not be dynamically updated should a new cache be placed nearby. We looked into supporting this but were unable to within our time span.

Configuration:

Displaying the nearby caches:

Additions to profile device

For the profile device, which is the current featureset for the integration, we have added two sensors: total number of caches nearby, and total distance traveled by all tracked Trackables.

Setup

To test, we have made changes to the ha-core repository. This PR is used with the geo branch of our fork, available here. We did not manage to get it working just using the hass --skip-pip-packages geocachingapi command, it would continue to use the downloaded version from pypi instead of our local version of this repository. Therefore we made some temporary changes during development, which we will revert before opening a PR for the ha-core repository. This is our first time developing Home Assistant, so if you know of a better way to solve it you can ignore these instructions, but be aware that there are changes made in our geo branch that are temporary and used to set up our development environments.

We used Dev Containers in Visual Studio Code to develop our features. In order to get everything working, we created a bind mount for the dev container, and then we had to:

Clone this repository to $HOME/Repos/geocachingapi-python, see the bind mount
Clone the forked ha-core repository and switch to the geo branch
Inside the HA dev container, run Ctrl + Shift + P -> Dev Containers: Rebuild Container Without Cache
In the HA dev container terminal, run pip3 install -e config/custom_libraries/geocachingapi
In the HA dev container terminal, run hass --skip-pip-packages geocachingapi
Set up the HA instance
Add the Geocaching integration

The cards we used to display the information in the previous images are available here.

Thank you for the previous work with this library and HA integration. Please let us know if you have any questions or comments. If this PR gets merged, then we will try to merge the forked ha-core geo-branch afterwards, after reverting our temporary development changes. Merging this PR would of course warrant a new version, so let us know which version this should be before merging.

Credits

The feature updates to the Geocaching integration and updates to this repository has been developed by:

…ot been able to test that it works yet, /Per&Albin

Cache info and trackable journeys

… some refactoring and cleanup

Fix trackable parsing and add missing fields

Update NearbyCachesSetting

Replace find count

Cleanup

Trackable journey improvements

Separate nearby caches logic to allow directed usage

Add comments and improve documentation

Add settings validation method

Cache and Trackable URLs

Formatting and cleanup

reinder83 · 2024-12-12T17:18:51Z

Haven't checked it yet but first of all, what an incredible addition!

Sholofly · 2024-12-13T07:33:29Z

Simply lovely! Thanks! This is exactly what I hoped for when I started this package. Very happy with it!
I will take a closer look at the code in the coming days.

One question I have in my mind since the beginning:
The HA integration is using one partner consumer key (managed by Nabu Casa) for the API connection. The API documentation has the following rate limits:

Given the fact that 350 users are using the integration, and a maximum of 1200 API calls per minute are allowed, how do we make sure not to hit that rate limit, making the integration unuseable for all users for a minute, or even longer if we hit that limit too often?

And if we do hit the limit, how will the integration respond?

It's something that we probably should have built in from the beginning but with just some generic requests the need wasn't there.

How are your views on this? We can try to contact Groundspeak to try stretching the limits, but I doubt they can do that. But maybe the influence of the largest open source community in the world can convince them to move ;)

Can you make a rough estimation of the expected calls per minute? It would help to get a better view of the (possible) problem and how to overcome it..

reinder83 · 2024-12-13T08:30:17Z

And in addition to @Sholofly's comment there are also limitations on basic vs premium members, non-premium have more restrictions in what they can see or request, there should be checks built-in for that more information on this:

https://api.groundspeak.com/documentation#restrictions

This is one of the reasons we kept the initial implementation very basic, because we didnt have checks on that yet

marc7s · 2024-12-13T13:44:36Z

Thank you both for your comments.

As a note:

For caches, we only fetch lite caches, which are limited to 10 000 per basic user rather than 3/basic user each day. So there should not be any issues here
For trackables, I am not sure what the limits are. There is no section for trackable limits as there are for caches. There is only a section for trackable discoveries, which should be unrelated as we are only reading data and not discovering any trackables
For verifying the settings, we only fetch the referenceCode field, meaning these calls do not count towards the user's limit (although rate limiting should still apply)

Here is a breakdown of the API requests:
Each time the integration updates (_async_update_data is called), we call the update() function on the Geocaching instance to retrieve a new GeocachingStatus.

Retrieving a new status initiates the following requests:

Update the user (1 API call to the /users/me endpoint)
Update tracked trackables, if enabled (1 API call to the /trackables endpoint)
2.1: For each tracked trackable, update its journey data (TT API calls to the /trackables/___/trackablelogs endpoint, where TT is the number of tracked trackables)
Update the tracked caches, if enabled (1 API call to the /geocaches endpoint)
Update the nearby caches, if enabled (1 API call to the /geocaches/search endpoint)

So in total, each new status yields 4 + TT API calls, with a minimum of 1 call (tracked trackables, tracked caches and nearby caches disabled). We can expect these to occur during the same minute, as they are called in succession when the update is triggered. TT is tied to each integration instance and that user's configuration, with a minimum of 1 and currently no maximum. However, pagination is not implemented so there is an imposed max limit there when making the API call, however we should probably include some handling in the code that limits the number of tracked trackables and caches. Currently we do not supply the take parameter, so the default of 10 will be used by the API, meaning there is an imposed upper limit for TT of 10.

The minimum number of requests per status update is therefore 1, with the maximum being 4 + max(TT) which is 14.

To expand on the API imposed limits:

50 tracked caches (imposed by max number of reference codes allowed by /geocaches endpoint)
10 tracked trackables (imposed by the default take parameter of the /trackables endpoint)
100 nearby caches (imposed by the take parameter of the /geocaches/search endpoint, but also limited in code to a value between 0-100 currently)

Some quick estimations therefore yield (with the current update interval at one every hour):

With the three rate limits as:
RL1: 60 calls per minute for user per method
RL2: 1200 calls per minute for partner consumer key
RL3: 6000 calls per minute for IP address

And assuming each integration is under a unique IP address, we get:

Minimum configuration (tracked caches and trackables disabled, nearby caches disabled):

Requests per user per hour	Subject to RL1	Subject to RL2	Subject to RL3
1	No	Worst case: 1 200 users. Best case: 103 680 000 users	No

Maximum configuration (50 tracked caches, 10 tracked trackables, 100 nearby caches):

Requests per user per hour	Subject to RL1	Subject to RL2	Subject to RL3
14	No	Worst case: 85 users. Best case: 7 405 714 users	No

Note about the calculations:

RL1 will only be relevant if the requests per minute per user exceeds 60, and is therefore not relevant.
1 request per user per hour, meaning worst case where all integrations
RL3 follows the prior rules, however it must instead exceed 6000 as we assume each integration is bound to a unique IP address
The worst case is if all Geocaching integrations happen to update during the same minute, which is highly unlikely. I have not looked into this, but my guess is the update interval starts from the time when the integration was configured, meaning the worst case assumes all integrations were individually configured by users during the same minute of a day.
The best case is if all Geocaching integrations were equally distributed among the available minutes of the day (24 * 60 * 60 = 86 400) meaning the API requests are optimally placed, again highly unlikely. This gives us the best case user count as 86 400 * 1 200 / Requests per user per hour
The truth will of course lie somewhere between the best and worst case, but the important part is that if this integration should scale massively and we are only allowed a single partner consumer key, we would be able to reach the best case scenario by programmatically scheduling all API calls optimally among all users. This would of course be a very large task, and would require all integration instances to somehow communicate with each other to decide during which minute each integration should update its status, but at least there is a possibility to scale here.

All in all, I do not think it will be an issue at the moment, however I do think it would be a good idea to add some limitations in the code as well for all of the API calls. I also think updating the data each hour is not really necessary for this integration, so we could for example update it every other hour, and set the hour it updates on based on a coin flip (tails is hour 1, 3, 5, 7..., heads is hour 2, 4, 6, 8...) which would halve the API calls and importantly the rates, allowing for double the users.

This is of course my own interpretations of the restrictions, I may have misinterpreted them so it is a good idea to double check that, and that my math looks reasonable. I have not spent much time digging into the restrictions, so I may also have missed something.

TLDR: We do not currently enforce any limits to the number of tracked caches or tracked trackables but could easily do so by implementing it. For nearby caches we do enforce limits in the code, currently allowing the entire span (0-100 caches). All three are affected by the API imposed limitations described above. There are ways of enforcing or scheduling to allow the integration to scale more under a single partner key, which could be implemented with different levels of ambition.

marc7s · 2024-12-13T14:19:29Z

And in addition to @Sholofly's comment there are also limitations on basic vs premium members, non-premium have more restrictions in what they can see or request, there should be checks built-in for that more information on this:

https://api.groundspeak.com/documentation#restrictions

This is one of the reasons we kept the initial implementation very basic, because we didnt have checks on that yet

If I understand these restrictions correctly, they should not be an issue for us as we only fetch lite geocaches with a limit of 10 000 per basic user per day, which is far above what is currently possible to achieve (which should be 150 * 24 = 3 600), with the most of these being the nearby caches which we can easily reduce from its current maximum of 100 caches. These 100 nearby caches are fetched with a single API call, but as I understand these rules fetching data about 100 caches with a single API call would count as 100 towards these daily 10 000 available. Nevertheless, premium subscriptions should not be needed for this use case as explained, if I have understood the restrictions correctly.

Sholofly

I've read everything twice and couldn't find something to correct. Must say I am not a Python developer so it doesn't say much from my side. You should probably ask a native Python programmer to review it too, if you idn't do that already.

Sholofly · 2024-12-13T21:39:45Z

Brilliant change! Please let me know if you are planning to review this by a more experienced python developer. Otherwise I will complete the PR.

I will create a new release 0.3.0 that will automaticcally publish te package to pypi

… handling: Raise error if too many codes were configured in settings. Automatically remove duplicate codes

marc7s · 2024-12-14T00:26:41Z

@Sholofly I made some additions to this PR:

Handle limits in settings (throw an error if you pass in too many reference codes during the configuration)
Handle limits during API call (limit applicable API calls with take parameter, even though the settings handling should have already caught this)
Automatically remove duplicate reference codes (changed from list to set)
Separated the min(max(... function into a clamp function to make it more readable
Put the limits in a limits.py file where you can configure them

These changes were made to address my previous comments, and make it easy to change the limits down the line. There are now two imposed limits in the code:

For the settings: raising an error if you try to configure the API with values above the limits
For the API: setting the take parameter where possible

The limits are configurable in limits.py, so you can easily change them later, for example lowering the limits to allow for more users.

Regarding the review, we have previous experience in python even though it is not our main language. However, I am confident enough in these changes that I do not think they should need a further review. We have tried to put comments and documentation where necessary to make it more maintainable and not overuse python specific syntax to make it more accessible for non python-natives. I think most of the reviewing will take place in the HA side of things, so these changes may get revisited as part of that PR which we will try to publish in the near future. So from our end bumping the version number and merging is fine!

Sholofly

Good addition!

marc7s and others added 30 commits November 3, 2024 20:11

Fix missing async keyword, see home-assistant/core#99977

bbd56d2

Add nearby caches to status

a66e14e

Change trackable distance traveled data type

5f1a5d8

Fix missing status initialization for nearby caches

a820811

Fix typo causing cache coordinates not to be parsed

f0631a1

Added update_caches and _get_cache_info, with some more stuff, have n…

a018bed

…ot been able to test that it works yet, /Per&Albin

add seperate call for geocaches

6b21457

Started working on the trackable journey list, not yet tested /Albin Per

425be24

fix for appending to array in trackable journey api

caf5beb

add support for trackable objects

9d4af3a

uncommented code

dea1433

Revert formatting

051db11

Fix errors and add missing cache data points

e98dd84

Rename caches -> tracked_caches

aabbc74

Correctly parse location data for caches

c40377e

Rename caches_codes -> cache_codes

3176b7f

Merge pull request #2 from marc7s/cache-info-trackable-journeys

b4a77fb

Cache info and trackable journeys

Fix trackable parsing, add additonal fields to caches and trackables,…

26be5b3

… some refactoring and cleanup

Formatting

11f252c

Merge pull request #3 from marc7s/fix-trackable-parsing

6d89c90

Fix trackable parsing and add missing fields

Always return original value if new value could not be parsed

71fd60f

Update NearbyCachesSetting

fdc0c4a

Rename cache update function

ee8343e

Replace cache find count with found date and found switch

50c7001

Rework foundByUser to be nullable

73c041f

Fix nearby caches update condition and limit take parameter for API

ff85ef5

Merge pull request #4 from marc7s/input-nearby-caches

c1c6653

Update NearbyCachesSetting

Merge pull request #5 from marc7s/change_find_count

2001c49

Replace find count

Remove unused function, align variables and functions to snake_case

412e898

Merge pull request #6 from marc7s/cleanup

384e2ed

Cleanup

marc7s and others added 15 commits November 28, 2024 12:51

Reverse geocode trackable journey locations

674f43f

Add distance between journeys

45bf373

Handle blocking reverse geocoding outside of init function

7452276

Merge pull request #7 from marc7s/trackable-journey

1785fd7

Trackable journey improvements

Separate nearby caches logic to allow directed usage

9a457b7

Merge pull request #8 from marc7s/nearby-caches-separation

9379c99

Separate nearby caches logic to allow directed usage

Add comments and improve documentation

75285e8

Merge pull request #9 from marc7s/docs

a672bfc

Add comments and improve documentation

Add settings validation method

411aa0a

Merge pull request #10 from marc7s/settings-validation

9623de5

Add settings validation method

Add cache and trackable URLs

7586c3d

Merge pull request #11 from marc7s/urls

0692fd5

Cache and Trackable URLs

Formatting and cleanup

2524b00

Merge pull request #12 from marc7s/cleanup

1160118

Formatting and cleanup

Update test

074fad6

Sholofly previously approved these changes Dec 13, 2024

View reviewed changes

Limits: cache and trackable limits in settings and in API call. Error…

08b5a4f

… handling: Raise error if too many codes were configured in settings. Automatically remove duplicate codes

marc7s dismissed Sholofly’s stale review via 08b5a4f December 14, 2024 00:02

Move limits to limits.py

8171c85

Sholofly approved these changes Dec 14, 2024

View reviewed changes

Sholofly merged commit 0aebbd4 into Sholofly:dev Dec 14, 2024
1 check passed

marc7s mentioned this pull request Dec 25, 2024

Geocaching integration feature updates home-assistant/core#134017

Draft

19 tasks

reinder83 mentioned this pull request Jan 5, 2025

[Feature request] findCount for a Cache #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates to support Geocaching Home Assistant integration feature updates #23

Updates to support Geocaching Home Assistant integration feature updates #23

marc7s commented Dec 12, 2024

reinder83 commented Dec 12, 2024

Sholofly commented Dec 13, 2024

reinder83 commented Dec 13, 2024

marc7s commented Dec 13, 2024

marc7s commented Dec 13, 2024

Sholofly left a comment

Sholofly commented Dec 13, 2024

marc7s commented Dec 14, 2024

Sholofly left a comment

Updates to support Geocaching Home Assistant integration feature updates #23

Updates to support Geocaching Home Assistant integration feature updates #23

Conversation

marc7s commented Dec 12, 2024

Introduction

Feature overview

Trackable device

Cache device

Tracked trackables and caches

Nearby caches

Additions to profile device

Setup

Credits

reinder83 commented Dec 12, 2024

Sholofly commented Dec 13, 2024

reinder83 commented Dec 13, 2024

marc7s commented Dec 13, 2024

marc7s commented Dec 13, 2024

Sholofly left a comment

Choose a reason for hiding this comment

Sholofly commented Dec 13, 2024

marc7s commented Dec 14, 2024

Sholofly left a comment

Choose a reason for hiding this comment