Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting client leads to exponential(?) growth of subscriptions #164

Open
GrimmiMeloni opened this issue Feb 19, 2025 · 4 comments
Open

Comments

@GrimmiMeloni
Copy link

GrimmiMeloni commented Feb 19, 2025

We are observing the following behavior.

Scenario:
Client is connected to a tibber pulse and has a single subscription.

What happens:

  • Subscribing works as expected.
  • In case of connection errors, retries done by the graphql client recreate the subscription with a new id, the old id is gone.

Log example

...
[pulse ] TRACE 2025/02/17 14:12:49 {"id":"9eed0f34-0e98-4949-b1fd-ecba8314ee66","type":"next","payload":{"data":{"liveMeasurement":{"power":0,"powerProduction":483.59,"lastMeterConsumption":17492.18065271,"lastMeterProduction":535.57547554,"currentL1":null,"currentL2":null,"currentL3":null}}}} server
[pulse ] TRACE 2025/02/17 14:12:59 failed to read JSON message: failed to get reader: context deadline exceeded server
[pulse ] TRACE 2025/02/17 14:12:59 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:12:59 {"type":"connection_ack"} server
[pulse ] TRACE 2025/02/17 14:12:59 {"id":"bd4e24bf-9db1-43e6-8db0-bf2c68804630","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:13:04 {"id":"bd4e24bf-9db1-43e6-8db0-bf2c68804630","type":"next","payload":{"data":{"liveMeasurement":{"power":0,"powerProduction":473.48,"lastMeterConsumption":17492.18065271,"lastMeterProduction":535.5775975,"currentL1":null,"currentL2":null,"currentL3":null}}}} server
...

So far so good. Now next:

  • During one of the above mentioned reconnects the server fails to acknowledge the subscription. Similar log scenario as above, but this time the subscribing fails and the client exits gracefully.
[pulse ] TRACE 2025/02/17 14:15:39 failed to read JSON message: failed to get reader: context deadline exceeded server
[pulse ] TRACE 2025/02/17 14:15:39 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:39 {"type":"connection_ack"} server
[pulse ] TRACE 2025/02/17 14:15:39 {"id":"e6edc84a-62dd-4870-bcd7-ef8e6fd4e1a6","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:39 {"id":"e6edc84a-62dd-4870-bcd7-ef8e6fd4e1a6","type":"next","payload":{"errors":[{"message":"unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b","locations":[{"line":1,"column":28}],"path":["liveMeasurement"]}]}} server
[pulse ] TRACE 2025/02/17 14:15:39 {"id":"e6edc84a-62dd-4870-bcd7-ef8e6fd4e1a6","type":"complete"} server
[pulse ] ERROR 2025/02/17 14:15:39 Message: unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b, Locations: [{Line:1 Column:28}], Extensions: map[], Path: [liveMeasurement]
[pulse ] TRACE 2025/02/17 14:15:39 no running subscription. exiting... client

To my understanding, in this moment flow control given back to the calling code, in our case evcc.
As we need the client to retry connecting, even in situation where the client exits, we do the following:

  1. we call SubscriptionClient.Subscribe(...)
  2. we restart the client via SubscriptionClient.Run()

I would expect from hereon, the subscription just again does its thing like before, i.e. send an connection_init and once successfully subscribed, it just streams the events.

However, what we DO see, is a multiplication of the subscription, which I frankly don't understand. So what we see in the logs after calling .Run() is a burst of consecutive connection_init requests, which then lead to MULTIPLE subscriptions, where there actually should only be a single subscription.

[pulse ] TRACE 2025/02/17 14:15:39 no running subscription. exiting... client
[pulse ] TRACE 2025/02/17 14:15:39 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:41 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:42 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:48 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:48 {"type":"connection_ack"} server
[pulse ] TRACE 2025/02/17 14:15:48 {"id":"27599e79-183d-460e-94f9-d8047a56d9c8","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:48 {"id":"99d89c41-a145-40eb-bae3-11826e9d26df","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:48 {"id":"fb6ccf36-5c4b-4fdd-9ed5-1543e4368be1","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:48 {"id":"27599e79-183d-460e-94f9-d8047a56d9c8","type":"next","payload":{"errors":[{"message":"unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b","locations":[{"line":1,"column":28}],"path":["liveMeasurement"]}]}} server
[pulse ] TRACE 2025/02/17 14:15:48 {"id":"27599e79-183d-460e-94f9-d8047a56d9c8","type":"complete"} server
[pulse ] ERROR 2025/02/17 14:15:48 Message: unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b, Locations: [{Line:1 Column:28}], Extensions: map[], Path: [liveMeasurement]
[pulse ] TRACE 2025/02/17 14:15:49 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"type":"connection_init","payload":{"token":"***"}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"type":"connection_ack"} server
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"4ff3daa4-8b0c-4154-aaef-592507787842","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"6c08576b-402a-487f-9b54-f7fd77d64537","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"a295186e-83a6-40a1-bf7a-9d00aa30b929","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"abfd1e95-3a40-4d94-97f6-3b868e4b126e","type":"subscribe","payload":{"query":"subscription ($homeId:ID!){liveMeasurement(homeId: $homeId){power,powerProduction,lastMeterConsumption,lastMeterProduction,currentL1,currentL2,currentL3}}","variables":{"homeId":"***"}}} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"4ff3daa4-8b0c-4154-aaef-592507787842","type":"next","payload":{"errors":[{"message":"unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b","locations":[{"line":1,"column":28}],"path":["liveMeasurement"]}]}} server
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"4ff3daa4-8b0c-4154-aaef-592507787842","type":"complete"} server
[pulse ] ERROR 2025/02/17 14:15:58 Message: unable to start stream api:90c423074bbc4ff58fe47ddb8fce7020 for device 90c42307-4bbc-4ff5-8fe4-7ddb8fce7020; userId 6d634dc6-cb9b-48fd-9ff2-1a665ed80b9b, Locations: [{Line:1 Column:28}], Extensions: map[], Path: [liveMeasurement]
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"6c08576b-402a-487f-9b54-f7fd77d64537","type":"complete"} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"a295186e-83a6-40a1-bf7a-9d00aa30b929","type":"complete"} client
[pulse ] TRACE 2025/02/17 14:15:58 {"id":"abfd1e95-3a40-4d94-97f6-3b868e4b126e","type":"complete"} client
...

This "pattern" continues, and subscriptions start to pile up. After a few minutes we already have 12 subscriptions, where there used to be a single one.

I am not sure if this is a bug, or if we are doing something wrong on the client side by i.e. reusing the SubscriptionClient. Any advise would be helpful.

@hgiasac
Copy link

hgiasac commented Feb 19, 2025

@GrimmiMeloni if it's possible can you share the code block example on how you handle the connection restart? There would be some race conditions so we need to improve the sync lock logic

@GrimmiMeloni
Copy link
Author

@hgiasac
Copy link

hgiasac commented Feb 20, 2025

@GrimmiMeloni After reading the code, I have a few comments:

  • This line can be removed. The subscription client now keeps raw subscription information. When restarting, the client will reset the original state, not be removed as before. So when you call subscribe again the subscription will be duplicated.
  • Let's update the library with the latest commit on master.

Then test again to verify if the duplicated subscription issue still happens.

@GrimmiMeloni
Copy link
Author

Thanks @hgiasac - will do as suggested and follow up here once we have feedback from the field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants