Skip to content

Conversation

guilhem
Copy link
Contributor

@guilhem guilhem commented Nov 21, 2024

This pull request focuses on simplifying the volume attachment logic in the ControllerServer by removing redundant checks and related methods. The most important changes include the removal of the checkAttachmentCapacity function and its associated methods, as well as the related test cases.

Context

CSI caller (like Kubernetes) should already respect volume limitation from NodeGetInfo.
We don't need another check and logic at attach.

Motivation

Reduce logic and complexity and rely more on Linode API calls.
If there is an error, return it to the user for easier debug.

Risk

Making more failing attach requests and maybe hitting a rate limit.
This should not be a problem, as CSI attacher have a backoff logic.
Moreover, current limit logic is already doing api calls that should have hit API ratelimit.

Possible Mitigations

  • Implementing a rate limiting on api call.
  • As GCP, implementing a csiErrorBackoff mechanism

IMHO, both are not mandatory for the moment.

Simplification of volume attachment logic:

  • Removed the checkAttachmentCapacity function and its invocation from the ControllerPublishVolume method in internal/driver/controllerserver.go.
  • Deleted the canAttach function, which was used to determine if an additional volume could be attached to an instance, from internal/driver/controllerserver_helper.go.
  • Removed the checkAttachmentCapacity function, which checked if an instance could accommodate additional volume attachments, from internal/driver/controllerserver_helper.go.

Removal of related test cases:

  • Deleted the TestCheckAttachmentCapacity test function from internal/driver/controllerserver_helper_test.go, which tested the checkAttachmentCapacity logic.

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Pull Request Guidelines:

  1. Does your submission pass tests?
  2. Have you added tests?
  3. Are you addressing a single feature in this PR?
  4. Are your commits atomic, addressing one change per commit?
  5. Are you following the conventions of the language?
  6. Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. Have you explained your rationale for why this feature is needed?
  8. Have you linked your PR to an open issue


// maxAllowedVolumeAttachments calculates the maximum number of volumes that can be attached to a Linode instance,
// taking into account the instance's memory and currently attached disks.
func (cs *ControllerServer) maxAllowedVolumeAttachments(ctx context.Context, instance *linodego.Instance) (int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also remove this helper func(). Its not used anywhere else except the funcs you are removing :)

Copy link
Contributor

@komer3 komer3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this proposal. This will definitely simplify the volume attachment logic (great for us when debugging and understanding what is happening).

Just to make sure we aren't missing any cases, we should test the CSI driver by running the all the e2e tests we have plus run manual test to see how CSI driver handles things when we hit the max limit.

@wbh1
Copy link
Collaborator

wbh1 commented Nov 22, 2024

@guilhem - do you intend to re-open #310? I believe those changes will still be beneficial in order to properly implement the CSI spec and behave the way that kube-scheduler expects.

@guilhem
Copy link
Contributor Author

guilhem commented Nov 22, 2024

@wbh1 totally.
Was on draft just for discussion.
I'm opening it off team is interested

@guilhem guilhem marked this pull request as ready for review November 22, 2024 16:52
@guilhem guilhem requested review from a team as code owners November 22, 2024 16:52
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.79%. Comparing base (8df2d46) to head (6f49ca5).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #312   +/-   ##
=======================================
  Coverage   74.79%   74.79%           
=======================================
  Files          22       22           
  Lines        2396     2396           
=======================================
  Hits         1792     1792           
  Misses        499      499           
  Partials      105      105           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@guilhem
Copy link
Contributor Author

guilhem commented Nov 25, 2024

duplicate #322

@guilhem guilhem closed this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants