Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Get ECR Creds - context deadline exceeded #1029

Closed
mzameer777 opened this issue Sep 25, 2024 · 5 comments
Closed

Unable to Get ECR Creds - context deadline exceeded #1029

mzameer777 opened this issue Sep 25, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@mzameer777
Copy link

What happened:
I'm trying to implement image pull from private ECR, I have installed and configured ecr-credential-provider plugin, I'm getting this error in kubelet logs, and can't figure out how to proceed further

{"ts":1727300113642.7864,"caller":"plugin/plugin.go:235","msg":"Failed getting credential from external registry credential provider: error execing credential provider plugin ecr-credential-provider for image XXX.dkr.ecr.us-west-2.amazonaws.com/cilium/cilium: context deadline exceeded: I0925 21:34:13.657531    6329 main.go:129] Getting creds for private image XXX.dkr.ecr.us-west-2.amazonaws.com/cilium/cilium\nW0925 21:34:13.657585    6329 main.go:65] No region found in the image reference, the default region will be used. Please refer to AWS SDK documentation for configuration purpose."}

The plugin binary is executed and it says context deadline exceeded.

below is my configuration, I'm using Talos, so this is the creds config patch

machine:
  kubelet:
    credentialProviderConfig:
      apiVersion: kubelet.config.k8s.io/v1
      kind: CredentialProviderConfig
      providers:
        - name: ecr-credential-provider
          matchImages:
            - "*.dkr.ecr.*.amazonaws.com"
          defaultCacheDuration: "12h"
          apiVersion: credentialprovider.kubelet.k8s.io/v1

Environment:

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 25, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 25, 2024
@cartermckinnon
Copy link
Contributor

cartermckinnon commented Sep 26, 2024

You're probably getting the timeout here:

output, err := e.ecr.GetAuthorizationToken(&ecr.GetAuthorizationTokenInput{})

Do your nodes have network access to the ECR endpoint?

(the warning you're seeing in the output there is misleading but harmless. Fix for that in #1030)

@mzameer777
Copy link
Author

I can confirm that the node has network connectivity to ECR VPC endpoint and it also has ECR full permissions.

What else can I look for, is there a way to debug this further in my env?

@cartermckinnon
Copy link
Contributor

You verified that aws ecr get-login-password works on the node?

You can try to reproduce the cred provider failure with something like:

echo '{"kind":"CredentialProviderRequest","apiVersion":"credentialprovider.kubelet.k8s.io/v1","image":"$IMAGE"}' | ecr-credential-provider

@mzameer777
Copy link
Author

I was able to resolve this, my cluster was not having connectivity to ecr api. Thanks for helping me debug this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants