Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeltaLake >= v0.20.0 throws "invalid peer certificate: BadSignature" error (ARM/Kubernetes/EKS) #3193

Open
klauss42 opened this issue Feb 5, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@klauss42
Copy link

klauss42 commented Feb 5, 2025

Environment

Python 3.10 job on AWS EKS on ARM nodes

Delta-rs version:
0.20.0 and later

Binding:
Python

Environment:

  • Cloud provider: AWS
  • OS: Linux
  • Other: ARM architecture

Bug

What happened:
After a version upgrade we are running into an exception when running a Python job in Kubernetes on an ARM node. The job uses a k8s serviceAccount to get properly configured AWS IAM policies to access S3. In deltalake versions before v0.20.0 everything worked fine for months. Using version > 0.19.2 breaks our code.

[2025-02-05T16:33:59Z WARN  aws_config::web_identity_token] STS returned an error assuming web identity role error=dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }))
[2025-02-05T16:33:59Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=WebIdentityToken error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (ProviderError(ProviderError { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }) }))
[2025-02-05T16:33:59Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=DefaultChain error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (ProviderError(ProviderError { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }) }))

What you expected to happen:
Latest versions should continue to work :-)

How to reproduce it:

I stripped down the issue to the following simple snippet:

import pandas as pd
from deltalake import write_deltalake, DeltaTable

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Los Angeles", "Chicago"],
}
df = pd.DataFrame(data)

s3path = "s3://<BUCKET>/xxx/"
_storage_options = {
    "AWS_S3_ALLOW_UNSAFE_RENAME": "true",
}
write_deltalake(s3path, df, mode="append", storage_options=_storage_options)

dt = DeltaTable(s3path)
print(f"version: {dt.version()}")

The problem does not show up on local runs on Mac M3 or on Windows. It only happens when running the code on Linux in a Kubernetes pod. As we only have ARM nodes, I don't know if this is an ARM-only problem.

When using DeltaLake v0.19.2 the above works without error.
When using DeltaLake > v0.20.0 (0.22.0, 0.24.0 also failing) we see the following error:

[2025-02-05T16:33:59Z WARN  aws_config::web_identity_token] STS returned an error assuming web identity role error=dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }))
[2025-02-05T16:33:59Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=WebIdentityToken error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (ProviderError(ProviderError { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }) }))
[2025-02-05T16:33:59Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=DefaultChain error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (ProviderError(ProviderError { source: DispatchFailure(DispatchFailure { source: ConnectorError { kind: Io, source: hyper::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error: InvalidCertificate(BadSignature) } }), connection: Unknown } }) }))
Traceback (most recent call last):
  File "/app/src/controlling_v2/jobs/certbug.py", line 16, in <module>
    write_deltalake(s3path, df, mode="append", storage_options=_storage_options)
  File "/app/venv/lib/python3.10/site-packages/deltalake/writer.py", line 298, in write_deltalake
    table, table_uri = try_get_table_and_table_uri(table_or_uri, storage_options)
  File "/app/venv/lib/python3.10/site-packages/deltalake/writer.py", line 761, in try_get_table_and_table_uri
    table = try_get_deltatable(table_or_uri, storage_options)
  File "/app/venv/lib/python3.10/site-packages/deltalake/writer.py", line 774, in try_get_deltatable
    return DeltaTable(table_uri, storage_options=storage_options)
  File "/app/venv/lib/python3.10/site-packages/deltalake/table.py", line 415, in __init__
    self._table = RawDeltaTable(
OSError: Operation not supported: an error occurred while loading credentials

More details:

@klauss42 klauss42 added the bug Something isn't working label Feb 5, 2025
@ion-elgreco
Copy link
Collaborator

The issue implies there is something either wrong with the certificates that you are using or the parsing of the certificates.

To my understanding aws_config uses Hyper for the http client and rustls for the certificate handling

@ion-elgreco
Copy link
Collaborator

I can this dependency tree in aws_smithy:

── hyper-rustls v0.24.2
│   │   │   │   │   │   ├── futures-util v0.3.31 (*)
│   │   │   │   │   │   ├── http v0.2.12 (*)
│   │   │   │   │   │   ├── hyper v0.14.32 (*)
│   │   │   │   │   │   ├── log v0.4.22 (*)
│   │   │   │   │   │   ├── rustls v0.21.12   <--- Mayne this version is incompatible?

@rtyler you have more insights on the inner workings of rustls and aws (EKS)?

@klauss42
Copy link
Author

klauss42 commented Feb 5, 2025

@ion-elgreco Sorry, but I don't know anything about rust and I am simply using deltalake from Python. We only discovered that deltalake library is not working anymore in k8s/arm environment when upgrading to v0.20.0

@klauss42
Copy link
Author

klauss42 commented Feb 5, 2025

I tried to reproduce the behavior using a minimal Docker container, maybe this helps to analyze or reproduce.

Dockerfile:

FROM python:3.10.14-slim

ARG DL_VERSION
ENV DL_VERSION=$DL_VERSION

RUN pip install deltalake==$DL_VERSION

WORKDIR /app

RUN cat <<EOF > /app/certbug.py
from deltalake import DeltaTable

s3path = "s3://<enter S3 bucket>/<existing deltatable>/"
dt = DeltaTable(s3path)
print(f"version: {dt.version()}")
EOF

ENTRYPOINT ["python", "-u", "certbug.py"]

Put values from your environment for <enter S3 bucket>/<existing deltatable>.

Build it:

docker build -t klauss42/deltalake-certbug:0.19.2 --build-arg DL_VERSION=0.19.2 .
docker push klauss42/deltalake-certbug:0.19.2

docker build -t klauss42/deltalake-certbug:0.20.0 --build-arg DL_VERSION=0.20.0 .
docker push klauss42/deltalake-certbug:0.20.0

Put a valid Docker registry instead of klauss42

k8s job manifests:

apiVersion: batch/v1
kind: Job
metadata:
  name: certbug-v19
  namespace: default
  labels:
    app: certbug
spec:
  backoffLimit: 0
  template:
    metadata:
      labels:
        app: certbug
    spec:
      serviceAccountName: <service account with permissions to access S3 bucket>
      containers:
        - name: certbug
          image: klauss42/deltalake-certbug:0.19.2
      restartPolicy: Never
---
apiVersion: batch/v1
kind: Job
metadata:
  name: certbug-v20
  namespace: default
  labels:
    app: certbug
spec:
  backoffLimit: 0
  template:
    metadata:
      labels:
        app: certbug
    spec:
      serviceAccountName: <service account with permissions to access S3 bucket>
      containers:
        - name: certbug
          image: klauss42/deltalake-certbug:0.20.0
      restartPolicy: Never

Put a valid serviceAccount instead of <service account with permissions to access S3 bucket> and adjust Docker registry accordingly.

Deploy it:

kubectl apply -f job.yaml

Result

The job using v0.19.2 logs:

version: 14

The job using v0.20.0 logs:

[2025-02-05T18:47:31Z WARN  aws_config::web_identity_token] STS returned an error assuming web identity role error=dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer certificate: BadSignature (DispatchFailure(DispatchFailure { so
[2025-02-05T18:47:31Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=WebIdentityToken error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid p
[2025-02-05T18:47:31Z WARN  aws_config::meta::credentials::chain] provider failed to provide credentials provider=DefaultChain error=an error occurred while loading credentials: dispatch failure: io error: error trying to connect: invalid peer certificate: BadSignature: invalid peer
Traceback (most recent call last):
  File "/app/certbug.py", line 18, in <module>
    dt = DeltaTable(s3path)
  File "/usr/local/lib/python3.10/site-packages/deltalake/table.py", line 412, in __init__
    self._table = RawDeltaTable(
OSError: Operation not supported: an error occurred while loading credentials

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Feb 5, 2025

Can you compile the branch of the commit of 0.19.2 release, and check if you see it still working?

If it does, you should compile 0.20.0 commit id and see if it fails. If it does then compare the cargo trees between both of the release commits.

This probably will give a hint of what has changed between these two versions in terms of rustls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants