Skip to content

apache-airflow-providers-fab 3.4.0: Intermittent OperationalError (4031) when FAB session reuses a MySQL connection dropped due to idle timeout #62903

@davidjfrickert

Description

@davidjfrickert

Apache Airflow Provider(s)

fab

Versions of Apache Airflow Providers

apache-airflow-providers-fab==3.4.0

Apache Airflow version

3.1.7

Operating System

Docker

Deployment

Official Apache Airflow Helm Chart

Deployment details

API Server config to allow okta login. Otherwise pretty standard. Snippet below:

extraEnvFrom: |
  - secretRef:
      name: airflow-okta-creds
// (...)
data:
  metadataSecretName: airflow-db-conn
  resultBackendSecretName: airflow-db-conn
// (...)
env:
  - name: AIRFLOW__API__EXPOSE_CONFIG
    value: "False"
  - name: AIRFLOW__API__BASE_URL
    value: "<REDACTED>"
  - name: AIRFLOW__EMAIL__EMAIL_BACKEND
    value: "airflow.utils.email.send_email_smtp"
  - name: AIRFLOW__SMTP__SMTP_HOST
    value: "<REDACTED>"
  - name: AIRFLOW__SMTP__SMTP_MAIL_FROM
    value: "<REDACTED"
  - name: AIRFLOW__SMTP__SMTP_STARTTLS
    value: "False"
  - name: AIRFLOW__WEBSERVER__SHOW_TRIGGER_FORM_IF_NO_PARAMS
    value: "True"
  - name: AIRFLOW__WEBSERVER__WARN_DEPLOYMENT_EXPOSURE
    value: "False"
  - name: AIRFLOW__CORE__PARALLELISM
    value: "256"  
## Invalidate pool connections idle for more than 300s.
## Set below MySQL's default wait_timeout (600s) to avoid using server-closed connections.
## A lower value also speeds up recovery when the max DB connection limit is exceeded.
## NOTE: Does not apply to FAB since it does not use pool, but it uses single shared session.
  - name: AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_RECYCLE
    value: "300"
// (...)
apiServer:
  apiServerConfig: |
    import os
    from flask_appbuilder.security.manager import AUTH_OAUTH

    AUTH_TYPE = AUTH_OAUTH

    # registration configs
    AUTH_USER_REGISTRATION = True  # allow users who are not already in the FAB DB

    # the list of providers which the user can choose from
    OAUTH_PROVIDERS = [
      {
          "name": "okta",
          "icon": "fa-circle-o",
          "token_key": "access_token",
          "remote_app": {
              "client_id": os.environ["OKTA_CLIENT_ID"],
              "client_secret": os.environ["OKTA_CLIENT_SECRET"],
              "api_base_url": "<REDACTED>",
              "client_kwargs": {"scope": "openid profile email groups"},
              "server_metadata_url": "<REDACTED>",
              "access_token_url": "<REDACTED>",
              "authorize_url": "<REDACTED>",
          },
      },
    ]

    # a mapping from the values of `userinfo["role_keys"]` to a list of FAB roles
    AUTH_ROLES_MAPPING = {
        "<REDACTED>": ["Viewer"],
        "<REDACTED>": ["Admin"],
    }

    # if we should replace ALL the user's roles each login, or only on registration
    AUTH_ROLES_SYNC_AT_LOGIN = True

    # force users to re-auth after 30min of inactivity (to keep roles in sync)
    PERMANENT_SESSION_LIFETIME = 1800

What happened

MySQL server drops idle connections after 600s of inactivity. When this happens, FAB auth fails with error 500.

What you think should happen instead

This code should use connection pooling and allow the user to configure pool recycle via AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_RECYCLE.
The auth flow should not return error 500 and instead either idle sessions are dropped client-side via config above or server dropped sessions are handled gracefully.

How to reproduce

  • Deploy Airflow with MySQL and FAB plugin to Okta
  • Let connection be dropped by MySQL
  • Try authenticating --> ERROR 500
  • Try authenticating again --> OK

Anything else

Docker image built on 3.1.7 tag of this repo + adding apache-airflow-providers-fab==3.4.0 instead of current version in 3.1.7.

Full error log:

gh-issue-fab-3.4.0-mysql-disconnect-log.txt

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions