Skip to content

Deadlock on ingest attachment nodes during log rollover  #131404

@Rassyan

Description

@Rassyan

Elasticsearch Version

8.16.1

Installed Plugins

No response

Java Version

bundled

OS Version

Problem Description

Summary

The issue observed here is exactly the same as in #91964 and remains unresolved. There is a reproducible Java-level deadlock in Elasticsearch ingest attachment nodes during log rollover. This deadlock causes ingest nodes to hang indefinitely, and is especially likely to occur under heavy ingest/logging load. After reviewing #93878 discussions, it is clear that the community has focused on logging loss, logging bridges, and permission issues related to log4j, but has not identified or addressed the potential for a JVM-level deadlock caused by log4j2's RollingFileManager and BufferedWriter lock cycle.

Details

  • Elasticsearch version: 8.16.1
  • log4j2 version: 2.19.0 (bundled)
  • Node type: Dedicated ingest node

jstack evidence

Below is a minimal excerpt from a real production jstack, showing the deadlock:

Found one Java-level deadlock:
=============================
"elasticsearch[1751529376016684232][cluster_coordination][T#1]":
  waiting to lock monitor 0x00007f3f214190e0 (object 0x00000010012ae8f0, a org.apache.logging.log4j.core.appender.rolling.RollingFileManager),
  which is held by "elasticsearch[1751529376016684232][write][T#16]"

"elasticsearch[1751529376016684232][write][T#16]":
  waiting to lock monitor 0x00007f3f02e4a000 (object 0x0000001011305730, a java.io.BufferedWriter),
  which is held by "elasticsearch[1751529376016684232][write][T#15]"

"elasticsearch[1751529376016684232][write][T#15]":
  waiting to lock monitor 0x00007f3f214190e0 (object 0x00000010012ae8f0, a org.apache.logging.log4j.core.appender.rolling.RollingFileManager),
  which is held by "elasticsearch[1751529376016684232][write][T#16]"

Impact

  • Ingest nodes become permanently stuck, requiring a process restart.
  • All ingest pipelines and possibly cluster coordination are affected.

Attachments

  • Full jstack available upon request.

Steps to Reproduce

  1. Use attachment processor
  2. Almost always during log rollover under ingest load

As #93878 (comment) said

Logs (if relevant)

jstack.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions