Skip to content

Potential deadlock from GraphListenerAdapterToPipelineListener #700

@Dohbedoh

Description

@Dohbedoh

Jenkins and plugins versions report

Environment
* core: 2.387.3
* opentelemetry:2.11.0
* workflow-api:1208.v0cc7c6e0da_9e
* workflow-cps:3653.v07ea_433c90b_4
* workflow-job:1289.1291.vb_7c188e7e7df
* workflow-support:839.v35e2736cfd5c

Issue

Jenkins shows a deadlock between open-telemetry GraphListenerAdapterToPipelineListener and the Jenkins initialization thread:

==============
Deadlock Found
==============
"Jenkins initialization thread" id=50 (0x32) state=BLOCKED cpu=95%
    - waiting to lock <0x23e558fe> (a org.jenkinsci.plugins.workflow.cps.CpsFlowExecution)
      owned by "Running CpsFlowExecution[Owner[myfolder/myjob/17:myfolder/myjob #17]]" id=1246 (0x4de)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.isComplete(CpsFlowExecution.java:1277)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.getCurrentExecutions(CpsFlowExecution.java:1019)
    at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ResumeStepExecutionListener.onResumed(FlowExecutionList.java:267)
    at org.jenkinsci.plugins.workflow.flow.FlowExecutionListener.fireResumed(FlowExecutionListener.java:85)
    at org.jenkinsci.plugins.workflow.job.WorkflowRun.onLoad(WorkflowRun.java:589)
    at hudson.model.RunMap.retrieve(RunMap.java:233)
    at hudson.model.RunMap.retrieve(RunMap.java:61)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:650)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.load(AbstractLazyLoadRunMap.java:632)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:530)
    at hudson.model.RunMap.getById(RunMap.java:213)
    at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.run(WorkflowRun.java:965)
    at org.jenkinsci.plugins.workflow.job.WorkflowRun$Owner.get(WorkflowRun.java:977)
    at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:76)
    at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$1.computeNext(FlowExecutionList.java:68)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:146)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:141)
    at org.jenkinsci.plugins.workflow.flow.FlowExecutionList$ItemListenerImpl.onLoaded(FlowExecutionList.java:197)
    at jenkins.model.Jenkins.<init>(Jenkins.java:1040)
    at hudson.model.Hudson.<init>(Hudson.java:86)
    at hudson.model.Hudson.<init>(Hudson.java:82)
    at hudson.WebAppMain$3.run(WebAppMain.java:247)

"Running CpsFlowExecution[Owner[myfolder/myjob/17:myfolder/myjob #17]]" id=1246 (0x4de) state=BLOCKED cpu=57%
    - waiting to lock <0x76460394> (a hudson.model.RunMap)
      owned by "Jenkins initialization thread" id=50 (0x32)
    at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:519)
    at jenkins.model.lazy.LazyBuildMixIn.getBuildByNumber(LazyBuildMixIn.java:237)
    at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:234)
    at org.jenkinsci.plugins.workflow.job.WorkflowJob.getBuildByNumber(WorkflowJob.java:105)
    at hudson.model.Run.fromExternalizableId(Run.java:2498)
    at org.jenkinsci.plugins.workflow.support.steps.build.DownstreamFailureCause.getDownstreamBuild(DownstreamFailureCause.java:47)
    at org.jenkinsci.plugins.workflow.support.steps.build.DownstreamFailureCause.getShortDescription(DownstreamFailureCause.java:63)
    at io.jenkins.plugins.opentelemetry.job.MonitoringPipelineListener.lambda$endCurrentSpan$11(MonitoringPipelineListener.java:308)
    at io.jenkins.plugins.opentelemetry.job.MonitoringPipelineListener$$Lambda$1490/0x00000008425bd440.apply(Unknown Source)
    at [email protected]/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at [email protected]/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at [email protected]/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at [email protected]/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at [email protected]/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
    at [email protected]/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at [email protected]/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
    at io.jenkins.plugins.opentelemetry.job.MonitoringPipelineListener.endCurrentSpan(MonitoringPipelineListener.java:308)
    at io.jenkins.plugins.opentelemetry.job.MonitoringPipelineListener.onEndParallelStepBranch(MonitoringPipelineListener.java:291)
    at io.jenkins.plugins.opentelemetry.job.jenkins.GraphListenerAdapterToPipelineListener.fireOnAfterEndParallelStepBranch(GraphListenerAdapterToPipelineListener.java:129)
    at io.jenkins.plugins.opentelemetry.job.jenkins.GraphListenerAdapterToPipelineListener.onNewHead(GraphListenerAdapterToPipelineListener.java:64)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.notifyListeners(CpsFlowExecution.java:1588)
    at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.notifyNewHead(CpsThreadGroup.java:492)
    at org.jenkinsci.plugins.workflow.cps.FlowHead.setNewHead(FlowHead.java:158)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.onProgramEnd(CpsFlowExecution.java:1297)
    at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution.croak(CpsFlowExecution.java:882)
    at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.reportProblem(CpsVmExecutorService.java:57)
    at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:69)
    at [email protected]/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
    at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
    at jenkins.util.ErrorLoggingExecutorService$$Lambda$1086/0x0000000841d64840.run(Unknown Source)
    at [email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at [email protected]/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at [email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at [email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at [email protected]/java.lang.Thread.run(Thread.java:829)

Could be due to the usage of a synchronous GraphListener:

What Operating System are you using (both controller, and any agents involved in the problem)?

  • JVM Implementation

    • Name: OpenJDK 64-Bit Server VM
    • Vendor: Red Hat, Inc.
    • Version: 11.0.19+7-LTS
  • Operating system

    • Name: Linux
    • Architecture: amd64
    • Version: 3.10.0-1160.92.1.el7.x86_64

Reproduction steps

No consistently reproducible. But basically:

  • launch many many resumable pipelines
  • restart Jenkins

Expected Results

No deadlocks

Actual Results

deadlock

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions