Skip to content

streamingccl: data race from Subscription.Subscribe #83694

@stevendanna

Description

@stevendanna

Describe the problem

When attempting to remove skip.UnderRace from TestPartitionedStreamReplicationClient, I hit:

==================
WARNING: DATA RACE
Write at 0x00c00146e328 by goroutine 714:
  runtime.slicecopy()
      GOROOT/src/runtime/slice.go:284 +0x0
  bufio.(*Reader).Read()
      GOROOT/src/bufio/bufio.go:238 +0x6c4
  io.ReadAtLeast()
      GOROOT/src/io/io.go:328 +0xdd
  io.ReadFull()
      GOROOT/src/io/io.go:347 +0x324
  github.com/lib/pq.(*conn).recvMessage()
      github.com/lib/pq/external/com_github_lib_pq/conn.go:1018 +0x2bd
  github.com/lib/pq.(*conn).recv1Buf()
      github.com/lib/pq/external/com_github_lib_pq/conn.go:1059 +0x3a
  github.com/lib/pq.(*rows).Next()
      github.com/lib/pq/external/com_github_lib_pq/conn.go:1573 +0x20e
  github.com/lib/pq.(*rows).Close()
      github.com/lib/pq/external/com_github_lib_pq/conn.go:1530 +0xc4
  database/sql.(*Rows).close.func1()
      GOROOT/src/database/sql/sql.go:3280 +0x57
  database/sql.withLock()
      GOROOT/src/database/sql/sql.go:3396 +0xb5
  database/sql.(*Rows).close()
      GOROOT/src/database/sql/sql.go:3279 +0x24d
  database/sql.(*Rows).awaitDone()
      GOROOT/src/database/sql/sql.go:2933 +0x10b
  database/sql.(*Rows).initContextClose·dwrap·27()
      GOROOT/src/database/sql/sql.go:2917 +0x71

Previous read at 0x00c00146e328 by goroutine 478:
  runtime.slicecopy()
      GOROOT/src/runtime/slice.go:284 +0x0
  database/sql.cloneBytes()
      GOROOT/src/database/sql/convert.go:494 +0x38c
  database/sql.convertAssignRows()
      GOROOT/src/database/sql/convert.go:261 +0x357
  database/sql.(*Rows).Scan()
      GOROOT/src/database/sql/sql.go:3246 +0x526
  github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient.(*partitionedStreamSubscription).Subscribe.func1()
      github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient/partitioned_stream_client.go:304 +0x104
  github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient.(*partitionedStreamSubscription).Subscribe()
      github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient/partitioned_stream_client.go:316 +0x4a1
  github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient.Subscription.Subscribe-fm()
      github.com/cockroachdb/cockroach/pkg/ccl/streamingccl/streamclient/client.go:113 +0x6b
  github.com/cockroachdb/cockroach/pkg/util/ctxgroup.Group.GoCtx.func1()
      github.com/cockroachdb/cockroach/pkg/util/ctxgroup/ctxgroup.go:169 +0x4c
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      golang.org/x/sync/errgroup/external/org_golang_x_sync/errgroup/errgroup.go:74 +0x91

This looks to me like the race is actually in the lib/pq driver in which it is accessing the rows buffer in its asynchronous context cancellation handling at the same time as accessing it from Scan.

See also:

Jira issue: CRDB-17212

Metadata

Metadata

Assignees

Labels

A-tenant-streamingIncluding cluster streamingC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions