Skip to content

db: possible obsolete file bug #5420

@jbowens

Description

@jbowens

In Open we scan for obsolete files to delete:

pebble/open.go

Lines 592 to 602 in a5edd16

if !d.opts.ReadOnly {
// Get a fresh list of files, in case some of the earlier flushes/compactions
// have deleted some files.
ls, err := opts.FS.List(dirname)
if err != nil {
return nil, err
}
d.scanObsoleteFiles(ls, flushableIngests)
d.deleteObsoleteFiles(jobID)
}
// Else, nothing is obsolete.

But this suspiciously appears to be after we may already have scheduled asynchronous flushes and compactions:

pebble/open.go

Lines 515 to 535 in a5edd16

// Register with the CompactionScheduler before calling
// d.maybeScheduleFlush, since completion of the flush can trigger
// compactions.
d.compactionScheduler.Register(2, d)
if !d.opts.ReadOnly {
d.maybeScheduleFlush()
for d.mu.compact.flushing {
d.mu.compact.cond.Wait()
}
// Create an empty .log file for the mutable memtable.
newLogNum := d.mu.versions.getNextDiskFileNum()
d.mu.log.writer, err = d.mu.log.manager.Create(wal.NumWAL(newLogNum), int(jobID))
if err != nil {
return nil, err
}
// This isn't strictly necessary as we don't use the log number for
// memtables being flushed, only for the next unflushed memtable.
d.mu.mem.queue[len(d.mu.mem.queue)-1].logNum = newLogNum
}

Isn't it possible for this scan to find a file that was created by an in-flight flush or compaction, determine it's obsolete and delete it?

Jira issue: PEBBLE-1213

Metadata

Metadata

Assignees

Labels

A-storageC-bugSomething isn't workingP-2Issues/test failures with a fix SLA of 3 monthsT-storage

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions