-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Lucene index scrubbing of missing entries #3009
Conversation
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
b754ff7
to
8e23ece
Compare
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
0f8f0d8
to
df17ad5
Compare
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Outdated
Show resolved
Hide resolved
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Outdated
Show resolved
Hide resolved
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Show resolved
Hide resolved
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Show resolved
Hide resolved
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Outdated
Show resolved
Hide resolved
...rc/main/java/com/apple/foundationdb/record/lucene/codec/PrimaryKeyAndStoredFieldsWriter.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Show resolved
Hide resolved
Result of fdb-record-layer-pr on Linux CentOS 7
|
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
|
||
try (final FDBRecordContext context = openContext()) { | ||
// Write some documents | ||
dataModel.saveRecords(15, 1007, context, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only save to group 1. This helps to ensure that you have multiple partitions, but means you aren't testing whether it scrubs all the groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
@@ -32,13 +37,25 @@ | |||
* the test execution. | |||
*/ | |||
public class MockedLuceneIndexMaintainer extends LuceneIndexMaintainer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is very lucene specific. It's sufficient for the test in question, but if you replaced the index maintainer entirely by a NoOp maintainer, the same process could be used for other scrubbing tests, without as much work.
Not something needed here, but something to keep in mind as you look into additional scrubbing (other index types, or dangling)
*/ | ||
public class LuceneIndexScrubbingToolsMissing implements IndexScrubbingTools<FDBStoredRecord<Message>> { | ||
public class LuceneIndexScrubbingToolsMissing extends ValueIndexScrubbingToolsMissing { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with your comment that extending ValueIndexScrubbingToolsMissing
feels wrong.
Perhaps a BaseScrubbingToolsMissing
would make sense, although I think the only method that actually should be shared is getCursor
, so probably a utility class would make more sense.
Leaving as you have it until we have a third Missing
implementation also seems reasonable, as it might also align with better abstracting synthetic records in general for use across scrubbing, indexing and IndexMaintenance.
...ene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingToolsMissing.java
Outdated
Show resolved
Hide resolved
public CompletableFuture<Pair<MissingIndexReason, Tuple>> detectMissingIndexKeys(FDBStoredRecord<Message> rec) { | ||
// return the first missing (if any). | ||
@SuppressWarnings("PMD.CloseResource") | ||
private CompletableFuture<Pair<MissingIndexReason, Tuple>> detectMissingIndexKeys(final FDBRecordStore store, FDBStoredRecord<Message> rec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice until I went to look at the similarity between this and ValueIndexScrubbingToolsMissing
, but you don't check the index filter before saving. I think that means you will have false-negatives if any index entries are filtered.
It's probably worth adding a boolean field to MyParentRecord
, and adding an additional parameter to saveRecords
to save filtered out records, and change the index definition to filter out anything with that field set to true
(maybe like isHidden
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4b17834
to
0546e22
Compare
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexScrubbingTest.java
Outdated
Show resolved
Hide resolved
dataModel.saveRecords(3, 10, context, 1); | ||
dataModel.saveRecords(2, 20, context, 3); | ||
dataModel.saveRecords(5, 20, context, 4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dose this guarantee that scrubbing scrubs across all partitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I'll add an explicit merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
Added explicit merges.
Added more records.
Reduced partition's high watermark.
Result of fdb-record-layer-pr on Linux CentOS 7
|
Result of fdb-record-layer-pr on Linux CentOS 7
|
c1a9d18
to
8799693
Compare
Resolved conflicts |
To validate Lucene index validity, support "Report Only" scrubbing for: Dangling Lucene index entries: Iterate "all entries" (similar toLuceneScanAllEntriesTest), validate that all pointers lead to existing records. Missing Lucene index entries: iterate all records, validate that their primary keys are represented in the “primary key to Lucene segment” map, and that the Lucene segment exists
4f275e8
to
016b254
Compare
To validate Lucene index validity, support "Report Only" scrubbing for:
Dangling Lucene index entries: Iterate "all entries" (similar toLuceneScanAllEntriesTest), validate that all pointers lead to existing records.
Missing Lucene index entries: iterate all records, validate that their primary keys are represented in the “primary key to Lucene segment” map, and that the Lucene segment exists
This resolves #3008