Skip to content

Bug: #Predicate macro incorrectly fails for ‘final’ classes #1173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bdkjones opened this issue Feb 14, 2025 · 7 comments
Open

Bug: #Predicate macro incorrectly fails for ‘final’ classes #1173

bdkjones opened this issue Feb 14, 2025 · 7 comments
Assignees

Comments

@bdkjones
Copy link

bdkjones commented Feb 14, 2025

Summary

I've discovered that Foundation's #Predicate fails when a class is marked as final. Here's a very simple test case:

final class Master {
    var id: UUID = UUID()
}

let u = UUID()
let p = #Predicate<Master>{ $0.id == u }

That expands to:

Foundation.Predicate<Master>({
    PredicateExpressions.build_Equal(
        lhs: PredicateExpressions.build_KeyPath(
            root: PredicateExpressions.build_Arg($0),
            keyPath: \.id
        ),
        rhs: PredicateExpressions.build_Arg(u)
    )
})

It compiles correctly, but at runtime crashes on the build_KeyPath line: Thread 1: Fatal error: Predicate does not support keypaths with multiple components

This is obviously not correct; there is only one component in the KeyPath. If you remove final from the class declaration, it works fine.

I see where this check exists in Foundation (

func _validateForPredicateUsage(restrictArguments: Bool = false) {
var ptr = unsafeBitCast(self, to: UnsafeRawPointer.self)
ptr = ptr.advanced(by: Self.WORD_SIZE * 3) // skip isa, type metadata, and KVC string pointers
let header = ptr.load(as: UInt32.self)
ptr = ptr.advanced(by: Self.WORD_SIZE)
let firstComponentHeader = ptr.load(as: UInt32.self)
switch firstComponentHeader._keyPathComponentHeader_kind {
case 1: // struct/tuple/self stored property
fallthrough
case 3: // class stored property
// Key paths to stored properties are only single-component if MemoryLayout.offset(of:) returns an offset
func project<T>(_: T.Type) -> Bool {
_keyPathOffset(T.self, self) == nil
}
if _openExistential(Self.rootType, do: project) {
fatalError("Predicate does not support keypaths with multiple components")
}
case 2: // computed
var componentWords = 3
if firstComponentHeader._keyPathComponentHeader_computedIsSettable {
componentWords += 1
}
if firstComponentHeader._keyPathComponentHeader_computedHasArguments {
if restrictArguments {
fatalError("Predicate does not support keypaths with arguments")
}
let capturesSize = ptr.advanced(by: Self.WORD_SIZE * componentWords).load(as: UInt.self)
componentWords += 2 + (Int(capturesSize) / Self.WORD_SIZE)
}
if header._keyPathHeader_bufferSize > (Self.WORD_SIZE * componentWords) {
fatalError("Predicate does not support keypaths with multiple components")
}
case 4: // optional chain
fatalError("Predicate does not support keypaths with optional chaining/unwrapping")
default: // unknown keypath component
fatalError("Predicate does not support this type of keypath (\(firstComponentHeader._keyPathComponentHeader_kind))")
}
}
}
) but I don't see why final should affect whether a UUID property has an offset.

Any Value

The behavior is not limited to UUID. I can reproduce it with a String, Int, and other values.

Speculation

I can’t find any information on how the presence of final affects the memory layout of a class such that offset(of:) would fail. I thought maybe the issue affects only types that are capable of storing their value without allocation (i.e. tagged pointers), but again I can’t see how final would affect that.

Intermittent

IF a class has a certain combination of properties, this issue does not always manifest. For example, Predicates ran just fine here:

final class Master
{
    var title: String = “”
    var numbers: [Int] = []
    var isEnabled: Bool = false
    var percentage: Double = 57.5
}

But add var id: UUID = UUID() and that fails until final is removed. Given the offset(of:) check, I figured the issue might have something to do with the exact way properties are aligned and packed into the storage for the class. But that’s just a guess.

Environment

Xcode 16.2.

Since #Predicate isn’t available in Playgrounds, I created a new project from the “command line application” template in Xcode. I changed no build settings.

@jmschonfeld jmschonfeld self-assigned this Feb 14, 2025
@jmschonfeld
Copy link
Contributor

I think this and #1174 are both a symptom of MemoryLayout.offset(of:) not being a suitable check for single component key paths as it can fail for unrelated reasons (or pass for multi-component key paths). It works most of the time for structs but it fails incorrectly for some classes and passes incorrectly for nested structs. We need to transition away from offset(of:) and instead perform a robust check - I started using offset(of:) because the pre-existing bit-checking code was fragile in the face of changes in the stdlib but I think that's still more reliable than offset(of:).

The real solution is that now that swiftlang/swift#70451 is merged, we should start using the _SwiftKeyPathBufferHeader_IsSingleComponentFlag flag to validate this. However, that's only landed on main so it won't be in the runtime until that version of Swift is released. We'll need to switch to something else in the meantime - probably by inspecting the bytes of the keypath manually again

@bdkjones
Copy link
Author

@jmschonfeld Ah, that might explain why the behavior is intermittent depending on how many (and what kind) of properties a class has. Glad to hear there's an idea for a workaround.

@bdkjones
Copy link
Author

Slightly unrelated, but is it possible to consider a design that eliminates this restriction altogether? That is, could Predicate be expanded to handle multi-component KeyPaths and even Optional chaining?

Context:

MongoDB recently killed Realm, which is why I’m now forced to build a new data framework on top of Couchbase, which offers a multi-user, offline-first sync.

Realm’s “Swift Query Syntax” was magical. It was concise and allowed easily navigating relationships. The closures below got rewritten to NSPredicates. The code is open source on GitHub. An example query across relationships:

let objects: [Foo] = someRealm.objects(Foo.self, where: { $0.parentObject?.nextParent?.someProperty == “blah” }).sortedBy({ $0.title, .ascending })

This would be a LOT more work than merely fixing the offset(of:) issue, but it would leave Foundation with an extremely natural, capable predicate syntax.

@jmschonfeld
Copy link
Contributor

Slightly unrelated, but is it possible to consider a design that eliminates this restriction altogether?

In a future world where we have a lot more APIs around KeyPath, potentially, but not today. The main problems with allowing multi-component key paths or key paths with extra components like optional chaining are:

  • Serialization: the PredicateCodableConfiguration type which lists all allowed key paths to be serialized finds the identifier to serialize for a keypath using KeyPath's Equatable conformance. This works well when all key paths must be a single component and are only property accessors, but breaks when multiple components exist. For example, the configuration might have allowed the \Employee.officeAddress and the \Address.streetNumber key paths, but if the predicate itself contained \Employee.officeAddress.streetNumber then serialization would fail since that keypath wasn't allowed even if its components were
  • Conversion to alternate formats: similar to the serialization issue, when walking a predicate tree and encountering a PredicateExpressions.KeyPath, there is little you can do with a multi-component keypath since you can't compare it to known key paths (by using the Equatable conformance) and Foundation wouldn't be able to expose APIs like the kind property since a single KeyPath can represent both the property referencing an array and the count of the array itself

If in the future we had robust APIs for enumerating the components of a keypath and splitting a keypath into multiple key paths representing each component then perhaps it could be possible since then the implementation of Predicate could break key paths down into single-component key paths where necessary. But without those APIs available to both Foundation and third party developers that walk predicate trees, multi-component key paths become a bit too problematic when doing anything other than in-memory evaluation 🙁

@bdkjones
Copy link
Author

@jmschonfeld Ah; makes sense. I've likewise found KeyPath to be frustratingly limited when I need to do dynamic things with it (other than parsing PredicateExpressions, which is a recent rabbit hole for me.)

I guess replacing offset(of:) will have to suffice!

@bdkjones
Copy link
Author

bdkjones commented Feb 20, 2025

I'm not sure what defines a "multi-component" KeyPath. For example, I was surprised to see that this works:

class Alpha {
    var id: UUID = UUID()
    var child: Bravo = Bravo()
}

class Bravo {
    var id: UUID = UUID()
}


let someID: UUID = UUID()

// No Problem:
let x = #Predicate<Alpha>{ $0.child.id == someID }

The predicate ran fine, filtered a Collection, etc. But before I build a whole framework on top of this, I'd like to verify that this is indeed valid and not just a false-negative from offset(of:), as described above. In other words, it's not going to suddenly be an error when this bug is fixed?

Or is "multi-component" something more convoluted like: someFoo[keyPath: \Person.pets][0][keyPath: \Pet.name]

@jmschonfeld
Copy link
Contributor

That predicate is indeed valid, but that's due to how the predicate macro handles key paths. Rather than turning $0.child.id into a single KeyPath operator with \.child.id, the #Predicate macro actually turns this into two keypath operators: one which applies the \.child keypath to the $0 variable, and the other which takes that keypath operator as an input and applies the second \.id keypath. In that case, each individual keypath is indeed a single component and so that way when converting key paths you still only need to worry about \Alpha.child and \Bravo.id instead of also worrying about multi-component key paths like \Alpha.child.id. If you expand the macro, you should see it expanding to two calls to build_KeyPath which is how this still results in all single-component key paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants