Skip to content

Commit e2fa131

Browse files
mdznrxwunatecook1000
authored
Add partitioned(by:) (#152)
* Add `partitioned(_:)` `partitioned(_:)` works like `filter(_:)`, but also returns the excluded elements by returning a tuple of two `Array`s * For collections with fewer than 8 elements, use the `Sequence`-based implementation This constant was determined using benchmarking. More information: #152 (comment) * Remove check for collections fewer than 8 elements * Make `_partitioned` `internal` * Prefer `Array` over `ContiguousArray` * Document `partitioned(_:)` on `Collection` * Remove `partitioned(upTo:)` * Remove `_tupleMap` * Remove `_partitioned` and use it inline (since it’s no longer used by the `Collection` implementation * Remove unnecessary conversation of `Array` to `Array` * Correct indentation Co-authored-by: Xiaodi Wu <[email protected]> * Consistent syntax Co-authored-by: Xiaodi Wu <[email protected]> * Add an external `by:` label to `partitioned` * Add labels to returned tuple `falseElements`, `trueElements` * Correct function signature * Rename `belongsInSecondCollection` parameter name to simply `predicate` The parameter name was potentially confusing. Unlike the other `partition` functions, this function can rely on its named tuple to clarify its behavior. * Update copyright information Co-authored-by: Nate Cook <[email protected]> * Update documentation Co-authored-by: Nate Cook <[email protected]> * Update comment * Add a precondition to ensure that the count matches up with the number of actual elements found while iterating Co-authored-by: Xiaodi Wu <[email protected]> Co-authored-by: Nate Cook <[email protected]>
1 parent aeead5a commit e2fa131

File tree

4 files changed

+196
-2
lines changed

4 files changed

+196
-2
lines changed

Guides/Partition.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,20 @@ let p = numbers.partitioningIndex(where: { $0.isMultiple(of: 20) })
4242
// numbers[p...] = [20, 40, 60]
4343
```
4444

45+
The standard library’s existing `filter(_:)` method provides functionality to
46+
get the elements that do match a given predicate. `partitioned(by:)` returns
47+
both the elements that match the predicate as well as those that don’t, as a
48+
tuple.
49+
50+
```swift
51+
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
52+
let (longNames, shortNames) = cast.partitioned(by: { $0.count < 5 })
53+
print(longNames)
54+
// Prints "["Vivien", "Marlon"]"
55+
print(shortNames)
56+
// Prints "["Kim", "Karl"]"
57+
```
58+
4559
## Detailed Design
4660

4761
All mutating methods are declared as extensions to `MutableCollection`.
@@ -69,11 +83,17 @@ extension Collection {
6983
where belongsInSecondPartition: (Element) throws -> Bool
7084
) rethrows -> Index
7185
}
86+
87+
extension Sequence {
88+
public func partitioned(
89+
by predicate: (Element) throws -> Bool
90+
) rethrows -> (falseElements: [Element], trueElements: [Element])
91+
}
7292
```
7393

7494
### Complexity
7595

76-
The existing partition is an O(_n_) operations, where _n_ is the length of the
96+
The existing partition is an O(_n_) operation, where _n_ is the length of the
7797
range to be partitioned, while the stable partition is O(_n_ log _n_). Both
7898
partitions have algorithms with improved performance for bidirectional
7999
collections, so it would be ideal for those to be customization points were they
@@ -82,6 +102,9 @@ to eventually land in the standard library.
82102
`partitioningIndex(where:)` is a slight generalization of a binary search, and
83103
is an O(log _n_) operation for random-access collections; O(_n_) otherwise.
84104

105+
`partitioned(by:)` is an O(_n_) operation, where _n_ is the number of elements
106+
in the original sequence.
107+
85108
### Comparison with other languages
86109

87110
**C++:** The `<algorithm>` library defines `partition`, `stable_partition`, and

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Read more about the package, and the intent behind it, in the [announcement on s
2727
#### Subsetting operations
2828

2929
- [`compacted()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Compacted.md): Drops the `nil`s from a sequence or collection, unwrapping the remaining elements.
30+
- [`partitioned(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Partition.md): Returns the elements in a sequence or collection that do and do not match a given predicate.
3031
- [`randomSample(count:)`, `randomSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection.
3132
- [`randomStableSample(count:)`, `randomStableSample(count:using:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/RandomSampling.md): Randomly selects a specific number of elements from a collection, preserving their original relative order.
3233
- [`striding(by:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Stride.md): Returns every nth element of a collection.

Sources/Algorithms/Partition.swift

Lines changed: 135 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
//
33
// This source file is part of the Swift Algorithms open source project
44
//
5-
// Copyright (c) 2020 Apple Inc. and the Swift project authors
5+
// Copyright (c) 2020-2021 Apple Inc. and the Swift project authors
66
// Licensed under Apache License v2.0 with Runtime Library Exception
77
//
88
// See https://swift.org/LICENSE.txt for license information
@@ -204,3 +204,137 @@ extension Collection {
204204
}
205205
}
206206

207+
//===----------------------------------------------------------------------===//
208+
// partitioned(by:)
209+
//===----------------------------------------------------------------------===//
210+
211+
extension Sequence {
212+
/// Returns two arrays containing, in order, the elements of the sequence that
213+
/// do and don’t satisfy the given predicate.
214+
///
215+
/// In this example, `partitioned(by:)` is used to separate the input based on
216+
/// whether a name is shorter than five characters:
217+
///
218+
/// let cast = ["Vivien", "Marlon", "Kim", "Karl"]
219+
/// let (longNames, shortNames) = cast.partitioned(by: { $0.count < 5 })
220+
/// print(longNames)
221+
/// // Prints "["Vivien", "Marlon"]"
222+
/// print(shortNames)
223+
/// // Prints "["Kim", "Karl"]"
224+
///
225+
/// - Parameter predicate: A closure that takes an element of the sequence as
226+
/// its argument and returns a Boolean value indicating whether the element
227+
/// should be included in the second returned array. Otherwise, the element
228+
/// will appear in the first returned array.
229+
///
230+
/// - Returns: Two arrays with all of the elements of the receiver. The
231+
/// first array contains all the elements that `predicate` didn’t allow, and
232+
/// the second array contains all the elements that `predicate` allowed.
233+
///
234+
/// - Complexity: O(*n*), where *n* is the length of the sequence.
235+
@inlinable
236+
public func partitioned(
237+
by predicate: (Element) throws -> Bool
238+
) rethrows -> (falseElements: [Element], trueElements: [Element]) {
239+
var lhs = [Element]()
240+
var rhs = [Element]()
241+
242+
for element in self {
243+
if try predicate(element) {
244+
rhs.append(element)
245+
} else {
246+
lhs.append(element)
247+
}
248+
}
249+
250+
return (lhs, rhs)
251+
}
252+
}
253+
254+
extension Collection {
255+
/// Returns two arrays containing, in order, the elements of the collection
256+
/// that do and don’t satisfy the given predicate.
257+
///
258+
/// In this example, `partitioned(by:)` is used to separate the input based on
259+
/// whether a name is shorter than five characters.
260+
///
261+
/// let cast = ["Vivien", "Marlon", "Kim", "Karl"]
262+
/// let (longNames, shortNames) = cast.partitioned(by: { $0.count < 5 })
263+
/// print(longNames)
264+
/// // Prints "["Vivien", "Marlon"]"
265+
/// print(shortNames)
266+
/// // Prints "["Kim", "Karl"]"
267+
///
268+
/// - Parameter predicate: A closure that takes an element of the collection
269+
/// as its argument and returns a Boolean value indicating whether the element
270+
/// should be included in the second returned array. Otherwise, the element
271+
/// will appear in the first returned array.
272+
///
273+
/// - Returns: Two arrays with all of the elements of the receiver. The
274+
/// first array contains all the elements that `predicate` didn’t allow, and
275+
/// the second array contains all the elements that `predicate` allowed.
276+
///
277+
/// - Complexity: O(*n*), where *n* is the length of the collection.
278+
@inlinable
279+
public func partitioned(
280+
by predicate: (Element) throws -> Bool
281+
) rethrows -> (falseElements: [Element], trueElements: [Element]) {
282+
guard !self.isEmpty else {
283+
return ([], [])
284+
}
285+
286+
// Since collections have known sizes, we can allocate one array of size
287+
// `self.count`, then insert items at the beginning or end of that contiguous
288+
// block. This way, we don’t have to do any dynamic array resizing. Since we
289+
// insert the right elements on the right side in reverse order, we need to
290+
// reverse them back to the original order at the end.
291+
292+
let count = self.count
293+
294+
// Inside of the `initializer` closure, we set what the actual mid-point is.
295+
// We will use this to partition the single array into two.
296+
var midPoint: Int = 0
297+
298+
let elements = try [Element](
299+
unsafeUninitializedCapacity: count,
300+
initializingWith: { buffer, initializedCount in
301+
var lhs = buffer.baseAddress!
302+
var rhs = lhs + buffer.count
303+
do {
304+
for element in self {
305+
if try predicate(element) {
306+
rhs -= 1
307+
rhs.initialize(to: element)
308+
} else {
309+
lhs.initialize(to: element)
310+
lhs += 1
311+
}
312+
}
313+
314+
precondition(lhs == rhs, """
315+
Collection's `count` differed from the number of elements iterated.
316+
"""
317+
)
318+
319+
let rhsIndex = rhs - buffer.baseAddress!
320+
buffer[rhsIndex...].reverse()
321+
initializedCount = buffer.count
322+
323+
midPoint = rhsIndex
324+
} catch {
325+
let lhsCount = lhs - buffer.baseAddress!
326+
let rhsCount = (buffer.baseAddress! + buffer.count) - rhs
327+
buffer.baseAddress!.deinitialize(count: lhsCount)
328+
rhs.deinitialize(count: rhsCount)
329+
throw error
330+
}
331+
})
332+
333+
let lhs = elements[..<midPoint]
334+
let rhs = elements[midPoint...]
335+
return (
336+
Array(lhs),
337+
Array(rhs)
338+
)
339+
}
340+
}

Tests/SwiftAlgorithmsTests/PartitionTests.swift

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,4 +133,40 @@ final class PartitionTests: XCTestCase {
133133
}
134134
}
135135
}
136+
137+
func testPartitionedWithEmptyInput() {
138+
let input: [Int] = []
139+
140+
let s0 = input.partitioned(by: { _ in return true })
141+
142+
XCTAssertTrue(s0.0.isEmpty)
143+
XCTAssertTrue(s0.1.isEmpty)
144+
}
145+
146+
/// Test the example given in the `partitioned(by:)` documentation
147+
func testPartitionedExample() throws {
148+
let cast = ["Vivien", "Marlon", "Kim", "Karl"]
149+
let (longNames, shortNames) = cast.partitioned(by: { $0.count < 5 })
150+
XCTAssertEqual(longNames, ["Vivien", "Marlon"])
151+
XCTAssertEqual(shortNames, ["Kim", "Karl"])
152+
}
153+
154+
func testPartitionedWithPredicate() throws {
155+
let s0 = ["A", "B", "C", "D"].partitioned(by: { $0 == $0.lowercased() })
156+
let s1 = ["a", "B", "C", "D"].partitioned(by: { $0 == $0.lowercased() })
157+
let s2 = ["a", "B", "c", "D"].partitioned(by: { $0 == $0.lowercased() })
158+
let s3 = ["a", "B", "c", "d"].partitioned(by: { $0 == $0.lowercased() })
159+
160+
XCTAssertEqual(s0.0, ["A", "B", "C", "D"])
161+
XCTAssertEqual(s0.1, [])
162+
163+
XCTAssertEqual(s1.0, ["B", "C", "D"])
164+
XCTAssertEqual(s1.1, ["a"])
165+
166+
XCTAssertEqual(s2.0, ["B", "D"])
167+
XCTAssertEqual(s2.1, ["a", "c"])
168+
169+
XCTAssertEqual(s3.0, ["B"])
170+
XCTAssertEqual(s3.1, ["a", "c", "d"])
171+
}
136172
}

0 commit comments

Comments
 (0)