Skip to content

Commit 9836a4f

Browse files
authored
[Chunked] Add chunks(of:) variant that divides a collection in chunks of a given size (#54)
2 parents 5b7993c + ad8d6ed commit 9836a4f

File tree

4 files changed

+398
-7
lines changed

4 files changed

+398
-7
lines changed

Guides/Chunked.md

+24-6
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,10 @@
44
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/ChunkedTests.swift)]
55

66
Break a collection into subsequences where consecutive elements pass a binary
7-
predicate, or where all elements in each chunk project to the same value.
7+
predicate, or where all elements in each chunk project to the same value.
8+
9+
Also, includes a `chunks(ofCount:)` that breaks a collection into subsequences
10+
of a given `count`.
811

912
There are two variations of the `chunked` method: `chunked(by:)` and
1013
`chunked(on:)`. `chunked(by:)` uses a binary predicate to test consecutive
@@ -26,17 +29,32 @@ let chunks = names.chunked(on: \.first!)
2629
// [["David"], ["Kyle", "Karoy"], ["Nate"]]
2730
```
2831

29-
These methods are related to the [existing SE proposal][proposal] for chunking a
30-
collection into subsequences of a particular size, potentially named something
31-
like `chunked(length:)`. Unlike the `split` family of methods, the entire
32-
collection is included in the chunked result — joining the resulting chunks
33-
recreates the original collection.
32+
The `chunks(ofCount:)` takes a `count` parameter (required to be > 0) and separates
33+
the collection into `n` chunks of this given count. If the `count` parameter is
34+
evenly divided by the count of the base `Collection` all the chunks will have
35+
the count equals to the parameter. Otherwise, the last chunk will contain the
36+
remaining elements.
37+
38+
```swift
39+
let names = ["David", "Kyle", "Karoy", "Nate"]
40+
let evenly = names.chunks(ofCount: 2)
41+
// equivalent to [["David", "Kyle"], ["Karoy", "Nate"]]
42+
43+
let remaining = names.chunks(ofCount: 3)
44+
// equivalent to [["David", "Kyle", "Karoy"], ["Nate"]]
45+
```
46+
47+
The `chunks(ofCount:)` is the method of the [existing SE proposal][proposal].
48+
Unlike the `split` family of methods, the entire collection is included in the
49+
chunked result — joining the resulting chunks recreates the original collection.
3450

3551
```swift
3652
c.elementsEqual(c.chunked(...).joined())
3753
// true
3854
```
3955

56+
Check the [proposal][proposal] detailed design section for more info.
57+
4058
[proposal]: https://github.com/apple/swift-evolution/pull/935
4159

4260
## Detailed Design

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Read more about the package, and the intent behind it, in the [announcement on s
3434

3535
#### Other useful operations
3636

37-
- [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes.
37+
- [`chunked(by:)`, `chunked(on:)`, `chunks(ofCount:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes or chunks of a given count.
3838
- [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements.
3939
- [`trimming(where:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start and end.
4040

Sources/Algorithms/Chunked.swift

+304
Original file line numberDiff line numberDiff line change
@@ -246,3 +246,307 @@ extension Collection {
246246
try chunked(on: projection, by: ==)
247247
}
248248
}
249+
250+
//===----------------------------------------------------------------------===//
251+
// chunks(ofCount:)
252+
//===----------------------------------------------------------------------===//
253+
254+
/// A collection that presents the elements of its base collection
255+
/// in `SubSequence` chunks of any given count.
256+
///
257+
/// A `ChunkedByCount` is a lazy view on the base Collection, but it does not implicitly confer
258+
/// laziness on algorithms applied to its result. In other words, for ordinary collections `c`:
259+
///
260+
/// * `c.chunks(ofCount: 3)` does not create new storage
261+
/// * `c.chunks(ofCount: 3).map(f)` maps eagerly and returns a new array
262+
/// * `c.lazy.chunks(ofCount: 3).map(f)` maps lazily and returns a `LazyMapCollection`
263+
public struct ChunkedByCount<Base: Collection> {
264+
265+
public typealias Element = Base.SubSequence
266+
267+
@usableFromInline
268+
internal let base: Base
269+
270+
@usableFromInline
271+
internal let chunkCount: Int
272+
273+
@usableFromInline
274+
internal var startUpperBound: Base.Index
275+
276+
/// Creates a view instance that presents the elements of `base`
277+
/// in `SubSequence` chunks of the given count.
278+
///
279+
/// - Complexity: O(n)
280+
@inlinable
281+
internal init(_base: Base, _chunkCount: Int) {
282+
self.base = _base
283+
self.chunkCount = _chunkCount
284+
285+
// Compute the start index upfront in order to make
286+
// start index a O(1) lookup.
287+
self.startUpperBound = _base.index(
288+
_base.startIndex, offsetBy: _chunkCount,
289+
limitedBy: _base.endIndex
290+
) ?? _base.endIndex
291+
}
292+
}
293+
294+
extension ChunkedByCount: Collection {
295+
public struct Index {
296+
@usableFromInline
297+
internal let baseRange: Range<Base.Index>
298+
299+
@usableFromInline
300+
internal init(_baseRange: Range<Base.Index>) {
301+
self.baseRange = _baseRange
302+
}
303+
}
304+
305+
/// - Complexity: O(1)
306+
@inlinable
307+
public var startIndex: Index {
308+
Index(_baseRange: base.startIndex..<startUpperBound)
309+
}
310+
@inlinable
311+
public var endIndex: Index {
312+
Index(_baseRange: base.endIndex..<base.endIndex)
313+
}
314+
315+
/// - Complexity: O(1)
316+
public subscript(i: Index) -> Element {
317+
precondition(i < endIndex, "Index out of range")
318+
return base[i.baseRange]
319+
}
320+
321+
@inlinable
322+
public func index(after i: Index) -> Index {
323+
precondition(i < endIndex, "Advancing past end index")
324+
let baseIdx = base.index(
325+
i.baseRange.upperBound, offsetBy: chunkCount,
326+
limitedBy: base.endIndex
327+
) ?? base.endIndex
328+
return Index(_baseRange: i.baseRange.upperBound..<baseIdx)
329+
}
330+
}
331+
332+
extension ChunkedByCount.Index: Comparable {
333+
@inlinable
334+
public static func == (lhs: ChunkedByCount.Index,
335+
rhs: ChunkedByCount.Index) -> Bool {
336+
lhs.baseRange.lowerBound == rhs.baseRange.lowerBound
337+
}
338+
339+
@inlinable
340+
public static func < (lhs: ChunkedByCount.Index,
341+
rhs: ChunkedByCount.Index) -> Bool {
342+
lhs.baseRange.lowerBound < rhs.baseRange.lowerBound
343+
}
344+
}
345+
346+
extension ChunkedByCount:
347+
BidirectionalCollection, RandomAccessCollection
348+
where Base: RandomAccessCollection {
349+
@inlinable
350+
public func index(before i: Index) -> Index {
351+
precondition(i > startIndex, "Advancing past start index")
352+
353+
var offset = chunkCount
354+
if i.baseRange.lowerBound == base.endIndex {
355+
let remainder = base.count%chunkCount
356+
if remainder != 0 {
357+
offset = remainder
358+
}
359+
}
360+
361+
let baseIdx = base.index(
362+
i.baseRange.lowerBound, offsetBy: -offset,
363+
limitedBy: base.startIndex
364+
) ?? base.startIndex
365+
return Index(_baseRange: baseIdx..<i.baseRange.lowerBound)
366+
}
367+
}
368+
369+
extension ChunkedByCount {
370+
@inlinable
371+
public func distance(from start: Index, to end: Index) -> Int {
372+
let distance =
373+
base.distance(from: start.baseRange.lowerBound,
374+
to: end.baseRange.lowerBound)
375+
let (quotient, remainder) =
376+
distance.quotientAndRemainder(dividingBy: chunkCount)
377+
return quotient + remainder.signum()
378+
}
379+
380+
@inlinable
381+
public var count: Int {
382+
let (quotient, remainder) =
383+
base.count.quotientAndRemainder(dividingBy: chunkCount)
384+
return quotient + remainder.signum()
385+
}
386+
387+
@inlinable
388+
public func index(
389+
_ i: Index, offsetBy offset: Int, limitedBy limit: Index
390+
) -> Index? {
391+
guard offset != 0 else { return i }
392+
guard limit != i else { return nil }
393+
394+
if offset > 0 {
395+
return limit > i
396+
? offsetForward(i, offsetBy: offset, limit: limit)
397+
: offsetForward(i, offsetBy: offset)
398+
} else {
399+
return limit < i
400+
? offsetBackward(i, offsetBy: offset, limit: limit)
401+
: offsetBackward(i, offsetBy: offset)
402+
}
403+
}
404+
405+
@inlinable
406+
public func index(_ i: Index, offsetBy distance: Int) -> Index {
407+
guard distance != 0 else { return i }
408+
409+
let idx = distance > 0
410+
? offsetForward(i, offsetBy: distance)
411+
: offsetBackward(i, offsetBy: distance)
412+
guard let index = idx else {
413+
fatalError("Out of bounds")
414+
}
415+
return index
416+
}
417+
418+
@usableFromInline
419+
internal func offsetForward(
420+
_ i: Index, offsetBy distance: Int, limit: Index? = nil
421+
) -> Index? {
422+
assert(distance > 0)
423+
424+
return makeOffsetIndex(
425+
from: i, baseBound: base.endIndex,
426+
distance: distance, baseDistance: distance * chunkCount,
427+
limit: limit, by: >
428+
)
429+
}
430+
431+
// Convenience to compute offset backward base distance.
432+
@inline(__always)
433+
private func computeOffsetBackwardBaseDistance(
434+
_ i: Index, _ distance: Int
435+
) -> Int {
436+
if i == endIndex {
437+
let remainder = base.count%chunkCount
438+
// We have to take it into account when calculating offsets.
439+
if remainder != 0 {
440+
// Distance "minus" one(at this point distance is negative)
441+
// because we need to adjust for the last position that have
442+
// a variadic(remainder) number of elements.
443+
return ((distance + 1) * chunkCount) - remainder
444+
}
445+
}
446+
return distance * chunkCount
447+
}
448+
449+
@usableFromInline
450+
internal func offsetBackward(
451+
_ i: Index, offsetBy distance: Int, limit: Index? = nil
452+
) -> Index? {
453+
assert(distance < 0)
454+
let baseDistance =
455+
computeOffsetBackwardBaseDistance(i, distance)
456+
return makeOffsetIndex(
457+
from: i, baseBound: base.startIndex,
458+
distance: distance, baseDistance: baseDistance,
459+
limit: limit, by: <
460+
)
461+
}
462+
463+
// Helper to compute index(offsetBy:) index.
464+
@inline(__always)
465+
private func makeOffsetIndex(
466+
from i: Index, baseBound: Base.Index, distance: Int, baseDistance: Int,
467+
limit: Index?, by limitFn: (Base.Index, Base.Index) -> Bool
468+
) -> Index? {
469+
let baseIdx = base.index(
470+
i.baseRange.lowerBound, offsetBy: baseDistance,
471+
limitedBy: baseBound
472+
)
473+
474+
if let limit = limit {
475+
if baseIdx == nil {
476+
// If we past the bounds while advancing forward and the
477+
// limit is the `endIndex`, since the computation on base
478+
// don't take into account the remainder, we have to make
479+
// sure that passing the bound was because of the distance
480+
// not just because of a remainder. Special casing is less
481+
// expensive than always use count(which could be O(n) for
482+
// non-random access collection base) to compute the base
483+
// distance taking remainder into account.
484+
if baseDistance > 0 && limit == endIndex {
485+
if self.distance(from: i, to: limit) < distance {
486+
return nil
487+
}
488+
} else {
489+
return nil
490+
}
491+
}
492+
493+
// Checks for the limit.
494+
let baseStartIdx = baseIdx ?? baseBound
495+
if limitFn(baseStartIdx, limit.baseRange.lowerBound) {
496+
return nil
497+
}
498+
}
499+
500+
let baseStartIdx = baseIdx ?? baseBound
501+
let baseEndIdx = base.index(
502+
baseStartIdx, offsetBy: chunkCount, limitedBy: base.endIndex
503+
) ?? base.endIndex
504+
505+
return Index(_baseRange: baseStartIdx..<baseEndIdx)
506+
}
507+
}
508+
509+
extension Collection {
510+
/// Returns a `ChunkedCollection<Self>` view presenting the elements
511+
/// in chunks with count of the given count parameter.
512+
///
513+
/// - Parameter size: The size of the chunks. If the count parameter
514+
/// is evenly divided by the count of the base `Collection` all the
515+
/// chunks will have the count equals to size.
516+
/// Otherwise, the last chunk will contain the remaining elements.
517+
///
518+
/// let c = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
519+
/// print(c.chunks(ofCount: 5).map(Array.init))
520+
/// // [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
521+
///
522+
/// print(c.chunks(ofCount: 3).map(Array.init))
523+
/// // [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
524+
///
525+
/// - Complexity: O(1)
526+
@inlinable
527+
public func chunks(ofCount count: Int) -> ChunkedByCount<Self> {
528+
precondition(count > 0, "Cannot chunk with count <= 0!")
529+
return ChunkedByCount(_base: self, _chunkCount: count)
530+
}
531+
}
532+
533+
// Conditional conformances.
534+
extension ChunkedByCount: Equatable where Base: Equatable {}
535+
536+
// Since we have another stored property of type `Index` on the
537+
// collection, synthesis of `Hashble` conformace would require
538+
// a `Base.Index: Hashable` constraint, so we implement the hasher
539+
// only in terms of `base`. Since the computed index is based on it,
540+
// it should not make a difference here.
541+
extension ChunkedByCount: Hashable where Base: Hashable {
542+
public func hash(into hasher: inout Hasher) {
543+
hasher.combine(base)
544+
}
545+
}
546+
extension ChunkedByCount.Index: Hashable where Base.Index: Hashable {}
547+
548+
// Lazy conditional conformance.
549+
extension ChunkedByCount: LazySequenceProtocol
550+
where Base: LazySequenceProtocol {}
551+
extension ChunkedByCount: LazyCollectionProtocol
552+
where Base: LazyCollectionProtocol {}

0 commit comments

Comments
 (0)