Skip to content

Commit d010286

Browse files
authored
Merge branch 'master' into master
2 parents ff8ac73 + 0f45497 commit d010286

12 files changed

+267
-50
lines changed

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Changelog
22
## Next Release
33
* Allow empty sequence expressions `seq()`, `pseq()` (#159)
4+
* Add `no_wrap` option to `head()`, `head_option()`, `first()`, `last()` and `last_option()`, as well as to `seq()`, `pseq()` and `Sequence` constructor
45

56
## Release 1.3.0
67
* added precompute attribute to reverse transformation (#137)
@@ -72,7 +73,7 @@ Reaching `1.0` primarily means that API stability has been reached so I don't ex
7273
* Parallel functions: `map`, `select`, `filter`, `filter_not`, `where`, `flatten`, and `flat_map`
7374
* Compressed file IO support for `gzip`/`lzma`/`bz2` as detailed at https://github.com/EntilZha/PyFunctional/issues/54
7475
* Cartesian product from `itertools.product` implemented as `Pipeline.cartesian`
75-
* Website at [pyfunctional.org](http://www.pyfunctional.org) and docs at [docs.pyfunctional.org](http://docs.pyfunctional.org)
76+
* Website at [pyfunctional.pedro.ai](http://pyfunctional.pedro.ai) and docs at [docs.pyfunctional.pedro.ai](http://docs.pyfunctional.pedro.ai)
7677

7778
### Bug Fixes
7879
* No option for encoding in `to_json` https://github.com/EntilZha/PyFunctional/issues/70

DEVELOPER.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ and that pylint succeeds
99
2. Push commit which is the candidate release to Github master
1010
3. Wait for tests to pass on [TravisCI](https://travis-ci.org/EntilZha/PyFunctional)
1111
4. Wait for coverage to complete at 100% on [Coveralls](https://coveralls.io/github/EntilZha/PyFunctional)
12-
5. Wait for docs to complete building successfully at [docs.pyfunctional.org/en/latest](http://docs.pyfunctional.org/en/latest/)
12+
5. Wait for docs to complete building successfully at [docs.pyfunctional.pedro.ai/en/latest](http://docs.pyfunctional.pedro.ai/en/latest/)
1313

1414
### Testing Local Release
1515
1. Run `docker run -it python bash` for clean python installation

README.md

Lines changed: 44 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# PyFunctional
22
![Build Status](https://github.com/EntilZha/PyFunctional/workflows/Python%20package/badge.svg)
33
[![Code Coverage](https://codecov.io/gh/EntilZha/PyFunctional/branch/master/graph/badge.svg)](https://codecov.io/gh/EntilZha/PyFunctional)
4-
[![ReadTheDocs](https://readthedocs.org/projects/scalafunctional/badge/?version=latest)](http://docs.pyfunctional.org)
4+
[![ReadTheDocs](https://readthedocs.org/projects/scalafunctional/badge/?version=latest)](http://docs.pyfunctional.pedro.ai)
55
[![PyPI version](https://badge.fury.io/py/PyFunctional.svg)](https://badge.fury.io/py/PyFunctional)
66

77
## Features
@@ -268,7 +268,7 @@ all at once rather than in multiple loops using `multiprocessing`
268268

269269
## Documentation
270270
Shortform documentation is below and full documentation is at
271-
[docs.pyfunctional.org](http://docs.pyfunctional.org/en/latest/functional.html).
271+
[docs.pyfunctional.pedro.ai](http://docs.pyfunctional.pedro.ai/en/latest/functional.html).
272272

273273
### Streams API
274274
All of `PyFunctional` streams can be accessed through the `seq` object. The primary way to create
@@ -313,12 +313,12 @@ seq.sqlite3('filepath', 'select * from data')
313313
```
314314

315315
For more information on the parameters that these functions can take, reference the
316-
[streams documentation](http://docs.pyfunctional.org/en/latest/functional.html#module-functional.streams)
316+
[streams documentation](http://docs.pyfunctional.pedro.ai/en/latest/functional.html#module-functional.streams)
317317

318318
### Transformations and Actions APIs
319319
Below is the complete list of functions which can be called on a stream object from `seq`. For
320320
complete documentation reference
321-
[transformation and actions API](http://docs.pyfunctional.org/en/latest/functional.html#module-functional.pipeline).
321+
[transformation and actions API](http://docs.pyfunctional.pedro.ai/en/latest/functional.html#module-functional.pipeline).
322322

323323
Function | Description | Type
324324
------- | ----------- | ----
@@ -362,10 +362,10 @@ Function | Description | Type
362362
`sorted(key=None, reverse=False)/order_by(func)` | Returns elements sorted according to python `sorted` | transformation
363363
`reverse()` | Returns the reversed sequence | transformation
364364
`slice(start, until)` | Sequence starting at `start` and including elements up to `until` | transformation
365-
`head()` / `first()` | Returns first element in sequence | action
366-
`head_option()` | Returns first element in sequence or `None` if its empty | action
367-
`last()` | Returns last element in sequence | action
368-
`last_option()` | Returns last element in sequence or `None` if its empty | action
365+
`head(no_wrap=None)` / `first(no_wrap=None)` | Returns first element in sequence (if `no_wrap=True`, the result will never be wrapped with `Sequence`) | action
366+
`head_option(no_wrap=None)` | Returns first element in sequence or `None` if its empty (if `no_wrap=True`, the result will never be wrapped with `Sequence`) | action
367+
`last(no_wrap=None)` | Returns last element in sequence (if `no_wrap=True`, the result will never be wrapped with `Sequence`) | action
368+
`last_option(no_wrap=None)` | Returns last element in sequence or `None` if its empty (if `no_wrap=True`, the result will never be wrapped with `Sequence`) | action
369369
`len()` / `size()` | Returns length of sequence | action
370370
`count(func)` | Returns count of elements in sequence where `func(element)` is True | action
371371
`empty()` | Returns `True` if the sequence has zero length | action
@@ -397,6 +397,7 @@ Function | Description | Type
397397
`to_pandas(columns=None)` | Converts the sequence to a pandas DataFrame | action
398398
`cache()` | Forces evaluation of sequence immediately and caches the result | action
399399
`for_each(func)` | Executes `func` on each element of the sequence | action
400+
`peek(func)` | Executes `func` on each element of the sequence but returns the element | transformation
400401

401402
### Lazy Execution
402403
Whenever possible, `PyFunctional` will compute lazily. This is accomplished by tracking the list
@@ -411,11 +412,17 @@ undesirable to keep recomputing the same value. Below are some examples of inspe
411412

412413
```python
413414
def times_2(x):
414-
print(x)
415415
return 2 * x
416-
elements = seq(1, 1, 2, 3, 4).map(times_2).distinct()
416+
417+
elements = (
418+
seq(1, 1, 2, 3, 4)
419+
.map(times_2)
420+
.peek(print)
421+
.distinct()
422+
)
423+
417424
elements._lineage
418-
# Lineage: sequence -> map(times_2) -> distinct
425+
# Lineage: sequence -> map(times_2) -> peek(print) -> distinct
419426

420427
l_elements = elements.to_list()
421428
# Prints: 1
@@ -425,7 +432,7 @@ l_elements = elements.to_list()
425432
# Prints: 4
426433

427434
elements._lineage
428-
# Lineage: sequence -> map(times_2) -> distinct -> cache
435+
# Lineage: sequence -> map(times_2) -> peek(print) -> distinct -> cache
429436

430437
l_elements = elements.to_list()
431438
# The cached result is returned so times_2 is not called and nothing is printed
@@ -436,6 +443,30 @@ Files are given special treatment if opened through the `seq.open` and related A
436443
multiple iteration over a single file object while correctly handling iteration termination and
437444
file closing.
438445

446+
### `no_wrap` option
447+
Even though functions like `first()` are supposed to return a single element, if the element is an iterable,
448+
then it is wrapped into a `Sequence`. For instance:
449+
450+
```
451+
>>> s = seq(list(), list())
452+
>>> type(a.first())
453+
<class 'functional.pipeline.Sequence'>
454+
```
455+
456+
That behaviour can be changed with `no_wrap` option:
457+
458+
```
459+
>>> type(a.first(no_wrap=True))
460+
<class 'list'>
461+
```
462+
463+
The option is also accpeted by `seq()`/`pseq()` as well as `Sequence()` constructor, for example:
464+
465+
```
466+
>>> type(seq([list(), list()], no_wrap=True).last())
467+
<class 'list'>
468+
```
469+
439470
## Road Map Idea
440471
* SQL based query planner and interpreter
441472
* `_` lambda operator
@@ -472,7 +503,7 @@ In order to be merged, all pull requests must:
472503
To learn more about me (the author) visit my webpage at
473504
[pedro.ai](https://www.pedro.ai).
474505

475-
I created `PyFunctional` while using Python extensivel, and finding that I missed the
506+
I created `PyFunctional` while using Python extensively, and finding that I missed the
476507
ease of use for manipulating data that Spark RDDs and Scala collections have. The project takes the
477508
best ideas from these APIs as well as LINQ to provide an easy way to manipulate data when using
478509
Scala is not an option or PySpark is overkill.

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ PyFunctional Documentation
77
=============================
88

99
Welcome to the `PyFunctional` documentation. For a tutorial of how to use this package you should
10-
visit `pyfunctional.org <http://www.pyfunctional.org/>`_.
10+
visit `pyfunctional.pedro.ai <http://pyfunctional.pedro.ai/>`_.
1111

1212
The documentation on this site should primarily be used as an API reference. The Streams
1313
documentation covers ways to read data into `PyFunctional` while the Transformations and Actions

functional/pipeline.py

Lines changed: 46 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
is_namedtuple,
2121
is_tabulatable,
2222
identity,
23+
default_value,
2324
)
2425
from functional.io import WRITE_MODE, universal_write_open
2526
from functional import transformations
@@ -32,7 +33,9 @@ class Sequence(object):
3233
functional transformations and reductions in a data pipeline style
3334
"""
3435

35-
def __init__(self, sequence, transform=None, engine=None, max_repr_items=None):
36+
def __init__(
37+
self, sequence, transform=None, engine=None, max_repr_items=None, no_wrap=None
38+
):
3639
# pylint: disable=protected-access
3740
"""
3841
Takes a Sequence, list, tuple. or iterable sequence and wraps it around a Sequence object.
@@ -43,6 +46,7 @@ def __init__(self, sequence, transform=None, engine=None, max_repr_items=None):
4346
:param transform: transformation to apply
4447
:param engine: execution engine
4548
:param max_repr_items: maximum number of items to print with repr
49+
:param no_wrap: default value of no_wrap for functions like first() or last()
4650
:return: sequence wrapped in a Sequence
4751
"""
4852
self.engine = engine or ExecutionEngine()
@@ -58,6 +62,7 @@ def __init__(self, sequence, transform=None, engine=None, max_repr_items=None):
5862
raise TypeError("Given sequence must be an iterable value")
5963
if transform is not None:
6064
self._lineage.apply(transform)
65+
self.no_wrap = no_wrap
6166

6267
def __iter__(self):
6368
"""
@@ -164,9 +169,9 @@ def __add__(self, other):
164169
:return: concatenated sequence with other
165170
"""
166171
if isinstance(other, Sequence):
167-
return Sequence(self.sequence + other.sequence)
172+
return Sequence(self.sequence + other.sequence, no_wrap=self.no_wrap)
168173
else:
169-
return Sequence(self.sequence + other)
174+
return Sequence(self.sequence + other, no_wrap=self.no_wrap)
170175

171176
def _evaluate(self):
172177
"""
@@ -185,9 +190,9 @@ def _transform(self, *transforms):
185190
sequence = None
186191
for transform in transforms:
187192
if sequence:
188-
sequence = Sequence(sequence, transform=transform)
193+
sequence = Sequence(sequence, transform=transform, no_wrap=self.no_wrap)
189194
else:
190-
sequence = Sequence(self, transform=transform)
195+
sequence = Sequence(self, transform=transform, no_wrap=self.no_wrap)
191196
return sequence
192197

193198
@property
@@ -220,7 +225,7 @@ def cache(self, delete_lineage=False):
220225
self._lineage = Lineage(engine=self.engine)
221226
return self
222227

223-
def head(self):
228+
def head(self, no_wrap=None):
224229
"""
225230
Returns the first element of the sequence.
226231
@@ -234,11 +239,15 @@ def head(self):
234239
...
235240
IndexError: list index out of range
236241
242+
:param no_wrap: If set to True, the returned value will never be wrapped with Sequence
237243
:return: first element of sequence
238244
"""
239-
return _wrap(self.take(1)[0])
245+
if default_value(no_wrap, self.no_wrap, False):
246+
return self.sequence[0]
247+
else:
248+
return _wrap(self.take(1)[0])
240249

241-
def first(self):
250+
def first(self, no_wrap=None):
242251
"""
243252
Returns the first element of the sequence.
244253
@@ -252,11 +261,12 @@ def first(self):
252261
...
253262
IndexError: list index out of range
254263
264+
:param no_wrap: If set to True, the returned value will never be wrapped with Sequence
255265
:return: first element of sequence
256266
"""
257-
return self.head()
267+
return self.head(no_wrap=no_wrap)
258268

259-
def head_option(self):
269+
def head_option(self, no_wrap=None):
260270
"""
261271
Returns the first element of the sequence or None, if the sequence is empty.
262272
@@ -266,13 +276,14 @@ def head_option(self):
266276
>>> seq([]).head_option()
267277
None
268278
279+
:param no_wrap: If set to True, the returned value will never be wrapped with Sequence
269280
:return: first element of sequence or None if sequence is empty
270281
"""
271282
if not self.sequence:
272283
return None
273-
return self.head()
284+
return self.head(no_wrap=no_wrap)
274285

275-
def last(self):
286+
def last(self, no_wrap=None):
276287
"""
277288
Returns the last element of the sequence.
278289
@@ -286,11 +297,15 @@ def last(self):
286297
...
287298
IndexError: list index out of range
288299
300+
:param no_wrap: If set to True, the returned value will never be wrapped with Sequence
289301
:return: last element of sequence
290302
"""
291-
return _wrap(self.sequence[-1])
303+
if default_value(no_wrap, self.no_wrap, False):
304+
return self.sequence[-1]
305+
else:
306+
return _wrap(self.sequence[-1])
292307

293-
def last_option(self):
308+
def last_option(self, no_wrap=None):
294309
"""
295310
Returns the last element of the sequence or None, if the sequence is empty.
296311
@@ -300,11 +315,12 @@ def last_option(self):
300315
>>> seq([]).last_option()
301316
None
302317
318+
:param no_wrap: If set to True, the returned value will never be wrapped with Sequence
303319
:return: last element of sequence or None if sequence is empty
304320
"""
305321
if not self.sequence:
306322
return None
307-
return self.last()
323+
return self.last(no_wrap=no_wrap)
308324

309325
def init(self):
310326
"""
@@ -546,6 +562,21 @@ def for_each(self, func):
546562
for e in self:
547563
func(e)
548564

565+
def peek(self, func):
566+
"""
567+
Executes func on each element of the sequence and returns the element
568+
569+
>>> seq([1, 2, 3, 4]).peek(print).map(lambda x: x ** 2).to_list()
570+
1
571+
2
572+
3
573+
4
574+
[1, 4, 9, 16]
575+
576+
:param func: function to execute
577+
"""
578+
return self._transform(transformations.peek_t(func))
579+
549580
def filter(self, func):
550581
"""
551582
Filters sequence to include only elements where func is True.

0 commit comments

Comments
 (0)