Skip to content

feat!: Move all feature extractors into namespace only + speedup Lempel Ziv #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Nov 7, 2023
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ faer = {version = "0.14.1", features = ["ndarray"]}
ndarray = "0.15.6"
numpy = "0.20.0"
serde = {version = "1.0.190", features=["derive"]}
hashbrown = "0.14.2"
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ pip install "functime[llm,lgb]"
```python
import polars as pl
from functime.cross_validation import train_test_split
from functime.feature_extraction import add_fourier_terms
from functime.seasonality import add_fourier_terms
from functime.forecasting import linear_model
from functime.preprocessing import scale
from functime.metrics import mase
Expand Down Expand Up @@ -101,9 +101,9 @@ View the full walkthrough on forecasting [here](https://docs.functime.ai/forecas
### Feature Extraction

`functime` comes with over 100+ [time-series feature extractors](https://docs.functime.ai/feature-extraction/).
These features are easily accessible via our custom `ts` (time-series) namespace on any `Polars` Series or expression.
Every feature is easily accessible via `functime`'s custom `ts` (time-series) namespace, which works with any `Polars` Series or expression. To register the custom `ts` `Polars` namespace, you must first import `functime` in your module.

To register the custom `ts` `Polars` namespace, you must first import `functime` in your module!
To register the custom `ts` `Polars` namespace, you must first import `functime`!

```python
import polar as pl
Expand Down
2 changes: 1 addition & 1 deletion docs/code/quickstart.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
import polars as pl

from functime.cross_validation import train_test_split
from functime.feature_extraction import add_fourier_terms
from functime.forecasting import auto_linear_model, linear_model, naive, snaive
from functime.metrics import smape
from functime.preprocessing import scale
from functime.seasonality import add_fourier_terms

start_time = default_timer()

Expand Down
62 changes: 61 additions & 1 deletion docs/feature-extraction.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,64 @@
Check out the [API reference](ref/feature-extraction.md) for a list of supported feature extractors.

## Usage Examples
WIP

Every feature is easily accessible via `functime`'s custom `ts` (time-series) namespace, which works with any `Polars` Series or expression. To register the custom `ts` `Polars` namespace, you must first import `functime`.

```python
import polar as pl
import numpy as np
import functime

# Load commodities price data
y = pl.read_parquet("https://github.com/neocortexdb/functime/raw/main/data/commodities.parquet")

# Get column names ("commodity_type", "time", "price")
entity_col, time_col, value_col = y.columns

# Extract a single feature from a single time-series
binned_entropy = (
pl.Series(np.random.normal(0, 1, size=10))
.ts.binned_entropy(bin_count=10)
)

# 🔥 Also works on LazyFrames with query optimization
features = (
pl.LazyFrame({
"index": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
"value": np.random.normal(0, 1, size=10)
})
.select(
pl.col("value").ts.binned_entropy(bin_count=10),
pl.col("value").ts.lempel_ziv_complexity(threshold=3),
pl.col("value").ts.longest_streak_above_mean(),
)
)

# 🚄 Extract features blazingly fast on many
# stacked time-series using `group_by`
features = (
y.group_by(entity_col)
.agg(
pl.col(value_col).ts.binned_entropy(bin_count=10),
pl.col(value_col).ts.lempel_ziv_complexity(threshold=3),
pl.col(value_col).ts.longest_streak_above_mean(),
)
)

# 🚄 Extract features blazingly fast on windows
# of many time-series using `group_by_dynamic`
features = (
# Compute rolling features at yearly intervals
y.group_by_dynamic(
time_col,
every="12mo",
by=entity_col,
)
.select(
pl.col("value").ts.binned_entropy(bin_count=10),
pl.col("value").ts.lempel_ziv_complexity(threshold=3),
pl.col("value").ts.longest_streak_above_mean(),
)
)

```
4 changes: 2 additions & 2 deletions docs/forecasting.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Load a collection of time series, also known as panel data, into a [`polars.Lazy
import polars as pl
from functime.cross_validation import train_test_split
from functime.metrics import mase
from functime.feature_extraction import add_calendar_effects
from functime.seasonality import add_calendar_effects


# Load data
Expand Down Expand Up @@ -271,7 +271,7 @@ forecaster = linear_model(

```python
from functime.forecasting import linear_model
from functime.feature_extraction import add_fourier_terms
from functime.seasonality import add_fourier_terms
from functime.preprocessing import roll

# Include Fourier terms to model complex seasonality
Expand Down
4 changes: 1 addition & 3 deletions docs/ref/feature-extraction.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
::: functime.feature_extraction

## Time-series Feature Extractors

:::functime.feature_extraction.tsfresh
:::functime.feature_extractors
3 changes: 3 additions & 0 deletions docs/ref/seasonality.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Seasonality and Holiday Effects

:::functime.seasonality
51 changes: 0 additions & 51 deletions docs/ref/tsfresh.md

This file was deleted.

6 changes: 3 additions & 3 deletions docs/seasonality.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ If you choose the dummy variable strategy, beware of the "dummy variable trap" (
- year: 1999, 2000, ..., 2023 (any year)

```python
from functime.feature_extraction import add_calendar_effects
from functime.seasonality import add_calendar_effects

# Returns X with one categorical column "month" with values 1,2,...,12
X_new = X.pipe(add_calendar_effects(["month"])).collect()
Expand All @@ -58,7 +58,7 @@ For example, if `sp=12` and `K=3`, `X_new` would contain the columns `sin_12_1`,

```python
from functime.offsets import freq_to_sp
from functime.feature_extraction import add_fourier_terms
from functime.seasonality import add_fourier_terms

sp = freq_to_sp["1mo"][0]
X_new = X.pipe(add_fourier_terms(sp=sp, K=3)).collect()
Expand All @@ -69,7 +69,7 @@ X_new = X.pipe(add_fourier_terms(sp=sp, K=3)).collect()
`functime` has a wrapper function around the [`holidays`](https://pypi.org/project/holidays/) Python package to generate categorical features for special events. Dates without a holiday are filled with nulls.

```python
from functime.feature_extraction import add_holiday_effects
from functime.seasonality import add_holiday_effects

# Returns X with two categorical columns "holiday__US" and "holiday__CA"
north_america_holidays = add_holiday_effects(country_codes=["US", "CA"])
Expand Down
Loading