Skip to content

Releases: business-science/pytimetk

Pytimetk 0.4.0

18 Mar 16:07
Compare
Choose a tag to compare

pytimetk 0.4.0

Feature Engineering Module:

  1. augment_pct_change(): pandas and polars engines

Finance Module Updates:

  1. augment_macd(): MACD, pandas and polars engines
  2. augment_bbands(): Bollinger Bands, pandas and polars engines
  3. augment_atr(): Average True Range, pandas and polars engines
  4. augment_ppo(): Percentage Price Oscillator, pandas and polars engines
  5. augment_rsi(): Relative Strength Index, pandas and polars engines
  6. augment_qsmomentum(): Quant Science Momentum Indicator, pandas and polars engines
  7. augment_roc(): Rate of Change (ROC), pandas and polars e ngines

Polars Upgrades

  • Migrate to polars 0.20.7

Pytimetk 0.3.0

12 Jan 19:42
Compare
Choose a tag to compare

pytimetk 0.3.0

Correlation Funnel

The R package correlationfunnel has been ported inside pytimetk:

  • binarize()
  • correlate()
  • plot_correlation_funnel()

Core:

  • filter_by_time() - Filtering with time-based strings

Feature Engineering:

  • augment_diffs() - Can now add differenced columns
  • augment_fourier() - Can now add fourier features.

Finance Module:

  • augment_cmo(): Chande Momentum Oscillator (CMO)

New Polars Backends:

  • augment_diffs()
  • augment_fourier()
  • augment_cmo()

General

  • Make memory reduction optional #275

pytimetk 0.2.1

05 Nov 17:45
Compare
Choose a tag to compare

Fixes #216 - Incorrect sort order of polars engine. Affected: augment_rolling(), agument_expanding()

pytimetk 0.2.0

03 Nov 17:39
Compare
Choose a tag to compare

Anomaly Detection

  • anomalize(): A new scalable function for detecting time series anomalies (outliers)
  • plot_anomalies(): A scalable visualization tool for inspecting anomalies in time series data.
  • plot_anomalies_decomp: A scalable visualization tool for inspecting the observed, seasonal, trend, and remainder decomposition, which are useful for telling you whether or not anomalies are being detected to your preference.
  • plot_anomalies_cleaned(): A scalable visualization tool for showing the before and after transformation for the cleaned vs uncleaned anomalies.

New Functions:

  • apply_by_time(): For complex apply-style aggregations by time.
  • augment_rolling_apply(): For complex rolling operations using apply-style data frame functions.
  • augment_expanding(): For expanding calculations with single-column functions (e.g. mean).
  • augment_expanding_apply(): For complex expanding operations with apply-style data frame functions.
  • augment_hilbert(): Hilbert features for signal processing.
  • augment_wavelet(): Wavelet transform features.
  • get_frequency(): Infer a pandas-like frequency. More robust than pandas.infer_freq.
  • get_seasonal_frequency(): Infer the pandas-like seasonal frequency (periodicity) for the time series.
  • get_trend_frequency(): Infer the pandas-like trend for the time series.

New Finance Module

More coming soon.

  • augment_ewm(): Exponentially weighted augmentation

Speed Improvements

Polars Engines:

  • summarize_by_time(): Gains a polars engine.
    • 3X to 10X speed improvements.
  • augment_lags() and augment_leads(): Gains a polars engine. Speed improvements increase with number of lags/leads.
    • 6.5X speed improvement with 100 lags.
  • augment_rolling(): Gains a polars engine. 10X speed improvement.
  • augment_expanding(): Gains a polars engine.
  • augment_timeseries_signature(): Gains a polars engine. 3X speed improvement.
  • augment_holiday_signature(): Gains a polars engine.

Parallel Processing and Vectorized Optimizations:

  • pad_by_time(): Complete overhaul. Uses Cartesian Product (Vectorization) to enhance the speed. 1000s of time series can now be padded in seconds.
    • Independent review: Time went from over 90 minutes to 13 seconds for a 500X speedup on 10M rows.
  • future_frame(): Complete overhaul. Uses vectorization when possible. Grouped parallel processing. Set threads = -1 to use all cores.
    • Independent Review: Time went from 11 minutes to 2.5 minutes for a 4.4X speedup on 10M rows
  • ts_features: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • ts_summary: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • anomalize: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • augment_rolling() and augment_rolling_apply(): Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.

Helpful Utilities:

  • parallel_apply: Mimics the pandas apply() function with concurrent futures.
  • progress_apply: Adds a progress bar to pandas apply()
  • glimpse(): Mimics tidyverse (tibble) glimpse function

New Data Sets:

  • expedia: Expedia hotel searches time series data set

3 New Applied Tutorials:

  1. Sales Analysis Tutorial
  2. Finance Analysis Tutorial
  3. Demand Forecasting Tutorial
  4. Anomaly Detection Tutorial

Final Deprecations:

  • summarize_by_time(): kind = "period". This was removed for consistency with pytimetk. "timestamp" is the default.
  • augment_rolling(): use_independent_variables. This is replaced by augment_rolling_apply().

pytimetk 0.1.0

03 Oct 02:51
Compare
Choose a tag to compare

About the Initial release.

This release includes the following features:

  1. A workhorse plotting function called plot_timeseries() 💪
  2. Three (3) data wrangling functions that will simplify 90% of time series tasks 🙏
  3. Five (5) "augmentor" functions: These add hundreds of features to time series to help in predictive tasks 🧠
  4. Two (2) time series feature summarizes: identify key aspects of your time series 🔍
  5. Nine (9) pandas series and DatetimeIndex helpers (work more easily with these timestamp data structures) ⏲
  6. Four (4) date utility functions that fill in missing function gaps in pandas 🐼
  7. Two (2) Visualization utilities to help you customize your visualizations and make them look MORE professional 📈
  8. Two (2) Pandas helpers that help clean up and understand pandas data frames with time series 🎇
  9. Twelve (12) time series datasets that you can practice PyTimeTK time series analysis on 🔢

The PyTimeTK website comes with:

  1. Two (2) Getting started tutorials
  2. Five (5) Guides covering common tasks
  3. Coming Soon: Applied Tutorials in Sales, Finance, Demand Forecasting, Anomaly Detection, and more.