Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyTimeTK Roadmap #2

Open
13 of 30 tasks
mdancho84 opened this issue Apr 16, 2021 · 4 comments
Open
13 of 30 tasks

PyTimeTK Roadmap #2

mdancho84 opened this issue Apr 16, 2021 · 4 comments
Assignees

Comments

@mdancho84
Copy link
Contributor

mdancho84 commented Apr 16, 2021

Phase 1: MVP Package

Develop a minimal package with the most important functions.

Use this guide: https://py-pkgs.org/03-how-to-package-a-python

Priority 1 - Core Data and Data Frame Operations

  • summarise_by_time() / summarize_by_time()
  • Data Sets

Priority 2 - Plot Time Series

  • plot_time_series() - Not sure if we should go with plotly or altair for interactive mode. I feel we should go with plotnine for non-interactive. Will need smooth_vec().

Priority 3 - Data Wrangling

  • future_frame() - We will also need tk_make_future_timeseries() and tk_make_timeseries()
  • pad_by_time()

Priority 4 - Augment Operations

Note - These functions should overwrite columns that are named the same in the input data frame.

  • tk.augment_timeseries_signature() - tk.get_timeseries_signature()
  • tk.augment_holiday_signature() - Uses holidays package
  • tk.augment_lags() / tk.agument_leads()
  • tk.augment_rolling()
  • tk.augment_fourier()

Priority 5 - TS Features

  • tk.ts_features()

Phase 2: Expand Functionality

Anomalize in Python

  • Convert Anomalize R package to tk.anomalize()

Time Series Plotting Utilities

  • Plot ACF
  • Plot Anomalies
  • Plot Seasonality
  • Plot STL Decomposition
  • Plot Time Series Regression

Time Series Inspection, Frequency, and Trend

  • TS Summary: tk.ts_summary()
  • Time Scale Template
  • Automatic Frequency Detection
  • Automatic Trend Detection

Applied Tutorials

  • Sales CRM Database Analysis
  • Finance Investment Analysis
  • Demand Forecasting
  • Anomaly Detection
  • Clustering

Phase 3: Extend Sklearn

  • Time Series Splitting / Cross Validation Functionality
  • Preprocessors & Feature Engineering
  • Vectorized Functions - Box Cox,
  • Plot Time Series CV

Phase 4: Fill in Function Gaps Where Needed

Add additional functionality that was not identified in Phases 1-3.

@mdancho84
Copy link
Contributor Author

mdancho84 commented Sep 26, 2023

We're Tracking Using GH Projects Now (Details Here)

Project Plan - Timetk: https://github.com/orgs/business-science/projects/1

Please let me know if you would like to contribute and I will set you up as an Outside Collaborator.

@mdancho84 mdancho84 self-assigned this Sep 27, 2023
@mdancho84 mdancho84 changed the title Roadmap Sep 28, 2023
@mdancho84 mdancho84 changed the title Timetk Roadmap Oct 3, 2023
@isachng93
Copy link

Hi mdancho84,

Thanks for bringing timetk to python!

Would you consider to build-in in phase 3, and X13ARIMA SEATS in phase 2?

@joshdunnlime
Copy link

joshdunnlime commented Nov 21, 2023

A nice extra addition to the augment module would be: tk.augment_periodic_spline. This has can have accuracy benefits over Fourier encoding, but does come at the cost of many more features.

In the sklearn example below, we see 12 spline features vs 2 Fourier, but a significant improvement in rmse and mae:
https://scikit-learn.org/stable/auto_examples/applications/plot_cyclical_feature_engineering.html#periodic-spline-features

@mdancho84
Copy link
Contributor Author

@isachng93

Would you consider to build-in in phase 3, and X13ARIMA SEATS in phase 2?

We are evaluating modeling and forecasting next. Will keep you posted. It may be a separate package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants