Code template for running autoML regression in pycaret.
Link to website : https://pycaret.org/
Link to repository : https://github.com/pycaret/pycaret
Data Science Team & Tech Lead
Code template for running autoML regression in pycaret.
Link to website : https://pycaret.org/
Link to repository : https://github.com/pycaret/pycaret
Code template for running autoML classification in pycaret.
Link to website : https://pycaret.org/
Link to repository : https://github.com/pycaret/pycaret
Code template for running autoML regression in auto-sklearn.
Link to website : https://automl.github.io/auto-sklearn/master/
Link to repository : https://github.com/automl/auto-sklearn
Code template for running autoML classification in auto-sklearn.
Link to website : https://automl.github.io/auto-sklearn/master/
Link to repository : https://github.com/automl/auto-sklearn
Code template for running autoML regression in autokeras.
Link to website : https://autokeras.com/
Link to repository : https://github.com/keras-team/autokeras
Code template for running autoML classification in autokeras.
Link to website : https://autokeras.com/
Link to repository : https://github.com/keras-team/autokeras
I have tested and reviewed a few Python packages for data processing and/or exploratory data analysis (EDA). Most of these packages attempt to automate parts of the data processing and/or EDA process, or provide a suite of functions to manipulate and visualize data.
The main objective here is to review and explore Python packages that will shorten the time needed for data processing and/or exploratory data analysis.
In summary, sweetviz may be the best option under a business setting (with focus on business understanding) while autoviz/pandas-profiling are solid choices under an R&D setting (with focus on deep dive analysis).
Package | sweetviz | pandas-profiling | autoviz | lux-api | dtale | dataprep |
---|---|---|---|---|---|---|
Version | 2.1.3 | 3.0.0 | 0.0.83 | 0.3.2 | 1.56.0 | 0.3.0 |
Recommended for exploration | Yes | No | Yes | No | No | Yes |
Recommended for production | No | No | No | No | No | No |
Ease of use | Yes | Yes | Yes | Yes | Yes | Yes |
Computation speed | Fast | Medium | Fast | Fast | Fast | Fast |
Installation complexity | Low | Low | Low | Low | Medium | Low |
Target variable-centric | Yes | No | Yes | Yes | No | Yes |
Missing data check | No | Yes | No | No | Yes | Yes |
Per variable summary statistics | Yes | Yes | No | No | Yes | Yes |
AutoEDA focus | No | Yes | Yes | No | No | Yes |
Score | 4 | 3 | 2 | 2 | 1 | 1 |
I have tested and reviewed a few Python packages for time-series data analysis, mostly on forecasting. Most of these packages are one-stop shop machine learning packages, with some of them also containing autoML function.
The main objective here is to review and explore Python packages that will shorten the time needed for time-series data analysis.
In summary, kats is the most promising one-stop shop machine learning package for time-series analysis. pycaret-ts-alpha is likely to be a strong contender once it matures out of the alpha status and gets integrated officially into pycaret.
These libraries tend to be a bit rough around the edges in terms of documentations and API implementations, especially for the newer packages. The support for multivariate time series forecasting is also on the weaker side, as most of them focus on univariate time series forecasting.
pytorch-forecasting deserves a special mention as it is the only library with a deep learning focus. While I agree that deep learning is very sexy to play with, I am still quite reserved in terms of applying deep learning to time series problems. Compare to traditional statistical models that have tens of parameters, deep learning models often have millions or billions of parameters to be trained. Fitting an N-BEATS model that has 1.6 million parameters on the air passenger data with hundreds of data points feels wrong.
Or as John von Neumann famously said, “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”
Package | kats | pmdarima | sktime | pytorch-forecasting | pycaret-ts-alpha | autots |
---|---|---|---|---|---|---|
Version | 0.1.0 | 1.8.2 | 0.5.3 | 0.9.0 | 3.0.0.dev1624743408 | 0.3.2 |
Recommended for exploration | No | No | No | No | No | No |
Recommended for production | Yes | Yes | No | No | No | No |
Ease of use | Yes | Yes | No | Yes | No | Yes |
Computation speed | Fast | Fast | Fast | Slow | Medium | Slow |
Installation complexity | Low | Low | Low | Medium | Medium | Low |
One-stop shop | Yes | No | Yes | No | Yes | Yes |
AutoML focus | No | No | No | No | Yes | Yes |
Deep learning focus | No | No | No | Yes | No | No |
Score | 4 | 3 | 2 | 2 | 1 | 1 |