Code template for running exploratory data analysis in sweetviz.
Link to website : None
Link to repository : https://github.com/fbdesignpro/sweetviz
Data Science Team & Tech Lead
Code template for running exploratory data analysis in sweetviz.
Link to website : None
Link to repository : https://github.com/fbdesignpro/sweetviz
Code template for running exploratory data analysis in pandas-profiling.
Link to website : https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/
Link to repository : https://github.com/pandas-profiling/pandas-profiling
Code template for running exploratory data analysis in lux.
Link to website : None
Link to repository : https://github.com/lux-org/lux
Code template for running exploratory data analysis in dtale.
Link to website : None
Link to repository : https://github.com/man-group/dtale
Code template for running exploratory data analysis in dataprep.
Link to website : https://dataprep.ai/
Link to repository : https://github.com/sfu-db/dataprep
Code template for running exploratory data analysis in autoviz.
Link to website : None
Link to repository : https://github.com/AutoViML/AutoViz
I have tested and reviewed a few Python packages for data processing and/or exploratory data analysis (EDA). Most of these packages attempt to automate parts of the data processing and/or EDA process, or provide a suite of functions to manipulate and visualize data.
The main objective here is to review and explore Python packages that will shorten the time needed for data processing and/or exploratory data analysis.
In summary, sweetviz may be the best option under a business setting (with focus on business understanding) while autoviz/pandas-profiling are solid choices under an R&D setting (with focus on deep dive analysis).
Package | sweetviz | pandas-profiling | autoviz | lux-api | dtale | dataprep |
---|---|---|---|---|---|---|
Version | 2.1.3 | 3.0.0 | 0.0.83 | 0.3.2 | 1.56.0 | 0.3.0 |
Recommended for exploration | Yes | No | Yes | No | No | Yes |
Recommended for production | No | No | No | No | No | No |
Ease of use | Yes | Yes | Yes | Yes | Yes | Yes |
Computation speed | Fast | Medium | Fast | Fast | Fast | Fast |
Installation complexity | Low | Low | Low | Low | Medium | Low |
Target variable-centric | Yes | No | Yes | Yes | No | Yes |
Missing data check | No | Yes | No | No | Yes | Yes |
Per variable summary statistics | Yes | Yes | No | No | Yes | Yes |
AutoEDA focus | No | Yes | Yes | No | No | Yes |
Score | 4 | 3 | 2 | 2 | 1 | 1 |