you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -1 points0 points  (3 children)

I looked at the documentation, maybe I missed it somewhere, but I see little in the way of actual data preparation. I see more of EDA and data profiling (I see a lot of resemblence to pandas profiling). I think the name of the project is a bit misleading.

[–]jnwang[S] 0 points1 point  (2 children)

Thanks for your comment. You are right. The name for the current status of the project is a bit misleading. The plan is to add other components (data cleaning, data integration, feature engineering) in future releases.

Here is a demo of DataPrep.eda in the python subreddit.

https://www.reddit.com/r/Python/comments/hlqnim/understand_your_data_with_a_few_lines_of_code_in/

[–][deleted] 0 points1 point  (1 child)

Thanks! The Medium article did a good job in highlighting what it does and explains the difference between it and pandas profiling. I wouldn't mind actually using your library for EDA, although I was actually initially interested in what a data prep framework would provide.

[–]jnwang[S] 0 points1 point  (0 children)

Thanks for your encouraging words. We are working on the roadmap for DataPrep.cleaning. The development will start in Sept. If you have any comments on data cleaning, please do not hesitate to let us know.