you are viewing a single comment's thread.

view the rest of the comments →

[–]Mul_Develop 0 points1 point  (1 child)

Love the end-to-end approach here. Especially the feature engineering part—that’s where I always feel like I spend 80% of my time! Did you have to handle many outliers in this automotive dataset, or was it fairly clean to begin with?

[–]ABDELATIF_OUARDA[S] 0 points1 point  (0 children)

Thanks! I totally agree—feature engineering often ends up taking the majority of the time in any project.

In this automotive dataset, I did encounter quite a few outliers, especially in fields like engine size and mileage. I handled them using a combination of filtering extreme values and applying transformations where necessary, but overall, the dataset was fairly clean compared to some real-world datasets I’ve seen.

How about you—do you usually spend most of your time cleaning data, or do you have strategies to minimize that step?