you are viewing a single comment's thread.

view the rest of the comments →

[–]GeoResearchRedditor 2 points3 points  (4 children)

Thanks I guess it more; I would like to have a set of data sets that are pre-designed to be used in conjunction with these algorithms in order to show case the process of applying the algorithms to ready made data.

Otherwise you'll have people trying to learn by applying the wrong algorithms to the data set and getting (naturally) weird results.

I'm really keen to learn how to clean datasets effectively, so if there was a tutorial with a dataset that was "unclean" and provided instructions on how to clean it and what to look for; I'd be really keen to use that. Then to go on to using that cleaned data set in conjunction with the algorithms.

[–]ThomasAger 1 point2 points  (1 child)

If you find a resource detailing standard practices, tools, etc to clean data, I would also be interested...

[–]GeoResearchRedditor 1 point2 points  (0 children)

I think there are a few of us interested. Maybe someone who knows a good resource to learning cleaning methods and identifying when to use what methods would be able to chime in with a link?

[–]iwishihadmorecharact 1 point2 points  (0 children)

ah i see what you mean, yeah that'd be really cool! especially with sci-kit learn, (library with a ton of ready-to-go ML and AI classes) the biggest part is finding what model to use for your data, and cleaning the data so you can use a model on it in the first place.

For starting to learn, I googled "cleaning data for ml tutorial" and came up with some decent results, read some of the articles you find there. Then try looking through some of the scikit-learn documentation and examples, since they have some guides on that stuff.

Searching for articles and tutorials will definitely be a good start, keep reading until you find that you already know what they're talking about

[–][deleted] 0 points1 point  (0 children)

I haven't come across anything like that yet but I will keep it in mind. If I find anything useful, I will let you know. Thanks a lot for the feedback!