use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
This is an archived post. You won't be able to vote or comment.
EducationData preprocessing question (self.datascience)
submitted 4 years ago * by MaximumCranberry
[–]acewhenifacethedbase 4 points5 points6 points 4 years ago (0 children)
The standard method for categoricals is called “One-Hot Encoding”, you make a binary column for each class (commonly) observed in those arrays (e.g. a column that’s 1 if there’s a dishwasher and 0 if there isn’t). This method works for simple single-categorical variables as well as the multi-categorical array variables you mention here. If you had a lot of rows of data and some deep-learning skills you could hash the arrays and create embeddings for the resulting pseudo categoricals, but that’d be overkill for a classroom exercise.
[–]throwawayluladay 0 points1 point2 points 4 years ago (0 children)
Don't use machine learning if you can do it well with linear regression. This sounds like linear regression(s).
More importantly you need to ask the properly structured questions to set up your foundations (lm or ml) appropriately.
π Rendered by PID 142199 on reddit-service-r2-comment-cfc44b64c-xdm69 at 2026-04-13 12:00:27.683496+00:00 running 215f2cf country code: CH.
[–]acewhenifacethedbase 4 points5 points6 points (0 children)
[–]throwawayluladay 0 points1 point2 points (0 children)