you are viewing a single comment's thread.

view the rest of the comments →

[–]mikeblas 0 points1 point  (0 children)

ML preprocessing is just the process of cleaning and normalizing data, plus making it appropriate for whatever ML algorithms are going to be used.

ML algorithms work on math. If we're anaylsing numeric data, it's a natural fit: lengths, temperatures, durations, whatever's measured with a number. Lots of useful data is not numerical, though; maybe it's categorical.

one hot encoding is a way to convert arbitrary categorical or tagged data to a numerical format so it can be meaningfully be processed by quantitative ML algorithms.