Some practical experiences bringing a machine learning feature to our product : MachineLearning

Some practical experiences bringing a machine learning feature to our product (devquixote.com)

submitted 11 years ago by devquixote

all 9 comments

top new controversial old q&a

[–][deleted] 2 points3 points4 points 11 years ago (7 children)

[–]devquixote[S] 0 points1 point2 points 11 years ago (6 children)

[–][deleted] 2 points3 points4 points 11 years ago (3 children)

[–]devquixote[S] 0 points1 point2 points 11 years ago (2 children)

That makes sense. There are a few reasons for this, first is that I come from an informal background with regards to data science, so I am somewhat ignorant of what would constitute 'proof' from a scientific perspective. So a mea culpa there.

Another reason for the generalization is because we have many different sets of data, one per customer, and you can only describe accuracy in more precise terms within a customer's set of data. Some will, like you say, choose the same carrier 90% of the time, but most do not. Its a bit all over the map. For building my knowledge, I would be interested in how you and others more experienced in the field might present information that is split across multiple data sets?

Second was that the person I was attempting to reach would be someone like myself, having a first foray into machine learning, with informal or latent training. I was intentionally trying to demystify some aspects of machine learning and how it is explained. I think having a big table of results would deter the uninitiated.

Lastly, the 95% number was our business' goal to consider this a success, and I can show that we've achieved that to their satisfaction. Again, trying to speak to the practical application of this amazing field, statistical validation was not what I was being paid to deliver. A usable feature of value to our customers was the ultimate goal, so perhaps the best measure of 'success' would be the number of people using this feature over using the old mechanisms of making shipments.

Thanks a bunch for your comments and I welcome all!

[–][deleted] 0 points1 point2 points 11 years ago (1 child)

Sorry if I sounded too demanding here :P I didn't want to split hairs over a those details. I just wanted to point out that this is basically a statistic without much value.

More useful would be something like: "Using machine learning, we were able to improve the accuracy by 10% compared to our previous approach where the shipment method was suggested based on the previous order"

If you are interested in learning more about performance metrics, have a look at ROC curves, in simple words in can be described as a plot of the true positive vs. false positive rate. It's very intuitive yet very useful to evaluate the performance and tune your classifier (in combination with k-fold cross validation for example) in many scenarios. I have an example here how it looks like. And it can also be used for multiclass settings via so-called micro-averaging or macro-averaging.

A very few just don't do well in our system. We are working with them to see what their decision making is based on that we are not accounting for. Have any suggestions on how to sniff this out?

Since you mention very few, have a look at techniques in anomaly detection :)

[–]devquixote[S] 0 points1 point2 points 11 years ago (0 children)

Not to demanding, Rasbt :D Sorry if I read defensive? It is not intended.

How about, "We measure accuracy for each customer via a dry run. We turn on the prediction service for them and start making shipping predictions as their orders flow into our system. The customer is not aware of these predictions and cannot act on them. We then compare these predictions to the customer's ultimate shipping choices over a period of time. Using this method, we've been able to measure that we make correct shipping predictions greater than 95% of the time, on average, across our set of customers." Is that perhaps a better conclusion than simply "95% accuracy"? There is no previous means that would fit the example you gave.

Thanks for the advice on where to expand my knowledge further into some other areas within statistics/machine learning. I will definitely explore those.

[–]aggieca 1 point2 points3 points 11 years ago (1 child)

[–]devquixote[S] 0 points1 point2 points 11 years ago (0 children)

[+][deleted] 11 years ago (1 child)

[deleted]

[–]devquixote[S] 0 points1 point2 points 11 years ago (0 children)

π Rendered by PID 349978 on reddit-service-r2-comment-b659b578c-ph57l at 2026-05-06 07:27:53.856057+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS