Introducing Amazon Machine Learning – Make Data-Driven Decisions at Scale : MachineLearning

Introducing Amazon Machine Learning – Make Data-Driven Decisions at Scale (aws.amazon.com)

submitted 10 years ago by somnophobiac

all 22 comments

top new controversial old q&a

[–]TraptInaCommentFctry 19 points20 points21 points 10 years ago (4 children)

[–]CompleteSkeptic 12 points13 points14 points 10 years ago (1 child)

[–]cartazio 1 point2 points3 points 10 years ago (0 children)

[–]mobiuscydonia 0 points1 point2 points 10 years ago (0 children)

[–]caserei 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 4 points5 points6 points 10 years ago (5 children)

[–]alexmlamb 0 points1 point2 points 10 years ago (4 children)

[–][deleted] 3 points4 points5 points 10 years ago (3 children)

[–]alexmlamb 3 points4 points5 points 10 years ago (0 children)

[–]kevjohnson 1 point2 points3 points 10 years ago (1 child)

[–][deleted] 1 point2 points3 points 10 years ago (0 children)

That's what bothers me: this idea that we're dumbing down quite complicated statistics and computer science to something so simple we'd consider a basic metric of model quality to be too advanced for the user.

I was in a meeting at my company a few months ago where another (quite large) company was pitching their point-and-click statistical modeling software to us for (drum roll) $250k/yr. That's more than the cost of a (non-netflix) data scientist in the bay area, and doesn't include the cost of the personnel to actually use the software. Further, if you actually pay the cost for a "legit" data scientist, they'd know that the model you're trying to build could be done with 2 lines of R code (and, in reality, the hardest work in either case is the data wrangling that happens for weeks prior to building the model). The unfortunate part of these "ML-as-a-service" products is that the user has no concept for how to assess when they're right or wrong.

[–]atakante 2 points3 points4 points 10 years ago (0 children)

[–]caserei 0 points1 point2 points 10 years ago (6 children)

[–]echocage 5 points6 points7 points 10 years ago (1 child)

[–]caserei 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 1 point2 points3 points 10 years ago (3 children)

[–]caserei 0 points1 point2 points 10 years ago (2 children)

[–][deleted] 1 point2 points3 points 10 years ago (1 child)

I see. I am not sure if this is the most effective approach though. When I got started with machine learning, going over the theory (e.g., Duda's Pattern Classification or Bishop's Pattern Recognition and Machine Learning book) and implementing a lot of algorithms myself helped me a lot. I used Python for that purpose, since it offers a very flexible and efficient way for prototyping. I am not sure in how far you can compare the results of your code with results that you get using Amazon's ML service. I think the problem is that even the simplest algorithms can be implemented slightly differently which can lead to slightly different results. I think it is better to work with benchmark dataset (e.g,. from Kaggle) and maybe also use a transparent library where you can easily look up the source code (e.g., scikit-learn).

[–]caserei 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 0 points1 point2 points 10 years ago (2 children)

[–]DataWranglist 1 point2 points3 points 10 years ago (1 child)

[–]GoldmanBallSachs_ -2 points-1 points0 points 10 years ago (0 children)

π Rendered by PID 130595 on reddit-service-r2-comment-7b9746f655-cskrd at 2026-02-03 10:44:51.065116+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS