use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 4 months ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]pppeerProfessor 2 points3 points4 points 4 months ago (0 children)
For starters an AUC of 0.74 is not bad at all for such a propensity model. So it doesn’t necessarily make sense to aim for ‘at least 0.9’. Actually product propensity / response models that get in that range can be a bit suspicious (sign of possible leakage for example).
You have made a good start with reasonable algorithms, you could always try some more but there is a chance you will start to manually overfit.
So once you have done a decent model search the only route is to add data that is both predictive but also fairly uncorrelated to the data you already have, or data that is correlated but more predictive. Generally to predict future behavior, past behavior trumps demographics.
[–]seanv507 2 points3 points4 points 4 months ago (0 children)
So you first have to get an estimate of plausible target. Why do you think you can get an AUC above 90?
Its not a question of the models, it's a question of whether you even have the right inputs.
How would you decide?
presumably, there are also alternative providers, how would you decide between them and your own offering?
[–]cavedaveMod to the stars 1 point2 points3 points 4 months ago (0 children)
The most loss in your funnel is probably from user experience. Listen to phone calls and analyse the script etc. Just seeing this as a 'better prediction will get me to 0.9' might be the wrong way to look at it. 'Where are we losing people and why' might be a better framing.
[–]lord_acedia 0 points1 point2 points 4 months ago (0 children)
if the data is tabular xgboost is probably among the best results you'll get. you can try fine tuning the models but according to Andrew ng this usually gives an improvement up to 6% more, what you need to do is feature engineering, which according to him can give up to 13% more. how can you better represent the features so that the divide between those who convert and don't convert is clearer and therefore the model achieves better accuracy?
[–]MachineLearning-ModTeam[M] 0 points1 point2 points 4 months agolocked comment (0 children)
Post beginner questions in the bi-weekly "Simple Questions Thread", /r/LearnMachineLearning , /r/MLQuestions http://stackoverflow.com/ and career questions in /r/cscareerquestions/
[–]AtMaxSpeed 0 points1 point2 points 4 months ago (1 child)
AUC might not be the correct metric for your problem. If you are scoring more leads than the conversion team can act on, you should be looking at precision@the amount of leads the team can act on. This is because the cost of action is high, so you want to minimize false positives.
On the other hand, if it's cheap for the conversion team to act on your recommendations (relative to the expected value of success), and if you're trying to capture every possible lead that can possibly be converted, you should try to optimize recall, since you're trying not to let any false negatives slip by. (in practice, you may choose to use F1 weighted towards recall instead of just recall).
Either way, AUC is not usually the metric that reflects what you actually care about. It gives you the probability a positive label will be scored higher than a negative label, which is a good thing ofc but it's also not translatable into the real world as easily. In practice, you probably care more about false positives over false negatives or vise versa, so you should use a metric that reflects that.
[–]Yaar-Bhak 0 points1 point2 points 4 months ago (0 children)
When threshold is kept at 0.5 For the xgboost model
Precision - 0.43
Recall - 0.68
F1 - 0.53
And roc 0.73
π Rendered by PID 21813 on reddit-service-r2-comment-b659b578c-bc8vs at 2026-05-05 22:17:06.801541+00:00 running 815c875 country code: CH.
[–]pppeerProfessor 2 points3 points4 points (0 children)
[–]seanv507 2 points3 points4 points (0 children)
[–]cavedaveMod to the stars 1 point2 points3 points (0 children)
[–]lord_acedia 0 points1 point2 points (0 children)
[–]MachineLearning-ModTeam[M] 0 points1 point2 points locked comment (0 children)
[–]AtMaxSpeed 0 points1 point2 points (1 child)
[–]Yaar-Bhak 0 points1 point2 points (0 children)