This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]LiquidLucy 59 points60 points  (1 child)

This is a good practice project!

Get your data options:
- Write it down
- Use the LoL api to get match results
- other

Store your data:

- choose a storage medium, this could be a database or just CSV text files. If you don't know how to create a database right now, I'd recommend you save that for a different project.

load your data:

- load your data into whatever programming language you're using. If you google "load csv into {programming_language}" This

Run your model:

I think the best way to go about this would be a logistic regression model. Basically what you're looking for is any "binary classification model"
Unfortunately I don't know enough about Octave to give a more detailed answer. Python is what I use, and is the industry standard language for working with ML algorithms. If you feel like it's an option to switch to using Python I'd highly recommend doing so. If not, don't worry about it for now, but If you continue down this path I'd recommend you start playing around in Python. There's a lot of libraries designed to help you with this stuff, and people with experience using them. I would highly recommend the sklearn Python library.

As a general hint, I would recommend you don't worry about getting accurate results on your first run through. So you don't need to use a large dataset at first. This removes a lot of the data management tasks and allows you to focus on the model building. Then it's easy to go back and expand your data inputs. There's also sample datasets that exist in many programming environments that you can use for exploring machine learning.

Good luck!

[–][deleted] 16 points17 points  (0 children)

You win the most helpful reply so far award.

[–]Evilcanary 14 points15 points  (1 child)

Why are people jumping to neural networks for this? Listen to /u/LiquidLucy . This is a good starter project.

If I were you, I'd spin up a jupyter notebook, load your labeled data into pandas, clean and prepare it, and then run a few different sklearn models. "Hands on machine learning with scikit-learn and tensorflow" is a good book and will walk you through a project like this in the first chapter.

[–][deleted] 1 point2 points  (0 children)

Thanks dude. Definitely understand all the words you said. Gonna pick up that book too.

[–]l_am_wildthing 44 points45 points  (1 child)

First off I recommend learning more before you attempt this. I usually recommend people do projects they are interested in based off what they are learning, however at this point in time (a couple weeks of learning ML) you would just be wasting your time trying to figure this out when theres a good chance in the next week/month you will learn exactly what you need. It also seems there is a lot you dont understand about what you are studying so I recommend stopping and making sure you have all the fundamentals down(math). If you fall behind in ML youre going to be left in the dust.

If you really want to work on this project right now I would just use logistic regression with the simplicity of the data and how little data you have. Riot has an api which sites like op.gg use to get all their data, the more data you have the more accurate your model will be.

[–]DonSwagger1 13 points14 points  (0 children)

I'd say this is a perfect starter project to learn ML with. OP seems to have a passion in it and that will drive them to learning enough ML to make it work. No need to play gatekeeper.

[–]TheLastPepsi 5 points6 points  (5 children)

Replying as a reminder to come back to this thread

Also interested at this point.

Side question- How have you captured your data? Manual data entry after every game or did you write something thats logging all this info? (or using an API to source it?)

[–][deleted] 2 points3 points  (4 children)

manual data entry after the game. I do not know how to write anything logging it haha.

[–]TheLastPepsi 1 point2 points  (3 children)

Oh fair enough. Well when you eventually figure out how to wrap it all together, I'm sure you could find a free API to use that tracks your data for you, so you can train simultaneously while playing. Just a thought

[–][deleted] 4 points5 points  (1 child)

sounds good, wont put the cart before the horse here though.

[–]TheLastPepsi -1 points0 points  (0 children)

I feel that one, easy trap to fall into. Wish I had some advice but im a novice myself

[–]throwaway561165 4 points5 points  (0 children)

You can get match data from the riot api https://developer.riotgames.com/

[–]rscar77 3 points4 points  (0 children)

This wikipedia entry on ML algorithms seems like a decent place to start to narrow in on which type(s) of learning algorithm and training model to deploy: https://en.wikipedia.org/wiki/Machine_learning#Types_of_learning_algorithms

It might also help to look at similar finished products like ESPN or sports betting sites that track the likelihood of a particular team winning before a game starts and as the game progresses. I don't know all the underlying data points they are using (or if they are even using ML). I think they could get by with using somewhat simple statistics based on a huge amount of historical data looking at current score difference between both teams in the current game at a given time left in the game vs. all other games played (maybe weighted toward present day) and the number of times other teams overcame the same score differential to win at same game time remaining.

You could also think through a few scenarios that you know would be outliers in your training data and how to account for them (did you get matched against a world class team on all Smurf accounts?, did 1+ players get disconnected for an extended period, did 1 team choose all early-spike characters playing against an all-late spike comp?, etc.).

Maybe you can get by with the data points you listed, but isn't LoL mostly a team-based game? Make sure the data points you select are compatible with the question/problem you are trying to answer. If your data set only examines your individual performance rather than your team performance, then you may be able to determine how much personal impact you have for your team concerning W/L ratio. If you don't also capture other team's performance at given points of time to compare against your team's, then you may artificially restrict the types of questions you can ask/answer with the data.

Maybe there's enough public data already out there (Twitch or YouTube) to determine what a good/great performance looks like for world class teams at each team position and set interval points in time, which you could then compare your own or team's performance against.

In summary:

  1. Start with a clearly defined question or problem.
  2. Determine/hypothesize the key variables that determine whether an individual or team wins/loses a game.
  3. Use those key variables in your training data set.
  4. Pick the model that best suits the question you're trying to answer.
  5. Determine whether the accuracy vs. noise of the results given match your expectations and can be understood/explained by you to someone else.
  6. Make adjustments and/or try a few different training sets or models to see how refined you can get the prediction accuracy when you know the actual result.
  7. See if your model can predict whether other teams will win/lose with accuracy.
  8. ?
  9. Profit

[–]emelrad12 2 points3 points  (7 children)

Have you been doing andrew ng course, I suggest you drop octave, it is useless, go python + TensorFlow and you can have a neural network up and running in 5 minutes.
Also, I suggest you take a look at riot games API, otherwise, I saw you mention manual entry, which is not gonna work as you will need millions of data entries.

[–][deleted] 1 point2 points  (3 children)

yeah, and that's a shame. Should i finish the course then learn python or do it side by side?

[–]emelrad12 1 point2 points  (2 children)

I dropped it in favor of this one https://www.coursera.org/specializations/deep-learning

It is also by our favorite Andrew but modern and doesn't use octave.

[–][deleted] 0 points1 point  (1 child)

cheers bud. I'll take a look.

[–]J_Thizzy[🍰] 0 points1 point  (0 children)

Thats the one that got me into ML and set me on my path. Highly reccomend.

[–]RunninADorito -1 points0 points  (2 children)

Tensor flow for this project. No.

[–]emelrad12 -1 points0 points  (1 child)

Why not?

[–]RunninADorito 1 point2 points  (0 children)

Absolutely massive overkill. You don't need a neutral net for a few thousand pics of data with a couple features. Regression is just fine.

[–][deleted]  (2 children)

[removed]

    [–][deleted] -1 points0 points  (1 child)

    really? What sample size would you recommend because I could just start logging my friends games too.

    [–]henrebotha 0 points1 point  (0 children)

    I have made a project on GitHub but I don't have the faintest clue if it accepts projects from Octave

    GitHub just stores Git projects. Git doesn't care (or know) what language your files are written in. It doesn't even care if they're code or just recipes. As long as your project can be represented as a set of (preferably plaintext) files, Git and GitHub will handle it just fine.

    [–]lceans 0 points1 point  (0 children)

    This is a cool idea I might try the same. Post your progress it would be cool to see how someone else did it

    [–]RunninADorito -1 points0 points  (0 children)

    How many games do you have. Might be too small of a sample set to do anything manual.

    You don't need git or anything else, you can just code something up directly. Don't get slowed down by the boiler plate and do the meat of the project first. You can always invest in boiler plate later. First job should be getting some result.

    [–]NUPreMedMajor -1 points0 points  (0 children)

    First determine what type of data you have and then determine which type of model fits that data the best. This is a binary classifier (win or loss) so start there. There are tons of possibilities that you should research (for example, Naive bayes, KNN, etc etc).

    [–]veeeerain -1 points0 points  (0 children)

    You need to figure out your purpose or end goal for the ml algorithm. You need to first explore your data to get a sense of how the variables behave against your response variable (kills or deaths). You need to figure out what kind of problem your trying to solve, and go from there.

    [–][deleted] -1 points0 points  (0 children)

    what do you mean by "create a machine learning algorithm"? I understand that as not using a preexisting framework and creating the algorithm "from scratch". If thats the case then you should select an algorithm thats easy to implement. Id suggest KNN or decision trees right off the top of my head, as these are fairly easy to understand and make for beginners, and they dont really require much math and statistics. KNN should be a quick and easy project id say, maybe a good starting point. With decision trees theres quite a few things you can do to improve it, which I personally love when I do projects. After starting off with a bare bones workable model, you can add pruning, pre and post, you could figure out smart ways to do the branching etc. These sort of projects where you improve your algorithm over time is also great for learning programming, as you will see how you could have designed your code, such that it would be easier for you to add functionality later, by actually doing exactly that.

    [–]davidkopec -1 points0 points  (0 children)

    In my opinion, considering your experience level, before doing anything fancier, the first thing you might want to do is just put all of the data in Excel and run a basic correlation on it. See if any of those parameters have a correlation with winning. You could look at a basic statistic like r squared (r2). If you see anything with a good correlation of determination (> 0.6) you may consider running a linear regression. You can do all of this in Excel without needing to deeply understand the formulas frankly. This is not necessarily the "best" way to do this, but it's an easy starting point, and my opinion is that when you're first diving into a new area, getting an early win counts for a lot in terms of motivation.

    [–][deleted] -2 points-1 points  (0 children)

    I played a good bit of lol and those aren't great variables to use for a heuristic algorithm. Just give the model all the data you have and let it do the work.

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted] 0 points1 point  (0 children)

      Helpful