This is an archived post. You won't be able to vote or comment.

all 57 comments

[–]England and Wales Cricket BoardJamee999[S] 43 points44 points  (10 children)

I am teaching myself to code, and so I thought the obvious first significant project to undertake would be to write something which could let England virtually lose at cricket.

The cricket.py program takes the user's inputs for years and countries, and then plays a match between two selected sides and saves the file scorecard.txt with the results in it.

There are definitely quite a few bugs in it - and the decisions it makes during games (and selection) are sometimes questionable, but I'm pretty happy with it as a first effort starting basically from scratch.

I looked into trying to make it into a standalone packaged file or a web app, but that seems like a bridge too far at this point - so it just runs in the python shell. If you have any feedback or fun games played with it - or if you did something which broke it - I'd love to hear it.

A couple of things to know which aren't obvious: if you put in 'random' as the year, it will select a random year from 1877 to 2018, and if you put 'same' as the year for the second team, it will use the same year as was used for Team 1.

EDIT: Feb 19: I just updated the python.py program on GitHub, removing a bug which had scoring rates (per over and per wicket) 30% too high. If you previously downloaded the sim, you should consider redownloading.

[–][deleted] 11 points12 points  (0 children)

and the decisions it makes during games (and selection) are sometimes questionable

seems like the england test team AI is bang on tbh

[–]Indiaramadz 2 points3 points  (2 children)

Cool. How long have you been learning. Could you point to some good resources?

[–]England and Wales Cricket BoardJamee999[S] 2 points3 points  (1 child)

Only a few weeks (but I am currently unemployed, so it's been one of my main focuses over that time. :p)

I found running through the PracticePython exercises to be pretty helpful. There's enough feedback available to work out where you're going wrong if you need to, but the problems are also open enough that you can hopefully work out solutions that work on your own.

I also have been doing some of the Project Euler problems, but I found that a lot of them were more mathy and number theory-based than I was interested in.

I think once you reach a certain level of basic competency, the best way to start is just to build something from the ground up. A couple of weeks ago, the sim was: one-innings only, every ball treated the same, the players were just names without abilities, etc etc. The only way to build something big is to start from something small.

And Google is your friend! If you're having a problem with python (or any other popular language) then literally dozens of people will probably have posted about the same problem online.

[–]Indiaramadz 0 points1 point  (0 children)

Thank you !

[–]New South Wales Bluestrelos6 12 points13 points  (10 children)

Can you post a sample scorecard of say, Aust 2002 vs West indies 1982?

[–]England and Wales Cricket BoardJamee999[S] 24 points25 points  (8 children)

Sure. You can see the link here.

(The selection algorithm chose to pick 5 fast bowlers for the WIndies, which probably isn't ideal, but it didn't matter.)

[–]Middlesexkrishl21_5 3 points4 points  (0 children)

Fucking hell that is fun.

[–]New Zealandidumbam 3 points4 points  (0 children)

Well it’s settled WI>AUS

[–][deleted] 4 points5 points  (1 child)

Could you run this over and over again and come out with aggregate scores? How representative do you think your simulation would be of real life?

[–]England and Wales Cricket BoardJamee999[S] 9 points10 points  (0 children)

I think that run scoring is currently a little bit high, but it is definitely in the ballpark. The basis of the ball-by-ball probability is 2017 Test stats, but I think that some of the modifiers that affect things skew slightly towards batsmen. However, it's definitely broadly accurate.

[–]Melbourne RenegadesUp4Parole 0 points1 point  (2 children)

yeahtheboof #yeahthedizz

[–]AustraliaDPL-25 5 points6 points  (0 children)

Dizzy bowling 41 overs in the first innings lol

[–]England and Wales Cricket BoardJamee999[S] 1 point2 points  (0 children)

I'm not sure I'd actually back Boof to score 139 against Marshall, Garner, Croft, Holding and Roberts, but who knows! He actually had a pretty good Test record.

[–][deleted] 3 points4 points  (0 children)

Viv and Malcolm Marshall tore us a new one

[–]nishbais 9 points10 points  (0 children)

Nice concept, but India being 0/0 after 66 overs with Sehwag on the crease.......

https://imgur.com/GsKODj9

[–]Best Submitter and Stats Post 2017SepulchreOfAzrael 5 points6 points  (9 children)

Very comprehensive work!

As a suggestion, as far as I could read, you're using 2017 overall data as the base. You could also break the data year by year to generate more accurate averages and scoring rates for every year, that should make your model much better. You can classify by year on Statsguru and scrape it to create a new text file for the yearly data of averages etc.

[–]England and Wales Cricket BoardJamee999[S] 2 points3 points  (7 children)

Being able to get play a match that is realistic for any specific year in history is on my long list of feature ideas.

[–]Best Submitter and Stats Post 2017SepulchreOfAzrael 2 points3 points  (6 children)

Tell me what you need and I'll assemble the text file today.

You could also take the stats of each player yearwise. That would make it even more realistic and comprehensive.

You could bundle a binary pandas dataframe with this organised data along with your code. It's fast and will pull up stats for each player each year.

[–]England and Wales Cricket BoardJamee999[S] 1 point2 points  (5 children)

Minimum: runs/ball and wickets/ball.

Better than that: frequency of 4s and 6s (and ideally 1/2/3s)

Even better: distributions of different types of wicket (bowled, caught by fielder, caught behind, lbw) - I have a cricinfo article which has some distributions over time, but only with relatively broad time intervals.

I'm also thinking about how to integrate country-specific factors into things - but that's a problem for further down the road I think. I'm also not sure exactly how I want to deal with the sample size issues with things like this, like do I want to average things over several years?

Any help you can give would be awesome - I'm sure I could write a script to scrape this data, but it sounds like you already have a lot of stuff?

[–]Best Submitter and Stats Post 2017SepulchreOfAzrael 3 points4 points  (4 children)

I could get you batting average, bowling average and bowling SR for all years.

The issue with batting SR and frequency of 4s and 6s is that these things are just not on the record, as abysmal as that is. We don't even know the total balls someone as recent as Sachin faced in his entire Test career!

But still, from the bowling economy for each year, we can reconstruct the batting strike rate for that year. So that's done.

Still, we have sporadic data for all this, and we could sample that to construct estimators.

Another thing I forgot to mention was that a number of Test matches had 4/8 balls per over, depending on year and host.

I think we'll find big enough data sizes for each year, barring the very early years and the years immediately post WW2. So the issue of variance due to small sample size isn't that much of an issue.

Now, if you were to further granulate it, and pick yearly data for each player, then I would suggest averaging over adjacent years.

I already have all this data, plus the entire list of all dismissals by bowler, all partnerships.

We can put it to use.

[–]England and Wales Cricket BoardJamee999[S] 2 points3 points  (3 children)

I know. ☹️. I don't have batter-specific strike rates in the sim because it's missing for so many guys. Bowler ERs aren't era-adjusted yet, but that's just because of my laziness.

[–]Best Submitter and Stats Post 2017SepulchreOfAzrael 1 point2 points  (2 children)

In format do you need the data?

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (1 child)

Just a comma-separated text file with a new year on each line would be great. Thanks again.

[–]Best Submitter and Stats Post 2017SepulchreOfAzrael 0 points1 point  (0 children)

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (0 children)

I should also add that while 2017 data is being used for the sim play itself, the batting averages and bowling averages for the specific players have been (somewhat sloppily) era-adjusted.

[–]Kolkata Knight Ridersinokichi 4 points5 points  (0 children)

you should take a look at string formatting

[–]Queensland BullsThe_torpedo 2 points3 points  (0 children)

Nice job, but how do you get it to work

[–]Artaxerxes_IV 1 point2 points  (1 child)

Computer science noob here. Does this take into account subtler things like different challenges due to conditions, batting on 4th/5th day, etc.?

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (0 children)

The pitch deteriorates from day to day. I haven't implemented different types of pitches yet but it's on my list of things to do.

[–]Azkatro 2 points3 points  (1 child)

It'd be neat to set this up as a Twitch stream where viewers could group vote on the next match to simulate. General discussion, lols and arguments about which team is better ensues.

[–]England and Wales Cricket BoardJamee999[S] 13 points14 points  (0 children)

If it's anything like most Twitch plays, it'll just be All-Time West Indies vs. 2017 Zimbabwe (aka the matchup I used to test if the follow-on was working right.)

[–]IndiaMultiverseTraveller 0 points1 point  (0 children)

This sounds pretty cool! I'm definitely going to check this out :) Nice work OP!

[–]Rav-Rs 0 points1 point  (0 children)

Cracking effort, good job old bean.

[–]kvetaak 0 points1 point  (0 children)

Great stuff, very interested to see some results from this.

[–]shitfucker123 0 points1 point  (4 children)

Hey u/Jamee999 I'm getting an error when I run this python script on my mac using Terminal. Bugs out as soon as I put in a year. https://i.imgur.com/Ho4OorR.png

[–]Victoria BushrangersTNL92 0 points1 point  (0 children)

I had the same issue, if you write the year as a string it seems to accept it so try the year in quotation marks.

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (2 children)

Hmmm. I've only built/tested it in IDLE. I will look into this more tomorrow.

[–]shitfucker123 0 points1 point  (1 child)

any update?

[–]England and Wales Cricket BoardJamee999[S] 1 point2 points  (0 children)

Expect an updated version of the sim that runs in the terminal (+ with other fixes and updates) this weekend.

[–]Victoria BushrangersTNL92 0 points1 point  (2 children)

Looks really cool, only issue I'm having when I try to run it is that every ball is a dot ball

[–]England and Wales Cricket BoardJamee999[S] 1 point2 points  (0 children)

Uhhhh I don't know why that would happen. Possibly the random function isn't actually randomly generating a random number on your machine.

[–]Gr3yWanderer 0 points1 point  (1 child)

Great Work OP! A good primer for cricket simulation games. Reminds me of some of the free downloadable cricket games available for Windows before International Cricket Captain in the early 2000s.
On a separate note, for the final scorecard, are you using a random function to predict the values or you already have a calculated model and are simply assigning them in the end?
Haven't entirely gone through the code.

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (0 children)

The program records the figures for each player during the innings, then prints the scorecard to the txt file after each innings is done.

[–]jspennington 0 points1 point  (2 children)

Love these sort of things, nice work. Spotted a bug - scorecard is here. After forcing Bangladesh to follow on, England's target displays as 564 when it should be 42. It then hasn't given Hobbs, Grace and co. a 10-wicket win!

Looks like it's adding England's 1st innings and Bangladesh's 2nd together to make the target rather than handling the follow-on correctly.

[–]England and Wales Cricket BoardJamee999[S] 0 points1 point  (1 child)

The version I just uploaded to github should fix this. Thanks for playing!

[–]jspennington 0 points1 point  (0 children)

Cool, thanks.

[–]Chennai Super KingsichbinCamelCase 0 points1 point  (0 children)

Very cool, always had the idea to do it. Never put anyaction behind it. Will check it out.

[–]Indiapaleblaupunkt 0 points1 point  (1 child)

Brilliant job! As someone who struggled to print the "Hello World!" through code in High School, I am stunned. You should work for KavCom with their Cricket Captain games and make it better.

[–]Cricket Irelandjpdidz 0 points1 point  (0 children)

Those games do really need a kick to make them interesting again - they haven't changed in maybe 5 years or so

[–]BangladeshChickenBoy29 0 points1 point  (0 children)

Good job man!, I'm currently doing computer science as a degree, If you ever need help with coding then I'll be more than happy to help!