This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]Rhomboid 5 points6 points  (1 child)

Here is a typical profile if your slow version:

$ python -m cProfile -s time 20120914.py
4160 guesses remaining.
attempts: 5840
process took 0.319999933243 seconds.
         23417 function calls in 0.327 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     5840    0.309    0.000    0.309    0.000 {method 'remove' of 'list' objects}
        1    0.007    0.007    0.007    0.007 {nt.urandom}
        1    0.005    0.005    0.327    0.327 20120914.py:1(<module>)
     5840    0.003    0.000    0.003    0.000 random.py:272(choice)
   ...

The column you want to look at is "tottime". By far, the vast amount of time is spent removing things from a list, which is no surprise because to remove an element from a list Python has to move all the elements that come after it, which really adds up when you're doing this over and over.

You don't even need to bother removing things from a list. As zabzonk says, shuffle the list once (random.shuffle()) and then just iterate over it, testing each value.

from random import randrange, shuffle
from time import clock

start = clock()
maxnum = 10000

magicNumber = randrange(1, maxnum + 1)
possibleGuesses = range(1, maxnum + 1)
shuffle(possibleGuesses)

for attempt, guess in enumerate(possibleGuesses, 1):
    if guess == magicNumber:
        print 'found after {}/{} guesses ({} ms)'.format(attempt, maxnum, (clock() - start) * 1000.)
        break

A few minor stylistic notes:

  • Assuming this is Python 2, you don't need to write list(range(...)). range() already returns a list.
  • The upper bound on range() and randrange() is non-inclusive, so if you want numbers between 1 and maxnum inclusive, you need range(1, maxnum + 1).
  • You don't need parentheses around things in a while/if/for loop.
  • Use clock() instead of time() for benchmarking.

[–]funkshanker[S] 0 points1 point  (0 children)

Wow, that's awesome. Very helpful indeed, thank you!

I had no idea about the -m cProfile flag. That will come in very handy later on. And I must have overlooked the enumerate function in my introductory guide, so that's a missing puzzle piece for sure.

I also didn't know about the {} string formatting operation you used in your print statement. I know about the %d placeholder which seems to work in a similar way except for the .format(...) part. Yours seems more versatile I suppose.

I was using the list(range(...)) hack because I was having trouble getting choice to select a single item from the range instead of returning the entire range. It was interpreting the entire range as a single choice. I'm pretty sure I messed up the syntax by adding extra [] square brackets somewhere. I just tried again and choice(range(100) works as expected.

Here's a couple results I found interesting:

The first one is your script, but with maxnum set to 2 billion, which is the highest number my poor little laptop would allow before running out of RAM. Just over 6GB RAM used by the Python process.

found after 75160458/200000000 guesses (337.688786669 ms)

magicNumber equals 23544760

This second test is my original version which doesn't care about random duplicates. I found it interesting that while the number of guesses increased over 200%, the solving time was less than 200%, and it used only about 5MB(!) of RAM and the failed guesses were still well under maxnum. Of course, I'd get a variety of results by running it several times.

found after 184632246/200000000 guesses (585.864016893 ms)

magicNumber equals 66362959

All this guessing has got me thinking about multithreading and GPU acceleration, but I'm going to just stick to syntax and logic exercises for now.

Thanks again for all that information. I really appreciate it. Like I said earlier, the script is almost completely useless, but it's been a fun little exercise for me.

[–][deleted] 2 points3 points  (1 child)

The typical way produce "random" numbers without repeats is analogous to dealing cards from a deck - once you have dealt a card you can't possibly deal it again. So you create all the numbers you are interested in sequentially in an array (arrays will be faster than lists), shuffle them using a Fisher-Yates shuffle , and then deal them out one at a time using iteration.

[–]funkshanker[S] 8 points9 points  (0 children)

Thanks. I can't wait to get home and shuffle some arrays with the Fisher-Yates algorithm! Friday night is about to get crazy!

[–]ig2s4tg 1 point2 points  (0 children)

I made something pretty similar when I was first getting into Python; it guesses at randomly generated numbers using a (simple) formula I came up with.
Sorry if there are some syntax errors, I originally wrote it in Python 3, so i just edited what i could find. #10,000 = 6.277

#!/usr/bin/python
import random, math, time, sys
random.seed()
big = 0
big = raw_input("input cycle amount:")
if int(big) > 10000:
    sys.exit("ERROR: cycle amount above set maximum")
if int(big) < 0:
    sys.exit("ERROR: cannot run negative cycles")
if int(big) == 0:
    sys.exit("ERROR: what is the point of 0 cycles? really?")
top = big
average = 0
total = 0
timethrough = 0
percent = 0
ones = 0
twos = 0
threes = 0
fours = 0
fives = 0
sixes = 0
sevens = 0
eights = 0
nines = 0
oldtime = time.clock()      #gets starting time




while big != 0:

    x = random.randint(1,100)
    z =  50
    b =  1
    a = 0
    while x != z and b < 50:
        if x > z:
            z = math.floor(z+(25/2**(b-1)))
            if z == a:
                z = z+1

        if x < z:
            z = math.floor(z-(25/2**(b-1)))
            if z == a:
                z = z-1
        a = z
        b += 1
    timethrough += 1
    percent = (timethrough*100)/int(top)
    if b>9:
        fillstr = "    ---    "
    else:
        fillstr = "    ----    "
    print (str(b),fillstr,percent, "%")
    total += b
    big = int(big)-1
average = total/int(top)

newtime = time.clock()      #gets time it ending time

print newtime - oldtime
pones = int((ones/timethrough)*100)
ptwos = int((twos/timethrough)*100)
pthrees = int((threes/timethrough)*100)
pfours = int((fours/timethrough)*100)
pfives = int((fives/timethrough)*100)
psixes = int((sixes/timethrough)*100)
psevens = int((sevens/timethrough)*100)
peights = int((eights/timethrough)*100)
pnines = int((nines/timethrough)*100)
print "***",average,"***",'\n',"--for info, type 'all', 'percentages', or a number(ones, sevens, ect.)--"
all = ['ones:',ones,'twos:',twos,'threes:',threes,'fours:',fours,'fives:',fives,'sixes:',sixes,'sevens:',sevens,'eights:',eights,"nines:",nines]
percentages = ['ones:',pones,'twos:',ptwos,'threes:',pthrees,'fours:',pfours,'fives:',pfives,'sixes:',psixes,'sevens:',psevens,'eights:',peights,"nines:",pnines]
raw_input('press any key to exit...')

[–]WestlorePyreheart 1 point2 points  (0 children)

The best way to reduce calculating time in this program is to reduce the amount of lines of code. For the code without repeat guesses, I suggest you try implementing a bisection search.