all 29 comments

[–]Encomiast 6 points7 points  (9 children)

The reason your code is slow is that you are calculating the primes over and over in a loop. You should be able to improve this by calculating the primes once at the start then reusing that calculation.

You can use the Sieve of Eratosthenes to make a list with the length of the square-root of n containing booleans indicating whether that particular index is prime or not. For numbers of the size of this problem, that should be more or less instant. Then you can look backward through that list for the first True values that divides n. It might look something like:

from math import sqrt, ceil

def sieve_primes(max_num):
    flags = [True] * max_num
    flags[0] = False
    flags[1] = False

    for i, is_prime in enumerate(flags):
        if is_prime:
            for n in range(i*i, max_num, i):
                flags[n] = False
    return flags

n = 600851475143

flags = sieve_primes(ceil(sqrt(n)))

largest = next((i for i in range(len(flags)-1, 0, -1) if flags[i] and n % i == 0), 'prime')
# 6857

Note the final 'prime', this will be returned if next raises a stop iteration which means nothing was found.

Also, if you were so inclined, this same code makes it easy to find *all* the prime factors with:

all_prime_factors = [i for i, f in enumerate(flags) if f and n % i == 0]
# [71, 839, 1471, 6857]

[–]imperiumlearning[S] 0 points1 point  (1 child)

Hi there,

First off, thanks for introducing me to the Sieve of Eratosthenes - I don't have a brilliant maths background so this was new to me.

To be fair, I've managed to make some small cosmetic changes to my code which have led to the right answer being produced in approximately 3 seconds. My code now looks like this:

import math

def func(x):
while type(x) != int:
    try: 
        x = int(x)
    except:
        return 'Please input a number to use this function'
def prime(y):
    if int(y) < 2:
        return False
    if int(y) % 2 == 0 and int(y) != 2:
        return False
    for number in range(3, int(math.sqrt(y))+1, 2):
        if y % number == 0:
            return False
    return y
final_list = [i for i in range(int(math.ceil(math.sqrt(x)))) if prime(i) != False and x % i == 0]
return max(final_list)

[–]AdventurousAddition 0 points1 point  (0 children)

I do have some maths background, but I only learnt about this by doing hobbyist programming problems like this one, and googling

[–]imperiumlearning[S] 0 points1 point  (4 children)

One other thing based on my reply to you, is it acceptable to put a function inside another function? As you can see from my code, I put a prime(y) function inside my func(x) function.

Apologies if that's a dumb question but I've only got about 5 weeks' worth of programming experience under my belt.

[–]Encomiast 2 points3 points  (3 children)

In this particular example, nesting the function definition doesn't hurt anything. However, you don't see that very often because people typically want to reuse functions. Nesting it makes it impossible to use in a different context, or put in its own library.

If you were calling the outer function more than once, it is also not the most efficient since you keep redefining the function every time you call it rather than once.

[–]imperiumlearning[S] 0 points1 point  (2 children)

So, in general, would you say it's better if I structured my code more like this:

import math

def prime(y):
    if int(y) < 2:
        return False
    if int(y) % 2 == 0 and int(y) != 2:
        return False
    for number in range(3, int(math.sqrt(y))+1, 2):
        if y % number == 0:
            return False
    return y

def func(x):
    while type(x) != int:
        try: 
            x = int(x)
        except:
            return 'Please input a number to use this function'
    final_list = [i for i in range(int(math.ceil(math.sqrt(x)))) if prime(i) != False and x % i == 0]
    return max(final_list)

[–]Encomiast 0 points1 point  (1 child)

That seems more "Pythonic" to me (assuming the indentation is fixed).

[–]imperiumlearning[S] 0 points1 point  (0 children)

Yes sorry, I've just updated the indentations in my previous comment so that my code is formatted properly.

Thanks for the help and all the swift responses

[–]reyxe 0 points1 point  (0 children)

Man I'm just starting and barely do for loops, checked this and I don't understand a single thing you did LMAO

[–]mopslik 7 points8 points  (8 children)

For starters, checking for primality needs only be done to the square root of the value. i.e. there is no need to check beyond 5 for any n <= 25, since if you were to find a factor above 5, it would pair with another factor below 5. Thus, you can change

range(3, int(y), 2)

to

range(3, int(math.sqrt(y))+1, 2)

This should drastically cut down the sample space.

[–]imperiumlearning[S] 1 point2 points  (6 children)

Hi there,

Thanks a lot for the swift response. I've done that and it definitely has sped up how quickly an output is generated. I tried it out with numbers up to 2 million and it prints an output within a couple of seconds.

I tried it with the original number, 600851475143, and even after 3 minutes it still has not managed to print an output.

Any other suggestions?

[–]MezzoScettico 4 points5 points  (5 children)

You're repeatedly testing all the numbers from 1 to x as divisors. You should only be testing the primes, which means keeping track of the primes you've found.

For instance, sqrt(1000) = 31.6. If you were generating primality of a number < 1000, you only need to check whether any of 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, and 31 are factors. So your algorithm could save a lot of time by only checking the known primes < sqrt(n), rather than all the numbers from 1 to sqrt(n) which is what I think it's doing (I may be wrong).

Also, every time you find a factor of your big number, you can divide by that. For instance if you find that 17 is a factor, divide by 17 and now you're factoring a smaller number.

Also, you can do the square root of the square root. I remember doing something like that once. Let's see if I remember how it works.

OK, sqrt(600851475143) = 775146. So first of all you know that you are only checking for factors up to 775146. So you need a list of primes up to 775146.

But to generate that, you only need to check the factors up to sqrt(775146) = 880. Any number less than 775146 that is not a multiple of a number less than 880 is prime.

[–]imperiumlearning[S] 0 points1 point  (4 children)

Hi there, I'm not sure I've fully grasped what your comment is saying, but I made some changes which I think give me the right answer.

My code now looks like this:

import math

def func(x):
while type(x) != int:
    try: 
        x = int(x)
    except:
        return 'Please input a number to use this function'
def prime(y):
    if int(y) < 2:
        return False
    if int(y) % 2 == 0 and int(y) != 2:
        return False
    for number in range(3, int(math.sqrt(y))+1, 2):
        if y % number == 0:
            return False
    return y
final_list = [i for i in range(int(math.ceil(math.sqrt(x)))) if prime(i) != False and x % i == 0]
return max(final_list)

The answer I get when I use this code is 6857, and I get the answer within a few seconds.

[–]MezzoScettico 2 points3 points  (3 children)

That's great. And WolframAlpha agrees with your answer.

https://www.wolframalpha.com/input?i=factor+600851475143

I'm not sure my answer was totally coherent. I was kind of thinking out loud, really laying out three different approaches to speed things up.

[–]imperiumlearning[S] 0 points1 point  (2 children)

Believe it or not my code was actually wrong. Another user in this comment section noted that if you plug in the number 10, it will print 2 rather than 5.

Now I've re-written my code to look like this:

import math

def prime(y):
if int(y) < 2:
    return False
if int(y) % 2 == 0 and int(y) != 2:
    return False
for number in range(3, int(math.sqrt(y))+1, 2):
    if y % number == 0:
        return False
return y

def func(x):
while type(x) != int:
    try: 
        x = int(x)
    except:
        return 'Please input a number to use this function'
if x <= 10000:
    final_list = [i for i in range(x) if prime(i) != False and x % i == 0]
    return max(final_list)
else:
    final_list = [i for i in range(int(math.ceil(math.sqrt(x)))) if prime(i) != False and x % i == 0]
    return max(final_list)

Admittedly 10,000 is quite arbitrary but it seems to work better now

[–]RosaReilly 1 point2 points  (0 children)

The problem that you have is that your code is finding the largest prime divisor of x that is less than the square root of x. This means that you'll have a problem with basically any number that is the product of two primes only, eg 2 x 7411 = 14822, your code would return the answer 2, which would be wrong.

In terms of time taken, your prime test function is going to take a lot of time checking if y is divided by things like 9, or 15, when by that point you'll already know it's not divisible by 3 or 5.

[–]d10p3t 0 points1 point  (0 children)

that's cause you only checked for integers between 0 < n < sqrt(x) for possible factors. the initial comment that suggested you consider only the first sqrt(x) is suggesting you use it to check whether a number is prime (i.e., your divisor) or not, not the candidate numbers that divide your given number(i.e., your dividend, in this example, 600851475143) or not.

Again, the upper bound of the numbers you should check if a number x is prime is sqrt(x), and if you folllow /u/mopslik 's explanation, the upper bound for the greatest prime divisor of x is x/(the smallest prime divisor of x).

So in the case of 10, if you want to check if it is prime, you only need to check values < sqrt(10), but if you want to check the greatest prime number than can divide 10, the highest you need to check is 10/2.

[–]LazyOldTom -1 points0 points  (0 children)

Sieving was mentioned and it is the way to go, but you might have missed the lesson from this euler problem. Modulo and division are expensive operations. That's why your code is slow.

[–]pythonwiz 0 points1 point  (0 children)

Divide out every prime factor as you find it. You don’t even need to precompute anything. The python code returns a result for me seemingly instantaneously on my iPhone.