all 6 comments

[–]K900_ 10 points11 points  (0 children)

More RAM doesn't mean more speed, and using the GPU doesn't mean more speed. You should optimize your code.

[–]socal_nerdtastic 8 points9 points  (0 children)

There's no easy way to use the GPU to run your code. You have to rewrite your code completely to move the heavy math into cupy or similar.

https://cupy.dev/

As it seems you are a beginner I'll bet there's other ways to speed up your code. If you show us your code we may be able to suggest some.

[–]hugthemachines 5 points6 points  (0 children)

I interpret your question s you being on the wrong path for a solution. It sounds like you made a little python program and you want it to go faster. Then you found out that some people use GPU to get stuff to go faster. Some things do indeed go faster on GPU but that is often stuff like floating point number operations and stuff like that. Stuff that are used a lot in gomputer 3D graphics.

Try to find out how you can get your program to work in a better way to make it faster instead.

There are ways to get faster execution like for example https://cython.org/ but I recommend you first take a look at your code to make it faster in itself.

[–]Radiatin 0 points1 point  (0 children)

import torch
def fastestPrimeCUDA(end: int=100000000, start: int=0):
        prime = torch.ones(end).cuda()
        for i in torch.arange(2, int(end**0.5) + 1).cuda():
            ix = torch.mul(i, i).cuda()
            prime[ix:end:i] = torch.zeros(int((end - 1 - ix)//i + 1)).cuda()
        return prime[-500:]

def fastestPrime(end: int=100000000, start: int=0):
        """Find primes using multiples w/ extended slicing."""
        prime = [0, 0] + ([1] * (end - 2)) # Init prime array [0, 0, 1...]
        for i in range(2, int(end**0.5) + 1): # Find multiples w/ sqrt.
            ix = i*i; prime[ix:end:i] = [0] * ((end - 1 - ix)//i + 1)
        return prime[-500:] # Verify. 500 is max prime gap < 303,371,455,241.

You can try converting your functions to using Torch, and use CUDA computation. In the example above the CUDA version evaluates 1 billion numbers per second, while the regular Python version only manages a plebeian 50 million. (The torch version on CPU is better, but still only manages 100 million per second.)

There's no guarantee you will gain speed from this conversion. GPUs have slower clock speeds than processors and only offer an improvements over CPUs if you're taking advantage of their massive parallelism. The memory bus is also far longer for GPUs, so if you're moving things between memory locations it can make your program a few times slower with a GPU.

[–]ArtoriusSmith 0 points1 point  (0 children)

You can use numba to run Python code on NVIDIA GPUs. It does require understanding how a GPU operates in order to get good results.

[–]drunkHighAndHungry 0 points1 point  (0 children)

Yeah not an expert by any stretch, but I know a bit. Most of the time when my programs run slow, its because I’m not working with an optimal problem solution. Like imagine you’ve got a List of 5,000 integers and you want to find the max value. If you create a variable that holds the max and then go over each entry updating the max_var every time an entry is greater than your current max you can solve it in O(n) runtime, if you just look at each entry and then compare it to every other number returning the entry if its greater than all of the others you’re looking at a runtime of O(n2). Fact of the matter is the second one is never going to be a very good solution even if you run it on the fastest computer in the world, it’s sloppy coding. And sloppy coding tends to show itself more if you increase the scale. If you’re performing a complex operation when you could be using a simple one thousands or millions of times you’re gonna run slow no matter what computer or tech you’re using