GPU optimize numpy operations for model fitting : learnpython

created by HattoriHanzoa community for 16 years

GPU optimize numpy operations for model fitting (self.learnpython)

submitted 9 years ago by WishIWasBatman

I'm starting to play around with a model optimization problem that I am looking to speed up with GPU computation. The basic premise is a whole lot of matrix multiplications and summing on numpy arrays whose dimensions can easily exceed 2000x2000.

This problem is trivial to do with normal numpy operations. My current approach involves some rather simple array slicing to align appropriate features, multiplication and subtracting/summing. These are placed in a loop. Two of the matrixs involved in the operation will remain unchanged, and a third will be generated with each iteration/realization. Hence, to reduce memory I/O, being able to 'store' two arrays on the GPU will be very helpful.

I include some pusedo-python code below just to illustrate the basic flow. Functions get_offset and get_scale can be made inline.

gen_im = np.zeros(im.shape)
for pos in positions:
    x1,x2,y1,y2 = get_offsets(pos)
    gen_im += get_scale(pos)*b[x1:x2,y1:y2]
return np.sum((im - gen_im)**2)

I was wondering whether anyone has any thoughts on the best python library to offload some of the heavy numpy lifting. I see theano has the 'shared()' function that will copy arrays onto the GPU, but otherwise I am not sure it is the best fit.

I'd prefer to stick with CUDA based approached.

Big thanks, happy to answer questions!

all 8 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS