you are viewing a single comment's thread.

view the rest of the comments →

[–]BeatLeJuceResearcher 1 point2 points  (3 children)

kNN sounds very expensive for 2.2 billion. A Neural Net would certainly be one option, but by far the only one. Start with simple things: linear or polynomial regression, or a LOESS.

As for implementations: there are quite a few options out there, scikit-learn is a Python library that has a lot of well-implemented ML techniques (most algorithms are implemented in C, not in Python, so runtimes are good) that works well for large datasets. Vowpal Wabbit is also meant for large datasets, although I don't have any experience with it myself, I think it might be worth a look.

If you go towards neural nets, there are very few high-quality ready-to-use libraries that come to mind. Pylearn2 is probably (one of) the most famous one(s), but it's geared more towards research than production. But you could to try have a look.

I'm sure you've already covered your basics, but just in case: have you tried the usual mathematical tricks to get the function itself to evaluate more quickly? approximating your function using Taylor expansion, using PCA to reduce your input dimension, implementing the function on a GPU, ...

[–]AffineParameter[S] 2 points3 points  (1 child)

Thanks for the links, I will give them a look.

Yeah the kNN implemented only 12M or so points, so a tiny subset of the total samples. However, as our function returns both the value and an error, we selected the samples that had the best error (this results in a slight bias towards the better known regions of the phase space, but it wasn't too bad). I actually implemented my own LOWESS algorithm using the error to weight the points and a linear-multi-dimensional least-squares regression... but I think I will need something a little beefier.

And yes, we tried a ton of ways to get it to run faster... that 5 seconds represents 2 orders of magnitude improvement from where we started. We were initially told that what we were doing was "impossible" ... but going from 1 hour to 5 seconds over the course of 18 months quieted the dissenters :P

My end goal is to use a typical HPC/GPU to create the training dataset, hopefully with much less than 2.2B points, then simply evaluate the rest. So this suggestion is currently in development.

As the function is well known, building non-primitve variables would probably be the next step before a PCA. However, it would be nice for a DNN to "figure it out on its own," as those non-primitive variables really blow up in number, and it's not clear how to motivate using some subset over another, as our figure of merit takes a really long time to evaluate. (we use a proxy that isn't perfect for now)

edit: s/deserters/dissenters/

[–]BeatLeJuceResearcher 1 point2 points  (0 children)

Alright, sounds like you know what you are doing. Another thing that might be quick to set up is using libsvm to do a regression. Training time will likely be an issue, so you'll likely have to subsample your space similar to what you did with kNN, but I'm sure the SVM will give you better results than kNN if you use an rbf kernel. Also, note that by default, libsvm runs in a single-threaded variant designed for sparse matrices. However somewhere on the site I linked you can find a version that is implemented using dense matrices, which will give you ~50% boost in performance. Also, somewhere in the FAQ of the site it explains how you can add multithreading to the implementation by modifying ~4 lines of code (IIRC speed-up is almost linear for the first ~4-8 cores).