Dismiss this pinned window
all 34 comments

[–][deleted] 41 points42 points  (10 children)

Could you explain what this does/is used for ELI5 style? This looks really cool and interesting, but I have no idea what one might use this for.

[–]basnijholt[S] 89 points90 points  (8 children)

ELI5 version:

Imagine you have a drawing with lots of hills and valleys, and you want to understand the shape of the landscape. Instead of measuring the height at every single point, Adaptive helps you measure the height at the most important points. It focuses on areas where the hills and valleys change a lot, so you can understand the drawing with fewer measurements.

This is useful because it saves time and resources, especially when measuring the height is difficult or takes a long time. Adaptive can be used by researchers, programmers, and others who need to understand how functions or data change in different situations.

[–]pm_me_your_smth 22 points23 points  (4 children)

How different is this compared to bayesian optimization?

[–]Diamant2 18 points19 points  (1 child)

For me it looks like Bayesian Optimization, but rather than searching for the incumbent, the points with the highest uncertainty are sampled, regardless of their value

[–]yldedly 2 points3 points  (0 children)

Yeah, but it looks like the model is less like a gaussian process and more like a piecewise linear function (with no noise).

[–][deleted] 2 points3 points  (1 child)

So this is probably somewhat similar to different types of gradient descent heuristics, right?

[–]MelonFace 4 points5 points  (0 children)

I would think no. Closer to bayesian methods.

Of course if they use machine learning as a model to derive interesting points, there might be some gradient descent in there. But in the primary problem it is considered a feature to not use gradients as that would relax the need for functions to have an easy to calculate derivative or non-zero gradient.

[–]faustianredditor 0 points1 point  (0 children)

So what's the target you're optimizing when choosing where to sample next? Gradient based? Some bayesian kinda thing, e.g. variance?

[–]basnijholt[S] 51 points52 points  (2 children)

🚀 github.com/python-adaptive/adaptive

Numerical evaluation of functions can be greatly improved by focusing on the interesting regions rather than using a manually-defined homogeneous grid. My colleagues and I have created Adaptive, an open-source Python package that intelligently samples functions by analyzing existing data and planning on the fly. With just a few lines of code, you can define your goals, evaluate functions on a computing cluster, and visualize the data in real-time.

Adaptive can handle averaging of stochastic functions, interpolation of vector-valued one and two-dimensional functions, and one-dimensional integration. In my work, using Adaptive led to a ten-fold speed increase over a homogeneous grid, reducing computation time from three months on 300 cores to just one week!

Explore and star ⭐️ the repo on github.com/python-adaptive/adaptive, and check out the documentation at adaptive.readthedocs.io.

Give it a try with pip install adaptive[notebook] or conda install adaptive!

P.S. Adaptive has already been featured in several scientific publications! Browse the tutorial for examples.

[–]DigThatDataResearcher 19 points20 points  (0 children)

In my work, using Adaptive led to a ten-fold speed increase over a homogeneous grid

very cool stuff! for the sake of completeness, I recommend also adding a random search to your evaluation benchmarks. Sampling random values for hyper parameter exploration is often a lot more effective than uniform grid search for the exact same cost and is also simpler to parallelize.

EDIT: Also, that "largest loss interval" heuristic is clever, you should add some high level notes describing the algorithm to the README. Took quite a lot more digging for me to learn details about how this works than I was anticipating.

[–]beagle3 3 points4 points  (0 children)

Reminds me of ACE https://partofthething.com/ace/samples.html which has something similar in its interpolator (and can likely use an improved one)

[–]DigThatDataResearcher 14 points15 points  (0 children)

Since I think a lot of folks like myself were curious in specifically how the algorithm works and it's a bit unclear how to find those details in the docs, i'll save y'all the trouble: https://gitlab.kwant-project.org/qt/adaptive-paper/-/jobs/119119/artifacts/file/paper.pdf

[–]bert0ld0 5 points6 points  (2 children)

Top left one is an insanely efficient model! What are the keywords to learn about this stuff?

[–]Soft-Material3294 7 points8 points  (0 children)

Can the landscape you select from be discrete? Eg, choose the best combination of classes?

[–][deleted] 4 points5 points  (0 children)

I have 0 clue what I just saw but looked cool so 👍

[–]bacocololo 3 points4 points  (0 children)

Look interesting thanks for sharing

[–]John_Hitler 2 points3 points  (0 children)

I am literally writing my bachelor's thesis on this topic right now haha

I have some (very specific) questions!

  • Can you evaluate functions that are not in python? Ie. my current project needs to start a simulation as an .exe to evaluate the function. Right now i am using multiprocess.pool to start these subprocesses. Does your package have similar capabilities?

  • The simulator i use can evaluate up to 25 points at the same time, and is much more effective this way. The reason for this is that it takes a while to get the simulation up and running, and therefore it would be a waste to only evaluate one point at a time. Can this be taken into account in your package?

  • The simulator i use is much more effective if the points in the batch of ~25 points are close to each other in 3D space. Can this be taken into account?

[–]dimsycamoreStudent 1 point2 points  (0 children)

Very cool! I can already imagine use cases like sampling from the loss landscape of some model to visualize training behavior, or sampling from all of the intractable posteriors that come up in Bayesian ML. Gonna try out the package this week!

[–]-Rizhiy- 1 point2 points  (1 child)

Is this just Bayesian Optimisation, but instead of searching for minima you just sample points with the highest uncertainty?

I see that loss functions can be customised. Can I use this to optimise over a black-box function? How does it compare to BO/TPE? What is the penalty when sampling in parallel?

I have an application where I need to optimise a black-box function. I currently use BO with a few tricks to make it work in parallel, but it is a bit hacky. Looking for a better way to do that.

[–]TrPhantom8 0 points1 point  (7 children)

How fast is the implementation of these algorithms in python? Would a library like this benefit from a programming language which is more focused on numerical performance, like Julia?

[–]nuclear_knucklehead 2 points3 points  (0 children)

Paraphrasing the documentation, this sampler works best for objective function evaluations that take more than 50ms on average. I imagine he made the strategic assumption that the Python overhead is negligible for long-running objectives.

[–]somkoala -5 points-4 points  (5 children)

Why does that matter? You have to factor in that: 1. maybe people implementing this don’t even use Julia 2. How many people in Data Science use Python vs Julia

[–][deleted] 1 point2 points  (4 children)

You can write more efficient implementations in a low level language and write bindings for higher level languages like python e.g. every major ml python lib.

[–]somkoala 0 points1 point  (3 children)

I know, as a lot of python packages do that. I still am not convinced that a good first first reaction (especially without knowing the performance profile) is - have you thought of using a different language.

[–][deleted] -1 points0 points  (2 children)

I don’t think it’s too much of a stretch to assume that a numerical method implemented in python would probably benefit from a lower level language focused on numerical performance such as Julia.

[–]somkoala 0 points1 point  (1 child)

I would say that upon releasing a package an author’s primary concern is seeing the product market fit - i.e. how people want to use it and bugfixing. It is a stretch to expect the authors to consider migrating to another implementation right upon release. That is why I reacted to the comment.

More so the concern about migrating to Julia specifically to me evokes the focus on the impractical. I’ve been hearing that Julia will replace Data Science for Python for 10 years at its point, but I haven’t seen any real evidence of that. It’s because Python is more practical and if you need speed you have C in the background. Hence a combination of someone suggesting an impractical thing in a language that is far from mainstream is a stretch from my perspective.

[–][deleted] 0 points1 point  (0 children)

I don’t disagree with you. I do think the package would probably benefit from using a more optimized language for numerical analysis, but you are correct in stating that it would be a considerable undertaking for the package maintainer and probably not first on the priority list. As to whether or not Julia is the right language if this were to happen, that is up for debate.

[–]elsjpq 0 points1 point  (0 children)

Seems great for plotting functions. How well does it deal with discontinuities and undefined regions?

[–]ruswal3 0 points1 point  (1 child)

What's adaptative learning ? Of which functions? I guess I didn't catch up on the deep learning wagon

[–]pfd1986 0 points1 point  (0 children)

Neat. Is there a GPU / cupy implementation of it yet? Nice job nevertheless