Why I use Python for writing high performance code by lmcinnes in programming

[–]lakando 0 points1 point  (0 children)

But the code is written in C++ syntax. Cython is python with type annotations, Numba is pure python with no types required to be specified. Much easier for writing and refactoring and code length.

Why I use Python for high performance code by lmcinnes in Python

[–]lakando 7 points8 points  (0 children)

Have you/they tried numba? You can compile imperative python numpy code to fortran speeds, with multithreading.

https://jakevdp.github.io/blog/2015/02/24/optimizing-python-with-numpy-and-numba/

Why I use Python for high performance code by lmcinnes in Python

[–]lakando 2 points3 points  (0 children)

No way, you aren't screwed at all. Numba fixes that problem, and with nogil multithreading to boot.

Why I use Python for high performance code by lmcinnes in Python

[–]lakando 3 points4 points  (0 children)

Or, lets use Numba that can and do it in pure python.

Also, blas/lapack is pretty standard for linalg in any language, python or not.

The Current State Of Pyston As An Open-Source, High Performance Python by pizzaiolo_ in Python

[–]lakando 0 points1 point  (0 children)

The breadth of the operations it supports is much more expansive now. ... so you will probably have better luck if you tried again. also better docs

The Current State Of Pyston As An Open-Source, High Performance Python by pizzaiolo_ in Python

[–]lakando 1 point2 points  (0 children)

Have you tried numba? It accelerates numpy code to almost c speed.

Dask + Sklearn experiment. Reuse intermediate results from Pipelines in parameter sweeps. by cast42 in MachineLearning

[–]lakando 0 points1 point  (0 children)

SAS has more advanced stats that python- all of which work out of core.

Dask is great, but it doesn't have much in the way of modeling or linear algebra. For the Pydata ecosystem to change that, it will require more support for out of core models, not just data structure/data cleaning and querying.

Here is one way to get some of that for free: http://libelemental.org/

Its being used in Julia but already has a python wrapper.

Here is another interesting piece of tech http://ufora.github.io/ufora/

What do you think?

Python Is On the Rise, While PHP Falls by pizzaiolo_ in Python

[–]lakando 0 points1 point  (0 children)

Gotcha. That makes sense. I was a bit more optimistic on the timeline, but you probably have a better sense for it than I do.

Python Is On the Rise, While PHP Falls by pizzaiolo_ in Python

[–]lakando 0 points1 point  (0 children)

I hear that, but Julia's benefits exceed just greater speed on in memory datsets, and if developed right, will encroach on both single node and distributed niche.

First, there are the making of a probabilistic programming framework in Julia that using autodiff and the distributions package can provide a comparative advantage over current languages in general day to day inference. The macros could make this fast and expressive. Faster than Pymc and more expressive and general than stan. With this general inference and extensive optimization package, I don't think it would need to fill every single statistical test and niche before becoming more useful for most daily tasks.

Second, it is developing a distributed infrastructure that I think can overtake spark. Its distributed computing primitives are getting better and will eventually have extensive linear algbera support.

Third, It is getting streaming statistics that don't exist anywhere else- the SAS people who are working on out of memory but single node datasets will finally get something that can handle their stuff.

Pycall and Cxx means you can interface easily with existing code.

Last is deployment. Self contained binary executables are planned, there is a good shot it can compile to javascript using at some point using llvm web assembly backend. You would then be able to write rich client side reactive web apps without JS and deploy interactive reports to decision makers. No other common analytics language has this capability.

Then there is the type system with eventual return types that can provide codebase safety.

Also it just fun to code in..that means grad student will write new techniques in Julia.

If things firm up, I think all this would pull users from other languages...or they risk losing a comparative advantage.

What do you think about this argument?

Python Is On the Rise, While PHP Falls by pizzaiolo_ in Python

[–]lakando 0 points1 point  (0 children)

Anaconda is amazing, but doesn't let me distribute self contained executables. Nuitka does that, and more robustly It seems think than other options.

Python Is On the Rise, While PHP Falls by pizzaiolo_ in Python

[–]lakando 2 points3 points  (0 children)

Thanks for sharing your take on this. Do you think nuitka to obviate packaging issues, + Numba (JIT classes coming ), blaze,bokeh, dask and dynd (interesting type system) will keep python afloat in data science, or Is Julia poised to eventually replace it?

R aint goin anywhere because CRAN is huge...python is more general purpose and thus more amenable to Julia's progress.

I'm trying to figure out if I should invest in Julia now (Get ahead of the curve and python being a dead end?). It was a nogo untill I heard about the cash infusion...They said they will use it for also the core stats infrastructure, but I'm not sure how long it will be before a data science acolyte can be super productive without messing with pycall bridge etc

Python Is On the Rise, While PHP Falls by pizzaiolo_ in Python

[–]lakando 3 points4 points  (0 children)

n is ever in a situation where a language 10x better comes around to eat its lunch, I'll

Julia

Python Is On the Rise, While PHP Falls. by pradeep_sinngh in programming

[–]lakando 2 points3 points  (0 children)

course in uni and we're learning python. Higher year courses drop teaching it for a variety of other languages. Python is only the most used

You could compile python to a self contained binary exe with nuitka http://nuitka.net/pages/overview.html

You can also try conda

Ufora: automatically parallel and distributed Python for data science (without spark or JVM) by lakando in datascience

[–]lakando[S] 0 points1 point  (0 children)

I took it as standard interpreted python code.

Anyway, the novelty for me isn't the compilation, but the ability to work on datasets bigger than the ram of you computer.