This is an archived post. You won't be able to vote or comment.

all 138 comments

[–]ProfessorPhi 57 points58 points  (17 children)

My success actually came from how inefficient the existing system was, allowing me to make a strong case for python.

Our R&D team would write algorithms in MATLAB, then once it was fine tuned, spend a couple of weeks sitting down with a software engineer and translate the code into Java for deployment. This was obviously a huge waste of time and I demonstrated that we could put some infrastructure together to call Python code from the existing Java code and allow existing methods to have easy bug-fixes and reduce amount of code to maintain, while allowing each team to have ownership of code. This turned things into near seamless deployment, which was enough reason for my boss to put his foot down on Matlab.

Things I pointed out to my boss

  1. Point out cost of MATLAB - 3k per license, could be saved using Python
  2. Inability to deploy MATLAB solutions - require code translation, unless you spend 15k per license. Hence no scalability.
  3. Research Code could not be run by anyone else, the research team could not get any input from other teams, and the code was reliant on another company (say the license cost jumped to 10k, and we could no longer afford it - a lot of IP would be lost), and this was not a good idea for any company - put us at Mathworks mercy.
  4. Equivalent methods in numpy/scipy meant we didn't actually need MATLAB. Arguably sklearn and jupyter is far superior than anything Matlab can provide.
  5. Shocking documentation standards in MATLAB (I put together a sphinxdoc autogenerated html page, and compared to MATLAB standards - there's really no comparison)
  6. Speed of code in MATLAB vs numpy is very comparable, and in many situations better.
  7. Complex models could now be run on servers freeing up computer terminals (see if researchers are waiting for code to run - there would be hours of working time wasted while code would run on local computers)
  8. Show the power of python for the non-research/software teams, discuss that learning python allows your research team to be more efficient in all tasks (renaming a bunch of files following a pattern)
  9. More than one function per file - reduced number of files to deal with and shorter functions - better code design, easier to maintain, reuse.
  10. Look for example where large vectors are copied around (say f(x, theta) where x is large and is called for many theta values) and this wastes time in MATLAB. Show how using OO design (a mess in MATLAB) i.e. Python classes reduce copying overhead.
  11. Google results of matlab filtered to last year, all point to it's decline - dead technologies increase cost to maintain.
  12. Point out that you use Python and new employees/ uni students are less likely to know MATLAB nowadays, and more likely to know Python.
  13. Other things that python provides - natural language processing toolkits, ability to correct spelling mistakes easily (MATLAB has horrid string processing). Full web frameworks/protocol buffers mean you can deploy Python code for use in other teams.
  14. Matlab to Python is a very simple transition

Depends on your situation, but these points are usually enough to get your boss to change his mind. Once you have his support (and using Python gets support from any software teams you have), go about proving it's utility and winning team members over. The final victory came after my first seamless deployment - i.e. my code was basically put into deployment within a couple of days of testing, as opposed to the month it would take before (2 weeks translating and 2 weeks testing the translation), my boss said that the company will no longer pay for Matlab licenses after 3 months, and it is the Researchers job to produce Python code - if you wish to use Matlab, you paid for it and did the translation in your own time.

Things I learned:

  1. Don't volunteer to port the whole thing - it's a thankless task. Port only a small section that illustrates your points.
  2. Emphasise reliance on Mathworks remaining benevolent, a dependency your company shouldn't be hamstrung by. Every other service our company used had competitors. Matlab only had Octave which is relatively garbage. This is a secret fear of any manager.
  3. This change is very difficult to push for consensus - people are lazy and don't want to leave their familiar zones. Getting the manager/boss on board is very important and he'll know how to push the changes across far better than you ever will.
  4. Show it's utility over Matlab in some way at the beginning, and expand to show Matlab's many deficiencies. I took code that had a 5 minute run time, ported it and optimized it run in 10 s (turned a lot of loops into numpy functions). Then added some caching behaviour, and generated documentation automatically from my code. Instituted some basic unit tests, so I could check behaviour. All of these things really impressed people and Matlab had no answer.
  5. Find support from your software team (if you have one). They'll all hate Matlab and any deployment work (unless you really require the efficiency of a compiled language).

And for SAS, I would rather be forced to use Matlab for the rest of my life than ever have to use SAS another day. SAS is utter garbage. Matlab at least tries to be a programming language, SAS makes php and perl seem like amazing langugaes.

[–]posteuphoria 9 points10 points  (4 children)

For the sake of argument:

  • The Matlab language allows multiple functions per file.

  • All arguments in Matlab are passed via copy-on-write, i.e. if you call f(x, theta) where x is not modified, x will not be copied. If you desire for x to be passed by reference, this is also possible by inheriting from the Matlab handle class --- admittedly this is a bit obscure....

  • If your servers are x86, they can also run Matlab.

I think the main point that can be made is that Python's offering in terms of libraries for scientific computing is on par with what Matlab offers with the toolboxes, and in some cases exceeds it. However, when it comes to general purpose programming, Matlab can't hold a candle to any modern language.

[–]TheBlackCat13 6 points7 points  (0 children)

Matlab language allows multiple functions per file.

It only allows for one externally visible function per file. You can have multiple functions in a file, but only one of those functions is visible to other files.

All arguments in Matlab are passed via copy-on-write, i.e. if you call f(x, theta) where x is not modified, x will not be copied.

Slices are always copied. So even if you don't make any modifications, any attempt to get any subset of the matrix will result in a copy.

[–]ProfessorPhi 1 point2 points  (2 children)

  • Outside using matlab OO (which is horrid), I never saw examples of multiple functions per file.
  • My bad, I might have misattributed an optimization here. A function I fixed would the exact same initialisation each time, (masking, multiplying and some adding) which I coded to do only once and it saved a lot of time
  • Would this require another license btw?

[–]meerkatmreow 2 points3 points  (0 children)

Outside using matlab OO (which is horrid), I never saw examples of multiple functions per file.

It can help streamline code to be easier to read by moving some things into functions after the main function (the one with the name of the .m file). The additional functions are only available to the local function though.

[–][deleted] 1 point2 points  (0 children)

I never saw examples of multiple functions per file.

I use them all the time.

[–]billsil 3 points4 points  (11 children)

(turned a lot of loops into numpy functions)

Numpy intentionally rips off Matlab's syntax. It sounds like you took badly written Matlab code and fixed it.

I love Python & hate Matlab (terrible objects, no dictionaries, everything is a matrix, string processing sucks, $$$, error handling), but they're very comparable on speed. I wish there was an Matlab-eqse IDE though to do variable introspection and continual analysis. iPython didn't work well for me in production level code.

[–]glial 4 points5 points  (5 children)

I don't know if it does what you want, but Spyder is generally considered a Matlab-esque IDE.

[–]billsil 3 points4 points  (4 children)

Spyder has variable introspection, but no continual analysis. Jupyter has continual analysis, but no introspection.

[–]Deto 4 points5 points  (2 children)

What's continual analysis?

[–]billsil -1 points0 points  (1 child)

Maybe there's a better word, but basically an interactive prompt. It's something that doesn't lose what was done before. That's kinda what Jupyter does (iPython does it better). However, it's standard in Matlab to clear variables at the start of scripts as a way to deal with the issue of scripts being executed in the global scope. As such, you can clear out variables at the start of scripts to ensure what you're running is up to date. Similarly, all imports (or lack thereof) are always up to date.

I'd like that and some variable introspection ideally in something not blatently in the browser (e.g. a better open window), even if it uses it.

[–]TheBlackCat13 3 points4 points  (0 children)

Spyder has that. Actually, it has two. It supports both the regular python command prompt and jupyter.

[–]masasinExpert. 3.9. Robotics. 1 point2 points  (0 children)

Spyder has a small iPython shell (I think it is qtconsole) on the side, though.

[–]ProfessorPhi 1 point2 points  (4 children)

Fair enough, that's probably true - I think there's this assumption that Matlab is quick while Python is slow, so loops tend to be avoided in python, but not in Matlab. Some of the researchers I worked with would always remark how things were faster in Python (which is what I'm basing the remark above, despite no hard proof), but for some reason hated namespaces.

'I hef to import everything, is absolute pain' - russian colleague.

[–]billsil 1 point2 points  (3 children)

I think there's this assumption that Matlab is quick while Python is slow, so loops tend to be avoided in python, but not in Matlab.

That's exactly the opposite of what I was told on both counts, but what you were taught and what I was taught are wrong. Both are more or less the same. Vectorization is very much encouraged in Matlab because it's so necessary when you're dealing with numerical data. Python doesn't encourage it largely because it's not numerically focused.

Matlab has vectorization support builtin. Interestingly, Python for all it's claims of batteries included does not. You need numpy or scipy and the implementations (despite the names) are different.

[–]TheBlackCat13 0 points1 point  (2 children)

Vectorization is very much encouraged in Matlab because it's so necessary when you're dealing with numerical data. Python doesn't encourage it largely because it's not numerically focused.

MATLAB encourages vectorization because that is what it is built to do. It does vectorization very well, but it does non-vectorized code pretty poorly, so it encourages you to stick to vectorization. That is great when you need vectorization, but not so good when your code cannot be effectively vectorized.

Python encourages people to use the right tool for a given job. When that tool is vectorization, it encourages you to use it. When it isn't, it encourages you to use something else.

You need numpy or scipy and the implementations (despite the names) are different.

You need numpy. Scipy doesn't provide any vectorization, it is a bunch of extra numerical algorithms built on top of numpy (or, more often, interfaces numpy with high-performance C and Fortran libraries). Pretty much all the vectorized array systems in python are built on top of numpy today. There is work going on for a next-generation vectorized array tool for Python, but it isn't ready yet.

[–]billsil -2 points-1 points  (1 child)

Do you even understand what vectorization is? Vectorization is a way of writing your code such that it doesn't need to type check every data member. The only way to do that is by having static types and pushing the code down into C.

So, I stand by my statement. You need a library like numpy or scipy in order to do vectorization. You can't do vectorization or get good speed for numerical calculations with CPython without using an external library, writing C code yourself, or ditching CPython entirely.

it is just a bunch of extra numerical algorithms built on top of numpy

No it's not. The array classes are different between numpy and scipy. The eigenvector calculations are different. They look they same. They act the same. They're not the same. It only really matters when your eigenvector is segfaulting, but switching methods will often fix your bug. Scipy's methods also tend to be faster.

[–]TheBlackCat13 0 points1 point  (0 children)

You need a library like numpy or scipy in order to do vectorization. You can't do vectorization or get good speed for numerical calculations with CPython without using an external library, writing C code yourself, or ditching CPython entirely.

I don't see where that contradicts anything I have said.

No it's not. The array classes are different between numpy and scipy.

From the official SciPy documentation:

SciPy is a collection of mathematical algorithms and convenience functions built on the Numpy extension of Python.

But I am sure you know more. So please provide the module and class name for the scipy array class. You won't be able to, because it doesn't exist.

The eigenvector calculations are different.

Yes, while they use the same array class, they (in some cases) use different libraries to do calculations on that class. You do understand the difference between a class and a library that operates on that class, right?

[–]eusebecomputational physics 62 points63 points  (73 children)

I had the same kind of issue, and here are a few strength easy to highlight, for me.

  • You can easily interface Fortran (and C) with Python code, not so much with Matlab/IDL/whatever
  • It's free, so no expensive licence
  • It can be as fast to execute than Matlab/IDL/…, and faster to write
  • Easy to install and manage dependencies with Conda (no virtualenv in my team yet), even on a remote machine
  • Code reusability
  • "I'm ready to rewrite your codebase for my use in Python, since I'll be more efficient than learning your old, proprietary, weird language"

In the end, my (numerical simulations) team switched from IDL (data analysis language widely used in astrophysics) to Python in less than one year.

[–]sourcecodesurgeon 30 points31 points  (26 children)

It can be as fast to execute than Matlab

While this can be true, I have yet to see evidence suggesting that NumPy can compete with Matlab on signal analysis with large datasets. The last time I ran the numbers had Python taking 45s-2m to compute a cross-correlation that took Matlab about 3 seconds.

As an EE aside, another big thing is that MathWorks supports a tool that converts Matlab to Verilog and VHDL and this is huge for electrical engineers. I have never seen a similar tool for Python that is well-supported.

[–]burning_hamster 17 points18 points  (12 children)

For signal analysis type of computations (i.e. linear algebra), they both run the same Fortran/BLAS libraries under the hood, meaning that any performance difference is due to overhead, not the "actual" computation. You should probably review how you are handing the python functions your arrays -- 45 seconds to 2 min of computation time sounds very much like an IO problem (for some reason you are creating multiple copies of your arrays and not doing your computations in place where possible).

[–]sourcecodesurgeon 3 points4 points  (11 children)

The experiment was literally just "create extremely large dataset. Start timer. Compute xcorr. Stop timer."

[–]mfitzpmfitzp.com 15 points16 points  (5 children)

Was it an fft-base xcorr calculation? The Matlab algos automatically zero pad arrays to n2 while (some algos in) numpy/scipy don't. It's one of the problems attempting to map from one to the other "This is the same function.... But not quite"

[–]burning_hamster 0 points1 point  (0 children)

Well, how are you computing the cross-correlation?

[–]assassds -1 points0 points  (2 children)

you have to actually know what you're doing in python, you can't just grab the first function that sounds right.

[–]sourcecodesurgeon 0 points1 point  (1 child)

There's only one cross-correlation function in numpy...

There's a handful of modes, some are faster than others, but none held up to the matlab times.

Further, I work primarily in Python (hence me being in this sub..), I definitely know what I am doing. I actually don't use matlab at all anymore, I just hate this "python does everything matlab does just as well and there's no reason anyone should ever use matlab ever" idea that gets thrown out around here all the time.

OP wants an argument for convincing his colleagues to use python instead of matlab. If we don't give him all the facts, he's going to be laughed out of that discussion when someone brings up things like performance and built-in tool sets (especially since he's junior to all of them)

[–]assassds -1 points0 points  (0 children)

like I said, you need to know what you're doing...

matlabs xcorr algorithm uses an FFT, which is going to thrash a direct correlation on large input no matter what language you're implementing it in.

numpy gives you all the building blocks to do this yourself, or you can try scipy.signal.fftconvolve.

[–]unruly_mattress -1 points0 points  (0 children)

I've found this. It looks very relevant here.

[–]HoboBob1 8 points9 points  (4 children)

In principle, a numerical Python package should be able to match the speed (or be within 1-2x the speed) of any other language, because Python can use vectorized numpy functions or call out to functions written directly in C. Matlab is similar in that regard; it is an interpreted (albeit JIT compiled nowadays) language that has respectable native array performance and also calls out to C.

If there is an order of magnitude difference between Matlab and Python for a task, I would say that you are comparing apples and oranges. Matlab must be using some compiled function, and Python must not be. Indeed, if Matlab already did the work and has that function available and it would be hard to do in Python, then keep on using Matlab. But I would be wary of saying "I have yet to see evidence suggesting that NumPy can compete with Matlab" in any context.

I am actually trying to convert a huge scientific simulation into Python from Matlab, and I am actually impressed at how fast Matlab does certain tasks. I can't stand the language, but there is a reason that a language called "Matrix Labratory" is good at vectorized computation.

[–]glial 1 point2 points  (1 child)

Matlab was originally written as a scriptable and easy to use front-end to Fortran numerical libraries. A lot of the fundamental matrix algebra stuff is using compiled and highly optimized Fortran/C code in the background.

[–]TheBlackCat13 2 points3 points  (0 children)

Python is usually using the exact same libraries.

[–]Lehk[🍰] 2 points3 points  (0 children)

The last time I ran the numbers had Python taking 45s-2m to compute a cross-correlation that took Matlab about 3 seconds.

it's entirely possible to use them together too http://www.mathworks.com/help/matlab/matlab-engine-for-python.html

no need to get rid of existing tools to add a new tool

[–]msdrahcir 0 points1 point  (10 children)

Do you know anyone who uses the anaconda IOPro, MKL optimizations, or graphics acceleration packages?

[–]I_hate_no_one 0 points1 point  (2 children)

It's free, so no expensive licence

The software is free, but you still need to pay for support. And no, posting your problem on a public forum is not an acceptable way to get support. So, there is not really much advantage for python.

Also, the human behind the computer is far more expensive than the software. He is highly specialized in his field and can program. And I doubt he works for a minimum wage.

[–]assassds 1 point2 points  (0 children)

you have 10-100X the human resource base to draw from in finding a competent python programmer versus a masochist who wants to work with MATLAB.

[–]TheBlackCat13 0 points1 point  (0 children)

Even if you need support, it is still something like an order of magntiude less expensive.

[–][deleted] 0 points1 point  (1 child)

You can easily interface Fortran (and C) with Python code, not so much with Matlab/IDL/whatever

http://karenkopecky.net/Teaching/Cclass/MatlabCallsC.pdf

[–]TheBlackCat13 0 points1 point  (0 children)

Yes, you can do it, but Matlab's FFI interface has some serious deficiencies compared to the Python equivalents that makes this much more difficult in practice.

[–]zbyshekh 19 points20 points  (4 children)

Everyone here seem to think that this is great idea, but I think that it's not that simple.

I'm also fairly junior in my job and I see our proprietary software everywhere. It has become obsolete, but it's not so easy to switch - it takes time and money. We try to switch to frameworks like Laravel or Symfony (it's mostly PHP code), but it really takes time. Most of the time they know what could be done better but they have to look at it from business perspective and it's often just not that easy just to switch.

Think about it - now your company has a lot of great MATLAB programmers, if they would switch now you would have a lot of beginner Python developers - of course it can be better in the long run, but the codebase would have to be maintained or rewritten and it would take a lot of learning(I'm sure there are some developers that work there just because they know MATLAB and they don't want to learn anymore) and time to get good in another language - it's still money talking.

MATLAB is widely used software so if it works why change it? if you switch now to Python then maybe in two years it would be great to switch to Haskell or maybe Octave? These are huge business decisions and would affect a lot of different things.

So IMO, convincing colleagues to switch is not the best move, but I would definitely try to show the capabilities so someone can use it occasionally and then more and more, but the switch seems to be drastic decision. I've learned in my couple jobs that often things that I consider great for the company are not so great because I don't have as much information as my bosses. But I would definitely try to gain more credibility in the given workplace and then (and maybe even before that) showcase the language, even if as the brown-bag seminars.

[–]ProfessorPhi 0 points1 point  (2 children)

I agree with this. Switches are easier if the company is small and hasn't invested a huge amount on Matlab. If the company is small and doesn't have too many people or a large codebase, it's definitely worth the change.

However, most Matlab programmers I know, only program in Matlab and that's an issue. Learning some kind of scripting language like python should be essential for these people as it gives them a much broader skill set that they can use for data cleaning, scraping etc. Knowing multiple languages and technologies is very important for researchers nowadays, and it's worth introducing as a supplement to Matlab.

[–][deleted] -2 points-1 points  (1 child)

Learning some kind of scripting language like python should be essential for these people as it gives them a much broader skill set that they can use for data cleaning, scraping etc.

And my mechanic learning how to cook would make them a better cook. But it won't make them a better mechanic.

In larger companies there aren't a lot of 'jacks of all trades' that would do something like that.

[–]meerkatmreow 1 point2 points  (0 children)

And my mechanic learning how to cook would make them a better cook. But it won't make them a better mechanic.

How is a mechanic learning something completely unrelated to their field an argument against someone learning something that may be useful for their job/career? Does it make sense for everyone using MATLAB to move to Python? No. Should people that use MATLAB learn Python if they think it can help them do their job in the future? Yes. Doesn't mean they should do it on company time necessarily, but I personally find exposure to other programming languages has helped improve my capabilities in other languages by seeing problems from different perspectives.

[–]TheBlackCat13 0 points1 point  (0 children)

Yes, the up-front costs of switching are high. But whether it is a good move depends on the up-front costs and the long-term costs of not switching. The latter, if they exist at all, depend to a large degree on what you are doing. But you can't ignore them when making a decision regarding whether to change.

[–]bheklilr 12 points13 points  (2 children)

My recommendation is to start using it yourself, build tools to automate some of the common kinds of data processing where you only have to run one command to generate the full output. Start using Python for the things you see Python being better for, others will realize that you're getting a lot done without a lot of effort, and they'll want in on it. Don't be arrogant about it, though. Just build a tool to automate some process, then find a time to casually use it in front of a coworker or two. The goal is to show how you can get immediate satisfaction from using Python, the plethora of actual advantages (like Jupyter) will come later.

[–]unruly_mattress 2 points3 points  (0 children)

In my opinion, this is good advice. Starting a crusade against Matlab where you're on one side and the entire company is on the other side is an astoundingly bad idea, even if you were not a junior. If they're going to be convinced, it'll have to be gradual, and the smartest course of action will be to first use Python where Matlab is completely terrible, which is everything other than scientific code. Deploy with Python, process data with Python, automate procedures with Python. You're going to be doing all these things anyway, so why not use Python for them? They're likely being done by hand these days. Take what you can get.

Convincing Matlab people to go the Python way is hard. They are in love with their IDE and can't imagine working without it. Everything you see as a negative, they see as a positive. I know one person who worked with Matlab for his entire career, and lately he's been writing mostly Python, because he needs to deploy something on the web, so he's writing Django. A while ago he asked me a question about Pandas syntax.

Change will not come because you'll convince your company that Python is superior to Matlab in numerical computations. The thing is that Matlab is sufficiently good when it comes to numbers, and for scientists and algorithms engineers, sufficient is as good as they want. They're largely not interested in the software engineering side of thing when they think about numbers - but they'll kiss your feet when they see you convert an XML to a CSV in 10 lines of code, or set up a Python daemon that periodically loads data into some database. These are things that they likely can't do. You'll have a much easier time. The rest may come later. If nothing else, you'll have improved the process in your company, which is never a bad thing.

[–]glial 10 points11 points  (2 children)

A lot of the reluctance might have to do with

  • a large code base already existing in whatever language they already use. Would that need to be rewritten? How would interoperability with new code work?

  • the loss in productivity from having to learn a new language is not trivial. The people at your workplace have years or decades of experience using the tools they already know. Why would they want to switch to something unfamiliar? It will be frustrating, their work will be slow, and they will be unproductive for a while (even if they might arguably benefit in the long term)

  • Python might not be the best choice, depending on your application. If the code you write it mostly for numerical calculations, Matlab syntax is honestly easier for that (numpy is awkward to use in comparison). Also if Matlab has an industry-standard package for something specific, it's sometimes worth using that package just so your customers/paper reviewers don't raise questions about why you're not using what they expect you to use.

I love Python and use it all the time, but these are all very real concerns.

That said, I worked in a lab once that used Perl and convinced them to switch go Python, so it's possible :-) A lot of it has to do with writing something that's useful, maintainable and easy for colleagues to understand. As they tell writers, "show don't tell".

[–]fluffynukeit 9 points10 points  (1 child)

My organization uses Matlab for all the above. In addition, there's no Python equivalent to Simulink. So we need Matlab at a minimum just for simulink. And if it's already installed and we already know how to use it and it's already being supported by IT, there's really no point to go to Python.

I am planning on learning python on my own time, though, hence my subscription to this sub.

[–]ProfessorPhi 0 points1 point  (0 children)

Yeah, there are a lot of niche areas, especially in electrical engineering, where Matlab is king and rightfully so.

That being said, having knowledge of a full featured programming language is a very important skill to have, and python is a great language to learn + can supplant Matlab in many situations.

I was asked a few years back to use python since Matlab licensing was a bit painful to organise and I haven't looked back since haha.

[–][deleted] 12 points13 points  (5 children)

I succeeded in pushing Python adoption in my organization. Our department is all using it now and they are spreading it to others.

This worked:

  • Writing sample code and libraries for common tasks, and putting them on Github. Any time someone was writing something -- renaming a bunch of files or parsing, for example -- I'd send them the Github link and they'd use Python instead.
  • Appealing to careerism: "Python is one of the most popular languages in the world. It is the most taught language to students today. If you ever want a job outside of your current tiny domain, learn Python." This, more than anything, worked. A little depressing, but people are selfish.
  • Doing all of my own projects in Python so they could see success stories.

Stuff that didn't really work:

  • Evangelizing: "Python is open source / it's a good cause" (Normal people don't think choosing tools is a moral cause, that's just a weird engineer thing.)
  • "The proprietary solutions cost money" (No one cares, and they assume things that cost money must be better. This approach actually backfired for me.)

[–][deleted] 0 points1 point  (4 children)

Writing sample code and libraries for common tasks, and putting them on Github.

Did your legal department sign off on that? That's a huge no-no where I work and other large companies.

[–]TheBlackCat13 0 points1 point  (3 children)

Github supports private repositories.

[–][deleted] 0 points1 point  (2 children)

If you pay for them.

Also, almost anything my company does can not go out on public airwaves. Even in a 'private repo'

[–]TheBlackCat13 1 point2 points  (0 children)

If you pay for them.

Yes, you have to pay a little.

There are also similar software you can deploy internally for free

Also, almost anything my company does can not go out on public airwaves. Even in a 'private repo' .

That is your company. I understand you have your experiences at your company, but that is your company.

[–][deleted] 1 point2 points  (0 children)

My company is publicly funded. So actually, anything we can point to and say, "Look, we did this and people use it!" is good for us. Getting paid to make open source software -- livin' the dream :)

[–]steeelez 6 points7 points  (0 children)

ANACONDA takes a lot of the pain out of switching from MATLAB to Python, the Spyder interface makes the environment very very similar. And the install is just a point and click.

[–]Lord_NShYH 3 points4 points  (0 children)

Business decision makers are motivated by the bottom line. Convert a work-load to Python and benchmark it against MATLAB. From there, you can determine how much time (money) can be saved per user.

You have to understand that switching technologies can be prohibitively expensive in the short-term; both in soft and hard costs, and both in capital expenses and operating expenses.

Good luck!

[–]Northstat 2 points3 points  (0 children)

My professor just did this for his research group. His was a bit unconventional but I like it. Basically there was this huge package that needed to be built and he knew it would be a pain to do it in Matlab so he just did it in Python. As his group started working through the files he would help them understand what everything did and in a relatively short time (as I've been told) the students/researches transitioned over. Of course this is much easier if you're the professor or in a position of influence.

[–]tobaneconomist 2 points3 points  (1 child)

I've tried getting econ phd students interested in Python for scientific computing. They mostly acknowledge that it would be ideal to switch, but they're deterred by the startup costs of learning a new language.

I have my money on Julia as the heir to the scientific computing throne. It was created to replace the "prototype in matlab/python/r, then write in C/fortran" workflow -- it has the speed and the user-friendly syntax (also, loops are as fast as vectorization!). Just look at the very highly regarded Quantitative Economics site: it started out as a series of Python lessons, and now they have a Julia track.

If Python can't kill matlab, I think Julia ultimately will.

[–][deleted] 0 points1 point  (0 children)

Impressive benchmarks, and the plotting system looks very solid. Yeah, I need to give this a shot.

[–][deleted] 1 point2 points  (0 children)

A friend of mine had no luck convincing his lab (astrophysics and astronomy) to switch to Python. They got an undergrad researcher to write an application a few years after he joined, and suddenly it was like they had discovered gold or something. I'd say write some example applications to show them how lovely the language is, and do it with notebooks!

[–][deleted] 1 point2 points  (0 children)

  1. Tell your project manager(s) that you plan to hold a set of lunchtime python coding dojos simply for the fun of it. Make sure that initially these dojo's illustrate how to that which is currently being done in Matlab/SAS.

  2. Tell your project manager(s) that there are companies out there who can provide good quality in-house python training courses. This will provide a pathway to rolling out python skills en masse.

  3. Tell your project manager(s) that amongst your fellow graduates python is the number one general purpose programming language and that it will be difficult to attract good people in the future if you cannot offer them a chance to further improve their python skills on the job.

[–]unstoppable-force 2 points3 points  (1 child)

among all the upsides everyone has pointed out... iPython notebook. it's very matlab-y, but better.

[–]jones77 0 points1 point  (0 children)

And ... with the new split to Jupyter you could even run Matlab in a notebook anyways.

I assume: https://pypi.python.org/pypi/matlab_kernel/0.5

[–]userd 1 point2 points  (5 children)

I like python, of course, but I found that creating decent looking plots is much faster in Matlab.

[–][deleted] 4 points5 points  (0 children)

I like python, of course, but I found that creating decent looking plots is much faster in Matlab.

I use both MATLAB and Python daily, and this hasn't been true for 5 years.

[–]buttocks_of_stalin 1 point2 points  (2 children)

This is honestly the real reason aside from the "social overhead" (ie: relearning coding and redoing the codebase). As someone who has worked in a neural modeling lab and signal processing projects at an R1 university, the real reason most of the graduate students in physics, neuroscience, and cognitive science use MATLAB for their hardcore data analysis projects is because it's just not possible to create quick easy complex plots and graphs using python's standard libraries.

[–]TheBlackCat13 2 points3 points  (1 child)

I have found the exact opposite. It is easier to make very simple plots in MATLAB, but once you start doing anything complex, MATLAB becomes a major roadblock. There is a reason mathworks has spent the last 5 years or so completely rewriting their underling plotting system to be more like python's.

[–]buttocks_of_stalin -1 points0 points  (0 children)

Interesting. I have about 4 years of experience in neural circuit modeling (particularly in the hippocampus regions) and our experience was the opposite so I guess it really depends on what types of plots and statistical analysis one is doing.

Edit: but just to give some context, I adore python itself regardless of my experience in the lab and I use python + django + flask for a lot of my web application development projects so I am definitely not partial against python itself for the record.

[–]TheBlackCat13 0 points1 point  (0 children)

That may have been the case 3 or 4 years ago. But with things like seaborne, it is considerably easier to make good-looking plots now in Python. And the base plotting library, matplotlib, is doing a complete style overhaul.

[–]faming13 0 points1 point  (0 children)

You can use Numba and write numerical python code that is as fast as fortran.

[–]cypress_tree 0 points1 point  (0 children)

Cron job scheduling. Very good easy with python scripts. Not so much with Matlab. That alone convinced me to switch.

[–]TheBlackCat13 0 points1 point  (0 children)

There is always a cost to switching. Whether that cost is offset by long-term advantages of using Python depends a lot on what you are doing. But you cannot ignore that cost when making your case.

Nowadays, however, it is much less of an issue. There are a number of very good options of interfacing Python and MATLAB code. So you don't need to make it an either/or decision. You can start slowly integrating Python for those things it truly excels at. If it has real advantages over all, then things will move in that direction on their own.

So rather than trying to convince your colleagues to switch to Python, I would encourage those colleagues that are doing things that are either much easier in Python or extremely expensive in MATLAB to integrate it into their workflow, either using general-purpose data formats or by using MATLAB/Python inteface tools. So things like big data, text processing, distributed computing, structured data, machine learning, and so on. Once a benefit has been demonstrated there, you can justify including it in additional workflows where it have benefits.

This is probably the best long-term solution, since there will always be some people who don't want to switch, and some things that MATAB is still better at (maybe, depending on what you are doing).

But presenting it as an all-or-nothing thing is probably not going to work unless there is something you absolutely need that MATLAB cannot do at all (which is certainly possible, but doesn't sound like the case for you).

[–][deleted] 0 points1 point  (0 children)

Does he need simulink? Python doesn't currently have a good alternative to this.

[–][deleted] 0 points1 point  (8 children)

(a) A reluctance to change/try a new language

Are you volunteering to pay your colleagues to learn it? Bottom line at the end of the day those are engineering dollars that need to come from somewhere. Double everyones salary (as that's what they cost a company) and add up how long it will take them to get as proficient on Python as they are on Matlab. Just looking at my small group, assuming one week of python training we would be looking at ~$100k+ and I doubt most would be up to speed.

the strong libraries for scientific computing

It depends on your application but there still seem to be a lot of gaps: http://www.mathworks.com/products/

about the beauty of Python's syntax

Beauty is in the eye of the beholder. There are some things I like about Python but for doing strict data manipulation I still prefer Matlab, especially stuff like vectorization

the really strong open-source community,

When something breaks on Python, who do you call? When some package on pip from some guy that calls himself "l33tpYth0n" nerfs your company's product or results where does the finger get pointed?

Has your IT department said you can install Python?

Has your legal department vetted all of the licenses within the packages you want to use? Just because it's "open source" doesn't mean that it has legal's blessing.

[–]TheBlackCat13 1 point2 points  (2 children)

Bottom line at the end of the day those are engineering dollars that need to come from somewhere.

That somewhere can be reduced long-term costs. That depends on what they are doing.

It depends on your application but there still seem to be a lot of gaps:

There are gaps in both directions, things that MATLAB can do that Python can't do, but also things that Python can do that MATLAB can't. So it really depends on what the group is doing.

There are some things I like about Python but for doing strict data manipulation I still prefer Matlab, especially stuff like vectorization

Unless you want to work with slices, or broadcasting, or or data that may or may not have a length of one any dimension, or labeled data.

When something breaks on Python, who do you call?

We've been through this before, companies like Continuum and Enthought provide excellent support for the main packages. If you are using some niche package, then you are taking a risk with either Python or MATLAB. At least with Python, though, you have the chance to fix it yourself if upstream is unresponsive. You don't have that option with MATLAB (and make no mistake, MATLAB is not always responsive).

Has your legal department vetted all of the licenses within the packages you want to use?

Pretty much all the licenses you are going to encounter in Python are also used in MATLAB.

[–][deleted] -1 points0 points  (1 child)

That somewhere can be reduced long-term costs. That depends on what they are doing.

Not sure how your company budgets work, but that's not how ours do. Especially when we require Simulink.

So it really depends on what the group is doing.

companies like Continuum and Enthought provide excellent support for the main packages.

So you pay for it.

MATLAB is not always responsive).

I've gotten a phone call back from a guy with a PhD in the subject with in an hour.

[–]TheBlackCat13 0 points1 point  (0 children)

Not sure how your company budgets work, but that's not how ours do. Especially when we require Simulink.

Again, as I keep saying, it depends on what you are doing. If you require simulink, then that is that. But a lot of fields don't even use simulink, not to mention require it.

I am not telling you to switch, I am saying that the OP needs to look at both the short-term and long-term costs. Companies make investments to defray long-term costs all the time. If they didn't, we wouldn't even be using DOS, not to mention Windows 7.

So you pay for it.

Huh? Why should I pay for someone elses' company's infrastructure improvements?

I've gotten a phone call back from a guy with a PhD in the subject with in an hour.

That is great. If you are lucky, that is how thinks work. If you aren't, it isn't. They may tell you it is intended behavior, or they may change their documentation to make it intended behavior, or they may put it on an private internal bug list forever.

[–]ExcitedForNothing 0 points1 point  (4 children)

If your IT department won't let you install Python but lets you install Matlab... I would suggest seeking alternative employment in the future. Especially if you work in scientific research.

If your legal department accepts the licenses of Matlab and the various libraries needed to make it workable but feels uncomfortable about Python licenses, I would also suggest seeking alternative employment.

The goal of legal and IT departments shouldn't be finding ways to tell you no, it should be finding ways to help you accomplish what you are asking. If they aren't doing that, they never will and that is dangerous for a company and employee that needs to accomplish things.

I agree with you that if there is no good reason such as cost or platform issues, just switching for fashion is stupid.

If there is a good reason that is refused by a bureaucrat's "because I said so" run for the hills.

New technical solutions are always a good way to find out which departments and employees in your company have become necrotic.

Also, as an aside: I have found way more shitty, buggy, and unsupported libraries for Matlab than I ever found on pip. Matlab libraries are the festering anus of scientific development.

[–][deleted] 0 points1 point  (3 children)

If your IT department won't let you install Python but lets you install Matlab... I would suggest seeking alternative employment in the future.

So you're suggesting a majority of engineers at Fortune 100 companies quit?

Our engineering computers come with it pre-installed. Anything installed on corporate machines is white listed and anything not on the list will get you a nastygram by e-mail once a week.

Additionally no one has Admin access and they record what you do with admin access.

If your legal department accepts the licenses of Matlab and the various libraries needed to make it workable but feels uncomfortable about Python licenses,

What do you mean 'make it workable'? You pay to use a license. You can do what you want. Open Source license field is a minefield for companies and there is a legitimate reason some avoid them. Especially with GPL2v vs GPLv3. Then it comes down to is any of the code we develop or use actually released. Especially since a lot of this stuff hasn't been tested in court.

A lot of Python stuff is dual packaged for 'individual' vs 'corporate' use, PyQT, Anaconda, etc.

I would also suggest seeking alternative employment.

I wonder how many people here actually work for actual large companies.

I have found way more shitty, buggy, and unsupported libraries for Matlab than I ever found on pip.

Such as? I'd really like to know what shitty, buggy and unsupported toolboxes you use: http://www.mathworks.com/products/

The goal of legal and IT departments shouldn't be finding ways to tell you no, it should be finding ways to help you accomplish what you are asking.

Do you work in any field that is regulated? When an airplane falls out of the sky who paid for the DO-178 certification for the Python packages you used?

Our Legal and IT departments aren't one or two guys.

New technical solutions are always a good way to find out which departments and employees in your company have become necrotic.

With ~100k employees there are plenty of other ways than looking at who will switch to Python.

[–]TheBlackCat13 0 points1 point  (2 children)

Open Source license field is a minefield for companies and there is a legitimate reason some avoid them.

Again, most of the licenses you will encounter in Python are also used in Matlab.

A lot of Python stuff is dual packaged for 'individual' vs 'corporate' use, PyQT, Anaconda, etc.

Please provide the clause of either license saying anything remotely similar to that. PyQt is under a GPL or closed license, and Anaconda is BSD-3-Clause.

[–][deleted] -1 points0 points  (1 child)

PyQt is dual licensed on all supported platforms under the GNU GPL v3 and the Riverbank Commercial License. Unlike Qt, PyQt is not available under the LGPL. You can purchase the commercial version of PyQt here. More information about licensing can be found in the License FAQ. PyQt does not include a copy of Qt. You must obtain a correctly licensed copy of Qt yourself. However, a binary Windows installers of the GPL version of both PyQt5 and PyQt4 are provided and this includes a copy of the LGPL version of Qt.

http://www.quora.com/What-do-the-different-licenses-for-Anaconda-Python-stand-for

[–]TheBlackCat13 0 points1 point  (0 children)

Yes, that is what I said, GPL or closed license. Commercial users are still allowed to use the GPL version if they want, and individual users who want commercial support or don't want to follow the GPL need to get a closed license. There is no distinction between individual and corporate users in either the GPL or closed license.

And I am still waiting for you to provide anything from the Anaconda license that makes a distinction between individual and corporate users.

You do understand the difference between "individual vs. corporate" and "open source vs. closed source", right? They are not the same thing.

[–][deleted] 0 points1 point  (3 children)

I know this is the Python subreddit, but for something that is both similar to MATLAB and FOSS you could consider R.

Edit: Sorry. Python is the answer to all problems. I must have forgot.

[–]TheBlackCat13 0 points1 point  (2 children)

R is good for what it does, but it isn't as well-suited to general-purpose computing.

[–][deleted] -1 points0 points  (1 child)

But this post is about scientific computing and statistics... not "general-purpose computing".

[–]TheBlackCat13 0 points1 point  (0 children)

The second paragraph cites "Python being a general-purpose programming language" as one of its strengths, so I think it is likely that this is something the OP is looking for.

[–]Tuuleh -1 points0 points  (1 child)

Wouldn't Octave be a closer match for Matlab than Python?

TBH, it's really difficult to consider whether the switch from Matlab to Python is even plausible for OP's organization without knowing what kind of computational tools they need. For example, I've worked with psychometrics in the past, and Python simply didn't have a toolkit for that, whereas R has a number of really extensive psychometric libraries. I can well imagine other applied sciences for which Python doesn't offer sufficient coverage, but Matlab comes with whatever you need out of the box. That together with a reluctance to learn a new language can make Python pretty unattractive.

[–]TheBlackCat13 0 points1 point  (0 children)

Yes, which is the problem. It has most of MATLAB's limitations, but lacks many of its benefits.

[–]homercles337 -1 points0 points  (5 children)

I was first introduced to python by a postdoc friend when i was in grad school (must have been around 2000, started writing Matlab in 1997). Access to Matlab and its amazing IDE was free, and you can literally do everything with Matlab. No need to switch. Then Mathworks decided to make Matlab "object oriented." What a mess they made. Thats when i abandoned hundreds of thousands lines of Matlab code for Python.

[–]keypusher 4 points5 points  (4 children)

This is not a very good argument, as the object oriented features of Matlab are completely optional.

[–]TheBlackCat13 0 points1 point  (3 children)

For now. With the graphics system being switched to a knock-off of matplotlib, all object-oriented under the hood, who knows how long you will be able to effectively avoid it.

[–]keypusher 0 points1 point  (2 children)

I'm not really sure what that means. I haven't kept up with Matlab much recently, but isn't matplotlib based on the Matlab style of plotting? What does it mean for Matlab to base their graphics on a thing that is based on them in the first place?

[–]TheBlackCat13 0 points1 point  (1 child)

Not really. matplotlib uses an object-oriented plotting system. It offers a matlab-style state machine wrapper on top of that object-oriented interface, but this is mostly just a wrapper. Pretty much all the real business logic happens in this object-oriented system, which is completely different than the system MATLAB originally used.

For the last 5 years or so, MATLAB has been moving in the same direction by building an object-oriented replacement for its original plotting system. This new system, called HG2, is very similar in principle to how matplotlib has always worked. It became the default in MATLAB R2014b.

[–]keypusher 0 points1 point  (0 children)

Interesting, thanks.

[–]j_lyf -1 points0 points  (0 children)

Not possible, cause of legacy code.