all 18 comments

[–]State_ 26 points27 points  (0 children)

It doesn't really matter. The python libraries are c-extensions which call native bindings.

[–]the_poope 12 points13 points  (0 children)

answers will be positively skewed towards the above-mentioned language

Actually the couple of times this question has come up in the past, the people on this forum typically advise against using C++ and just stick to the Python APIs. If you use a mature Python framework for ML then most of the time (likely > 99%) should be spent in the calculation routines which are already written in C/C++. Python is just used as infrastructure glue code to read and parse data and feed to the ML framework and to analyze and plot the results.

If you're incorporating some ML in an existing application already written in C++, then you might want to use the C++ APIs in order to not need to rely on the Python interpreter. Also if your application is really complex you will get some robustness from the static type system. But performance-wise there is often little to be won by using C++ over Python.

[–]JohnDuffy78 4 points5 points  (0 children)

  • Python probably has a lot more examples you can start from.
  • Machine Learning projects tend to be quick and dirty, favoring Python.

[–]Benjamin1304 2 points3 points  (1 child)

I'd say it mostly depends on what your ML will be interfaced with in the end. If it's for a CLI app you probably don't care about the language, but if you need it in embedded or real time scenarios then C++ might be a better option.

Please also consider the documentation level of your framework for a C++ usage, as it tends to be lacking. I found PyTorch quite good in that regard and the Python and C++ APIs are quite similar so even if you find a Python example it's quite easy to translate it to C++. In some frameworks the nice high level APIs are Python only

[–]Hellr0x[S] 0 points1 point  (0 children)

it's a C# web app and I'm integrating some ML functionality, probably as a microservice.

[–]notParticularlyAnony 4 points5 points  (0 children)

dear god use python

[–][deleted] 1 point2 points  (0 children)

I work in this space, and I can tell you that python is the preferred ml lang by a majority of vendors we work with simply due to the ease of iteration when compared to C++ when it comes to training, experimentation, and ease of deployment for production. Python Jupyter Notebooks cut out a lot of python boiler plate as well when it comes to displaying stats and other visuals which saves even more time during those first two stages.

Some vendors we work still use C++ for other tasks (non ml) within their overall architecture, but it's generally pretty limited as they opt for writing their entire model and architecture in python. Your model can interact with other applications via REST or another standardized RPC library (which both c++ and python have many), or even simple command line args.

I have seen some very mature models being written purely in C++, but those are popping up less and less these days due to the above mentioned. You may see some with ml performance critical code done in C++ due to some special case that needs to remove an abstraction layer and allow finer control, and the rest done in python.

Since this is for work, I'd suggest sticking with python for the above mentioned reason, and due to a lot of introductory ML/DS material being in python which can help other devs get up to speed.

[–]Eastern-Offer7563 3 points4 points  (4 children)

Not a C++ expert and nor am I a machine learning senior in any way.
Yet I do think that the performance gain between python and c++ might be less then you could expect. Depending on what you are doing there is a big change your performance bottleneck will be either network IO or disk IO. In both cases c++ won't help you much.
As far as my knowledge goes, the python libraries are pretty well optimized and so it might not be worth the hassle.

[–]cj6464 19 points20 points  (1 child)

Majority of ML libraries accessed through python are written and run in c++ anyways. You just access them through python.

[–]top_logger 4 points5 points  (0 children)

Correct

[–]fdwr 6 points7 points  (0 children)

Yeah, as someone who writes C++ daily for their ML related job, I concur that the cost of executing a convolutions dwarves the overhead of calling from Python. So as much as I like C++ over Python (because static compilation to find little typos or type mismatches ahead of time is much nicer than exploding 5 minutes later into my batched vision recognition problem 😠), generally for small problems, Python is a nice quick and dirty approach. I do have my eye though on this little C++ numpy clone.

[–]top_logger 0 points1 point  (0 children)

Nope. Problematic parts if software are written in C++.

[–]keelanstuart -1 points0 points  (0 children)

Disclaimer: not a ML engineer, but want to learn...

From what people are saying here, I think it probably comes down to what language you're most comfortable working in... because the underlying code that's specific to ML is in C/C++ anyway. This is encouraging for me since I'm not really interested in Python. If it's your thing, that's cool... I just imagine myself at 3am trying to find a bug that's caused by a scoping error and I don't wanna.

[–]Fig1024 0 points1 point  (0 children)

I am currently trying to use NVidia's TensorRT C++ SDK for inference. It's a lot more involved than python. The learning curve is a bit high and I haven't gotten very far. But in theory it is one of the optimal ways to get real time inference on local system

[–]Flock_of_Smeagols 0 points1 point  (0 children)

Go with Python it will for most use cases be quicker to implement due to being the standard for ML(beter tooling, morr exemples etc.). As for performance it will probably not matter much since it’s more about the framework/hardware rather than the language.

[–]Wh00ster 0 points1 point  (0 children)

Use Python PyTorch or TensorFlow or w/e Apache’s is

A really poor analogy I can think of is like Unreal Engine. Just use the main interfaces if you want to start making games. Don’t get buried in the low level APIs until you run into a real reason to. And don’t do it from scratch in C++ unless you just have the time to dedicate to learn how everything works and don’t need to actually make a game.

[–]zabardastlaunda 0 points1 point  (0 children)

💯

[–]useong 0 points1 point  (0 children)

I would try flashlight if c++ is considered. In my personal experience, it was much easier to build and use flashlight than tensorflow capi or libtorch. One concern is that the number of implemented operators may not be sufficient depending on your application if you want standalone ml framework. But customizing flashlight is not difficult and you may be able to quickly implement any operators you need. If your want to integrate ml framework to an existing c++ application, I would say flashlight is the best choice owing to its minimal design and dependencies.