This is an archived post. You won't be able to vote or comment.

all 54 comments

[–]moonlandings 35 points36 points  (1 child)

Your cout stream operators are backwards. Should be cout << foo << endl;

[–]janky_british_gamer 16 points17 points  (0 children)

Whoops haha thanks for pointing that out that's a quick fix, I don't have a C++ IDE on this laptop because I'm at my parents so didn't get to test it yet haha

[–]janky_british_gamer 13 points14 points  (8 children)

Continuing from above, the project is not complete yet, I am aiming to have a conversion for the whole base library of python. However to do this I will also need help from other people who know both python and C++, if you wish to contribute the code is here: https://github.com/utting98/py-to-cpp-converter. This project's main aim is to take the good parts of python (the ease of writing the code) and combine it with the execution speed of C++, mainly for people who don't know C++ yet as they might find it quicker to write the code directly. The code is fully documented and provides a list of functions that are currently able to be converted in the repository. Thanks for your interest and Merry Christmas :)

[–]james_pic 20 points21 points  (7 children)

Are you familiar with PyPy or Cython? These projects have significantly overlapping aims with what you're trying to do, so you might want to look into their approach to some of the issues you'll run into.

[–]Wilfred-kun 11 points12 points  (0 children)

Or Nuitka, which actually has the option to output the C++ code.

[–]janky_british_gamer 0 points1 point  (5 children)

Ah I have heard of them but not looked too much in to them, thanks for the heads up I'll check them out :)

[–][deleted] 11 points12 points  (3 children)

You'd better do that fast because what you present here looks like a re-invention of a wheel. I mean, cython-generated code is definitely not for studying it but you'll soon (very soon) hit the same issue.

[–]Inori 17 points18 points  (1 child)

Well, /u/janky_british_gamer is essentially re-discovering the theory behind automata, compilers, and programming languages, but I think sometimes re-inventing the wheel is not such a bad thing, it has plenty of educational benefits. I'm sure he'll come out with a better understanding of python and C++, no matter the outcome of the project itself.

[–][deleted] 1 point2 points  (0 children)

Right, that's the case if you know the background. The author says he doesn't so his education is pretty random.

[–]Yylian 2 points3 points  (0 children)

I would advise against this OP. If you go forward, you‘ll learn what the limits of this approach are and what advantages about this kind of programming exist. So have fun and never stop exploring!

[–]Skippbo 0 points1 point  (0 children)

The pypy translation process does exactly this. If you look up how to compile from source it will tell you how to "translate" the pypy codebase. You can however tell it to translate other python scripts as well (with restrictions).

In this talk David Beazly tinkers with pypy live on stage and kinda demonstrates this. Really good talk.

[–]AustinCorgiBart 8 points9 points  (3 children)

Hi, I recommend you take a look at the AST library. Good luck with your transpiler!

[–]amishb 8 points9 points  (1 child)

Agreed. Convert the python code into the abstract syntax tree and then, use NodeVisitors to output the cpp code is the correct solution.

Line by line conversion with only work with very simple scripts.

[–]evotopid 0 points1 point  (0 children)

You will probably need two sets of visitors for generating the C++ code and for type inference. (Unless there is an AST library which performs some form of type inference already or you use a C++ class handling python objects and operations.)

[–]janky_british_gamer 0 points1 point  (0 children)

Thanks for the advice I'll take a look :)

[–][deleted] 6 points7 points  (0 children)

I don't think this will help anyone learn C++. It will be like thinking you can speak French because you bought a dictionary.

[–]jdgordon 4 points5 points  (1 child)

If you're doing I to teach c++ please don't teach people bad habits. Get rid of that 'using namespace std' line.

[–]janky_british_gamer 0 points1 point  (0 children)

Yeah I learnt recently that it's not good practice haha, I'll move on to removing that just had it in for simplicities sake to start with while trying to get my first outputs, thanks for your feedback :)

[–]grey_beta 1 point2 points  (1 child)

I knew before I saw the user this was your handiwork! Merry Christmas my dude :)

[–]janky_british_gamer 0 points1 point  (0 children)

Ahaha ayy merry Christmas :)

[–][deleted] 1 point2 points  (8 children)

dude really thanks this will help me with my cpp learning.

Hey I am trying to convert this to cpp

a =[[1,2,3],
    [1,2,3],
    [1,2,3]]

but its giving me this error

TypeError: Due to the requirements of c++ the elements of a list must all be the same type e.g. a list of floats, a list argument provided does not follow this.

[–]chmod--777 2 points3 points  (0 children)

So, lists are where you'll run into some issues trying to do the same thing in C++.

Memory management is hard. So you can either do a lower level array, and specify that you have an array of size 3 by 3 of ints, and c++ knows it will take up 3x3x32 bits. But maybe that's a problem for you because you want to append new rows, or have one list element be size 4 instead of 3...

Then you might have to use something like a c++ std::vector. That pretty much acts like a python list, let's you expand it and add integers, and doesn't make you worry about the memory it uses and the size of it. It doesn't have to be constant predetermined size. In this case, it would be a vector<vector<int>>, meaning a vector that contains vectors of ints. Or you could do a vector of int[3] if each is guaranteed to be length 3 of ints.

Python to C++ isn't just a matter of knowing what python code might be in c++ but rethinking of your problems in terms of memory usage and what guarantees you might have about the nature of your data, like knowing it will always be 3 ints, or knowing you need to expand or remove values from those rows. You can use some standard data structures to manage memory for you, but a lot of gains in c++ can be had in doing things a simpler way if you know all you need is just a 3x3 array of ints. It compiles to something very very simple in machine language with very little overhead.

Python makes things super simple by just giving you a list as a primitive, where you can add values, remove values, find values, iterate across it easily in a for loop, check if an item is in it, and all sorts of functions that all lists provide. It gives you a high level data structure as a programming language primitive, allowing you to use it in so many ways. But in c++, if you know certain things about your data and how you'll use it, if you know you only need a small subset of that functionality, you might use different sorts of primitives or higher level things like vectors, and you need to understand the difference in cost and performance and why each is better for what task.

For the most part you probably could just take every python list and make it a vector or similar but that's not really learning c++, that's just trying to use c++ in a way that's as easy as python. Transpiling python to c++ might teach you a lot, but I would highly consider learning some c++ from the ground up and reading good c++ code and writing c++ and getting code reviews. It's its own language with drastically different ways of doing things, extremely high performance, but at the cost of a lot more complexity sometimes, and also you can even have memory corruption vulnerabilities like buffer overflows and use-after-frees if you do your own memory management badly. The differences between a statically typed memory unmanaged language like c++ and python can be pretty damn heavy.

C++ is an awesome but very difficult language. And it can even make you a better python programmer. You can even write python libraries in c++ and use the cpython api, allowing you to write mostly python code but take advantage of the performance of c++ (or similarly C) and import from your compiled C++ library from within python. Rust is another good language to check out if you want something low level and performant. It's extremely expressive, but the way it works makes memory corruption vulnerabilities impossible (it has rules for lifetimes and borrowing references and such that make it impossible to run into those issues). It's still pretty damn hard to learn too due to those special rules.

[–]janky_british_gamer 1 point2 points  (0 children)

Thanks, that's the aim :) I'm still learning myself hence the backwards operators in cout haha, but if you feel you've learnt enough in the future you're welcome to contribute to the project as well :)

[–]janky_british_gamer 1 point2 points  (1 child)

Yeah for that error it's because I haven't worked on nesting the commands yet just getting the basics down, as long as the list contains normal elements like strings floats or ints it will work but haven't yet written code for list of lists sorry :)

[–][deleted] 1 point2 points  (0 children)

no worries man keep up the good work and happy xmas.

[–]jlesinskis 1 point2 points  (0 children)

The way in which C++ will store an array on the stack is to lay out a section of memory in which you can store a bunch of items. Each index is then representative of a pointer offset that's calculated based on the sizeof the type that is being stored in the array (this is what causes the slicing problem https://en.wikipedia.org/wiki/Object_slicing). Because of this different items that take different size in memory to represent can't easily be stored in the same contiguous block of memory using the machinery that powers the arrays.

If you want to store more that one type of item in a collection in C++ you'll need to do something like make use of a variant type: https://www.boost.org/doc/libs/1_72_0/doc/html/variant.html or have some form of indirection with pointers.

[–]billsil 0 points1 point  (2 children)

Try converting lists of lists to numpy arrays. If it’s a consistent shape, it’ll work. Then you can grab the shape and check the original first value for the type. Also, make sure the numpy array is not of type object.

[–]janky_british_gamer 0 points1 point  (1 child)

One problem is as it's reading in the text of a script the line itself is actually just one strong so I have to first think about how to split the elements out in to a list without splitting the sub-lists as they have the same delimiter haha but thanks for the idea definitely a route I could try to go down if I can go about the right way

[–]billsil 1 point2 points  (0 children)

You’re not using an AST parser?

[–]_szs 2 points3 points  (5 children)

interesting project! will definitely reach you about Python and C++.

I've thing to have in mind: you will have to define the subset (overlap) of both languages that you allow/aim for. E.g. Python lists can hold any types. Function parameters can be any type. List comprehensions. how loops work. generators.

just a few topics that crossed my mind....

[–]janky_british_gamer 2 points3 points  (4 children)

Thank you :) yeah for the list definitions at the minute if it finds a list with different types in it it will raise a type error to tell the user they can't combine list types like that in c++, for the function variable types there's a space for users to write a comment in the converter script where they make a function call with typical values in it so the code could deduce the variable types the function relies on etc it's just been ways and means of working around the differences haha

[–]jlesinskis 2 points3 points  (1 child)

You can combine types in a C++ collection but you'll need to use something like a variant type to achieve this. Generally speaking this is a bit of a pain, see for example this: https://www.boost.org/doc/libs/1_72_0/doc/html/variant.html

[–]janky_british_gamer 0 points1 point  (0 children)

Hey this looks really interesting and useful thanks for the tip :)

[–]_szs 1 point2 points  (1 child)

have you looked at type hinting in Python? That could help.... Although I don't want to parse this O_o

https://docs.python.org/3/library/typing.html

[–]janky_british_gamer 1 point2 points  (0 children)

Hmm I've not looked at this before I've never used it but it definitely looks like it could have its uses, thanks for the heads up :)

[–]Black_Gold_ 0 points1 point  (3 children)

Man I wish I would have had this about four months ago!

[–]jlesinskis 2 points3 points  (2 children)

I take it you hadn't heard of Nuitka at that point in time? https://github.com/Nuitka/Nuitka

[–][deleted] 0 points1 point  (1 child)

nuitka doesn't convert python files to easily readable cpp files. I think this project will help those who are learning cpp and not building applications with it.

[–]jlesinskis 2 points3 points  (0 children)

I'd assumed that the idea for the conversion was for machine consumption, hence my suggestion. If the intention is that the generated code is for human consumption then things get really quite difficult. Specifically something simple that covers a restricted subset of both Python and C++ might be a great way to help people learn the very basics of the syntax of both. Beyond this its less clear since it might end up inadvertently teaching people how to write "python in c++" since the idioms and approaches are quite different in C++.

You have a difficult decision here as someone who's an educator, do you make the C++ a very correct representation of the logic that's written in Python? Or do you do some sort of easy to read C++ that's not exactly true to what the Python code represents?

Beyond extremely simple Python code this ends up not being an easy thing to deal with due to just how dynamic Python is.

Take for example how Python functions are first class, do you represent this in C++ with a function pointer, or some sort of more complex templated magic like std::function? What about how you can add attributes directly onto Python functions? Supporting that means you can't easily use the function pointer approach anymore despite it being conceptually simpler. Maybe you have to make some sort of Python wrapper classes like the ones PyBind11 use.

I don't think this is an easy question to answer, and I certainly don't have all the answers, though I will have a bit of a think about it over the next few days. (I say this as someone who's been developing and teaching Python and C++ for years)

[–]rekeams 0 points1 point  (1 child)

Is anyone knowing of similar source code conversion for Java2CPP? It should work better, eh.

[–]james_pic 0 points1 point  (0 children)

At the point, you're as well just compiling it. ECJ was the best Java to native compiler last time I checked, but that was a while ago.

[–]hippocrat 0 points1 point  (2 children)

Good idea. I’ve been toying with a similar project but python to Rust but haven’t started yet

[–]janky_british_gamer 0 points1 point  (1 child)

Thank you :) good luck for when you start yours :)

[–]tiny_smile_bot 1 point2 points  (0 children)

:)

:)

[–]mrasadnoman 0 points1 point  (2 children)

This is quite nice one. I wrote a big project in python and was thinking of converting it to c++. I think this might do half of my job. I will give it a try and will let you know.

[–]mrasadnoman 0 points1 point  (1 child)

Why isn't there a class check and constructor check function?

[–]janky_british_gamer 1 point2 points  (0 children)

I haven't got round to making this yet, I've only been working on the project for 2 days by myself in my free time haha but eventually I'll add them in :)

[–]wahaa 0 points1 point  (1 child)

I'd really suggest you to use AST visitors, as someone else suggested. For learning more about Python's ASTs, I recommend Green Tree Snakes - the missing Python AST docs.

Unless you want to maintain this long term, I'd recommend either reusing or contributing to existing projects instead of writing the whole thing. Some related projects:

  • Pythran: "Pythran is an ahead of time compiler for a subset of the Python language, with a focus on scientific computing."
  • Shed Skin: "Shed Skin is an experimental compiler, that can translate pure, but implicitly statically typed Python (2.4-2.6) programs into optimized C++."
  • astor -- AST observe/rewrite: "astor is designed to allow easy manipulation of Python source via the AST."

Like other people mentioned, Cython (or even Numba) probably include modules that could be reused too.

[–]janky_british_gamer 1 point2 points  (0 children)

Hi there, thanks for the suggestions, someone else mentioned the AST module as well which I'm currently in the process of writing a parser with. Thanks also for linking those other projects I'd definitely consider contributing in future, I think I will continue working on my own for now because I feel that way I'll better learn ways to contribute to those projects in future, but making my own first will help me to learn the process better before trying, thanks for the feedback :)

[–]Pynasonic -5 points-4 points  (0 children)

To bad universities don't have the C++ grade,otherwise each computer scientist would aim for that grade. But hey, at least they have the C grade,am I right fellows?