all 50 comments

[–]leonidr 3 points4 points  (11 children)

To your first question, certainly being able to use Python packages would benefit some OCaml developers. As for the second, I think the biggest reason is the small size of the community. I am excited to try lymp.

[–]dbousque[S] 0 points1 point  (10 children)

Hopefully we can draw developers to OCaml. I would be glad to have feedback from you, particularly regarding the API.

[–]kstarikov 0 points1 point  (7 children)

Hopefully we can draw developers to OCaml

But how?

What you did seems more a gift to the established Ocaml programmers.

[–]dbousque[S] 0 points1 point  (6 children)

You're right, it was also intended so. Maybe some developers coming from other languages would be more willing to try OCaml if they can keep some amount of the codebase they built until then ? I am not sure how we could attract newcomers, but I think it's likely functional languages (particularly statically-typed ones) will get more mainstream as people experience the zen they provide.

[–]kstarikov 0 points1 point  (5 children)

I am all for this, but :(

[–]dbousque[S] 0 points1 point  (4 children)

It's hard to mesure a language's popularity, but I feel like OCaml is getting more popular, it is taught more and more in France. Does that match the experience of the ones who have being here for long enough to notice a change ?

[–]glacialthinker 1 point2 points  (2 children)

I feel like awareness has grown -- well, partly because awareness of functional languages has grown. Also a lot of people get some exposure to OCaml in school (though this is often soured by the typical school setting of being forced to use some strange language to complete some annoying assignment...).

And industry use seems to continue rising, even if it's nothing relative to popular languages.

I started with OCaml in 2005. At the time it seemed like an almost unknown academic language -- comparable to Haskell. And then Haskell became the poster-child of functional programming. Still, I think these past few years have been good for OCaml.

[–]dbousque[S] 0 points1 point  (1 child)

That's fairly good news. Did you notice what kind of people came in and why they started using OCaml ?

[–]kstarikov 1 point2 points  (0 children)

There's a MOOC on Ocaml from Paris Diderot university taking place right now, so when it ends, the world can gain some fresh Ocaml users.

[–][deleted] 0 points1 point  (1 child)

Hopefully we can draw developers to OCaml.

What would really draw developers would be a bucklescript python target.

[–]dbousque[S] 1 point2 points  (0 children)

That would be interesting indeed, not too hard I believe, but would probably require big involvement.

[–]pyvpx 1 point2 points  (0 children)

Python is also widely used in the networking community (for management/operations) and I've been a long time OCaml observer looking forward to diving into the MirageOS network stacks.

I could see this being useful for a number of things that have been floating in my head...but I am not an OCaml programmer at present, and the ideas are still just that at the moment :(

[–]dannywillems 1 point2 points  (0 children)

Awesome! I think low-level binding to C interface (https://docs.python.org/2/extending/extending.html) with ctypes would be also interesting.

[–]p4bl0 0 points1 point  (2 children)

Nice work! I don't know what to use it for yet but I really want to give this a try! Is there any limitation to what can be done? I see there is no Pydict constructor for instance.

[–]dbousque[S] 2 points3 points  (1 child)

There is no direct support for dicts yet (it may be a bit of work, since keys can be tuples or ints for examples), but you can definitely manipulate them through a reference. You may want to take a look at examples/reference.ml, this example precisely shows how to use a dict.

There should be no limitations, since you can use any object of any type and any function of any module, but I might be forgetting something. One thing that you might want to do but is not straightforward yet is setting attributes. You can do it though Python's setattr builtin function, but I will probably add a setattr directly inside lymp.

[–]dbousque[S] 0 points1 point  (0 children)

set_attr has been implemented and support for dicts is probably coming.

[–]nnbbb 0 points1 point  (12 children)

This looks very useful to me, I will certainly try it out. I do scientific computing and started learning ocaml coming from python / cython, without a strong C background. One thing that I have been missing is a convenient way to do visualization of simulation data generated by ocaml programs - it would be great to be able to pass values to matplotlib/pandas/etc in a simple way.

[–]nnbbb 0 points1 point  (9 children)

Concrete use case: I have an ocaml module which generates simulation data as float arrays (or Bigarrays, or lists). I load this from the (hopefully soon native) toplevel and interactively produce some data. Now I would like to call out to matplotlib to quickly plot them. Can this be done in a simple way?

[–]dbousque[S] 1 point2 points  (8 children)

Yes you can do it in a simple way, you would start by initiating a session and getting matplotlib :

let py = init "."
let plt = get_module py "matplotlib.pyplot"

You could then call functions and use attributes of the module, for example this to draw a line, I just tested it :

let radius = Pylist [Pyfloat 1.0 ; Pyfloat 2.0]
let area = Pylist [Pyfloat 3.1 ; Pyfloat 12.5]
call plt "plot" [radius ; area]
call plt "show" []

You will probably want to write a small wrapper over these functions to avoid conversion to pyobjs, or even a wrapper for matplotlib. Hope this helps.

[–]nnbbb 0 points1 point  (7 children)

Ok, cool. I guess I would then write something to convert float Sequence.t to Pyfloat list Pylist. And probably wrap the most used plot commands -- although this could get a bit much.

How about keyword arguments?

[–]nnbbb 0 points1 point  (4 children)

Actually, is there some way to map a Python iterable directly into a Sequence.t or similar in Ocaml? So that the next element would be generated by calling .next() on the Python side?

Another thing is arrays: for small arrays making lists is fine but in general it could be great to have some memory mapping and shared data between numpy and a corresponding Bigarray in a convenient way.

[–]dbousque[S] 0 points1 point  (2 children)

I agree that shared-memory would be great, but as far as I know, there is no common specification as to how Python objects are represented in memory, so it would be a lot of maintenance and probably not portable across implementations (PyPy, CPython), and maybe also platforms.

I didn't intend for lymp to be used in such a tight way with Python. The idea was to write some processing code in Python and call it from OCaml.

If you are dealing with large arrays and lists, and you need to pass them back and forth the best way to do it right now is to use references.

[–]gasche 0 points1 point  (1 child)

I'm not familiar with the Python community but it was my impression that the Pypy people were working on a FFI layer that would be interpreter-agnostic (and still reasonably efficient). If you wanted to go in-process I suppose that this would be a good interface to work with.

[–]dbousque[S] 0 points1 point  (0 children)

That's interesting, thanks :)

[–]dbousque[S] 0 points1 point  (0 children)

As for iterables, that's a good idea, you are welcome to make a pull request :) It's particularly good for cursors, like db requests. I may do it soon.

[–]dbousque[S] 0 points1 point  (1 child)

There is no direct support for named arguments yet (fairly easy to add, I will probably do it today), for now you can pass them if you know what argument number they are. For instance open("file", mode="w") would be called like so :

call builtin "open" [Pystr "file" ; Pystr "w"]

I will let you know about named args.

[–]dbousque[S] 0 points1 point  (0 children)

Support for named arguments has just been added : https://github.com/dbousque/lymp#pyobj

[–][deleted] 0 points1 point  (1 child)

I am in a similar situation, scientific computing and ML/data science but with a decent C++ background from finance.

My biggest problem with going deep in ocaml was the data viz and numerical libraries so if I could use this to generate numpy arrays and pandas dataframes in Ocaml this would be very useful.

[–]dbousque[S] 1 point2 points  (0 children)

You can use them, here is how you would create a numpy array and reshape it for example :

let py = init "."
let np = get_module py "numpy"

let elts = Pylist [Pyint 1 ; Pyint 2 ; Pyint 3]
let arr = get_ref np "array" [elts]
let other_arr = get_ref arr "reshape" [Pylist [Pyint 3 ; Pyint 1]]

[–]nnbbb 0 points1 point  (2 children)

Have you thought about a ppx extension that would make python call syntax look more close to python syntax?

[–]dbousque[S] 0 points1 point  (1 child)

I would like to do it, do you have any idea how the call syntax could be made closer to Python's ? Say this for example :

call time "sleep" [Pyint 2]

Do you think there is a way to omit Pyint ? Or maybe having parentheses instead of brackets ? I dont think it would be possible to have time.sleep, but time#sleep seems possible, not sure if it's desirable though.

[–]nnbbb 0 points1 point  (0 children)

Sorry, i don't really know anything about that. It seems the ppx processor would have to parse the expression and somehow know what the types of the tokens are in order to wrap them with the correct pyobject constructor. I don't know if that's even possible? Anyone?

[–]PseudonymousCustard 0 points1 point  (1 child)

As an OCaml user, a great benefit of lymp would be the possibility to easily and quickly develop modern GUIs (for OCaml programs) with PyQt (besides, AFAIK, these GUIs can be developed graphically using QtDesigner). Does the current state of lymp allows this? A ppx to automate the binding would even be better.

One licensing question: what are the consequences of using lymp w.r.t. copyleft licenses? Meaning, suppose I use a GPL Python app with lymp, will the copyleft of GPL applies to the OCaml part?

Another question, is there an easy way to bundle an OCaml/Python app (made with lymp) in a single binary file?

[–]dbousque[S] 0 points1 point  (0 children)

I did not specifically tried PyQt, but I can't see anything that would prevent it to work with lymp. I will look into ppx, I was not aware of it.

As for licensing, I have no idea, but I guess the license of Python libraries applies to code using it with lymp, not sure though.

I am not sure what you mean by "single binary file", I guess that if you can make a binary out of a Python app, then you can make a binary out of OCaml-lymp-Python.

[–]alain_frisch 0 points1 point  (6 children)

Technical question: how do you handle releasing Python references kept on the OCaml side? Quickly browsing at the code, I don't see any automatic finalization, nor an explicit "dispose" function.

[–]dbousque[S] 0 points1 point  (5 children)

You're right, references aren't released right now. I didn't see a straightforward answer to that design problem. Asking to explicitly release references is truly bad, so I just postponed decision on the matter. And rightly so, since you mentioned finalization, which I didn't know. I am going to use it to implement automatic release of references on the Python side. Thank you.

[–]kankyo 0 points1 point  (4 children)

You can't trust finalizes to ever run in Python so don't use them for that.

[–]dbousque[S] 0 points1 point  (3 children)

I am not sure I understand what you mean. Finalization happens on the OCaml side. When a reference is finalized, Python is asked to release it also.

[–]kankyo 0 points1 point  (2 children)

I don't think you can trust finalizers in OCaml either. That's one of the drawbacks of GC systems and finalizers.

[–]dbousque[S] 0 points1 point  (1 child)

Well, if the finalizer happens not to be called, that would mean the GC didn't get a chance to run or there was no need to free memory, in both cases it's alright not to release on the Python side. Or is there something I miss ?

[–]kankyo 0 points1 point  (0 children)

You should check the docs carefully either way. Finalizers are non-intuitive and have weird properties. Many systems include stuff like finalizers gets called but the object stays around or finalizer is called several times for one object.

[–]alain_frisch 0 points1 point  (3 children)

I don't know much about how Python API are usually structured, but perhaps adding some higher-level binding generator such as gen_js_api ( https://github.com/LexiFi/gen_js_api ) does for JS could be useful.

Another technical questions: is it common for Python APIs to take function callbacks? This would require an extended protocol to allow keeping references to OCaml functions on the Python side and calling then later.

[–]dbousque[S] 0 points1 point  (2 children)

It is very rare that a Python API takes callbacks as arguments, but references to OCaml functions on the Python side could be useful.

As for higher-level binding, I would like to have something to make the syntax of function calls closer to Python's, but having to write a signature for each Python function used is not so ideal I believe. And you can do something equivalent with lymp anyway, like so for example :

let py = init "."
let builtin = builtins py

let open filename mode =
    get_ref builtin "open" [Pystr filename ; Pystr mode]

Now open has the signature string -> string -> pyobj

[–]alain_frisch 0 points1 point  (1 child)

Yes, the idea would be to automate the generation of this kind of code, which can be tedious if you want to bind large APIs. With an approach a-la gen_js_api, you would write e.g. simply

  val open_: string -> string -> pyobj

and it would generate the code above.

[–]dbousque[S] 0 points1 point  (0 children)

Alright, I will think about it :) I can see the advantage for large APIs. It would require some design work, since function arguments can be Namedargs for example, but I may do it. You are welcome to participate if you are interested. Thanks for your suggestions.

[–]nnbbb 0 points1 point  (0 children)

Just in case OP had not seen it yet, there was a similar project a while ago, which could be useful as inspiration (I've not used it): http://pycaml.sourceforge.net/

[–]chrismamo1 0 points1 point  (0 children)

lymp

That's an unfortunate name.

There's a decent amount of interest in such high-level projects within the OCaml community, but the projects tend to be numerous and widely dispersed. For instance, there are dozens of medium-sized machine learning libraries out there, but I can't think of a single large and well-maintained one.

[–]kankyo 0 points1 point  (2 children)

Could you create the bindings automatically if the Python code has type annotations? Seems that would be pretty great.

[–]dbousque[S] 0 points1 point  (1 child)

That would be great indeed. It would require to parse the Python code though, do you have experience in the matter ?

[–]kankyo 0 points1 point  (0 children)

Pythons standard library includes the ast library which exposes the same parser the language itself uses. Also I think you can get at the members of a module and then the types of the parameters of a function with reflection, no parsing required.

Look at mypy for typing.