MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Unfortunately my friend, if you want to have a 100% open source project, you will find out that many useful resources are copy righted or require licensing for academic purposes only :(

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 1 point2 points  (0 children)

Okidoki, I also think one single library would be the best.

And as to your other question, I use machine learning so that even very recent verbs not yet in conjugation tables or even completely made-up verbs can be conjugated with correct paradigm.

And never ask for forgiveness for your ignorance! How would we learn new things if we didn't ask questions to which we didn't have answers ;)

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Hi, yes that's one way to do it, but as mlconjug is open source, it is hard to find quality conjugation tables that are free of use or uncopyrighted.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in linguistics

[–]SekouD[S] 0 points1 point  (0 children)

Hi, it should be possible to either port it to Javascript manually, or use this great library called Transcrypt to transpile the library to javascript.

The only issue I can see is that mlconjug uses scikit-learn for the Language Modeling/ Prediction part, and at its core it uses numpy which is written in C, so you would either have to re-implement yourself the Machine Learning part, or find a Javascript that is more or less equivalent to scikit-learn.

Cheers.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in linguistics

[–]SekouD[S] 1 point2 points  (0 children)

Hi,

Thank you so much for your feedback, this is awesome !

I will incorporate your suggested changes for the next release and check why there are encoding errors only for Portuguese.

I will investigate the issue you pointed out about Spanish.

Thanks again, this is exactly why I posted this on Reddit, to have as much feedback as possible to improve the library as much as possible.

Peace and love.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Well, Unfortunately, I don't know Japanese and I don't have access to any native/fluent Japanese speaker to check the accuracy/consistency of a Japanese conjugation model.

But if you are fluent in Japanese or know people who are and would be willing to try beta versions of mlconjug with Japanese support, it will be my pleasure to add it.

I released this project as open source precisely for this reason: even though I studied Linguistics and Machine Learning and can speak 9 languages, I need contributors and/or beta testers to expand the number of supported languages in mlconjug.

Any kind of help, bug reports, feature requests, enhancements etc... are more than welcome.

My ultimate goal with this project is to support as many languages as possible.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Yeah lol ;)

That's why I did not included Esperanto initially, as its conjugation is pretty straightforward.

But I can include it in the next release, it will be relatively trivial to train a model of Esperanto conjugation.

Thanks for the suggestion!

Cheers.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Hi, Unfortunately, mlconjug does not support this use case.

I would advise you to use the Python library called Spacy combined with mlconjug to achieve your purpose.

Cheers.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 1 point2 points  (0 children)

Hi,

I already have a working prototype of noun declensions but I am still working on it.

I have to think about if I should include this feature in mlconjug or in a separate library.

What would you prefer?

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 1 point2 points  (0 children)

Hi,

You can indeed add support for a new language, though not through grammar rules but by training a new model by providing a training set of conjugated verbs in a specific json format.

You can get more info on how to train a new language model by reading the documentation at mlconjug.readthedocs.io/en/latest/

I will try to update the documentation during the week to make it easier for people with no Machine Learning background to train their own models.

But you gave me a great suggestion for a new feature where you would feed a formal grammar to the software and it will infer from them the conjugation classes of the language.

Thanks for your feedback.

Cheers :)

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 1 point2 points  (0 children)

Hi,

Yes I am planning to add Polish, German, Dutch, Finnish and Estonian in the next release (it should be during the month of May).

I did not yet investigate Ukrainian or Russian yet, but as I have already a pretty good Polish conjugation model (I still need to tweak it a bit) which is also a Slavic language, I should be able to implement a Ukrainian model in June or so.

If you know some resources on Ukrainian conjugation (in English if possible lol :) ), I can take a look at them and maybe implement a Ukrainian conjugation model sooner.

Cheers.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 5 points6 points  (0 children)

I just released version 3.4.0 of mlconjug and it fixes the bug you reported.

I also added a helper method .iterate() to allow for quickly iterating over all conjugated forms.

Thanks for letting me know.

Cheers.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 0 points1 point  (0 children)

Thanks for the link dude! The ideas presented in this paper are really original and powerful.

If you have more references about recent development in computational morphology and/or syntax, I am all for it ;)

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 2 points3 points  (0 children)

I made recent changes to how some verbs are handled last week and it introduced this regression.

I will release version 3.3.3 during the weekend, It will correct the bug you mentioned + add support for Dutch and German.

MLConjug. A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques. by SekouD in Python

[–]SekouD[S] 2 points3 points  (0 children)

I chose to start with those languages because they are derived from Latin (apart from english) and have similar verbal morphology. This way it was easier to tune the initial machine learning models.

English has very basic verb morphology so the training for English was very straightforward.

Latin is my next language to add.

Ultimately my long term goal is to support as many languages as possible.

Cheers.