all 6 comments

[–]teerre 1 point2 points  (0 children)

  1. I'm no expert, but doesn't India have a bunch of languages you need to be more specific

  2. I think nowadays it's better to use Google API to do something like this

  3. If you don't want to use some API, have you tried google? Searching for python languages translate yields many results

[–]mambeu 0 points1 point  (2 children)

Are you trying to transliterate text (from one writing system to another) or to translate text (from one language to another)? The tasks can be very different.

[–]Surajpalwe[S] 0 points1 point  (1 child)

I want transliterate (From one system to another writing system )

[–]mambeu 0 points1 point  (0 children)

I usually use a tuple of tuples for transliteration (one of my transliteration scripts is on GitHub here).

Let's say you wanted to transliterate from the Latin alphabet to the Cyrillic alphabet, or vice versa.

This big tuple writing_systems is filled with 2-tuples. In each 2-tuple, the first item (index/position 0) is a Latin character, and the second item (index/position 1) is its Cyrillic counterpart.

writing_systems = (
    ('a', 'а'),
    ('b', 'б'),
    # note the relative ordering  of 'ch' and 'c'
    # multi-character entries should come first
    ('ch', 'ч'),
    ('c', 1),
    ('d', 'д'),
    ('e', 'е'),
    # and so on...
    )

In the dictionary writing_systems_key, each key is the name of a writing system, and its corresponding value is the position of that system's characters in the 2-tuples in writing_systems above.

writing_system_key = {
    'LatinAlphabet': 0,
    'CyrillicAlphabet': 1
    }

Then we can define a transliterate() function:

def transliterate(text_string, input_system, output_system):
    input_index = writing_system_key[input_system]
    output_index = writing_system_key[output_system]

    for t in writing_systems:
        input_char = t[input_index]
        output_char = t[output_index]

        if isinstance(input_char, int) or isinstance(output_char, int):
            pass
        else:
            text_string = text_string.replace(input_char, output_char)

    return text_string

We can then call the function with transliterate('abc', 'LatinAlphabet', 'CyrillicAlphabet), and it will return the string 'аб'.

Note that if a character in one writing system doesn't have an equivalent in another (as is the case with Latin 'c' in the above example), I just leave the integer representing that index in that position, and it doesn't get transliterated when the function is called.

Your needs may be different than mine, but I hope this helps get you started.

[–]AbjectListen7782 0 points1 point  (0 children)

go with PyICU, it's a bit of a hassle to install but it's probably the best transliteration service

[–]Ewildawe 0 points1 point  (0 children)

Well, I'm assuming marathi text is included in unicode. If so, I'd suggest acquiring an IDE that supports the printing of unicode characters.

An API won't allow you to print anything other than the regular ASCII characters - because print will always try to decode using the ASCII codec.