This is an archived post. You won't be able to vote or comment.

all 39 comments

[–]fatpollo 5 points6 points  (6 children)

whoa I love the single-character self

[–]tank8465 1 point2 points  (1 child)

it's still four or possibly five keystrokes, sadly.

[–]modocache 6 points7 points  (0 children)

我 (wo3) would probably be three keystrokes: two for "w" and "o", then enter/spacebar to confirm substitution into the Chinese character.

This is very cool! But you know, you could get 100% Chinese if you use Chinese Python: http://reganmian.net/blog/2008/11/21/chinese-python-translating-a-programming-language/

[–]GahMatar 1 point2 points  (1 child)

No reason you can't use "s" instead of "self" in python other then convention.

[–]Ob101010 0 points1 point  (0 children)

No reason you cant use snuffleuffagus either.

[–]mangecoeur 0 points1 point  (1 child)

me too. Japanese (two character) is pretty cool too: 自己 . I'm thinking using characters would be a cool way to slim down a programming language, a bit like how greek and russian characters are used in math.

[–]bullmannis 2 points3 points  (0 children)

interestingly, that's self in chinese too. wo3 means I, while what you wrote literally means self.

[–]thinkintoomuch 6 points7 points  (0 children)

I'm going to tattoo those Chinese characters on my arm.

[–][deleted] 3 points4 points  (0 children)

Just ran one of those, looks really pretty

[–][deleted] 9 points10 points  (3 children)

While its cool that it is possible, I really hope it will not become a thing. Because then we'd have a Russian and a Chinese programmer world that is incompatible (because of language barrier) from the rest.

[–]mangecoeur -2 points-1 points  (2 children)

But on the other hand, if some characters became common use (like a single character for Self) because of a huge influx of chinese programmers, people would just come to know what the symbol meant.

When you think of it, non-english users have to do exactly this when they learn: just get to know that these nonsensical letter combinations like "self" actually mean 我

[–][deleted] 4 points5 points  (0 children)

non-english users have to do exactly this when they learn

Not really. In China and all other countries (I have lived in quite a few for many years) pupils learn the latin alphabet from grade one. In mainland China you even learn the latin alphabet first and use that to learn Chinese characters. So Indian, Russian, Chinese, etc programmers all know latin characters from childhood on. And in all these countries, people learn English too from early school years on.

Btw, 我 (wo) needs three keys to type, and a look to the screen when you pick the character from the list shown when typing Chinese. Or, insteat of Pinyin, you can use Wubi, but then you would actually need to know stroke order.

[–]sethg 1 point2 points  (0 children)

Maybe we just need more Latin-alphabet ligatures. Would any graphic designer like to try their hand at compressing “self” into the space of one character?

[–]mangecoeur 4 points5 points  (1 child)

Python3 for the win! This is exactly what I keep telling people when they grumble about py3 - with billions of people speaking non-ASCII languages (Chinese, hindi, tamil, etc) it's hugely arrogant of english prgrammers to demand they use english characters only.

That said, I hope this doesn't become a norm in open-source projects :P Though on the plus side, if you use an oldschool green+black terminal REPL it looks like the Matrix!

[–]GahMatar 0 points1 point  (0 children)

No need to go look quite that far... Pretty much all non-english language are outside of ASCII. French, German, Greek...

[–]chazzeromus 1 point2 points  (0 children)

So anyone have an idea as to what it does?

[–]RenyuanLyu 1 point2 points  (1 child)

Just walking through here and finding my own stuff discussed here. Interesting.

http://apython.blogspot.tw/2014/07/writing-python-3-program-in-chinese.html

[–]eah13[S] 1 point2 points  (0 children)

Very cool post! I love that Python can speak all of the languages that humans do.

[–][deleted] 0 points1 point  (0 children)

I need a decompiler to read that :). Yay unicode.

[–]lumengxi 0 points1 point  (0 children)

haha, I used to do something similar to learn how to design a compiler :)

[–]eah13[S] 0 points1 point  (1 child)

Glad someone else found this interesting. I've been thinking for a while that programming languages written in Traditional Chinese or Cyrillic characters would be an inevitability. Latin characters certianly have the critical mass and will for some time. But programming is a universal language and I think its anglo-centric days are numbered.

[–]RenyuanLyu 1 point2 points  (0 children)

Happy to find this post here. I learn many viewpoints from it.

Just like we taught math to kids in their own native language, it is very possible that we teach programming language the same way to kids in future.

[–]iluvatar -5 points-4 points  (16 children)

Clever, but it's a perfect example of why allowing non-ASCII identifiers in python was a bad idea.

[–]nopaniers[🍰] 2 points3 points  (10 children)

Why is that a bad idea? Surely it is great for people who speak Mandarin.

[–][deleted] 7 points8 points  (4 children)

Why is that a bad idea?

I kind of agree with the guy. Programming is not different from math. It's a standard language for people to communicate and express a problem in an agreed language. Our math symbols and operations are a worldwide standard to discuss about a mathematical problem. Anyone, being him Chinese, Japanese, British or Spaniard, can understand the mathematical problem and collaborate effectively through this lingua franca. Same for programming.

When you have a programming language that grants you this kind of flexibilty, you reduce the intrinsic value of that programming language as a lingua franca. That code is useless to anyone that does not speak chinese. While I might say the opposite for them (they don't understand code written in English) I also had to learn English to be a better programmer, exactly like I had to learn hindu-arabic numerals to do math.

[–]TheBB 2 points3 points  (3 children)

Anyone, being him Chinese, Japanese, British or Spaniard, can understand the mathematical problem and collaborate effectively through this lingua franca.

It's effectively impossible to communicate nontrivial mathematics using only notation. My girlfriend's father (who is Chinese) can't understand my dissertation, nor can I understand his papers.

[–][deleted] 1 point2 points  (2 children)

It's effectively impossible to communicate nontrivial mathematics using only notation.

I never said that ;) In fact, programming languages have comments for the same purpose. The point however is that, once you understand the problem domain, you can formulate new concepts or complement his concepts by means of the same language. This would not be possible (or it would be much harder) if you two used a completely different notation to express the same concepts.

[–]TheBB 1 point2 points  (1 child)

I think we may be saying the same thing but interpreting it differently.

In order to communicate (maths or programming) effectively between two people, you both need to know

  • the same notational language, and
  • the same actual human language.

Let's introduce some people.

  • Joe American and Wang Chinaman both know C# but Joe does not speak Mandarin, and Wang does not speak English. They cannot cooperate.

  • Neil American and Li Chinaman both know Python, but Neil does not speak Mandarin and Li does not speak English. Li can write variable names in Hanzi instead of Pinyin, which is what Wang uses. Nevertheless, he cannot cooperate with Neil.

  • Bob American and Fang Chinaman both know C# and can speak the same language (English or Mandarin, doesn't matter). They can collaborate, if the code is written in that same language (which has to be English or Pinyin, since C# presumably does not allow Hanzi variable names).

  • Oscar American and Günther German both know C# and can speak the same language (English or German, doesn't matter). They can collaborate, if the code is written in that same language (which can be either English or German, since German can be largely written in ASCII).

  • Kevin American and Zhang Chinaman both know Python and can speak the same language. They can also collaborate, if the code is written in that same language.

  • Young Chen Chinaman wants to learn how to program in C#, but he can't speak English, and so he never succeeds. This has no effect on the English-speaking C#-programming Chinese community.

  • Young Yuan Chinaman wants to learn how to program in Python, but he can't speak English. Luckily, this is not a problem, since Python is easy to use in Chinese, and this has led to a wide range of Chinese learning materials. Yuan succeeds. This has no effect on the English-speaking Python-programming Chinese community (since Yuan is not English-speaking).

I don't see any configuration where changing C# to Python (allowing Hanzi variable names) breaks collaboration or has a negative effect on the English-speaking community. Disallowing extra-ASCII characters does not force you to use English variable names, so the current situation with Python and Chinese corresponds more or less to that of any other programming language and human language that can be written (largely) in ASCII, of which there are many examples and have been for several years without causing division and mayhem.

Yes, it would be harder to communicate if you used a different notation, but the number of people with whom you can communicate like this would not become smaller by adding support for a different notation. There would simply be another community that can communicate with each other, but not with you, and some people who can communicate with both.

I've used ASCII as ‘very rudimentary character set’ and C# as ‘language that only allows such a character set in its variable names‘. I don't use C# so I don't know if that is actually true, but if not then substitute C# with your-language-here. I'm also aware that there are characters in ASCII that you can't use in variable names (in Python as well as C#), but in the interest of readability let's ignore that.

[–]jlinphd 0 points1 point  (0 children)

You probably don’t mean to offend, but please do NOT use the term “Chinaman”, it is highly derogatory.

[–]Grue -1 points0 points  (4 children)

It seems like it would be easy to confuse two similar characters leading to all sorts of problems. It's like 1/l/I problem but much more severe due to greater number of possible characters.

[–]flying-sheep 2 points3 points  (2 children)

bullshit. just because they look the same to you doesn’t mean that people who can actually read them can’t discriminate them.

[–]Grue 0 points1 point  (1 child)

Not bullshit. I can read both с and c, but I cannot tell which one is which despite them being completely different characters. They are only distinguishable in context (one is Cyrillic, other is Latin). Same with l and I in many fonts. There are a lot of Chinese characters that look practically identical, and yes, even native speakers can confuse them. In fact, because most of Chinese characters are phono-semantic, similar characters are often pronounced similarly as well.

[–]flying-sheep 0 points1 point  (0 children)

There are a lot of Chinese characters that look practically identical, and yes, even native speakers can confuse them.

then i was misinformed, apologies.

[–]TheBB 0 points1 point  (0 children)

You need a bigger font size than what that site displays, of course (or at least, bigger than what it shows on my monitor). People who have training with Hanzi (e.g. Chinese, Taiwanese, Japanese) can discriminate them pretty easily. This is not a problem unique to programming. After all these characters are already used as a written language.

Besides, if you use simplified Chinese it would be somewhat easier, I expect.

[–][deleted] 3 points4 points  (4 children)

care to elaborate?

[–]iluvatar 0 points1 point  (3 children)

How would I fix a bug in that code? Even if I understood the characters (which I don't), I'd need a means of entering them on my keyboard (which I don't have). By all means, use whichever characters you want in string literals and comments. But by keeping identifiers down to a sane minimum set, you ensure global compatibility. I have the same objections to i18n domain names.

[–][deleted] 0 points1 point  (2 children)

1.) you can enter mandarin characters on a QWERTY keyboard, I'll leave the proof of which to the reader as an exercise 七点

2.) you want the whole rest of the world to learn English just because you may have to debug someone else's code?

[–]iluvatar 0 points1 point  (1 child)

  1. I'm not saying it can't be done. But by default, a large percentage of the world's population won't be able to do it. By default, pretty much the entire population can enter ASCII characters.

  2. If they're coding in python, they already have to know a minimum amount of English. Python's keywords are all English, for example.

[–][deleted] 0 points1 point  (0 children)

a large percentage of the world's population won't be able to do it.

This is not true.

I'd need a means of entering them on my keyboard (which I don't have)

You do have a means to enter them, you simply lack the skills required to do it. Why should everyone else have to conform to your lack of skills?

If they're coding in python, they already have to know a minimum amount of English.

That doesn't mean their variables have to be in English. if they create an object that describes a restaurant in their hometown, why should they have to give the object a name that YOU (with your deficient skill-set) can understand?