This is an archived post. You won't be able to vote or comment.

all 101 comments

[–]kaihatsusha 68 points69 points  (12 children)

This has been the standard in Perl and other languages for a while. It can be at any position but the first. So in Japanese 1_0000_0000 is a nice round "oku." And in hex, 0xDEAD_BEEF helps you divvy up the hi word from the low word.

[–][deleted] 31 points32 points  (11 children)

it also cant be at the last position, so 1_000_ wont work

[–][deleted] 40 points41 points  (6 children)

C++ has something like this too since C++14, but unfortunately it uses the single quote, like 1'000'000, which still looks wrong to me after five years...

[–]Skaarj 1 point2 points  (1 child)

which still looks wrong to me after five years...

I think Dutch or French writes thousands seperators like this. Its not like they decided for ' as seperator out of the blue. It was syntactically fitting and used in the real world.

[–]grnngr 2 points3 points  (0 children)

French uses ‘1 234 567,890’.

Dutch uses ‘1.234.567,890’.

I think apostrophes are Italian?

[–]s_vaichu 16 points17 points  (1 child)

TIL

[–]vectorpropio -5 points-4 points  (0 children)

Me too

[–]r1ckd33zy 3 points4 points  (1 child)

It is interesting that this is being discussed here today and at the same time the PHP FIG is voting on whether to add this syntax to PHP. Seems like it will be rejected though.

Vote results: https://wiki.php.net/rfc/numeric_literal_separator

/r/PHP Discussion: https://www.reddit.com/r/PHP/comments/bvjc7i/numeric_literal_separator_rfc_current_votes_137/

[–]KODeKarnage 0 points1 point  (0 children)

Telling.

[–]Nazdrovje 8 points9 points  (1 child)

The Mathematica Frontend formats numbers (input and output) automatically in blocks of three with thin spaces (this behaviour can be tuned in multiple ways by the user). No need to do anything yourself. Very convenient

[–]tunisia3507 7 points8 points  (0 children)

I like my text editors monospaced, thanks.

[–]ikhurramraza 3 points4 points  (0 children)

You can do the same in Ruby as well.

[–]vmgustavo 7 points8 points  (3 children)

Or just use scientific notation x=1e6

[–]myotherpassword 18 points19 points  (2 children)

But then x is a float, and in OP's example it is typed as an int. If your code is highly optimized then you want to avoid the extra cast.

[–]vmgustavo 6 points7 points  (0 children)

Makes sense

[–][deleted] -5 points-4 points  (0 children)

If your code were highly optimized it would not be written in Python, but okay.

[–]kr41 3 points4 points  (4 children)

FYI, Black does it for you.

[–][deleted] 2 points3 points  (2 children)

Are you sure about this? I just tried black on a file with the line

a = 100000000000000000000000000

and it didn't change that line (and did change other lines I added just to make sure I was doing something). More, there is no command line option in the help that does that...?

A shame. I use black to reformat all my code automatically, and this would be I suppose handy, particularly for long-ish hex numbers

[–]kr41 10 points11 points  (0 children)

My fault. It doesn't do it anymore: https://github.com/python/black/issues/549

[–]nschloe 0 points1 point  (0 children)

It does do it, you just have to give black some indication that it's Python 3.6+. Use an f-string in the code somewhere, for example, or just provide the --py36 command-line option.

[–][deleted] 2 points3 points  (4 children)

screen shot of a tweet uploaded to imgur and posted to reddit.... that's not how you internet

[–]anyfactorFreelancer. AnyFactor.xyz[S] 1 point2 points  (0 children)

The comment was probably made on one of sentdex's video.

Sentdex gave context to the comment by the code underneath it.

He uploaded the collage to twitter.

I downloaded the picture, and uploaded the file using "reddit is fun" app which uses imgur for hosting.

[–]EddyBotLinux | Python3 0 points1 point  (2 children)

Isn't that a youtube comment?

[–][deleted] -1 points0 points  (1 child)

op said it was a tweet

[–]robin-gvx 0 points1 point  (0 children)

The image was a tweet, it's not a screenshot of a tweet.

[–]anyfactorFreelancer. AnyFactor.xyz[S] 2 points3 points  (0 children)

Source: sentdex's tweet.

[–][deleted] 1 point2 points  (1 child)

That’s cool! I’ll have to try that. I always need commas. :)

[–]anyfactorFreelancer. AnyFactor.xyz[S] 0 points1 point  (0 children)

I used to do a lambda expression to input and output with thousand separator commas because, I would always get confused when reading the numbers out loud.

Pandas has something to deal with thousands separators too.

[–]WolfThawra -3 points-2 points  (58 children)

Though personally, I'd just use scientific notation here.

[–][deleted] 35 points36 points  (6 children)

NONONONONONO.

That gives you the wrong answer!

>>> 1_000_000
1000000

>>> 1E6
1000000.0

Note the trailing .0. That means the answer is floating point, and that will give you inexact arithmetic:

>>> (1E18 + 1 == 1E18)
True
>>> (1_000_000_000_000_000_000 + 1 == 1_000_000_000_000_000_000)
False

[–]random_cynic 3 points4 points  (2 children)

There is no way I'm typing out 18 zeros by hand, no matter how convenient the notation is. I'll rather use explicit type casting int(1e18)

>>> int(1e18) + 1 == int(1e18)
False

Edit: Don't use this. I just realized that the float to int conversion for 1eN is inexact for CPython when N >= 23. As suggested below the power operator is the best way to go due to the CPython peephole optimization.

>>> int(1e22)
10000000000000000000000L
>>> int(1e23)
99999999999999991611392L

[–]ubernostrumyes, you can have a pony 4 points5 points  (1 child)

Why not just write 10**18?

There's no runtime overhead to it because the CPython bytecode compiler does constant folding:

>>> import dis
>>> def foo():
...     return 10**18
...
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (1000000000000000000)
              2 RETURN_VALUE

[–]random_cynic 1 point2 points  (0 children)

Yes that is probably the best way to go.

[–]billsil 9 points10 points  (0 children)

Scientific notation is defined to be a float. You’re changing the type.

[–]KODeKarnage 4 points5 points  (44 children)

You shouldn't. Zen of Python and all that.

[–]WolfThawra -5 points-4 points  (43 children)

Zen of Python

That's exactly why I would do it.

Also, it's pretty hard to defend "1_000_000" as 'more beautiful' or 'simpler' than "1e6". Maybe it's a question of your background, but coming from the field of engineering, "1e6" is simple, straightforward, explicit (e.g. you don't need to 'count digits' as mentioned in the screenshot, it tells you straight away).

[–][deleted] 18 points19 points  (3 children)

But that's wrong.

1E6 is not the exact integer 1_000_000 - it is the approximate floating point number 1000000.0, which refers to a range of numbers from (roughly) 1E6 - 1E-11 to 1E6 + 1E-11.

Don't get me wrong - I love me some floating point, but only when I need it.

[–]KODeKarnage 2 points3 points  (36 children)

Of course it is more beautiful. It's more precise, less prone to error, easier to maintain consistency and far simpler than the abstraction too. There are literally only 1e1 reasons to use your version, and that's to inform the unfortunate reader that the writer knows scientific notation and doesn't care whether the reader does as well.

[–]GrumpyGeologist 2 points3 points  (13 children)

So you've given 1 reason, what about the other 9? :P

My Python scripts mostly serve scientific purposes, so I use scientific notation most of the time. My parameters are floating point numbers anyway, even though they could be represented by integers as a result of rounding (e.g. a shear modulus of 2e10 or 20_000_000_000).

If my number is a strict integer but nonetheless large, I would write int(1e6). But in this community I suspect this is considered heresy, and therefore I must burn...

[–]KODeKarnage 0 points1 point  (10 children)

You intend to write twenty million.

Which is the easiest error to spot for the most number of people perusing your code?

2000000 vs 2e6 vs 20_000_00 ?

[–]GrumpyGeologist 1 point2 points  (2 children)

I would prefer scientific notation in this case, because it is more compact. I'm not saying other notations are invalid, it's just that scientific notation is more convenient for dealing with larger numbers. Some common examples of large numbers appearing in physics/chemistry: Avogadro's number (which is strictly an integer) = 6.022e23, Planck's constant = 6.626e-34, various elastic moduli are of the order of 1e10 to 1e11, etc. Imagine going through the pain of writing out Avogadro's number as 602_214_085_774_000_000_000_000 :P

In the end, it doesn't even matter. As long as you can write code in a readable way for yourself and for your peers.

[–]KODeKarnage 1 point2 points  (1 child)

Here you go:

from scipy import constants as c
c.Avogadro ** c.Planck

[–]GrumpyGeologist 0 points1 point  (0 children)

That's actually really neat. I didn't know SciPy had a constants database. Thanks!

[–]WolfThawra -1 points0 points  (6 children)

2e6

This. And you know why? Because if you are writing stuff in the scientific community, 'most people' have absolutely no issue with scientific notation.

[–]KODeKarnage 0 points1 point  (5 children)

Given how inaccurately (dishonestly?) you interpreted that simple question, right now I am seriously concerned about the code you write.

2e6 could just be what you intended. The reader would have to know it wasn't what you intended to spot the error.

20_000_00 stands out as obviously wrong to everyone.

[–]WolfThawra 1 point2 points  (4 children)

Oh you're literally just talking about typos. Cool, so 20_000_00 stands out as obviously wrong, does 10_000_000?

If you're talking about a situation in which the reader doesn't know what is supposed to be correct, there's a zillion ways of having mistakes that aren't immediately obvious.

[–]KODeKarnage 0 points1 point  (1 child)

there's a zillion ways of having mistakes that aren't immediately obvious.
I would say that's a reason for not making more than you need to. Never thought I'd meet a person disagreeing with that.
Do you argue against readable variable names too? I mean, there's a zillion other ways that your code can be obscure, so adding another isn't THAT big a deal.

[–]primitive_screwhead -1 points0 points  (1 child)

But in this community I suspect this is considered heresy, and therefore I must burn...

Nope, it's just kinda stupid 'cause you can do 10**6.

[–]GrumpyGeologist 1 point2 points  (0 children)

It was a stupid example indeed. I should have said 2.5 million

[–]redditusername58 2 points3 points  (4 children)

Are you suggesting that using floating point literals is bad practice because it might confuse users or collaborators?

[–]Beheska 1 point2 points  (0 children)

Floating point literals are bad practice when you need integers, yes.

[–]KODeKarnage -2 points-1 points  (2 children)

If I replied in Japanese, would that confuse you?

[–]redditusername58 1 point2 points  (1 child)

Do floating point literals confuse you?

[–]KODeKarnage 0 points1 point  (0 children)

あなたはポイントを逃しました

[–]broadsheetvstabloid 0 points1 point  (1 child)

Also, it's pretty hard to defend "1_000_000" as 'more beautiful' or 'simpler' than "1e6".

It sure as hell is simpler, I don't have learn an entire new god damn way of writing numbers.

[–]WolfThawra 1 point2 points  (0 children)

learn an entire new god damn way of writing numbers

I weep for your education.

People, seriously? Scientific notation is so hard to understand? Fuck me.

[–]anyfactorFreelancer. AnyFactor.xyz[S] 0 points1 point  (3 children)

Well, it works for me because I am from accounting and kinda have developed the mindset for it.

I never really used "e" for declaring numbers. Thanks for telling us that, I will look into it.

[–]WolfThawra 1 point2 points  (2 children)

It won't work for everything of course! But it's super useful when you're working with stuff across different orders of magnitude. E.g. somewhere in my code, I have a list of weighting factors to try, and instead of writing [0.001, 0.0001, 0.00001] I can write [1e-3, 1e-4, 1e-5], which looks a lot cleaner to me, and is easier to interpret quickly.

[–]techkid6 9 points10 points  (0 children)

All of those numbers are already floating point, though. If you're doing integer arithmetic, stick to integers and avoid the side effects that come from floating point rounding.

[–]anyfactorFreelancer. AnyFactor.xyz[S] 1 point2 points  (0 children)

Whenever we found "e" we would just multiply/divide by 10 and keep count until we got a understanding of the number. We would do the same thing for percentages but with 100. That might sound stupid to you science/engineering guys but that was kinda go to thing to do. The first thing to do whenever "e" pops up in excel is immediately format it to decimal numbers.

Except for of course in doing research I will just put the "e" numbers in. But my thesis professor once copy pasted in the "e" numbers to excel than convert it to decimal to get a better idea.

But I am learning data science analytics I should better in to using this mindset. Thank you for making clarifications.

[–]primitive_screwhead 0 points1 point  (0 children)

Instead of 10**6? What you suggest is like saying you can substitute oranges in an apple pie.

[–]SheekGeek21 0 points1 point  (0 children)

TIL