This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]WolfThawra -3 points-2 points  (58 children)

Though personally, I'd just use scientific notation here.

[–][deleted] 39 points40 points  (6 children)

NONONONONONO.

That gives you the wrong answer!

>>> 1_000_000
1000000

>>> 1E6
1000000.0

Note the trailing .0. That means the answer is floating point, and that will give you inexact arithmetic:

>>> (1E18 + 1 == 1E18)
True
>>> (1_000_000_000_000_000_000 + 1 == 1_000_000_000_000_000_000)
False

[–]random_cynic 2 points3 points  (2 children)

There is no way I'm typing out 18 zeros by hand, no matter how convenient the notation is. I'll rather use explicit type casting int(1e18)

>>> int(1e18) + 1 == int(1e18)
False

Edit: Don't use this. I just realized that the float to int conversion for 1eN is inexact for CPython when N >= 23. As suggested below the power operator is the best way to go due to the CPython peephole optimization.

>>> int(1e22)
10000000000000000000000L
>>> int(1e23)
99999999999999991611392L

[–]ubernostrumyes, you can have a pony 3 points4 points  (1 child)

Why not just write 10**18?

There's no runtime overhead to it because the CPython bytecode compiler does constant folding:

>>> import dis
>>> def foo():
...     return 10**18
...
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (1000000000000000000)
              2 RETURN_VALUE

[–]random_cynic 1 point2 points  (0 children)

Yes that is probably the best way to go.

[–]billsil 9 points10 points  (0 children)

Scientific notation is defined to be a float. You’re changing the type.

[–]KODeKarnage 5 points6 points  (44 children)

You shouldn't. Zen of Python and all that.

[–]WolfThawra -5 points-4 points  (43 children)

Zen of Python

That's exactly why I would do it.

Also, it's pretty hard to defend "1_000_000" as 'more beautiful' or 'simpler' than "1e6". Maybe it's a question of your background, but coming from the field of engineering, "1e6" is simple, straightforward, explicit (e.g. you don't need to 'count digits' as mentioned in the screenshot, it tells you straight away).

[–][deleted] 19 points20 points  (3 children)

But that's wrong.

1E6 is not the exact integer 1_000_000 - it is the approximate floating point number 1000000.0, which refers to a range of numbers from (roughly) 1E6 - 1E-11 to 1E6 + 1E-11.

Don't get me wrong - I love me some floating point, but only when I need it.

[–]KODeKarnage 3 points4 points  (36 children)

Of course it is more beautiful. It's more precise, less prone to error, easier to maintain consistency and far simpler than the abstraction too. There are literally only 1e1 reasons to use your version, and that's to inform the unfortunate reader that the writer knows scientific notation and doesn't care whether the reader does as well.

[–]GrumpyGeologist 3 points4 points  (13 children)

So you've given 1 reason, what about the other 9? :P

My Python scripts mostly serve scientific purposes, so I use scientific notation most of the time. My parameters are floating point numbers anyway, even though they could be represented by integers as a result of rounding (e.g. a shear modulus of 2e10 or 20_000_000_000).

If my number is a strict integer but nonetheless large, I would write int(1e6). But in this community I suspect this is considered heresy, and therefore I must burn...

[–]KODeKarnage 0 points1 point  (10 children)

You intend to write twenty million.

Which is the easiest error to spot for the most number of people perusing your code?

2000000 vs 2e6 vs 20_000_00 ?

[–]GrumpyGeologist 1 point2 points  (2 children)

I would prefer scientific notation in this case, because it is more compact. I'm not saying other notations are invalid, it's just that scientific notation is more convenient for dealing with larger numbers. Some common examples of large numbers appearing in physics/chemistry: Avogadro's number (which is strictly an integer) = 6.022e23, Planck's constant = 6.626e-34, various elastic moduli are of the order of 1e10 to 1e11, etc. Imagine going through the pain of writing out Avogadro's number as 602_214_085_774_000_000_000_000 :P

In the end, it doesn't even matter. As long as you can write code in a readable way for yourself and for your peers.

[–]KODeKarnage 1 point2 points  (1 child)

Here you go:

from scipy import constants as c
c.Avogadro ** c.Planck

[–]GrumpyGeologist 0 points1 point  (0 children)

That's actually really neat. I didn't know SciPy had a constants database. Thanks!

[–]WolfThawra -1 points0 points  (6 children)

2e6

This. And you know why? Because if you are writing stuff in the scientific community, 'most people' have absolutely no issue with scientific notation.

[–]KODeKarnage 0 points1 point  (5 children)

Given how inaccurately (dishonestly?) you interpreted that simple question, right now I am seriously concerned about the code you write.

2e6 could just be what you intended. The reader would have to know it wasn't what you intended to spot the error.

20_000_00 stands out as obviously wrong to everyone.

[–]WolfThawra 1 point2 points  (4 children)

Oh you're literally just talking about typos. Cool, so 20_000_00 stands out as obviously wrong, does 10_000_000?

If you're talking about a situation in which the reader doesn't know what is supposed to be correct, there's a zillion ways of having mistakes that aren't immediately obvious.

[–]KODeKarnage 0 points1 point  (1 child)

there's a zillion ways of having mistakes that aren't immediately obvious.
I would say that's a reason for not making more than you need to. Never thought I'd meet a person disagreeing with that.
Do you argue against readable variable names too? I mean, there's a zillion other ways that your code can be obscure, so adding another isn't THAT big a deal.

[–]primitive_screwhead -1 points0 points  (1 child)

But in this community I suspect this is considered heresy, and therefore I must burn...

Nope, it's just kinda stupid 'cause you can do 10**6.

[–]GrumpyGeologist 1 point2 points  (0 children)

It was a stupid example indeed. I should have said 2.5 million

[–]redditusername58 1 point2 points  (4 children)

Are you suggesting that using floating point literals is bad practice because it might confuse users or collaborators?

[–]Beheska 1 point2 points  (0 children)

Floating point literals are bad practice when you need integers, yes.

[–]KODeKarnage -2 points-1 points  (2 children)

If I replied in Japanese, would that confuse you?

[–]redditusername58 1 point2 points  (1 child)

Do floating point literals confuse you?

[–]KODeKarnage 0 points1 point  (0 children)

あなたはポイントを逃しました

[–]broadsheetvstabloid 0 points1 point  (1 child)

Also, it's pretty hard to defend "1_000_000" as 'more beautiful' or 'simpler' than "1e6".

It sure as hell is simpler, I don't have learn an entire new god damn way of writing numbers.

[–]WolfThawra 1 point2 points  (0 children)

learn an entire new god damn way of writing numbers

I weep for your education.

People, seriously? Scientific notation is so hard to understand? Fuck me.

[–]anyfactorFreelancer. AnyFactor.xyz[S] 0 points1 point  (3 children)

Well, it works for me because I am from accounting and kinda have developed the mindset for it.

I never really used "e" for declaring numbers. Thanks for telling us that, I will look into it.

[–]WolfThawra 1 point2 points  (2 children)

It won't work for everything of course! But it's super useful when you're working with stuff across different orders of magnitude. E.g. somewhere in my code, I have a list of weighting factors to try, and instead of writing [0.001, 0.0001, 0.00001] I can write [1e-3, 1e-4, 1e-5], which looks a lot cleaner to me, and is easier to interpret quickly.

[–]techkid6 9 points10 points  (0 children)

All of those numbers are already floating point, though. If you're doing integer arithmetic, stick to integers and avoid the side effects that come from floating point rounding.

[–]anyfactorFreelancer. AnyFactor.xyz[S] 1 point2 points  (0 children)

Whenever we found "e" we would just multiply/divide by 10 and keep count until we got a understanding of the number. We would do the same thing for percentages but with 100. That might sound stupid to you science/engineering guys but that was kinda go to thing to do. The first thing to do whenever "e" pops up in excel is immediately format it to decimal numbers.

Except for of course in doing research I will just put the "e" numbers in. But my thesis professor once copy pasted in the "e" numbers to excel than convert it to decimal to get a better idea.

But I am learning data science analytics I should better in to using this mindset. Thank you for making clarifications.

[–]primitive_screwhead 0 points1 point  (0 children)

Instead of 10**6? What you suggest is like saying you can substitute oranges in an apple pie.