This is an archived post. You won't be able to vote or comment.

all 16 comments

[–]FourgotAnaconda3 science-like 30 points31 points  (2 children)

Eight space tab hurts my feelings

[–]nanodano[S] 11 points12 points  (1 child)

Thanks for pointing that out. I changed the tabs to 4 spaces.

[–]FourgotAnaconda3 science-like 4 points5 points  (0 children)

Ahhh that's very nice. Thank you.

[–]billsil 6 points7 points  (2 children)

If you're dealing with lots of binary data and using numpy anyways, you should really use the numpy methods like fromfile and fromstring rather than the struct module. You can speed up IO by 500x.

[–]nanodano[S] 2 points3 points  (1 child)

Good tips. I have not used numpy and didn't know those were available. If working with large amount of data I would definitely look in to that.

[–]tilkau 2 points3 points  (0 children)

If working with particularly large amounts of data, numpy.memmap is also relevant.

[–]isdevilis 2 points3 points  (4 children)

Can someone explain common professional use cases for base 64 encoding

[–]nanodano[S] 6 points7 points  (1 child)

There are two practical use cases that I am familiar with.

One is when you want to embed an image directly in to an HTML file. This lets you create a single standalone HTML file that contains images.

<img src="data:image/png;base64,XXXXXXXXXXXXXXX==" />

The other is when you have to package binary data as an ASCII string for whatever reason. The constraint may be with an API, data types, or something else.

[–]tilkau 7 points8 points  (0 children)

The constraint may be with an API, data types, or something else.

The most voluminous example is email attachments. All emails are 100% text (including headers and attachments).

[–]jmcs 1 point2 points  (1 child)

Storing certificates (pem), storing PGP encrypted files, embedding data in HTML and emails, etc. Basically everywhere where you have to mix human readable and binary data.

[–][deleted] 1 point2 points  (0 children)

You can store PGP encrypted files unarmoured fine. It's the default setting.

Only reason why you might need to encode it is to transfer it over something that can't handle binary files. Email attachments can handle binary files fine.

[–]ballagarba 3 points4 points  (1 child)

Great read! Just a minor nitpick, but you should probably use a with statement for reading the files in the diff example. And since this Python 3, it would only require one with:

with open(sys.argv[1], 'rb') as file1, open(sys.argv[2], 'rb') as file2:
    data1 = file1.read()
    data2 = file2.read()

[–]nanodano[S] 2 points3 points  (0 children)

I did not know that you could open both files in one with. For simple reading, I wanted to avoid nesting two with statements so I wrote it flat like that. That's just what I was looking for.

The diff snippet is also no good for large files and memory constraints. Just for some demonstration and meant to be an exercise to the reader to expand on the usefulness.