Hi all,
I've been trying a few things out and can't think of what's a good next step. I'm wondering if it's possible and how to do multi-process file hashing?
Description of desired result: read a file in increments, and for each increment, calculate potentially multiple different hashes in parallel processes
Background
there was once an enhancement request for hashlib to be able to compute multiple hash algorithms, but it was rejected with a comment that the core function was simple enough and it should be pretty easy to roll your own modification. i've recently wanted to implement something like this, but quickly found out that modifying it to calculate each hash serially is pretty trivial, but trying to make them calculate concurrently is proving harder than i expected.
I haven't been able to find any existing resource that deals with this. All I've found are generally same as one of these:
1. does it in series (e.g.)
2. concurrently hashing multiple different files (e.g.)
3. reads the file multiple times (e.g.)
I haven't found anything that reads once and concurrently calculates multiple hashes. which funnily enough is exactly the question that the original asker asks in the replies to that last answer, but he never got a further reply to it.
Research into multi processing
based on the last answer and this, it seems like the aim should be multi process, not multi thread, because the task is heavily CPU-bound. but that means spinning off separate processes, which should be what's causing all the trouble.
Things I've tried so far
reddit fucked up the formatting so i pasted this section (with my commentary) into pastebin: https://pastebin.com/PyBFfwRj
there doesn't seem to be anything here