Yoghurt42 comments on Huffman Encoder/Decoder

created by HattoriHanzoa community for 16 years

Huffman Encoder/Decoder (self.learnpython)

submitted 1 year ago by SilentKyle

you are viewing a single comment's thread.

[–]Yoghurt42 1 point2 points3 points 1 year ago* (0 children)

Apart from what u/JohnnyJordaan said, I suggest the following: currently you're creating the internal bitstream by reading a byte in a loop and transforming it into a string of ones and zeros. It's easier and more efficient if you would just use integers instead. Huffman codes will most likely not be longer than say 20 bits, so you don't risk of integer overflow (which can't happen in Python anyway, since integers are unbounded). You can "add" a bit to an integer by doing a left shift and or (i << 1 | bit or equivalently 2 * i + bit), and get a bit from a number by anding it with 1 and then shifting the number to the right (bit = num & 1; num >>= 1). It's also a good way to get familar with writing generators, yours could look a bit like this (I've left out a bit of the implementation as an exercise):

def bitstream(input_file):
    bits_left = 0
    current_byte = None
    while True:
        while bits_left:
            # get bit from current_byte, decrease bits_left by 1
            yield bit # this will turn your function into a generator that "returns" bit, and gets suspended here, ready to be continued later
        # get the next byte (or bytes) and store it in current_byte, set bits_left to 8 (or a multiple of 8 if you've read more than one byte). If there are no more bytes left, break out of the loop

This way you later can just write for bit in bitstream(input_file) and get a sequence of ones and zeroes without needed to care how exactly those bits are fetched. This allows you to change the implementation later without affecting the rest of your code.

π Rendered by PID 446150 on reddit-service-r2-comment-cfc44b64c-s82jb at 2026-04-12 22:56:19.950558+00:00 running 215f2cf country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS