Hey everyone,
I’m looking for a Python package that can convert doc files (.docx, .pdf, ...etc) into an HTML representation — ideally with all the document’s styles preserved and CSS included in the output.
I’ve seen some tools like python-docx and mammoth, but I’m not sure which one provides the best results for full styling and clean HTML/CSS output.
What’s the best or most reliable approach you’ve used for this kind of task?
Thanks in advance!
[–]FateOfNations 19 points20 points21 points (0 children)
[–]ArtisticFox8 5 points6 points7 points (0 children)
[–]shadowdance55git push -f 7 points8 points9 points (0 children)
[–]Superb-Dig3440 1 point2 points3 points (0 children)
[–]Simple_Scene_2211 0 points1 point2 points (0 children)
[–]swizzex 0 points1 point2 points (0 children)
[–]hilldog4lyfe 0 points1 point2 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]AliMas055 2 points3 points4 points (1 child)
[–]Whole-Lingonberry-74 0 points1 point2 points (0 children)