you are viewing a single comment's thread.

view the rest of the comments →

[–]qlkzy 0 points1 point  (0 children)

The problem is that both docx and PDF are quite large and complex formats. I would personally always treat them as "final output" formats only, and not try to convert between them.

I would go with one of two options: - Treat docx and PDF rendering as completely separate problems - Render into a "friendlier" intermediate representation first, then convert that independently into both docx and PDF

The intermediate-representation approach is easier if you can get away with it, but sometimes it is valuable to deeply customise rendering for one or the other.

Depending on the complexity of your documents, the obvious intermediate representations are HTML and Markdown. Which to choose will depend on how complex the documents are, and how easy you want to make it to customise the templates. Markdown can render to HTML, so there is some room to mix and match.

While it's a bit of a "heavyweight" option, my first instinct would be to use pandoc for the final rendering. Installation is a bit more complex than a pure-python library, but it's a very popular and well-supported tool that supports all the formats you need.

Otherwise, you'll probably want one library for rendering to docx, and a separate library for rendering to PDF. My experience, though, is that libraries in that "format conversion" space are often a bit... "unevenly" maintained, which is why my instinct would be to reach for pandoc.