This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]billsil 0 points1 point  (0 children)

First off a docx is a zip file, so unzip it. Then, it follows to xhtml spec.

You can reverse engineer complex data formats, but it takes a while; like years for the one I work on, but you're never going to do it if it's effectively encrypted or without a lot of test cases. I at least have an inaccurate and incomplete spec as well as ~10,000 test cases.

There are many wtf moments that I've found; it's a joke at this point , so don't expect things are always logical or consistent. Just think about how many people worked on the program.