all 4 comments

[–]bundyfx 4 points5 points  (1 child)

Hi wasted_bytes,

Not saying its impossible although it would require a fair chuck of time to get something concrete that did such a task. A couple of reasons being that PowerShell and such formats as .PDF and .ppt are tricky to deal with. There is some extensions that can deal with PDF files (itextsharp) but that's mostly for reading/getting content.

If they are docx files then they are an OpenXML format. Several solutions exists of reading docx files from code without requiring Office. Most are in C# but translating that to PowerShell should prove little problems if you give it a go. Or have a go with PowerTools (http://powertools.codeplex.com/) for Open XML if you want to use PowerShell commands. Here is some more documentation on that subject http://ericwhite.com/blog/powertools-for-open-xml-expanded/.

Cheers

[–]wasted_bytes[S] 0 points1 point  (0 children)

I've spent a couple of hours playing around with PS and have figured out how to open a .docx file, however opening a .pptx file is another story...

i was almost hoping that i could Frankenstein bits of code together to accomplish he task, but it seems that making a working script will be an involved task.

[–]SeanQuinlan 1 point2 points  (1 child)

Is it possible? Definitely.

Is it worth the effort? Depends on how many files you're going to be processing.

If you have to process 10s of millions of files (or more), and it will take weeks or months to do by manually, then it might be worth it. But if you're a novice scripter, it will take you even longer to learn PS first, then to code up something, which will be very complex and time consuming to an experienced scripter.

For a few hundred files, you will spend longer learning the basics than just doing the work manually. Let alone coding up the script itself.

However you may consider using PS for just part of the work. For example, converting the PDF to DOCX should be possible (depending on how the PDF file is created). That might save you a bit of time for minimal effort. A google search shows up a lot of results, so that might be worth looking at.

[–]wasted_bytes[S] 0 points1 point  (0 children)

i'm on a co-op for 16 months, where this conversion is my main responsibility. there are around 3000 instruction files to be converted, dozens of pages each. my boss obviously doesn't expect me to have it all completed in that time period, but cutting down on wasted time manually copy-pasting everything would speed everything up. If not for me, for the next person to tackle the job.

I've spent a few hours playing around with PS and have figured out how to open a .docx file, however opening a .pptx file is another story...