from MS Word or Excel to .txt

This is like using "Save as Text" in Word or Excel. Handles .doc, .docx (Office 2007) and .xls files.


from PDF

... into plain text. Not guaranteed to work with every .PDF as formats have changed and some are complex.

To convert PDFs to plain text can be extremely tricky even if you own a licensed copy of the Adobe software (Adobe themselves created the PDF format in 1993). That is because PDF is a representation of all the dots, colours, shapes and lines in a document, not a long string of words. It can be very hard with an image of the text, to determine the underlying words and sentences. A second problem is that PDFs can be set with security rights preventing any copying, printing, editing etc. Other formats (.TXT, .DOC, .DOCX, .XML, .HTML, .RTF etc.) are OK in principle as they do not contain only an image but also store within themselves the words and sentences.


