From rick Sat Nov 9 12:40:09 2002
Date: Sat, 9 Nov 2002 12:40:09 -0800
To: ilug@linux.ie
Subject: Re: [ILUG] MS word parser ?

Quoting Paul Reilly (paulr@maths.tcd.ie):

> I'm looking for a command line tool which can parse word docs
> and spit out the text in a reasonably intelligent way. strings just
> isn't up to it. any ideas?

catdoc
Antiword
wvWare (formerly mswordview -- http://wvware.sourceforge.net/wvInfo.html)

--
Cheers,
Rick Moen Emacs is a decent operating system,
rick@linuxmafia.com but it still lacks a good text editor.


From: Rick Moen rick@linuxmafia.com
To: ilug@linux.ie
Subject: Re: [ILUG] MS word parser ?
Date: Mon, 11 Nov 2002 13:59:53 -0800

Quoting Alan Horkan (horkana@tcd.ie):

> Abiword is the answer.

[snip]

> I think the command line options are:
> abiword -to text documentname.doc
> or possibly
> abiword --to txt documentname.doc

The manpage is more than a little unclear, verging on downright
erroneous:

-[to]
Target format of file. For conversion of AbiWord documents.
[abw, zabw, rtf, txt, utf8, html, latex, etc]

I've tested this, and what they _really_ mean is the literal string "-to"
followed by one of those format-type indicators. Works very well.

It would be nice if OpenOffice.org documented their command-line
switches, if any.

--
Cheers, Before enlightenment, caffeine.
Rick Moen After enlightenment, caffeine.
rick@linuxmafia.com