indexing of pdf documents needs one of those external programs. we should ship one of them out of the box.
nutch ships with a pdf plugin based on pdfbox: http://svn.apache.org/viewcvs.cgi/incubator/nutch/trunk/src/plugin/parse-pdf/