Namespaces

Classes

A decorator for File or a subclass that provides a method for extracting full-text from the file's external contents.

Text extractor that uses php function strip_tags to get just the text. OK for indexing, not the best for readable text.

Text extractor that calls pdftotext to do the conversion.

Text extractor that calls an Apache Solr instance and extracts content via the "ExtractingRequestHandler" endpoint.

Enables text extraction of file content via the Tika Rest Server

Enables text extraction of file content via the Tika CLI