TikaServerTextExtractor
class TikaServerTextExtractor extends FileTextExtractor (View source)
Enables text extraction of file content via the Tika Rest Server
Traits
Provides extensions to this object to integrate it with standard config API methods.
A class that can be instantiated or replaced via DI
Config options
priority | int | Tika server is pretty efficient so use it immediately if available |
|
server_endpoint | string | Server endpoint |
Properties
protected static | array | $sorted_extractor_classes | Cache of extractor class names, sorted by priority |
from FileTextExtractor |
protected | TikaRestClient | $client | ||
protected | array | $supportedMimes | Cache of supported mime types |
Methods
Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).
Gets the uninherited value for the given config option
An implementation of the factory method, allows you to create an instance of a class
Creates a class instance by the "singleton" design pattern.
Gets the list of prioritised extractor classes
Get the text file extractor for the given class
Given a File object, decide which extractor instance to use to handle it
Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path
Details
static Config_ForClass
config()
Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).
mixed
stat(string $name)
deprecated
deprecated
Get inherited config value
mixed
uninherited(string $name)
Gets the uninherited value for the given config option
$this
set_stat(string $name, mixed $value)
deprecated
deprecated
Update the config value for a given property
static Injectable
create(mixed ...$args)
An implementation of the factory method, allows you to create an instance of a class
This method will defer class substitution to the Injector API, which can be customised via the Config API to declare substitution classes.
This can be called in one of two ways - either calling via the class directly, or calling on Object and passing the class name as the first parameter. The following are equivalent: $list = DataList::create(SiteTree::class); $list = SiteTree::get();
static Injectable
singleton(string $class = null)
Creates a class instance by the "singleton" design pattern.
It will always return the same instance for this class, which can be used for performance reasons and as a simple way to access instance methods which don't rely on instance data (e.g. the custom SilverStripe static handling).
static protected array
get_extractor_classes()
Gets the list of prioritised extractor classes
static protected FileTextExtractor
get_extractor(string $class)
Get the text file extractor for the given class
static FileTextExtractor|null
for_file(File|string $file)
Given a File object, decide which extractor instance to use to handle it
static protected string
getPathFromFile(File $file)
Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path
bool
isAvailable()
No description
bool
supportsExtension(string $extension)
No description
bool
supportsMime(string $mime)
No description
string
getContent(File|string $file)
Given a File instance, extract the contents as text.
TikaRestClient
getClient()
No description
string
getServerEndpoint()
No description
float
getVersion()
Get the version of Tika installed, or 0 if not installed