FileTextExtractor
abstract class FileTextExtractor (View source)
A decorator for File or a subclass that provides a method for extracting full-text from the file's external contents.
Traits
Provides extensions to this object to integrate it with standard config API methods.
A class that can be instantiated or replaced via DI
Config options
priority | int | Set priority from 0-100. |
Properties
protected static | array | $sorted_extractor_classes | Cache of extractor class names, sorted by priority |
Methods
Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).
Gets the uninherited value for the given config option
An implementation of the factory method, allows you to create an instance of a class
Creates a class instance by the "singleton" design pattern.
Get the text file extractor for the given class
Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path
Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.
Determine if this extractor supports the given extension.
Details
static Config_ForClass
config()
Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).
mixed
uninherited(string $name)
Gets the uninherited value for the given config option
static Injectable
create(mixed ...$args)
An implementation of the factory method, allows you to create an instance of a class
This method will defer class substitution to the Injector API, which can be customised via the Config API to declare substitution classes.
This can be called in one of two ways - either calling via the class directly, or calling on Object and passing the class name as the first parameter. The following are equivalent: $list = DataList::create(SiteTree::class); $list = SiteTree::get();
static Injectable
singleton(string $class = null)
Creates a class instance by the "singleton" design pattern.
It will always return the same instance for this class, which can be used for performance reasons and as a simple way to access instance methods which don't rely on instance data (e.g. the custom SilverStripe static handling).
static protected array
get_extractor_classes()
Gets the list of prioritised extractor classes
static protected FileTextExtractor
get_extractor(string $class)
Get the text file extractor for the given class
static FileTextExtractor|null
for_file(File|string $file)
Given a File object, decide which extractor instance to use to handle it
static protected string
getPathFromFile(File $file)
Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path
abstract bool
isAvailable()
Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.
abstract bool
supportsExtension(string $extension)
Determine if this extractor supports the given extension.
If support is determined by mime/type only, then this should return false.
abstract bool
supportsMime(string $mime)
Determine if this extractor supports the given mime type.
Will only be called if supportsExtension returns false.
abstract string
getContent(File|string $file)
Given a File instance, extract the contents as text.