abstract class FileTextExtractor (View source)

A decorator for File or a subclass that provides a method for extracting full-text from the file's external contents.

Traits

Provides extensions to this object to integrate it with standard config API methods.

A class that can be instantiated or replaced via DI

Config options

priority int

Set priority from 0-100.

Properties

protected static array $sorted_extractor_classes

Cache of extractor class names, sorted by priority

Methods

public static 
config()

Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).

public
mixed
uninherited(string $name)

Gets the uninherited value for the given config option

public static 
create(mixed ...$args)

An implementation of the factory method, allows you to create an instance of a class

public static 
singleton(string $class = null)

Creates a class instance by the "singleton" design pattern.

protected static 
array
get_extractor_classes()

Gets the list of prioritised extractor classes

protected static 
get_extractor(string $class)

Get the text file extractor for the given class

public static 
for_file(File|string $file)

Given a File object, decide which extractor instance to use to handle it

protected static 
string
getPathFromFile(File $file)

Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path

public
bool
isAvailable()

Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.

public
bool
supportsExtension(string $extension)

Determine if this extractor supports the given extension.

public
bool
supportsMime(string $mime)

Determine if this extractor supports the given mime type.

public
string
getContent(File|string $file)

Given a File instance, extract the contents as text.

Details

static Config_ForClass config()

Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).

Return Value

Config_ForClass

mixed uninherited(string $name)

Gets the uninherited value for the given config option

Parameters

string $name

Return Value

mixed

static Injectable create(mixed ...$args)

An implementation of the factory method, allows you to create an instance of a class

This method will defer class substitution to the Injector API, which can be customised via the Config API to declare substitution classes.

This can be called in one of two ways - either calling via the class directly, or calling on Object and passing the class name as the first parameter. The following are equivalent: $list = DataList::create(SiteTree::class); $list = SiteTree::get();

Parameters

mixed ...$args

Return Value

Injectable

static Injectable singleton(string $class = null)

Creates a class instance by the "singleton" design pattern.

It will always return the same instance for this class, which can be used for performance reasons and as a simple way to access instance methods which don't rely on instance data (e.g. the custom SilverStripe static handling).

Parameters

string $class

Optional classname to create, if the called class should not be used

Return Value

Injectable

The singleton instance

static protected array get_extractor_classes()

Gets the list of prioritised extractor classes

Return Value

array

static protected FileTextExtractor get_extractor(string $class)

Get the text file extractor for the given class

Parameters

string $class

Return Value

FileTextExtractor

static FileTextExtractor|null for_file(File|string $file)

Given a File object, decide which extractor instance to use to handle it

Parameters

File|string $file

Return Value

FileTextExtractor|null

static protected string getPathFromFile(File $file)

Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path

Parameters

File $file

Return Value

string

Exceptions

Exception

abstract bool isAvailable()

Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.

Return Value

bool

abstract bool supportsExtension(string $extension)

Determine if this extractor supports the given extension.

If support is determined by mime/type only, then this should return false.

Parameters

string $extension

Return Value

bool

abstract bool supportsMime(string $mime)

Determine if this extractor supports the given mime type.

Will only be called if supportsExtension returns false.

Parameters

string $mime

Return Value

bool

abstract string getContent(File|string $file)

Given a File instance, extract the contents as text.

Parameters

File|string $file

Either the File instance, or a file path for a file to load

Return Value

string