class HTMLTextExtractor extends FileTextExtractor (View source)

Text extractor that uses php function strip_tags to get just the text. OK for indexing, not the best for readable text.

Properties

public string $class from  SS_Object
protected array $extension_instances from  SS_Object
protected $beforeExtendCallbacks

List of callbacks to call prior to extensions having extend called on them, each grouped by methodName.

from  SS_Object
protected $afterExtendCallbacks

List of callbacks to call after extensions having extend called on them, each grouped by methodName.

from  SS_Object
protected static array $sorted_extractor_classes

Cache of extractor class names, sorted by priority

from  FileTextExtractor

Methods

public static 
config()

Get a configuration accessor for this class. Short hand for Config::inst()->get($this->class, .....).

protected
beforeExtending(string $method, callable $callback)

Allows user code to hook into Object::extend prior to control being delegated to extensions. Each callback will be reset once called.

protected
afterExtending(string $method, callable $callback)

Allows user code to hook into Object::extend after control being delegated to extensions. Each callback will be reset once called.

public static 
create()

An implementation of the factory method, allows you to create an instance of a class

public static 
singleton()

Creates a class instance by the "singleton" design pattern.

public static 
create_from_string($classSpec, $firstArg = null)

Create an object from a string representation. It treats it as a PHP constructor without the 'new' keyword. It also manages to construct the object without the use of eval().

public static 
parse_class_spec($classSpec)

Parses a class-spec, such as "Versioned('Stage','Live')", as passed to create_from_string().

public static 
strong_create()

Similar to Object::create(), except that classes are only overloaded if you set the $strong parameter to TRUE when using Object::useCustomClass()

public static 
useCustomClass(string $oldClass, string $newClass, bool $strong = false)

This class allows you to overload classes with other classes when they are constructed using the factory method Object::create()

public static 
string
getCustomClass(string $class)

If a class has been overloaded, get the class name it has been overloaded with - otherwise return the class name

public static 
any
static_lookup($class, $name, null $default = null)

Get the value of a static property of a class, even in that property is declared protected (but not private), without any inheritance, merging or parent lookup if it doesn't exist on the given class.

public static 
get_static($class, $name, $uncached = false) deprecated

No description

public static 
set_static($class, $name, $value) deprecated

No description

public static 
uninherited_static($class, $name, $uncached = false) deprecated

No description

public static 
combined_static($class, $name, $ceiling = false) deprecated

No description

public static 
addStaticVars($class, $properties, $replace = false) deprecated

No description

public static 
add_static_var($class, $name, $value, $replace = false) deprecated

No description

public static 
has_extension(string $classOrExtension, string $requiredExtension = null, bool $strict = false)

Return TRUE if a class has a specified extension.

public static 
add_extension(string $classOrExtension, string $extension = null)

Add an extension to a specific class.

public static 
remove_extension(string $extension)

Remove an extension from a class.

public static 
array
get_extensions(string $class, bool $includeArgumentString = false)

No description

public static 
get_extra_config_sources($class = null)

No description

public
__construct()

No description

public
mixed
__call(string $method, array $arguments)

Attemps to locate and call a method dynamically added to a class at runtime if a default cannot be located

public
bool
hasMethod(string $method)

Return TRUE if a method exists on this object

public
array
allMethodNames(bool $custom = false)

Return the names of all the methods available on this object

protected
defineMethods()

Adds any methods from Extension instances attached to this object.

protected
array
findMethodsFromExtension(object $extension)

No description

protected
addMethodsFrom(string $property, string|int $index = null)

Add all the methods from an object property (which is an Extension) to this object.

protected
removeMethodsFrom(string $property, string|int $index = null)

Add all the methods from an object property (which is an Extension) to this object.

protected
addWrapperMethod(string $method, string $wrap)

Add a wrapper method - a method which points to another method with a different name. For example, Thumbnail(x) can be wrapped to generateThumbnail(x)

protected
createMethod(string $method, string $code)

Add an extra method using raw PHP code passed as a string

public
stat($name, $uncached = false)

No description

public
set_stat($name, $value)

No description

public
uninherited($name)

No description

public
bool
exists()

Return true if this object "exists" i.e. has a sensible value

public
string
parentClass()

No description

public
bool
is_a(string $class)

Check if this class is an instance of a specific class, or has that class as one of its parents

public
string
__toString()

No description

public
mixed
invokeWithExtensions(string $method, mixed $argument = null)

Calls a method if available on both this object and all applied Extensions, and then attempts to merge all results into an array

public
array
extend(string $method, mixed $a1 = null, mixed $a2 = null, mixed $a3 = null, mixed $a4 = null, mixed $a5 = null, mixed $a6 = null, mixed $a7 = null)

Run the given function on all of this object's extensions. Note that this method originally returned void, so if you wanted to return results, you're hosed

public
getExtensionInstance(string $extension)

Get an extension instance attached to this object by name.

public
bool
hasExtension(string $extension)

Returns TRUE if this object instance has a specific extension applied in $extension_instances. Extension instances are initialized at constructor time, meaning if you use add_extension() afterwards, the added extension will just be added to new instances of the extended class. Use the static method has_extension() to check if a class (not an instance) has a specific extension.

public
array
getExtensionInstances()

Get all extension instances for this specific object instance.

public
mixed
cacheToFile(string $method, int $lifetime = 3600, string $ID = false, array $arguments = array())

Cache the results of an instance method in this object to a file, or if it is already cache return the cached results

public
clearCache($method, $ID = false, $arguments = array())

Clears the cache for the given cacheToFile call

protected
mixed
loadCache(string $cache, int $lifetime = 3600)

Loads a cache from the filesystem if a valid on is present and within the specified lifetime

protected
saveCache(string $cache, mixed $data)

Save a piece of cached data to the file system

protected
string
sanitiseCachename(string $name)

Strip a file name of special characters so it is suitable for use as a cache file name

protected static 
array
get_extractor_classes()

Gets the list of prioritised extractor classes

protected static 
get_extractor(string $class)

Get the text file extractor for the given class

protected static 
string
get_mime(string $path)

Attempt to detect mime type for given file

public static 
for_file(string $path)

No description

public
bool
isAvailable()

Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.

public
bool
supportsExtension(string $extension)

Determine if this extractor supports the given extension.

public
bool
supportsMime(string $mime)

Determine if this extractor suports the given mime type.

public
string
getContent(string $path)

Extracts content from regex, by using strip_tags() combined with regular expressions to remove non-content tags like