(:~
 : This <a href="http://docs.basex.org/wiki/Module_Library">XQuery Module</a> extends the <a href="http://www.w3.org/TR/xpath-full-text-10">W3C Full Text Recommendation</a> with some useful functions: The index can be directly accessed, full-text results can be marked with additional elements, or the relevant parts can be extracted. Moreover, the score value, which is generated by the <code>contains text</code> expression, can be explicitly requested from items.
 : 
 : @author BaseX Team
 : @see http://docs.basex.org/wiki/Module_Library
 :)
module namespace ft = "http://basex.org/modules/ft";
declare namespace bxerr = "http://basex.org/errors";

(:~
 : Returns all text nodes from the full-text index of the database <code>$db</code> that contain the specified <code>$terms</code> .
 : The options used for tokenizing the input and building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well. <p>The <code>$options</code> argument can be used to control full-text processing. Options can be either specified
 : </p>  <ul> <li> as children of an <code>&lt;options/&gt;</code> element, e.g.: </li> </ul>  <pre class="brush:xml"> &lt;options&gt; &lt;key1 value='value1'/&gt; ... &lt;/options&gt; </pre>  <ul> <li> as map, which contains all key/value pairs: </li> </ul>  <pre class="brush:xml"> map { "key1": "value1", ... } </pre>  <p>The following options are supported (the introduction on <a href="http://docs.basex.org/wiki/Full-Text">Full-Text</a> processing gives you equivalent expressions in the XQuery Full-Text notation): </p>  <ul> <li> <code>mode</code>: determines the mode how tokens are searched. Allowed values are <code>any</code>, <code>any word</code>, <code>all</code>, <code>all words</code>, and <code>phrase</code>. <code>any</code> is the default search mode. </li> <li> <code>fuzzy</code>: turns fuzzy querying on or off. Allowed values are <code>true</code> and <code>false</code>. By default, fuzzy querying is turned off. </li> <li> <code>wildcards</code>: turns wildcard querying on or off. Allowed values are <code>true</code> and <code>false</code>. By default, wildcard querying is turned off. </li> <li> <code>ordered</code>: requires that all tokens occur in the order in which they are specified. Allowed values are <code>true</code> and <code>false</code>. The default is <code>false</code>. </li> <li> <code>content</code>: specifies that the matched tokens need to occur at the beginning or end of a searched string, or need to cover the entire string. Allowed values are <code>start</code>, <code>end</code>, and <code>entire</code>. By default, the option is turned off. </li> <li> <code>scope</code>: defines the scope in which tokens must be located. The option has following sub options: <ul> <li> <code>same</code>: can be set to <code>true</code> or <code>false</code>. It specifies if tokens need to occur in the same or different units. </li> <li> <code>unit</code>: can be <code>sentence</code> or <code>paragraph</code>. It specifies the unit for finding tokens. </li> </ul> </li> <li> <code>window</code>: sets up a window in which all tokens must be located. By default, the option is turned off. It has following sub options: <ul> <li> <code>size</code>: specifies the size of the window in terms of <i>units</i>. </li> <li> <code>unit</code>: can be <code>sentences</code>, <code>sentences</code> or <code>paragraphs</code>. The default is <code>words</code>. </li> </ul> </li> <li> <code>distance</code>: specifies the distance in which tokens must occur. By default, the option is turned off. It has following sub options: <ul> <li> <code>min</code>: specifies the minimum distance in terms of <i>units</i>. The default is <code>0</code>. </li> <li> <code>max</code>: specifies the maximum distance in terms of <i>units</i>. The default is <code>∞</code>. </li> <li> <code>unit</code>: can be <code>words</code>, <code>sentences</code> or <code>paragraphs</code>. The default is <code>words</code>. </li> </ul> </li> </ul> 
 :
 : @error bxerr:BXDB0002 The addressed database does not exist or could not be opened.
 : @error bxerr:BXDB0004 the full-text index is not available.
 : @error bxerr:BXFT0001 the fuzzy and wildcard option cannot be both specified.
 :)
declare function ft:search($db as xs:string, $terms as item()*) as text()* external;

(:~
 : Returns all text nodes from the full-text index of the database <code>$db</code> that contain the specified <code>$terms</code> .
 : The options used for tokenizing the input and building the full-text will also be applied to the search terms. As an example, if the index terms have been stemmed, the search string will be stemmed as well. <p>The <code>$options</code> argument can be used to control full-text processing. Options can be either specified
 : </p>  <ul> <li> as children of an <code>&lt;options/&gt;</code> element, e.g.: </li> </ul>  <pre class="brush:xml"> &lt;options&gt; &lt;key1 value='value1'/&gt; ... &lt;/options&gt; </pre>  <ul> <li> as map, which contains all key/value pairs: </li> </ul>  <pre class="brush:xml"> map { "key1": "value1", ... } </pre>  <p>The following options are supported (the introduction on <a href="http://docs.basex.org/wiki/Full-Text">Full-Text</a> processing gives you equivalent expressions in the XQuery Full-Text notation): </p>  <ul> <li> <code>mode</code>: determines the mode how tokens are searched. Allowed values are <code>any</code>, <code>any word</code>, <code>all</code>, <code>all words</code>, and <code>phrase</code>. <code>any</code> is the default search mode. </li> <li> <code>fuzzy</code>: turns fuzzy querying on or off. Allowed values are <code>true</code> and <code>false</code>. By default, fuzzy querying is turned off. </li> <li> <code>wildcards</code>: turns wildcard querying on or off. Allowed values are <code>true</code> and <code>false</code>. By default, wildcard querying is turned off. </li> <li> <code>ordered</code>: requires that all tokens occur in the order in which they are specified. Allowed values are <code>true</code> and <code>false</code>. The default is <code>false</code>. </li> <li> <code>content</code>: specifies that the matched tokens need to occur at the beginning or end of a searched string, or need to cover the entire string. Allowed values are <code>start</code>, <code>end</code>, and <code>entire</code>. By default, the option is turned off. </li> <li> <code>scope</code>: defines the scope in which tokens must be located. The option has following sub options: <ul> <li> <code>same</code>: can be set to <code>true</code> or <code>false</code>. It specifies if tokens need to occur in the same or different units. </li> <li> <code>unit</code>: can be <code>sentence</code> or <code>paragraph</code>. It specifies the unit for finding tokens. </li> </ul> </li> <li> <code>window</code>: sets up a window in which all tokens must be located. By default, the option is turned off. It has following sub options: <ul> <li> <code>size</code>: specifies the size of the window in terms of <i>units</i>. </li> <li> <code>unit</code>: can be <code>sentences</code>, <code>sentences</code> or <code>paragraphs</code>. The default is <code>words</code>. </li> </ul> </li> <li> <code>distance</code>: specifies the distance in which tokens must occur. By default, the option is turned off. It has following sub options: <ul> <li> <code>min</code>: specifies the minimum distance in terms of <i>units</i>. The default is <code>0</code>. </li> <li> <code>max</code>: specifies the maximum distance in terms of <i>units</i>. The default is <code>∞</code>. </li> <li> <code>unit</code>: can be <code>words</code>, <code>sentences</code> or <code>paragraphs</code>. The default is <code>words</code>. </li> </ul> </li> </ul> 
 :
 : @error bxerr:BXDB0002 The addressed database does not exist or could not be opened.
 : @error bxerr:BXDB0004 the full-text index is not available.
 : @error bxerr:BXFT0001 the fuzzy and wildcard option cannot be both specified.
 :)
declare function ft:search($db as xs:string, $terms as item()*, $options as item()) as text()* external;

(:~
 : Checks if the specified <code>$input</code> items contain the specified <code>$terms</code> .
 : The function does the same as the <a href="http://docs.basex.org/wiki/Full-Text">Full-Text</a> expression <code>contains text</code> , but options can be specified more dynamically. The <code>$options</code> are the same as for <a href="#ft:search">ft:search</a> , and the following ones in addition: <ul> <li> <code>case</code>: determines how character case is processed. Allowed values are <code>insensitive</code>, <code>sensitive</code>, <code>upper</code> and <code>lower</code>. By default, search is case insensitive. </li> <li> <code>diacritics</code>: determines how diacritical characters are processed. Allowed values are <code>insensitive</code> and <code>sensitive</code>. By default, search is diacritical insensitive. </li> <li> <code>stemming</code>: determines is tokens are stemmed. Allowed values are <code>true</code> and <code>false</code>. By default, stemming is turned off. </li> <li> <code>language</code>: determines the language. This option is relevant for stemming tokens. All language codes are supported. The default language is <code>en</code>. </li> </ul> 
 :
 : @error bxerr:BXFT0001 the fuzzy and wildcard option cannot be both specified.
 :)
declare function ft:contains($input as item()*, $terms as item()*) as xs:boolean external;

(:~
 : Checks if the specified <code>$input</code> items contain the specified <code>$terms</code> .
 : The function does the same as the <a href="http://docs.basex.org/wiki/Full-Text">Full-Text</a> expression <code>contains text</code> , but options can be specified more dynamically. The <code>$options</code> are the same as for <a href="#ft:search">ft:search</a> , and the following ones in addition: <ul> <li> <code>case</code>: determines how character case is processed. Allowed values are <code>insensitive</code>, <code>sensitive</code>, <code>upper</code> and <code>lower</code>. By default, search is case insensitive. </li> <li> <code>diacritics</code>: determines how diacritical characters are processed. Allowed values are <code>insensitive</code> and <code>sensitive</code>. By default, search is diacritical insensitive. </li> <li> <code>stemming</code>: determines is tokens are stemmed. Allowed values are <code>true</code> and <code>false</code>. By default, stemming is turned off. </li> <li> <code>language</code>: determines the language. This option is relevant for stemming tokens. All language codes are supported. The default language is <code>en</code>. </li> </ul> 
 :
 : @error bxerr:BXFT0001 the fuzzy and wildcard option cannot be both specified.
 :)
declare function ft:contains($input as item()*, $terms as item()*, $options as item()) as xs:boolean external;

(:~
 : Puts a marker element around the resulting <code>$nodes</code> of a full-text index request.
 : The default name of the marker element is <code>mark</code> . An alternative name can be chosen via the optional <code>$name</code> argument.
 : Please note that: <ul> <li> the full-text expression that computes the token positions must be specified as argument of the <code>ft:mark()</code> function, as all position information is lost in subsequent processing steps. You may need to specify more than one full-text expression if you want to use the function in a FLWOR expression, as shown in Example 2. </li> <li> the XML node to be transformed must be an internal "database" node. The <code>transform</code> expression can be used to apply the method to a main-memory fragment, as shown in Example 3. </li> </ul> 
 :)
declare function ft:mark($nodes as node()*) as node()* external;

(:~
 : Puts a marker element around the resulting <code>$nodes</code> of a full-text index request.
 : The default name of the marker element is <code>mark</code> . An alternative name can be chosen via the optional <code>$name</code> argument.
 : Please note that: <ul> <li> the full-text expression that computes the token positions must be specified as argument of the <code>ft:mark()</code> function, as all position information is lost in subsequent processing steps. You may need to specify more than one full-text expression if you want to use the function in a FLWOR expression, as shown in Example 2. </li> <li> the XML node to be transformed must be an internal "database" node. The <code>transform</code> expression can be used to apply the method to a main-memory fragment, as shown in Example 3. </li> </ul> 
 :)
declare function ft:mark($nodes as node()*, $name as xs:string) as node()* external;

(:~
 : Extracts and returns relevant parts of full-text results. It puts a marker element around the resulting <code>$nodes</code> of a full-text index request and chops irrelevant sections of the result.
 : The default tag name of the marker element is <code>mark</code> . An alternative tag name can be chosen via the optional <code>$name</code> argument.
 : The default length of the returned text is <code>150</code> characters. An alternative length can be specified via the optional <code>$length</code> argument. Note that the effective text length may differ from the specified text due to formatting and readibility issues.
 : For more details on this function, please have a look at <a href="#ft:mark">ft:mark</a> .
 :)
declare function ft:extract($nodes as node()*) as node()* external;

(:~
 : Extracts and returns relevant parts of full-text results. It puts a marker element around the resulting <code>$nodes</code> of a full-text index request and chops irrelevant sections of the result.
 : The default tag name of the marker element is <code>mark</code> . An alternative tag name can be chosen via the optional <code>$name</code> argument.
 : The default length of the returned text is <code>150</code> characters. An alternative length can be specified via the optional <code>$length</code> argument. Note that the effective text length may differ from the specified text due to formatting and readibility issues.
 : For more details on this function, please have a look at <a href="#ft:mark">ft:mark</a> .
 :)
declare function ft:extract($nodes as node()*, $name as xs:string) as node()* external;

(:~
 : Extracts and returns relevant parts of full-text results. It puts a marker element around the resulting <code>$nodes</code> of a full-text index request and chops irrelevant sections of the result.
 : The default tag name of the marker element is <code>mark</code> . An alternative tag name can be chosen via the optional <code>$name</code> argument.
 : The default length of the returned text is <code>150</code> characters. An alternative length can be specified via the optional <code>$length</code> argument. Note that the effective text length may differ from the specified text due to formatting and readibility issues.
 : For more details on this function, please have a look at <a href="#ft:mark">ft:mark</a> .
 :)
declare function ft:extract($nodes as node()*, $name as xs:string, $length as xs:integer) as node()* external;

(:~
 : Returns the number of occurrences of the search terms specified in a full-text expression.
 :)
declare function ft:count($nodes as node()*) as xs:integer external;

(:~
 : Returns the score values (0.0 - 1.0) that have been attached to the specified items. <code>0</code> is returned a value if no score was attached.
 :)
declare function ft:score($item as item()*) as xs:double* external;

(:~
 : Returns all full-text tokens stored in the index of the database <code>$db</code> , along with their numbers of occurrences.
 : If <code>$prefix</code> is specified, the returned nodes will be refined to the strings starting with that prefix. The prefix will be tokenized according to the full-text used for creating the index.
 :
 : @error bxerr:BXDB0002 The addressed database does not exist or could not be opened.
 : @error bxerr:BXDB0004 the full-text index is not available.
 :)
declare function ft:tokens($db as xs:string) as element(value)* external;

(:~
 : Returns all full-text tokens stored in the index of the database <code>$db</code> , along with their numbers of occurrences.
 : If <code>$prefix</code> is specified, the returned nodes will be refined to the strings starting with that prefix. The prefix will be tokenized according to the full-text used for creating the index.
 :
 : @error bxerr:BXDB0002 The addressed database does not exist or could not be opened.
 : @error bxerr:BXDB0004 the full-text index is not available.
 :)
declare function ft:tokens($db as xs:string, $prefix as xs:string) as element(value)* external;

(:~
 : Tokenizes the given <code>$input</code> string, using the current default full-text options or the <code>$options</code> specified as second argument. The following options are available: <ul> <li> <code>case</code>: determines how character case is processed. Allowed values are <code>insensitive</code>, <code>sensitive</code>, <code>upper</code> and <code>lower</code>. By default, search is case insensitive. </li> <li> <code>diacritics</code>: determines how diacritical characters are processed. Allowed values are <code>insensitive</code> and <code>sensitive</code>. By default, search is diacritical insensitive. </li> <li> <code>stemming</code>: determines is tokens are stemmed. Allowed values are <code>true</code> and <code>false</code>. By default, stemming is turned off. </li> <li> <code>language</code>: determines the language. This option is relevant for stemming tokens. All language codes are supported. The default language is <code>en</code>. </li> </ul>  <p>The <code>$options</code> argument can be used to control full-text processing. Options can be either specified
 : </p>  <ul> <li> as children of an <code>&lt;options/&gt;</code> element, e.g.: </li> </ul>  <pre class="brush:xml"> &lt;options&gt; &lt;key1 value='value1'/&gt; ... &lt;/options&gt; </pre>  <ul> <li> as map, which contains all key/value pairs: </li> </ul>  <pre class="brush:xml"> map { "key1": "value1", ... } </pre> 
 :)
declare function ft:tokenize($input as xs:string) as xs:string* external;

(:~
 : Tokenizes the given <code>$input</code> string, using the current default full-text options or the <code>$options</code> specified as second argument. The following options are available: <ul> <li> <code>case</code>: determines how character case is processed. Allowed values are <code>insensitive</code>, <code>sensitive</code>, <code>upper</code> and <code>lower</code>. By default, search is case insensitive. </li> <li> <code>diacritics</code>: determines how diacritical characters are processed. Allowed values are <code>insensitive</code> and <code>sensitive</code>. By default, search is diacritical insensitive. </li> <li> <code>stemming</code>: determines is tokens are stemmed. Allowed values are <code>true</code> and <code>false</code>. By default, stemming is turned off. </li> <li> <code>language</code>: determines the language. This option is relevant for stemming tokens. All language codes are supported. The default language is <code>en</code>. </li> </ul>  <p>The <code>$options</code> argument can be used to control full-text processing. Options can be either specified
 : </p>  <ul> <li> as children of an <code>&lt;options/&gt;</code> element, e.g.: </li> </ul>  <pre class="brush:xml"> &lt;options&gt; &lt;key1 value='value1'/&gt; ... &lt;/options&gt; </pre>  <ul> <li> as map, which contains all key/value pairs: </li> </ul>  <pre class="brush:xml"> map { "key1": "value1", ... } </pre> 
 :)
declare function ft:tokenize($input as xs:string, $options as item()) as xs:string* external;

(:~
 : Normalizes the given <code>$input</code> string, using the current default full-text options or the <code>$options</code> specified as second argument. The function provides the same arguments as <a href="#ft:tokenize">ft:tokenize</a> .
 :)
declare function ft:normalize($input as xs:string) as xs:string* external;

(:~
 : Normalizes the given <code>$input</code> string, using the current default full-text options or the <code>$options</code> specified as second argument. The function provides the same arguments as <a href="#ft:tokenize">ft:tokenize</a> .
 :)
declare function ft:normalize($input as xs:string, $options as item()) as xs:string* external;



