Legato
Legato

GoFiler Legato Script Reference

 

Legato v 1.5e

Application v 5.25b

  

 

Chapter ElevenSGML Functions (continued)

SGMLFindText Function

Overview

The SGMLFindText function scans a file and returns a table of matching positions. The function operates in the same manner as the Find function within the Page View editor with the exception that all matches are returned.

Syntax/Parameters

Syntax

int[][] = SGMLFindText ( handle hObject | string name, string match,
                    [string options], [int s_x, int s_y] );

Parameters

hObject or name

A handle to a SGML, Edit or Mapped Text Object. Or,

A string as a fully qualified filename.

match

A string containing the string to match of up to 1024 characters. The string must be unencoded, in other words, all special characters should be native. The string can also contain special match characters, see Remarks below.

options

An optional string containing search options. See Remarks below. The parameter can be omitted when supplying a start position.

s_x   s_y

Optional int values specifying the position to start the search.

Return Value

A table of int values containing a row for each matching entry or an empty array on error. Use the GetLastError function to retrieve a formatted error code. If a match is not made, the array will be empty and the last error value will be ERROR_EOD.

Remarks

The SGMLFindText function performs text match by examining the content of the marked up text (or data) between the SGML tags. By default, the function is tuned to basic HTML, skipping heading information and having inline and block sensitivity.

Each returned row in the table has five column index entries (with key names):

0 – s_x   — Starting X position, native.

1 – s_y   — Starting Y position.

2 – e_x   — Ending X position, native.

3 – e_y   — Ending Y position.

4 – flags — Crossing flags. If the parser crossed a tag to perform the match, bits are set in this value.

Match options can be set using the options parameter. The string must be in “parameter: value” format:

“Whole Words Only” — When set to “true” or 1, a complete word must be matched. The default is false.

“Match Case” — When set to “true” or 1, the string must match exactly. The default is false or case-insensitive.

“Equivalent Characters” — When set to “true” or 1, certain characters such as dashes are treated as the same character without regard to the type of character. For example, en dash versus plain dash. The default is false.

“Non Breaking Spaces” — When set to “true” or 1, non-breaking spaces are treated as white space. The default is false.

The match string allows for certain escape characters as escaped by a leading backslash ( \ ) or hat ( ^ ). These are as follows:

b    — Match any block boundary.

f    — Match a force break tag (<BR>).

r    — Match a force break tag (<BR>) (same as f).

c    — Match a cell boundary (<TD> or <TH>).

p    — Match a paragraph boundary (<P>).

t    — Match a tab character.

nnnn — Match the specified character code in decimal. For example, ^151^.

The remaining content of the match is treated as encoded data. For example, searching for “&amp;” will search for “& a m p ;” as it would appear in the displayed text, not as a character entity in the HTML code.

Related Functions

Platform Support

Go13, Go16, GoFiler Complete, GoFiler Corporate, GoFiler, GoFiler Lite, GoXBRL

Legato IDE, Legato Basic