Legato
Legato

GoFiler Legato Script Reference

 

Legato v 1.5e

Application v 5.25b

  

 

Appendix B — XML Data Sheet Format Specification

1.0 Introduction

1.1 General

1.1.1 Scope

This document covers the Data Sheet XML format used by the Legato Data Sheet class and various Data View based windows such as EDGAR XML templates. This specific covers XDS Mark II. Previous versions of XDS are not documented.

The format has been designed to reduce file size and improve speed. Therefore, tags such as that for a cell, are simply represented with the element <c> as opposed to <CELL> and attributes such as FLAGS have been changed simply to f. Further, many attributes can be implied or their values are significantly truncated. For example, a flags attribute might look something like FLAGS="0x0000414" but when stored will be simply f="414". It is important to recognize this since the value 414 can be misinterpreted as decimal.

Data files created by external programs or manually edited can be very sparse. The minimal framework only requires an XML signature, a sheet and rows and cells. The only attribute that is required in the specification is the number of columns (which will default to 10 if omitted). On the other hand, rows and cells can directly address positions in the sheet matrix which allows for wide swathes of the cell matrix to default to a null data condition.

Coding can be error checked by opening the files in Data View (or a Data Sheet script object) and review the resulting log. An example error log from a Data View open:

This can be used to help find coding errors and improve and debug programs that might write XDS data. On error, the XDS loader will continue and attempt to read as much information possible.

Finally, there presently is no defined schema document. This specification serves as the layout and schema.

 

1.1.2 Conventions

Definitions

The following value types are used with attributes:

xHex value. To save space, hex values are shortened with no leading 0x or 0’s. It is important for readers not to confuse hex and decimal values.
nDecimal value.
s String data.

Many attributes are not required. If an attribute-value is omitted the default value will apply.

Spaces and Encoding

While the XML header shows UTF-8, the current Data View editor only supports ANSI. Characters above 0x7F will be written by the application in character entity format as character values, for example, dagger &#134;. Values can be read in 8-bit ANSI but generators are cautioned that UTF-8 will be implemented at a future date at which point such coding could become problematic.

Spaces are all treated as white space except inside of data cells and attributes, in which case, space characters and tabs are loaded verbatim. Returns with cell data should be represented using the &#13; character.

Other

Items in the documentation that are marked with a dagger (†) indicate information that is carried in the XDS file format but are not necessarily used within views and are reserved for future use.

 

1.2 Basic Structure

The following is a minimal representation of a an XML sheet as wrapped for a XML Data Sheets (xds) format:

<?xml version="1.0" encoding="UTF-8"?> 
<xds xmlns="http://www.novaworkssoftware.com/schemas/xds">
  <s cols="5">
    <r><c>a1</c><c>b1</c><c>c1</c><c>d1</c><c>e1</c></r>
    <r><c>a2</c><c>b2</c><c>c2</c><c>d2</c><c>e2</c></r>
    <r><c>a3</c><c>b3</c><c>c3</c><c>d3</c><c>e3</c></r>
    <r><c>a4</c><c>b4</c><c>c4</c><c>d4</c><c>e4</c></r>
    <r><c>a5</c><c>b5</c><c>c5</c><c>d5</c><c>e5</c></r>
   </s>
</xds>

When being used as a Data Form, a Forms View, or any template driven source employing Data View, the XML must be wrapped with the XML start tag and the root element.

 

2.0 Data View XDS

2.1 XDS Structure

Sheet data can stand on its own as part of an import or export to a Data Sheet Object or as part of an overall workbook. The next section discusses the sheet content. When saved from Data View, the file will contain an XML wrapper and information.

The following is a typical XDS file saved from Data View:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Type: XML Data Sheet Workbook -->
<!-- Generator: Application Name and Version -->
<xds xmlns="http://www.novaworkssoftware.com/schemas/xds">
  <i>
    <e n="_title">My Main Document</e>
    <e n="_subject">Important Document</e>
    <e n="_sheet000">Sheet 1</e>
  </i>
  <s p="0" cols="5" rows="5" size="166" name="Sheet 1">
    <i>
      <e n="_name">Sheet 1</e>
      </i>
    <sd>
      <e p="7" n="_name" s="font-family: Arial"/>
      <e p="8" n="_name" s="font-family: Arial; font-weight: bold"/>
      </sd>
    <l>
      <e f="b0" cw="15"/>
      <e f="30" cw="41"/>
      <e f="30" cw="88"/>
      <e f="30" cw="88"/>
      <e f="30" cw="40"/>
      </l>
    <r p="0" f="8414">
      <c p="0" f="0" sx="8">a</c>
      <c p="1" f="0" sx="7"><a>,,"b1"</a></c>
      <c p="2" f="0" sx="7"><a>,,"c1"</a></c>
      <c p="3" f="0" sx="7"><a>,,"d1"</a></c>
      <c p="4" f="0" sx="7"><a>,,"e1"</a></c></r>
   </s>
</xds>

For detection, the application signature tester expects an XML opening header and the root to be <xds> with the appropriate name space. The XML comments in the header are not required.

 

2.2 Information Table

2.2.1 Information Section

Description

An optional information table can be provided for the entire workbook. The table contains named entries, some of which are reserved names. In addition, when written by the application all sheet names will be listed in the information table.

Note that each sheet may have an information table containing specific sheet properties. Which are separate and distinct.

The <i> defines a start of information data and ends with </i> tag. Nested within are entry data tags <e> for each named entry.

Code Structure

<i>
  <e ... />
  <e ... />
  <e ... />
  </i>

Tag Structure

<i>

Parameters

(None)

 

2.2.2 Information Entry

Description

Defines an information entry in the form of a name and data.

Tag Structure

<sd> ... <e n=s> data </e> ...

Attributes

 sName of information data. Reserved keywords will start with an underbar character.  

 

2.2.3 Predefined Named Data

The following names are reserved:

  _title Document title.  
  _subject Document subject.  
  _author Author.  
  _manager Manager.  
  _company Company.  
  _category Category.  
  _keywords Keywords.  
  _comments Comments.  
  _script Specifies a script filename to be loaded by the view as appropriate. When operating is basic The Data View, it does not load this script.  
  _version View or application version information and control.†  

 

 

Field sizes are restricted to 1,048,575 bytes. The sheet information supports ad hoc field names.

 

3.0 Data Sheet Tags and Attributes

3.1 Sheet Group

Description

Each sheet is defined with a sheet group which in turn can define a sheet or tab. A sheet group can contain sheet information <i>, style data <sd>, column layout <l>, and sheet data in the form rows <r> and cells <c>. When used within a workbook, the sheets are loaded in the order as they are within the stream. The position is ignored.

Various attributes within the sheet grouping tag define properties about the sheet and help preallocate memory for the cell matrix.

Coding Structure

<s>
  <i>  ... </i>     (information group)
  <sd> ... </sd>    (style data group)
  <l>  ... </l>     (layout group)
  <r>  ... </r>     (sheet data, rows, cells)
  </s>

Tag Structure

<s p=n cols=n rows=n size=n name=s version=s>

Attributes

pOptional sheet position (same as sheet ID).  
 colsOptional total columns in the sheet. If not specified or in error, the value is set to 10 columns. The columns specified here or in the layout section specify the width of the sheet. Incoming cells must be within the specified column boundary.  
 rowsOptional total rows in the sheet. Setting this value will reduce auto reallocation on sheet open. If there are less cells in the cell that requested, the sheet depth will be this value.  
 sizeOptional size pool size in bytes (for prealloc)  
 nameOptional name of sheet  
 versionOptional version of sheet data  
 fOptional caller flags in short hexadecimal. These are not used by the underlying datasheet but can be used by parents such as Forms View. The values are defined on an implementation basis.†  

†  Note that this field will not be carried through in versions prior to GoFiler 5.21b or Legato 1.4h.

3.2 Sheet Information

3.2.1 Information Section

Description

An optional information table can be provided for each sheet. The table contains named entries, some of which are reserved names.

Note that within an XDS the workbook may have an information table containing specific properties. These are separate and distinct from sheet properties.

Finally, there is no requirement that a sheet support an information section. If the receiving object does not support storing the data, the items will be ignored.

Code Structure

<i>
  <e ... />
  <e ... />
  <e ... />
  </i>

Tag Structure

<i>

Parameters

(None)

 

3.2.2 Information Entry

Description

Defines an information entry in the form of a name and data.

Tag Structure

<i> ... <e n=s> data </e> ... </i>

Attributes

 sName of information data. Reserved keywords will start with an underbar character.  

 

3.2.3 Predefined Named Data

The following names are reserved:

  _name Name of the sheet. This is shared with and is the same as the sheet name (as specified within the <s> tag) within a sheet object. This value is always stored without respect to whether the sheet information is stored. The name can be up to 128 characters,  
  _description Description of the sheet. Up to 1024 characters.  
  _comment Comments. Up to 1024 characters.  
  _tip Popup tip language. Up to 1024 characters.†  
  _refid Reference ID. Up to 256 characters.  
  _cdw1 32-bit user data word.  
  _cdw2 32-bit user data word.  
  _cdw3 32-bit user data word.  
  _cdw4 32-bit user data word.  
  _userdata General user data. Up to 1024 characters.  

 

The sheet information does not support ad hoc field names.

 

3.3 Style Data

3.3.1 Style Section

Description

The <sd> defines a start of style data and ends with </sd> tag. Contains entry data <e> for each style entry. The style entries can be referenced by index for any cell to define its appearance. There must be only a single style data group.

When a sheet is exported by the application, the style usage is reviewed. Styles that are abandoned or deleted are not written. Therefore, it is not uncommon to have missing indices for certain entries.

Coding Structure

<sd>
  <e ... />
  <e ... />
  <e ... />
  </sd>

Tag Structure

<sd>

Parameters

(None)

 

3.3.2 Style Entries

Description

Each style entry is defined by an <e> tag in the form of CSS-like style data. Each entry has an style index position which in turn is referenced by individual cells.

Tag Structure

<sd> ... <e sx=x n=s s=s />

Attributes

sxZero-based style position in hex (same as style index). Zero is default style.  
 nOptional style name as string. This is presently not used within Data View.  
 sCSS-like style information as text. (See next section.)  

If a font name is not specified, the default font or theme font will be used by the display renderer.

3.3.3 Style Elements

General

Style data, as an attribute, is in the form of CSS type “parameter: value;” pairs. For example:

<e p="0" s=""/>
<e p="1" s="font-size: 8pt"/>
<e p="2" s="font-size: 10pt"/>
<e p="3" s="font-size: 12pt"/>
<e p="4" s="font-weight: bold"/>
<e p="5" s="font-style: italic"/>
<e p="6" s="text-decoration: underline"/>
<e p="7" s="text-decoration: double-underline"/>
<e p="8" s="text-decoration: underline accounting"/>
<e p="9" s="text-decoration: double-underline accounting"/>
<e p="10" s="font-weight: bold; font-style: italic"/>
<e p="15" s="font-family: Times New Roman"/>
<e p="16" s="font-family: Arial"/>
<e p="17" s="font-family: Courier New"/>

Note that the first element is the default style or the sheet.

Shorthand CSS is not allowed.

Properties

 

background-colorBackground color. The value will be ‘transparent’ if omitted.  
 border-top
border-right
border-bottom
border-left
Border specification in the form of color thickness style. The thickness and style follow the CSS border values.  
 colorText color. The value will be ‘auto’ if omitted (normally black).  
 font-familySpecified font family name. The value must be a windows name and not use CSS font groupings.  
 font-sizeFont size.  
 font-styleFont style. Valid value is ‘italic’. Other values are not allowed.  
 font-weightFont weight. Valid value is ‘bold’. Other values are not allowed.  
 text-align

Cell alignment, both horizontal and vertical. The value can be a combination of horizontal:

‘left’, ‘center’, ‘right’, ‘justify’

and vertical:

‘top’, ‘middle’, ‘bottom’

The default values, if not specified are ‘left’ and ‘middle’, respectively.

The value ‘wrap’† can be appended to allow the cell to wrap text.

 
 text-decoration

Same as CSS underline with added accounting styles. Values are:

‘underline’, ‘double-underline’, ‘underline accounting’,
‘double-underline accounting’

Only a single value is allowed.

 
 vertical-alignValue as ‘sub’ or ‘super’.†  

 

Values

Values can follow the basic CSS types. Font sizes are generally in points (pt). Colors can be web names, rgb style as ‘rgb(r, g, b)’ or hex ‘#rrggbb’.

 

3.4 Column Layout

3.4.1 Overview

Description

Starts a layout group. Contains an entry for each column. For a data sheet or data view, this defines the column array. The layout group will not be present if the Data Sheet does not have a column array. There must be only a single column layout group.

Coding Structure

<l>
  <e ... />
  <e ... />
  <e ... />
  </sd>

Tag Structure

<l>

Attributes

(none)

 

3.4.2 Layout Entries

Description

Each entry defines a column for the layout. There is an entry for each column. If the column is not named the data value will not be present and the entry will be self-contained.

Tag Structure

<l> ... <e p=n n=s f=x sw=x cw=n > data </e>

Attributes

pZero-based column position (same as column index). There will be an entry for each column,  
 nName of column, if omitted the column is lettered when displayed.  
 fColumn flags  
 swSpecified width as a pvalue. The default value is zero.  
 cwColumn width as pixels. The default value is zero, or hidden.  

 

3.5 Data Matrix

3.5.1 Rows

Description

Defines a row group of cells. Ends with </r> tag. Contains cell data <c>.

Code Structure

<r ...>
  <c ...> cell data </c>
  <c ...> cell data <a> attributes </a></c>
  <c ...> </c>
  </r>

Tag Structure

<r p=n n=s f=x sh=x rh=n c=x sx=n>

Attributes

pZero-based row position.  
 nRow name. If omitted the row is numbered. Not presently used.  
 fRow flags in hex. Omitted if zero.  
 shSpecified height as a pvalue. The default value is zero. Omitted if zero.  
 rhRow height as pixels.  
 c32-bit caller data in hex (aka, row data or row data word). Omitted if zero.  
 sxDefault style index. Cells not specifying a style index get this value.  
 cfDefault cell flags.  

3.5.2 Cells

Description

Defines one or more cells. By default, empty or null cells are not exported. When reading, if a cell is not present a map position is not filled (-1) or the cell record is filled with zeros.

Coding Structure

<c ...> cell data </c>
<c ...> cell data <a> attributes </a></c>

Tag Structure

<r> ... <c p=n r=n f=x d=x cs=n rs=n sx=x > data </c>

Attributes

pZero-based column position.  
 rRepeat value. This is only present if the next cell or cells exactly matches the properties of this cell and all cells do not contain display or attribute data. This is different than a null cell.  
 fCell flags and type data in hex. This can default to the default row cell flags as specified by <r cf=x>.  
 dCell extended data in hex. Default is zero.  
 csColumn span value. If the column is not spanned, the value is omitted. If spanned, the value is the number of cells plus this cell spanned (for example, cs=1 means two cells joined). The following cells will be null cells.  
 rsRow span value. If the row is not spanned, the value is omitted. If spanned, the value is the number of rows in addition to this row spanned (for example, rs=1 means two cells vertically joined). The adjacent spanned cells in the following row will be null cells.  
 sxStyle index in hex. If present, references the style position in the table specified above. If omitted, the default style or 0 is used.  

Data

Cell data is the display portion of the cell’s data. Depending on the cell type, the display data will vary. For text type, the line feed character is used to indicate soft line break positions for word wrapping. As such, the code 0x0A or &#10; should not be used since it will be treaed as a word space.

3.5.3 Cell Attributes

Description

If the cell has attributes, it will trail the content data (if present) and be surrounded by the <a> </a> tag group. The content is a CSV list of the attribute fields. Note that ‘cell attributes’ are separate and distinct from XML attributes.

Attributes must follow the cell data, if any, and precede the cell end.

Coding Structure

<a> attributes </a>

Tag Structure

<r> ... <c> ... <a> data </a></c> ...

Attributes

(none)

 

3.5.4 Cell Attribute Data

Fields are referenced in position order in the CSV list. If the field is not present, the string should be empty. Depending on the version of the generating software, the field may or may not be quoted.

How these fields are used depends on the operating environment. 

  Field Type Data This field is used to support the edit function. For example, it will contain a button’s or check box’s text.  
  Edit/Native Data Original data. Used for numeric data that has been formatted.  
  Name Name of cell.  
  Description A description for the cell.  
  Tip A popup tip. (Not presently operable in Data View).  
  Comment General comment.  
  Review General review notes.  
  Private A Used by various views or available for user use.  
  Private B Used by various views or available for user use.  
  Events Events for Legato Script.  

 

The Description field can contain a formatted directive to set placeholder text into the cell if the cell is empty. The cell type must be text. If the string starts with “Placeholder:” the data control will interpret that as display text if the cell is empty. Optionally, a color can be added by specifying the “Color:” property. for example:

Placeholder: Federal Tax ID; Color: Gray

The default color depends on the Data Control implementation but is normally light gray, #C0C0C0.