Legato
Legato

GoFiler Legato Script Reference

 

Legato v 1.4j

Application v 5.22b

  

 

Chapter FiveGeneral Functions (continued)

5.23 Encoding Functions

5.23.1 Overview 

This section covers specific data encoding functions within Legato.

5.23.2 Base32 Encoding

Base32 encoding schemes represent binary data in an ASCII string format by translating it into a radix-32 representation. It uses a set of 32 digits, each of which can be represented by 5 bits. A method to represent Base32 numbers in a human-readable manner is to use a standard 32 character set with twenty-two upper-case letters A–V and the digits 0-9. However, many other variations are used in different contexts. Legato supports the RFC 4648 base alphabet.

Base32 support is provided via the DecodeString and EncodeString for either text or binary data.

5.23.3 Base64 Encoding

Base64 encoding schemes represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding. Each Base64 digit represents exactly 6 bits of data. Three 8-bit bytes (i.e., a total of 24 bits) can therefore be represented by four 6-bit Base64 digits.

Encoding data in Base64 allows it to be transported in a text medium. The encoding requires approximately a 3:4 expansion. Since line endings and padding increase the encode size, the ration can be slight larger.

Data can be encoded and decoded using a series of functions that either operate on strings or files. File encoding functions DecodeFile and EncodeFile are covered in the file section.

The decode and encode string functions allow for the use of strings or objects. Target data can be binary but generally binary data should be processed inside of a Data Object.

5.23.4 Run Length Encoding

Run Length Encoding (RLE) is a simple form of lossless data compression where repeated runs of data are encoded and stored as a single data value and count. It is extremely effective and fast for specific types of data with large areas of repeated information. For information that is varied in content, other methods of compression should be used.

For Legato, there are two key parameters that the user can adjust: the value size and the escape character. The value size, in bytes as 1, 2, or 4, is important to detecting repeated data. For example, if the source data is largely loaded with 32-bit values, then 4-byte values sizes should be used. Byte alignment is not really important, only that the framed values repeat. The escape character is either determined by external compatibility requirements or by the source data. Ideally, the escape value should be a value that does not appear frequently since each escape value encountered must be encoded, increasing overhead.

An example of encoding a string:

'Line Item................ 123          345’ 

Would be encoded as:

'Line Item\0xEE\0x10. 123\0xEE\0x0A 345’ 

The first series of period characters is replaced with the escape, count and the character and the second series of spaces as the same. In the example, 20 bytes are removed.

5.23.5 Functions