This week, we’re going to take a closer look at a script that’s been in GoFiler for a while as a base script, but has been through several revisions. This script, the Endnotes to Footnotes converter, takes endnotes in an HTML file and moves them to the appropriate page in the HTML file. These endnotes are common if you have a Word file that has been converted to HTML, that had footnotes in it. They are often converted as endnotes, and putting them back where they belong can be annoying if you have to do it manually.
Friday, July 27. 2018
LDC #95: Endnote to Footnote Mapper
The script then needs to scan through our document, identify all of the endnotes, all of the references to the endnotes, and then move the endnotes to the correct page. We need to make a few assumptions here about what our endnotes and footnotes will look like, otherwise it’s not going to work very well:
1) We assume that all of the endnotes are on the last page only.
2) We assume that all footnotes are converted to endnotes during the conversion process, and there aren’t any footnotes left in the document.
3) We assume that all endnote and footnote reference numbers are in superscript, either using SUP tags or FONT tags with a superscript style.
4) We assume that footnotes are sequential, and don’t skip numbers. You can’t have footnote 1, then 3, then 2, and you also can’t have footnote 3 without a footnote 2.
These assumptions, in my experience, are pretty safe, but keep in mind that our script assumes these things, so if they aren’t true then it obviously isn’t going to work very well. Let’s take a closer look at some of the global variables we’ll be using, because they are referenced a lot in the other functions:
/****************************************/ int n_count; /* Actual Added Count */ string n_text[MAX_NOTES]; /* Note Text */ int n_ox[MAX_NOTES]; /* Outline Object Index */ int n_px[MAX_NOTES]; /* Page Index to Apply */ int p_ox[MAX_NOTES]; /* Page Object Index */ int num_pages; /* number of pages in document */
The n_count variable is just a number of footnotes that it has found, and as incremented as more footnotes are found to move. The next four variables, n_text, n_ox, n_px, and p_ox, are all arrays describing attributes of a footnote or page break. They are all going to have an index that represents to the object they’re describing (so index 0 is footnote 1 or page break 1 depending on what it’s describing). The n_text variable is the text of the footnote. The n_ox variable is a reference to the footnote’s position in the outline. The n_px array holds the page number of the footnote. The p_ox array keeps track of the outline position of each page break, which we’ll need when writing out the footnotes at the bottom of each page. Our script is going to have to take a look at our document, fill all four of those data arrays above, and then use that information to move the endnotes onto the appropriate pages. Let’s start with our run function, which is our main function for this script.
if (mode != "preprocess") { /* Filter out all but preprocess */ return ERROR_NONE; /* Just leave */ } /* end not preporcess */ /* * Get Active Window */ if (IsWindowHandleValid(hView)==false){ hView = GetActiveEditWindow(); /* Get the active window */ if (hView == NULL_HANDLE) { /* No handle */ MessageBox('x', "Active window required."); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */ w_type = GetEditWindowType(hView); /* Get the window type */ if (IsError()) { /* Error, window may be a child */ hView = GetParentWindow(hView); /* Get the parent */ w_type = GetEditWindowType(hView); /* Try this type */ } /* end parent window */ w_type &= EDX_TYPE_ID_MASK; /* Rip off type */ if ((w_type != EDX_TYPE_PSG_PAGE_VIEW) && /* Must be Page View */ (w_type != EDX_TYPE_PSG_TEXT_VIEW)) { /* or Code View */ MessageBox('x', "Wrong window type. Must be Page View or Code View."); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */ } /* o Get the Edit Object */ hEdit = GetEditObject(hView); /* Get the edit object */ if (hEdit == NULL_HANDLE) { /* Problem with handle */ MessageBox('x', "Could not find source window"); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */
The first thing our run function does is check if we’re in any mode besides preprocess. If so, we can exit here. Then it can check if we’ve passed it a valid window handle. If so, we’re going to execute the script on that window. if not, we’re going to have to get the active edit window, and determine if it’s a valid Page View or Text View window. If not, we can display an appropriate error message, and quit. Otherwise, we can get the edit object with GetEditObject, and test to make sure we actually got the object. If not, we display an error and quit.
/* o Check the File Type */ /* * Perform the Outline */ hOutline = OutlineCreateObject(); /* Create class for outline */ rc = OutlineSetObject(hOutline, hEdit); /* Outline entire document, high level */ if (IsError(rc)) { /* Error on map */ MessageBox('s', "Internal Error %08X mapping outline", rc); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ /* * Perform the Mapping */ /* o Clear Array (Script Persists) */ num_pages = get_num_pages(hOutline); /* get the number of pages */ n_count = 0; /* Actual Added Count */ ArrayClear(n_text); /* Note Text */ ArrayClear(n_ox); /* Outline Object Index */ ArrayClear(n_px); /* Page Index to Apply */ ArrayClear(p_ox); /* Page Object Index */ /* o Scan Document */ map_end_notes(hOutline); /* Get the enbd note positions */ map_note_references(hOutline); /* Map the references */
Now that we have an edit object to work with, we can create our outline object with OutlineCreateObject, and set it’s contents with OutlineSetObject. The outline object will go ahead and map the paragraphs, tables, and divisions in the document, and can help us iterate over them and figure out what they are. For example, it will identify page breaks as page breaks, which is really useful here.
Then we can get the number of pages with our get_num_pages function, which we’ll discuss in a bit. At this point we also want to clear all of our global variables to ensure they’re reset back to default states, in case they’re currently set to something from a previous run. Next, we can run the map_end_notes and map_note_references functions. These functions map the positions of the end notes in the document (which should all be on the last page) and the positions of the references to them throughout the document.
/* * Look for Errors */ /* o Obvious Errors */ /* > Pages */ size = ArrayGetAxisDepth(p_ox); /* Get the pages */ if (size == 0) { /* Nothing to do */ MessageBox('x', "There are no page markers to place footnotes."); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ /* > No Notes */ size = ArrayGetAxisDepth(n_text); /* Get the notes */ if (size == 0) { /* Nothing to do */ MessageBox('x', "Could not locate any endnotes."); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ /* o Set Up Log */ s1 = GetEditObjectFilename(hEdit); /* Get the name of the file */ hLog = LogCreate("Endnote Errors", s1); /* Create log for errors */ LogSetMessageType(LOG_NONE); /* Set the message type */ s1 = GetFilename(s1); /* Get just the filename */ AddMessage(hLog, "Endnotes to Footnotes for: %s", s1); /* Add Header */ LogIndent(hLog); /* Indent log */
Now we need to do some basic tests on our document. First, we can check to see if there are any page break markers. The p_ox array is filled in by our previous two functions we called to map the document, so if it’s empty, it means that there are no page breaks and we can go no further, so we display an error and return. Then we can check to see if we have endnotes to actually place by getting the number of notes in n_text, and if it’s zero, we can just display an error and return. Next, we can get the filename of our edit window, and start a log for our run through to report back into.
/* o Scan for Gaps in Endnotes */ ix = 0; /* Starting position */ size = ArrayGetAxisDepth(n_text) - 1; /* Get the number of notes */ LogSetMessageType(LOG_ERROR); /* Set the message type */ /* > Look for Missing Items */ while (ix < size) { /* Loop to the end */ /* - No Data */ if (n_text[ix] == "") { /* No data */ s_x = -1; s_y = -1; /* Reset position */ LogSetPosition(s_x, s_y); /* Set position (default) */ LogSetMessageType(LOG_ERROR); /* Set the message type */ AddMessage(hLog, "Note: %d -- No text found for footnote",ix+1);/* Add in error */ } /* end no data */ else { /* Has data */ /* - Ordering of Numbers */ s1 = OutlineGetItemText(hOutline, n_ox[ix]); /* Get the text */ s1 = TrailStringAfter(s1, 100); /* Trail off */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix]); /* Get Start Y */ LogSetPosition(s_x, s_y); /* Set position (default) */
With our log set up, and our basic validation done, we can actually start looping through the footnotes to place them into our document, and of course doing yet more validation to ensure our footnote mapper can work correctly! For each footnote, we’re going to check if the text is blank. If so, we can just add to the error log that we cannot place this footnote. If it has text, we can get it’s position and the first 100 characters, so we can add it to the log effectively.
if (ix > 0) { /* Check the order */ if (n_ox[ix] <= n_ox[ix-1]) { /* Ordering issue */ LogSetMessageType(LOG_FATAL); /* Set the message type */ AddMessage(hLog, "Note: %d -- Out of order: %s" , ix+1, s1);/* Add in error */ } /* end order of data */ else { /* Not zero */ if (n_ox[ix] != n_ox[ix-1] + 1) { /* - Serial */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix-1]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix-1]); /* Get Start Y */ LogSetPosition(s_x, s_y); /* Set position (default) */ LogSetMessageType(LOG_FATAL); /* Set the message type */ AddMessage(hLog, "Note: %d -- Gap in text: %s" , ix+1,s1);/* Add in error */ } /* end error */ else { /* - OK */ LogSetMessageType(LOG_INFO); /* Set the message type */ AddMessage(hLog, "Note: %d -- %s" , ix+1, s1); /* Add in message */ } /* end ok */ } /* end field zero */ } /* end not first */ else { /* - OK (first, zero) */ LogSetMessageType(LOG_INFO); /* Set the message type */ AddMessage(hLog, "Note: %d -- %s" , ix+1, s1); /* Add in message */ } /* end field zero */ } /* end has data */ ix++; /* Next Entry */ } /* end missing loop */
First, we need to check the order of the footnotes to ensure they’re not out of order, like if we have footnotes 1, then 3, then 2. That would be a fatal error. So if we’re not on the first footnote (we’re going to assume the first footnote is fine for ordering), we can check to see if the next footnote is greater than the one before it, and if so we can throw an error. If that’s not the case, then we can check to make sure we don’t have a gap in footnotes, like footnote 1, then footnote 3. They need to be in serial order. We check this by looking at the index of the current footnote, and the index of the previous footnote + 1. They should be equal, and if not, we know we have a gap. If there’s no gap in the serial numbering, we can assume the numbering is OK, and just log it.
/* * Display Errors */ rc = LogGetMessageCount(hLog); /* Get messages */ if (rc != 0) { /* Had errors */ LogOutdent(hLog); /* Indent log */ LogAddErrorSummary(hLog); /* Add Summary Data */ LogDisplay(hLog); /* Display the log */ MessageBox('x', "Unable to map notes, see log for errors."); /* Display message */ return ERROR_SYNTAX; /* Exit w/error */ } /* end messages */ /* * Perform Changes */ /* o Delete End Notes */ ix = ArrayGetAxisDepth(n_text) - 1; /* Walk Up File (keep valid positions) */ /* > Loop Backward Though */ while (ix >= 0) { /* Loop and delete */ /* - Get Position */ if (n_px[ix] == num_pages){ /* if this note is already on last page */ ix--; /* decrement counter */ continue; /* go back for next */ } /* */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix]); /* Get Start Y */ e_x = OutlineGetItemPosEX(hOutline, n_ox[ix]); /* Get End X */ e_y = OutlineGetItemPosEY(hOutline, n_ox[ix]); /* Get End Y */ /* - Delete Main */ WriteSegment(hEdit, "", s_x, s_y, e_x, e_y); /* Delete the area */ /* - Delete Blanks */ if (IsBlankLine(hEdit, s_y)) { /* Resulted in blank line */ WriteSegment(hEdit, "", 0, s_y, 0, s_y + 1); /* Delete the line */ } /* end blank line */ if (IsBlankLine(hEdit, s_y)) { /* Resulted in blank line */ WriteSegment(hEdit, "", 0, s_y, 0, s_y + 1); /* Delete the line */ } /* end blank line */ ix--; /* Back up */ } /* end delete loop */
After we’ve checked our ordering for our footnotes, we need to see if we have any errors in the log. If so, we can display the log, then return an error, because we cannot go further with errors. Otherwise, next we need to delete the endnote references on the last page, so we can just set our counter to the last footnote so we can work our way backwards through the document, and iterate over each footnote. First, we need to check if the current footnote belongs on the last page. If so, we can decrement the index and continue to the next footnote, because we don’t need to do anything, we can leave it where it is, at the end of the last page. If not, we can get the position of the endnote, and then overwrite it with a blank line to delete it. After we’ve erased a footnote, we need to check if we created a blank line, and if so we can just delete the blank line.
/* o Spool Notes to Pages */ ix = ArrayGetAxisDepth(n_text) - 1; /* Walk Up File (keep valid positions) */ lp_x = -1; /* Set the last page */ /* > Loop Backward Though */ while (ix >= 0) { /* Loop and delete */ /* - Get Position */ if (n_px[ix] == num_pages){ /* if this note is already on last page */ ix--; /* decrement counter */ continue; /* go back for next */ } /* */ /* - Page Change */ if (n_px[ix] != lp_x) { /* Page Changed */ ox = n_px[ix]; /* Page Index */ s_x = OutlineGetItemPosSX(hOutline, p_ox[ox]); /* Get Start X for page */ s_y = OutlineGetItemPosSY(hOutline, p_ox[ox]); /* Get Start Y */ s1 = "<HR ALIGN=\"LEFT\" SIZE=\"1\" STYLE=\"width: 5pc; margin-top: 12pt\">"; /* Rule */ s1 += "\r\r"; /* Spacer */ WriteSegment(hEdit, s1, s_x, s_y, s_x, s_y); /* Add in divider */ s_y += 2; /* Move to position for notes */ lp_x = n_px[ix]; /* Set the new page */ } /* end page change */ /* - Insert Footnote */ WriteSegment(hEdit, n_text[ix], 0, s_y, 0, s_y); /* Add in note */ ix--; /* Back up */ } /* end delete loop */ /* */ /* ** Result */ /* * Result */ size = ArrayGetAxisDepth(n_text); /* Get the number of notes */ MessageBox('i', "Relocated %d endnotes to footnotes.", size); /* Display result */ return ERROR_NONE; /* Exit, OK */
Finally, we’re ready to write our footnotes out on the pages. We want to reset our iterator back to the last footnote, and again test if we’re on the last page, because if so we want to just skip to the next footnote as we’re ignoring ones on the last page. If the footnote we’re working on isn’t on the same page as the previous footnote’s page, it means we’ve changed pages and are putting this footnote on a new page, so we need to write out a HR tag to act as our divider. After we have added our divider to the page, we can write out our footnote text to the document right above the page break code.
That takes us through our run function, but we’ve got a few important subroutines in there that need to be discussed, especially the ones that map the document for us. Let’s look at a simple one first, get_num_pages, which does exactly what it sounds like, gets the number of pages in our document.
/****************************************/ int get_num_pages(handle hOutline){ /* get the number of pages in the doc */ /****************************************/ int size; /* the number of outline items */ int ox; /* the current item num */ int px; /* the current page number */ dword type; /* the type of outline item */ /* */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At begining */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ px++; /* Add Page Index */ } /* end is a page break */ ox++; /* increment item number */ } /* */ return px; /* return the page number */ }
This function uses our Outline object to count how many page breaks we have in the document. It starts by getting the number of items in the outline, then iterates over each one. If the type of that outline is a page break, it increments the page break counter and the outline item counter. If not, it just increments the outline object counter to look at the next one. Once it’s finished iterating over the outline, it just returns the number of page breaks it encountered while looping.
The next two functions we’re going to look at are pretty similar, in that they’re both looking for similar looking HTML code, but one is looking for the endnotes on the final page, and one is looking for references to those endnotes in the body of the document. Lets look at map_end_notes first.
/****************************************/ int map_end_notes(handle hOutline) { /* Map the End Notes */ /****************************************/ ... omitted variable declarations ... /* ** Find Possible End Notes */ /* * Initialize */ hWP = WordParseCreate(WP_SGML_TAG); /* Make a word parser */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At begining */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ p_ox[px] = ox; /* Page Index */ px++; /* Add Page Index */ ox++; /* Next Outline Item */ continue; /* Go back for more */ } /* end is a page break */
We start out by creating our Word Parse object, which we’ll use to analyze outline items while looking for endnotes. We then need to get the number of items in the outline, and iterate over each one. For each outline item, we can get the text and type, and check to see if it’s a page break. If it’s a page break, we need to set the outline index of that break into our global p_ox for the run function to use later, then just increment our local page and outline counters, then go back for the next item in the outline.
if (px != num_pages){ /* if we're not on the last page */ ox++; /* increment item count */ continue; /* */ } /* */ WordParseSetData(hWP, s2); /* Parse it */ s1 = "init"; /* init loop */ first_word = false; /* reset word parse first word flag */ while (s1 != ""){ /* while word parse has words */ nn = 0; /* Footnote number */ s1 = WordParseGetWord(hWP); /* Get first word */ parse_pos = WordParseGetPosition(hWP); /* get position of word parser */ if (FindInString(s1,"<")<0){ /* if not an SGML tag */ if (first_word == false){ /* if first word isn't set */ first_word = true; /* set first_word flag */ } /* */ else{ /* if this is not the first word */ break; /* break the loop */ } /* */ } /* */
We next need to check to see if we’re on the last page. If not, we can just ignore this outline item, since we’re looking for endnotes, and they should only be on the final page. So we just increment the outline and continue on. If we’re on the last page, we can set the outline text into our Word Parse object, so we can take a closer look at it to decide if it’s an endnote or not. For each word in the Word Parser, we need to get the get the word, and check if it’s an SGML tag or not (an SGML tag would start with a ‘<‘ character). If it’s not an SGML tag, then it’s a word of some sort, so we can check to see if it’s the first word. If so, we can set a flag to mark it as first, and continue. If it’s not the first word, then we can break, because we’re really only looking for the first word with this loop.
if (FindInString(s1,"</")>(-1)){ /* if word is a close tag */ continue; /* keep going */ } /* */ if (FindInString(s1,"&")>(-1)){ /* if word is a char entity */ continue; /* keep going */ } /* */ if (in_tag == true){ /* if we're in tag */ s2 = WordParseGetWord(hWP); /* get the next word */ WordParseSetPosition(hWP,parse_pos); /* reset parser position */ if(FindInString(MakeLowerCase(s2),"</sup") < 0 && /* if we're not a close font or sup tag */ FindInString(MakeLowerCase(s2),"</font") < 0 ){ /* if we're not a close font or sup tag */ continue; /* keep going */ } /* */ } /* */
If our current word is a close tag, we can keep going, because we don’t particularly care about close tags. Same for character entities, we’re not looking at them, so we can just continue if we see one. If we’re still processing, and we’re inside a valid HTML tag (sup or font) then we want to take a closer look at the next word. If the next word is anything but a close sup or close font, then it means there’s more than just a single word inside that tag, so it’s not a valid endnote label as far as we’re concerned, because a valid endnote label is going to be a single number, not a string of words, so we can just continue onto the next word.
if(FindInString(MakeLowerCase(s1),"<sup") == 0 || /* if we're in a font or sup tag */ FindInString(MakeLowerCase(s1),"<font") == 0 ){ /* if we're in a font or sup tag */ in_tag = true; /* we're in a tag */ continue; /* go back for next word */ } /* */ if(FindInString(MakeLowerCase(s1),"</sup") == 0 || /* if we're in a close font or sup tag */ FindInString(MakeLowerCase(s1),"</font") == 0 ){ /* if we're in a close font or sup tag */ in_tag = false; /* not in tag anymore */ continue; /* go back for next word */ } /* */ if (s1[0] == '[') { /* Footnote as [1] */ s1 = GetStringSegment(s1, 1); /* Skip the brace */ s1 = ReplaceInString(s1, "]", ""); /* Remove closing */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote number as[] */ else { /* As a plain number */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote as number */ if ((nn < 1) || (nn >= MAX_NOTES) || in_tag == false){ /* Not a valid number */ continue; /* Go back for more */ } /* end not valida */ /* o Add Item */ dx = nn - 1; /* Zero Offset Adjust */ n_text[dx] = OutlineGetItemHTML(hOutline, ox) + "\r\r"; /* Retrieve the HTML */ n_ox[dx] = ox; /* Index for Reference */ n_count++; /* Actual Added Count */ } /* */ ox++; /* Next Outline Item */ } /* end scan loop */ return ERROR_NONE; /* Exit, no error */ } /* end function */
If we haven’t continued onto the next word yet, we can check to see if it’s an open sup or open font tag. If so, we want to set our in_tag flag to true, and continue. Similarly, if it’s a close sup or close font, we want to set our in_tag flag to false, and continue. If we’re still going at this point, we can test to see if we have a number wrapped by ‘[‘ and ‘]’ brackets. If so, we can just remove the brackets.and use DecimalToInteger to try to convert it into a number. If it’s not wrapped in brackets, we can just use DecimalToInteger to try to convert our word to an integer. Then, finally, we can check to see if it’s an invalid page number by testing if it’s less than 1 (anything less than one here indicates DecimalToInteger couldn’t recognize the text as a number), if it’s a number bigger than the maximum allowed footnote number, or if it’s not inside a sup or font tag. If we haven’t been kicked out of our processing loop yet, we can assume this is an endnote, so we can store the text and outline index position in our global arrays, and then go back for the next outline object to keep the processing going.
This isn’t a 100% foolproof method of detecting all endnotes, but it’s fast and the results are pretty good from the documents I have tested it on so far. As new types of endnotes are seen, this function will need to be modified to detect their positions correctly. The next function, map_note_references, uses almost exactly the same logic as the function above, but it’s looking for references to the endnotes instead of endnotes themselves.
/****************************************/ int map_note_references(handle hOutline) { /* Map the Note References */ /****************************************/ ... omitted declarations ... /* */ /* ** Find Possible End Notes */ /* * Initialize */ hWP = WordParseCreate(WP_SGML_TAG); /* Make a word parser */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At beginning */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ p_ox[px] = ox; /* Page Index */ px++; /* Add Page Index */ ox++; /* Next Outline Item */ continue; /* Go back for more */ } /* end is a page break */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ WordParseSetData(hWP, s2); /* Parse it */ s1 = "init"; /* init loop */
Like our other function, we’re going to iterate over our outline items. If it’s a page break, we save it into our page break array, increment our index values, and then go back for another item. If it’s not a page break, we can get the HTML of our outline item and set it into our Word Parser.
while (s1 != ""){ /* while word parse has words */ nn = 0; /* Footnote number */ s1 = WordParseGetWord(hWP); /* Get first word */ parse_pos = WordParseGetPosition(hWP); /* get position of word parser */ if (FindInString(s1,"&")>(-1)){ /* if word is a char entity */ continue; /* keep going */ } /* */ if (in_tag == true){ /* if we're in tag */ s2 = WordParseGetWord(hWP); /* get the next word */ WordParseSetPosition(hWP,parse_pos); /* reset parser position */ if(FindInString(MakeLowerCase(s2),"</sup") < 0 && /* if we're not a close font or sup tag */ FindInString(MakeLowerCase(s2),"</font") < 0 ){ /* if we're not a close font or sup tag */ continue; /* keep going */ } /* */ } /* */ if(FindInString(MakeLowerCase(s1),"<sup") == 0 || /* if we're in a font or sup tag */ FindInString(MakeLowerCase(s1),"<font") == 0 ){ /* if we're in a font or sup tag */ in_tag = true; /* we're in a tag */ continue; /* go back for next word */ } /* */ if(FindInString(MakeLowerCase(s1),"</sup") == 0 || /* if we're in a close font or sup tag */ FindInString(MakeLowerCase(s1),"</font") == 0 ){ /* if we're in a close font or sup tag */ in_tag = false; /* not in tag anymore */ continue; /* go back for next word */ } /* */ if (FindInString(s1,"<")>(-1)){ /* if word is a close tag */ continue; /* keep going */ } /* */
This code should look pretty familiar as well, since it’s almost exactly the same as mapping the endnotes. We iterate over each word, and test to see if it is a character entity. If so we can continue to the next word because this one isn’t a reference. If we’re in an SGML tag, we can get the next word, and if it’s not a close tag we can continue to the next word, because this isn’t a reference either. If it’s an open sup or open font tag, we need to set the in_tag flag to true, and if it’s a close sup or close font tag, we can set the in_tag flag to false. Lastly, if it’s just a normal SGML tag, we can also skip it, because it’s not a word that might be a footnote reference unless it’s in a sup or font tag.
if (s1[0] == '[') { /* Footnote as [1] */ s1 = GetStringSegment(s1, 1); /* Skip the brace */ s1 = ReplaceInString(s1, "]", ""); /* Remove closing */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote number as[] */ else { /* As a plain number */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote as number */ if ((nn < 1) || (nn >= MAX_NOTES) || in_tag == false){ /* Not a valid number */ continue; /* Go back for more */ } /* end not valida */ /* o Add Item */ same_note = false; /* reset same_note flag */ for(ix=0;ix<n_count;ix++){ /* for each footnote we found */ if (n_ox[ix]==ox){ /* if this outline item is a footnote */ same_note = true; /* flag as already tagged */ } /* */ } /* */ if (same_note){ /* if already a footnote def */ break; /* go for next outline object */ } /* */ n_px[nn-1] = px; /* store page number of reference */ } /* */ ox++; /* Next Outline Item */ } /* */ return ERROR_NONE; /* Exit, no error */ } /* end function */
If we’re still processing, we can use the same logic as in the other function to remove the brackets from the number and test if it’s a valid footnote, and if so, we can check to make sure we don’t already have this reference, then store the page number of the reference in n_px for our run function to use later.
All in all, this is probably one of the more complicated script’s we’ve discussed on this blog, but it just takes a lot of little things and applies them into a larger whole. It’s gone through many revisions, as we found files it doesn’t work on, it has been modified to work as expected. I’m sure it’s not 100% complete, so it will probably need future additions to it when we encounter other documents it doesn’t work very well on, but that’s how these things are done. The script does the best it can, and it just needs to evolve as more possible variations on footnotes are found.
// // GoFiler Legato Script - End Notes to Footnotes // ---------------------------------------------- // // Rev 06/15/2015 SAT Initial Commit // 03/09/2018 SCH Modified script to ensure font tag references are superscript. // 07/26/2018 SCH Modified script to use HTML mode WordParse object. // // (c) 2018 Novaworks, LLC -- All rights reserved. // // Notes: // // - Requires GoFiler 4.23a or later. // - Can run from IDE, runs on all open HTML files if so. // // DO NOT EDIT THIS FILE - SEE INSTRUCTIONS IN EXTENSIONS FOLDER /********************************************************/ int run (int, string, handle); int get_num_pages (handle hOutline); int map_end_notes (handle hOutline); int map_note_references (handle hOutline); /********************************************************/ /****************************************/ #define MAX_NOTES 500 /* Maximum (highest) Footnote Number */ /****************************************/ int n_count; /* Actual Added Count */ string n_text[MAX_NOTES]; /* Note Text */ int n_ox[MAX_NOTES]; /* Outline Object Index */ int n_px[MAX_NOTES]; /* Page Index to Apply */ int p_ox[MAX_NOTES]; /* Page Object Index */ int num_pages; /* number of pages in document */ /********************************************************/ /* Menu Setup and Patch */ /* -------------------- */ /********************************************************/ /************************************************/ /* Set Hook */ /* -------- */ /* All globals are thrown away after this func- */ /* tion is run. */ /************************************************/ /****************************************/ int setup() { /* Called from Application Startup */ /****************************************/ string fnScript; /* Us */ string item[10]; /* Menu Item */ int rc; /* Return Code */ /* */ /* ** Add Menu Item */ /* * Define Function */ item["Code"] = "EXTENSION_ENDNOTES_REMAP"; /* Function Code */ item["MenuText"] = "&Endnotes to Footnotes"; /* Menu Text */ item["Description"] = "<B>Endnotes to Footnotes</B>\r\rFind endnotes and move each to the appropriate page."; item["Class"] = "DocumentExtension"; /* Add to Document Ribbon */ /* * Check for Existing */ rc = MenuFindFunctionID(item["Code"]); /* Look for existing */ if (IsNotError(rc)) { /* Was already be added */ return ERROR_NONE; /* Exit */ } /* end error */ /* * Registration */ rc = MenuAddFunction(item); /* Add the item */ if (IsError(rc)) { /* Was already be added */ return ERROR_NONE; /* Exit */ } /* end error */ fnScript = GetScriptFilename(); /* Get the script filename */ MenuSetHook(item["Code"], fnScript, "run"); /* Set the Hook */ return ERROR_NONE; /* Return value (does not matter) */ } /* end setup */ /********************************************************/ /************************************************/ /****************************************/ int main() { /* Default Entry */ /****************************************/ string windows[][]; int size,ix; string s1; /* General */ int rc; /* Return Code */ /* */ s1 = GetScriptParent(); /* Get the parent */ if (s1 == "LegatoIDE") { /* Is run from the IDE (debug) */ windows = EnumerateEditWindows(); /* enumerate windows */ size = ArrayGetAxisDepth(windows); /* get number of windows */ for (ix = 0 ; ix < size; ix++){ /* for each window */ if (windows[ix]["FileTypeToken"] == "FT_HTML"){ /* if it's an HTML window */ run (0,"preprocess", MakeHandle(windows[ix] /* run the script on it */ ["ClientHandle"])); /* * run the script on it */ } /* */ } /* */ setup(); /* run setup */ } /* end IDE run */ return ERROR_NONE; /* Return value (does not matter) */ } /* end setup */ /****************************************/ int run(int f_id, string mode, handle hView) { /* Call from Hook Processor */ /****************************************/ handle hEdit; /* Window and Edit Object */ handle hOutline; /* Outline Handle */ handle hMT; /* Source Mapped Text */ handle hLog; /* Error */ string fn,fp,test; /* test variables */ dword w_type; /* Window Type */ string s1; /* General */ int s_x, s_y, e_x, e_y; /* Position */ int lp_x; /* Past Page Index */ int ox; /* Outline Index */ int ix, size; /* General */ int rc; /* Return Code */ /* */ /* ** End Notes */ /* * Only on Frontside Hook */ if (mode != "preprocess") { /* Filter out all but preprocess */ return ERROR_NONE; /* Just leave */ } /* end not preporcess */ /* * Get Active Window */ if (IsWindowHandleValid(hView)==false){ hView = GetActiveEditWindow(); /* Get the active window */ if (hView == NULL_HANDLE) { /* No handle */ MessageBox('x', "Active window required."); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */ w_type = GetEditWindowType(hView); /* Get the window type */ if (IsError()) { /* Error, window may be a child */ hView = GetParentWindow(hView); /* Get the parent */ w_type = GetEditWindowType(hView); /* Try this type */ } /* end parent window */ w_type &= EDX_TYPE_ID_MASK; /* Rip off type */ if ((w_type != EDX_TYPE_PSG_PAGE_VIEW) && /* Must be Page View */ (w_type != EDX_TYPE_PSG_TEXT_VIEW)) { /* or Code View */ MessageBox('x', "Wrong window type. Must be Page View or Code View."); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */ } /* o Get the Edit Object */ hEdit = GetEditObject(hView); /* Get the edit object */ if (hEdit == NULL_HANDLE) { /* Problem with handle */ MessageBox('x', "Could not find source window"); /* Display error */ return ERROR_CANCEL_AUTO; /* Exit w/error */ } /* end error */ /* o Check the File Type */ /* * Perform the Outline */ hOutline = OutlineCreateObject(); /* Create class for outline */ rc = OutlineSetObject(hOutline, hEdit); /* Outline entire document, high level */ if (IsError(rc)) { /* Error on map */ MessageBox('s', "Internal Error %08X mapping outline", rc); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ /* * Perform the Mapping */ /* o Clear Array (Script Persists) */ num_pages = get_num_pages(hOutline); /* get the number of pages */ n_count = 0; /* Actual Added Count */ ArrayClear(n_text); /* Note Text */ ArrayClear(n_ox); /* Outline Object Index */ ArrayClear(n_px); /* Page Index to Apply */ ArrayClear(p_ox); /* Page Object Index */ /* o Scan Document */ map_end_notes(hOutline); /* Get the enbd note positions */ map_note_references(hOutline); /* Map the references */ /* * Look for Errors */ /* o Obvious Errors */ /* > Pages */ size = ArrayGetAxisDepth(p_ox); /* Get the pages */ if (size == 0) { /* Nothing to do */ MessageBox('x', "There are no page markers to place footnotes."); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ /* > No Notes */ size = ArrayGetAxisDepth(n_text); /* Get the notes */ if (size == 0) { /* Nothing to do */ MessageBox('x', "Could not locate any endnotes."); /* Display error */ return rc; /* Exit w/error */ } /* end error test */ size = ArrayGetAxisDepth(n_text) - 1; /* Get the number of notes */ /* o Set Up Log */ s1 = GetEditObjectFilename(hEdit); /* Get the name of the file */ hLog = LogCreate("Endnote Errors", s1); /* Create log for errors */ LogSetMessageType(LOG_NONE); /* Set the message type */ s1 = GetFilename(s1); /* Get just the filename */ AddMessage(hLog, "Endnotes to Footnotes for: %s", s1); /* Add Header */ LogIndent(hLog); /* Indent log */ /* o Scan for Gaps in Endnotes */ ix = 0; /* Starting position */ size = ArrayGetAxisDepth(n_text) - 1; /* Get the number of notes */ LogSetMessageType(LOG_ERROR); /* Set the message type */ /* > Look for Missing Items */ while (ix < size) { /* Loop to the end */ /* - No Data */ if (n_text[ix] == "") { /* No data */ s_x = -1; s_y = -1; /* Reset position */ LogSetPosition(s_x, s_y); /* Set position (default) */ LogSetMessageType(LOG_ERROR); /* Set the message type */ AddMessage(hLog, "Note: %d -- No text found for footnote",ix+1);/* Add in error */ } /* end no data */ else { /* Has data */ /* - Ordering of Numbers */ s1 = OutlineGetItemText(hOutline, n_ox[ix]); /* Get the text */ s1 = TrailStringAfter(s1, 100); /* Trail off */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix]); /* Get Start Y */ LogSetPosition(s_x, s_y); /* Set position (default) */ if (ix > 0) { /* Check the order */ if (n_ox[ix] <= n_ox[ix-1]) { /* Ordering issue */ LogSetMessageType(LOG_FATAL); /* Set the message type */ AddMessage(hLog, "Note: %d -- Out of order: %s" , ix+1, s1);/* Add in error */ } /* end order of data */ else { /* Not zero */ if (n_ox[ix] != n_ox[ix-1] + 1) { /* - Serial */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix-1]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix-1]); /* Get Start Y */ LogSetPosition(s_x, s_y); /* Set position (default) */ LogSetMessageType(LOG_FATAL); /* Set the message type */ AddMessage(hLog, "Note: %d -- Gap in text: %s" , ix+1,s1);/* Add in error */ } /* end error */ else { /* - OK */ LogSetMessageType(LOG_INFO); /* Set the message type */ AddMessage(hLog, "Note: %d -- %s" , ix+1, s1); /* Add in message */ } /* end ok */ } /* end field zero */ } /* end not first */ else { /* - OK (first, zero) */ LogSetMessageType(LOG_INFO); /* Set the message type */ AddMessage(hLog, "Note: %d -- %s" , ix+1, s1); /* Add in message */ } /* end field zero */ } /* end has data */ ix++; /* Next Entry */ } /* end missing loop */ /* * Display Errors */ rc = LogGetMessageCount(hLog); /* Get messages */ if (rc != 0) { /* Had errors */ LogOutdent(hLog); /* Indent log */ LogAddErrorSummary(hLog); /* Add Summary Data */ LogDisplay(hLog); /* Display the log */ MessageBox('x', "Unable to map notes, see log for errors."); /* Display message */ return ERROR_SYNTAX; /* Exit w/error */ } /* end messages */ /* * Perform Changes */ /* o Delete End Notes */ ix = ArrayGetAxisDepth(n_text) - 1; /* Walk Up File (keep valid positions) */ /* > Loop Backward Though */ while (ix >= 0) { /* Loop and delete */ /* - Get Position */ if (n_px[ix] == num_pages){ /* if this note is already on last page */ ix--; /* decrement counter */ continue; /* go back for next */ } /* */ s_x = OutlineGetItemPosSX(hOutline, n_ox[ix]); /* Get Start X */ s_y = OutlineGetItemPosSY(hOutline, n_ox[ix]); /* Get Start Y */ e_x = OutlineGetItemPosEX(hOutline, n_ox[ix]); /* Get End X */ e_y = OutlineGetItemPosEY(hOutline, n_ox[ix]); /* Get End Y */ /* - Delete Main */ WriteSegment(hEdit, "", s_x, s_y, e_x, e_y); /* Delete the area */ /* - Delete Blanks */ if (IsBlankLine(hEdit, s_y)) { /* Resulted in blank line */ WriteSegment(hEdit, "", 0, s_y, 0, s_y + 1); /* Delete the line */ } /* end blank line */ if (IsBlankLine(hEdit, s_y)) { /* Resulted in blank line */ WriteSegment(hEdit, "", 0, s_y, 0, s_y + 1); /* Delete the line */ } /* end blank line */ ix--; /* Back up */ } /* end delete loop */ /* o Spool Notes to Pages */ ix = ArrayGetAxisDepth(n_text) - 1; /* Walk Up File (keep valid positions) */ lp_x = -1; /* Set the last page */ /* > Loop Backward Though */ while (ix >= 0) { /* Loop and delete */ /* - Get Position */ if (n_px[ix] == num_pages){ /* if this note is already on last page */ ix--; /* decrement counter */ continue; /* go back for next */ } /* */ /* - Page Change */ if (n_px[ix] != lp_x) { /* Page Changed */ ox = n_px[ix]; /* Page Index */ s_x = OutlineGetItemPosSX(hOutline, p_ox[ox]); /* Get Start X for page */ s_y = OutlineGetItemPosSY(hOutline, p_ox[ox]); /* Get Start Y */ s1 = "<HR ALIGN=\"LEFT\" SIZE=\"1\" STYLE=\"width: 5pc; margin-top: 12pt\">"; /* Rule */ s1 += "\r\r"; /* Spacer */ WriteSegment(hEdit, s1, s_x, s_y, s_x, s_y); /* Add in divider */ s_y += 2; /* Move to position for notes */ lp_x = n_px[ix]; /* Set the new page */ } /* end page change */ /* - Insert Footnote */ WriteSegment(hEdit, n_text[ix], 0, s_y, 0, s_y); /* Add in note */ ix--; /* Back up */ } /* end delete loop */ /* */ /* ** Result */ /* * Result */ size = ArrayGetAxisDepth(n_text); /* Get the number of notes */ MessageBox('i', "Relocated %d endnotes to footnotes.", size); /* Display result */ return ERROR_NONE; /* Exit, OK */ } /* end run function */ /****************************************/ int get_num_pages(handle hOutline){ /* get the number of pages in the doc */ /****************************************/ int size; /* the number of outline items */ int ox; /* the current item num */ int px; /* the current page number */ dword type; /* the type of outline item */ string s2; /* temp string */ /* */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At begining */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ s2 = OutlineGetItemText(hOutline,ox); /* get item text */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ px++; /* Add Page Index */ ox++; /* Next Outline Item */ continue; /* Go back for more */ } /* end is a page break */ ox++; /* increment item number */ } /* */ return px; /* return the page number */ } /****************************************/ int map_end_notes(handle hOutline) { /* Map the End Notes */ /****************************************/ handle hWP; /* Word Parser */ string s1; /* General */ string s2; /* * */ dword type; /* the type of outline object */ int px; /* page number */ int ox, size; /* Outline Index */ int dx; /* Destination Index */ int words; /* number of words in tag */ int nn; /* Note Number */ int parse_pos; /* position of word parser */ boolean first_word; /* first word flag for word parser */ boolean in_tag; /* true if inside a font or sup tag */ /* */ /* ** Find Possible End Notes */ /* * Initialize */ hWP = WordParseCreate(WP_SGML_TAG); /* Make a word parser */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At begining */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ p_ox[px] = ox; /* Page Index */ px++; /* Add Page Index */ ox++; /* Next Outline Item */ continue; /* Go back for more */ } /* end is a page break */ if (px != num_pages){ /* if we're not on the last page */ ox++; /* increment item count */ continue; /* */ } /* */ WordParseSetData(hWP, s2); /* Parse it */ s1 = "init"; /* init loop */ first_word = false; /* reset word parse first word flag */ while (s1 != ""){ /* while word parse has words */ nn = 0; /* Footnote number */ s1 = WordParseGetWord(hWP); /* Get first word */ parse_pos = WordParseGetPosition(hWP); /* get position of word parser */ if (FindInString(s1,"<")<0){ /* if not an SGML tag */ if (first_word == false){ /* if first word isn't set */ first_word = true; /* set first_word flag */ } /* */ else{ /* if this is not the first word */ break; /* break the loop */ } /* */ } /* */ if (FindInString(s1,"</")>(-1)){ /* if word is a close tag */ continue; /* keep going */ } /* */ if (FindInString(s1,"&")>(-1)){ /* if word is a char entity */ continue; /* keep going */ } /* */ if (in_tag == true){ /* if we're in tag */ s2 = WordParseGetWord(hWP); /* get the next word */ WordParseSetPosition(hWP,parse_pos); /* reset parser position */ if(FindInString(MakeLowerCase(s2),"</sup") < 0 && /* if we're not a close font or sup tag */ FindInString(MakeLowerCase(s2),"</font") < 0 ){ /* if we're not a close font or sup tag */ continue; /* keep going */ } /* */ } /* */ if(FindInString(MakeLowerCase(s1),"<sup") == 0 || /* if we're in a font or sup tag */ FindInString(MakeLowerCase(s1),"<font") == 0 ){ /* if we're in a font or sup tag */ in_tag = true; /* we're in a tag */ continue; /* go back for next word */ } /* */ if(FindInString(MakeLowerCase(s1),"</sup") == 0 || /* if we're in a close font or sup tag */ FindInString(MakeLowerCase(s1),"</font") == 0 ){ /* if we're in a close font or sup tag */ in_tag = false; /* not in tag anymore */ continue; /* go back for next word */ } /* */ if (s1[0] == '[') { /* Footnote as [1] */ s1 = GetStringSegment(s1, 1); /* Skip the brace */ s1 = ReplaceInString(s1, "]", ""); /* Remove closing */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote number as[] */ else { /* As a plain number */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote as number */ if ((nn < 1) || (nn >= MAX_NOTES) || in_tag == false){ /* Not a valid number */ continue; /* Go back for more */ } /* end not valida */ /* o Add Item */ dx = nn - 1; /* Zero Offset Adjust */ n_text[dx] = OutlineGetItemHTML(hOutline, ox) + "\r\r"; /* Retrieve the HTML */ n_ox[dx] = ox; /* Index for Reference */ n_count++; /* Actual Added Count */ } /* */ ox++; /* Next Outline Item */ } /* end scan loop */ return ERROR_NONE; /* Exit, no error */ } /* end function */ /****************************************/ int map_note_references(handle hOutline) { /* Map the Note References */ /****************************************/ handle hWP; /* Word Parser */ string s1, s2; /* General */ dword type; /* Outline Item Type */ boolean first_word; /* first word */ boolean same_note; /* same note */ boolean in_tag; /* in tag */ int ix; /* counter */ int parse_pos; /* parser position */ int fnr_flag; /* Footnote Reference Flag */ int ox, size; /* Outline Index */ int dx; /* Destination Index */ int px, /* Page Index (bottom of page) */ nn, /* Note Number */ wc; /* Word Count (in outline text) */ /* */ /* ** Find Possible End Notes */ /* * Initialize */ hWP = WordParseCreate(WP_SGML_TAG); /* Make a word parser */ size = OutlineGetItemCount(hOutline); /* Get the overall items */ ox = 0; /* At begining */ /* * Scan for First Item */ while (ox < size) { /* Loop until end of list */ /* o Get Text and Test */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ type = OutlineGetItemType(hOutline, ox); /* Get the type */ type &= HO_TYPE_MASK; /* Remove other bits */ /* o Page Break */ if (type == HO_TYPE_PAGE_BREAK) { /* On a page break */ p_ox[px] = ox; /* Page Index */ px++; /* Add Page Index */ ox++; /* Next Outline Item */ continue; /* Go back for more */ } /* end is a page break */ s2 = OutlineGetItemHTML(hOutline, ox); /* Get the text */ WordParseSetData(hWP, s2); /* Parse it */ s1 = "init"; /* init loop */ first_word = false; /* reset word parse first word flag */ while (s1 != ""){ /* while word parse has words */ nn = 0; /* Footnote number */ s1 = WordParseGetWord(hWP); /* Get first word */ parse_pos = WordParseGetPosition(hWP); /* get position of word parser */ if (FindInString(s1,"&")>(-1)){ /* if word is a char entity */ continue; /* keep going */ } /* */ if (in_tag == true){ /* if we're in tag */ s2 = WordParseGetWord(hWP); /* get the next word */ WordParseSetPosition(hWP,parse_pos); /* reset parser position */ if(FindInString(MakeLowerCase(s2),"</sup") < 0 && /* if we're not a close font or sup tag */ FindInString(MakeLowerCase(s2),"</font") < 0 ){ /* if we're not a close font or sup tag */ continue; /* keep going */ } /* */ } /* */ if(FindInString(MakeLowerCase(s1),"<sup") == 0 || /* if we're in a font or sup tag */ FindInString(MakeLowerCase(s1),"<font") == 0 ){ /* if we're in a font or sup tag */ in_tag = true; /* we're in a tag */ continue; /* go back for next word */ } /* */ if(FindInString(MakeLowerCase(s1),"</sup") == 0 || /* if we're in a close font or sup tag */ FindInString(MakeLowerCase(s1),"</font") == 0 ){ /* if we're in a close font or sup tag */ in_tag = false; /* not in tag anymore */ continue; /* go back for next word */ } /* */ if (FindInString(s1,"</")>(-1)){ /* if word is a close tag */ continue; /* keep going */ } /* */ if (s1[0] == '[') { /* Footnote as [1] */ s1 = GetStringSegment(s1, 1); /* Skip the brace */ s1 = ReplaceInString(s1, "]", ""); /* Remove closing */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote number as[] */ else { /* As a plain number */ nn = DecimalToInteger(s1); /* Perform conversion */ } /* end footnote as number */ if ((nn < 1) || (nn >= MAX_NOTES) || in_tag == false){ /* Not a valid number */ continue; /* Go back for more */ } /* end not valida */ /* o Add Item */ same_note = false; /* reset same_note flag */ for(ix=0;ix<n_count;ix++){ /* for each footnote we found */ if (n_ox[ix]==ox){ /* if this outline item is a footnote */ same_note = true; /* flag as already tagged */ } /* */ } /* */ if (same_note){ /* if already a footnote def */ break; /* go for next outline object */ } /* */ n_px[nn-1] = px; /* store page number of reference */ } /* */ ox++; /* Next Outline Item */ } /* */ return ERROR_NONE; /* Exit, no error */ } /* end function */
Steven Horowitz has been working for Novaworks for over five years as a technical expert with a focus on EDGAR HTML and XBRL. Since the creation of the Legato language in 2015, Steven has been developing scripts to improve the GoFiler user experience. He is currently working toward a Bachelor of Sciences in Software Engineering at RIT and MCC. |
Additional Resources
Legato Script Developers LinkedIn Group
Primer: An Introduction to Legato