It’s time to finish up our base script for our Section 16 data scraper. In the first two parts of this series we started a script that would go out to the SEC’s website, download all of the filings from a particular CIK, cache all of the Section 16 filings, find all of the reported transactions, and put those transactions into a CSV file. This week we are going to finish up the base functionality by reading through all of the transactions and finding the current holdings. Additionally we are going to perform a large amount of refactoring as well as add in the functionality to look at reported holdings.
Friday, March 29. 2019
LDC #129: Building a Section 16 Data Scraper, Part 3: Analysis
Let’s first talk about some of the refactoring that we will be doing this week. To refactor our code we are going to look ahead to the future. What are our plans for the scraper? My plans for this script is to use it to populate a dialog control so that a user can specify a CIK and behind the scenes our script retrieves all the information. That way we can dynamically show the user the information that they want. Using the script in this way makes it even easier for a user to get someone’s latest holdings.
The easiest way to prepare for something like that is to split out each part of our large main function into its own functions. This way later on we can have the dialog call each function as needed - as we need more information we can retrieve more information. That means our final script today has a main function, but the main function is little more than a test jig now that calls each function and then prints out our findings.
Here’s this week’s final script:
/***************************************************************************************************************** Section 16 Data Collector ------------------------- Revision: 01-04-19 JCK Initial creation - get filings from SEC 02-08-19 JCK Part 2 - Store information 03-29-19 JCK Part 3 - Refactoring and getting holdings Notes: -Stage 3 (c) 2019 Novaworks, LLC. All Rights Reserved. *****************************************************************************************************************/ #define TEST_CIK "0001091423" //Robert Kotick #define COL_TORH 0 #define COL_AORD 1 #define COL_TDATE 2 #define COL_TTYPE 3 #define COL_DORI 4 #define COL_NUMTRANS 5 #define COL_NUMOWNED 6 #define COL_CIK 7 #define COL_NAME 8 handle hSheet; handle file; string s16filings[]; //Table of all filings string s16holdings[][]; //Table of current holdings boolean is_section_16 (string type); void add_transaction (string issuer); void add_holding (string issuer, string date); int populate_filings (string cache, string c_cik); void get_current_holdings (string cache); void parse_filings (string cache, string c_cik); void main() { //creation variables string cache; //Location of cache string c_cik; //CIK of Reporting Owner int count; //Counting Variable int max; c_cik = TEST_CIK; //Find the location for our cache cache = GetScriptFolder() + "Cache\\" + c_cik; //create cache CreateFolders(cache); if (IsFolder(cache) == FALSE) { AddMessage('x', "Unable to create document cache."); return; } //Populate filings array based on SEC's data populate_filings(cache, c_cik); //Parse through filings to find all transactions parse_filings(cache, c_cik); //Parse through transactions and holdings to find current number get_current_holdings(AddPaths(cache, c_cik + ".csv")); count = 0; max = ArrayGetAxisDepth(s16holdings, AXIS_ROW); int printed; printed = 1; //Print out all of our current holdings while (count < max) { if (s16holdings[count][COL_NUMOWNED] != "0") { AddMessage("%-11s %02d: CIK: %08s Name: %-50s Securities Owned: %-10s Last Reported: %10s", s16holdings[count][COL_TORH], printed++, s16holdings[count][COL_CIK], s16holdings[count][COL_NAME], s16holdings[count][COL_NUMOWNED], s16holdings[count][COL_TDATE]); } count++; } } //Returns true if S16 file type, false if anything else boolean is_section_16(string type) { switch (type) { case "3": return true; case "4": return true; case "5": return true; case "3/A": return true; case "4/A": return true; case "5/A": return true; } return false; } void parse_filings(string cache, string c_cik) { int count, max, rc; string date, cik; string parser; //Get number of filings count = ArrayGetAxisDepth(s16filings) - 1; //Create data sheet hSheet = DataSheetCreate(2, 9); //Set the header row DataSheetSetCellText(hSheet, 0, COL_TORH, "Transaction or Holding"); DataSheetSetCellText(hSheet, 0, COL_AORD, "Acquisition or Disposition"); DataSheetSetCellText(hSheet, 0, COL_TDATE, "Transaction Date"); DataSheetSetCellText(hSheet, 0, COL_TTYPE, "Transaction Type"); DataSheetSetCellText(hSheet, 0, COL_DORI, "Direct or Indirect Ownership"); DataSheetSetCellText(hSheet, 0, COL_NUMTRANS, "Number of Securities Transacted"); DataSheetSetCellText(hSheet, 0, COL_NUMOWNED, "Number of Securities Owned"); DataSheetSetCellText(hSheet, 0, COL_CIK, "Issuer CIK"); DataSheetSetCellText(hSheet, 0, COL_NAME, "Security Name"); //Open progress bar ProgressOpen("Reading Filings"); ProgressSetPhaseCount(count); max = count; //Go through filings while (count >= 0) { //Update progress ProgressSetPhase(max-count); ProgressSetStatus(FormatString("Reading file %d of %d", max-count, max)); ProgressUpdate(max-count, max); //AddMessage("Parsing file %s", s16filings[count]); date = ""; //Create SGML parser file = SGMLCreate(); //Set parse to this file rc = SGMLSetFile(file, s16filings[count]); //Check for errors if (rc != ERROR_NONE) { AddMessage("Error on Setting File %s: 0x%08X", s16filings[count], rc); } //Get first element parser = SGMLNextElement(file); rc = GetLastError(); //Check for issuer CIK while (parser != "<periodOfReport>" && rc == ERROR_NONE) { parser = SGMLNextElement(file); rc = GetLastError(); } //Get CIK date = SGMLNextItem(file); //Check for issuer CIK while (parser != "<issuerCik>" && rc == ERROR_NONE) { parser = SGMLNextElement(file); rc = GetLastError(); } //Get CIK cik = SGMLNextItem(file); if (rc != ERROR_NONE) { AddMessage("Error on retrieving CIK: 0x%08X", rc); } //Until end of file while (rc == ERROR_NONE) { //check if element is beginning of transaction if (parser == "<nonDerivativeTransaction>" || parser == "<derivativeTransaction>") { //Add transaction add_transaction(cik); } if (parser == "<nonDerivativeHolding>" || parser == "<derivativeHolding>") { //Add transaction add_holding(cik, date); } //Next element parser = SGMLNextElement(file); rc = GetLastError(); } //Done with SGML parser CloseHandle(file); //Next file count--; } //Export data to CSV rc = DataSheetExport(hSheet, AddPaths(cache, c_cik + ".csv")); if (rc != ERROR_NONE) { AddMessage("Error on Creating CSV: 0x%08X", rc); } CloseHandle(hSheet); //AddMessage("Finished Parsing files from CIK: %s", TEST_CIK); } void add_transaction(string issuer) { string aord, date, ttype, dori, numtrans, numowned, secname; string parser; int rc; //Initialize variables aord = ""; date = ""; ttype = ""; dori = ""; numtrans = ""; numowned = ""; secname = ""; //Get next element parser = SGMLNextElement(file); //Check for end of the transaction while (parser != "</nonDerivativeTransaction>" && parser != "</derivativeTransaction>" && rc == ERROR_NONE){ if (parser == "<securityTitle>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } secname = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ secname += SGMLNextItem(file); } secname = ReplaceInString(secname, "</value>", ""); } if (parser == "<transactionDate>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } date = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ date += SGMLNextItem(file); } date = ReplaceInString(date, "</value>", ""); } if (parser == "<transactionCode>"){ ttype = SGMLNextItem(file); } if (parser == "<transactionShares>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } numtrans = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ numtrans += SGMLNextItem(file); } numtrans = ReplaceInString(numtrans, "</value>", ""); } if (parser == "<transactionAcquiredDisposedCode>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } aord = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ aord += SGMLNextItem(file); } aord = ReplaceInString(aord, "</value>", ""); } if (parser == "<sharesOwnedFollowingTransaction>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } numowned = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ numowned += SGMLNextItem(file); } numowned = ReplaceInString(numowned, "</value>", ""); } if (parser == "<directOrIndirectOwnership>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } dori = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ dori += SGMLNextItem(file); } dori = ReplaceInString(dori, "</value>", ""); } parser = SGMLNextElement(file); rc = GetLastError(); } //Insert a row underneath the header row DataSheetRowInsert(hSheet, 1); //Insert all of the data DataSheetSetCellText(hSheet, 1, COL_TORH, "Transaction"); DataSheetSetCellText(hSheet, 1, COL_AORD, aord); DataSheetSetCellText(hSheet, 1, COL_TDATE, date); DataSheetSetCellText(hSheet, 1, COL_TTYPE, ttype); DataSheetSetCellText(hSheet, 1, COL_DORI, dori); DataSheetSetCellText(hSheet, 1, COL_NUMTRANS, numtrans); DataSheetSetCellText(hSheet, 1, COL_NUMOWNED, numowned); DataSheetSetCellText(hSheet, 1, COL_CIK, issuer); DataSheetSetCellText(hSheet, 1, COL_NAME, secname); } void add_holding(string issuer, string date) { string aord, ttype, dori, numtrans, numowned, secname; string parser; int rc; //Initialize variables dori = ""; numowned = ""; secname = ""; //Get next element parser = SGMLNextElement(file); //Check for end of the transaction while (parser != "</nonDerivativeHolding>" && parser != "</derivativeHolding>" && rc == ERROR_NONE){ if (parser == "<securityTitle>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } secname = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ secname += SGMLNextItem(file); } secname = ReplaceInString(secname, "</value>", ""); } if (parser == "<sharesOwnedFollowingTransaction>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } numowned = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ numowned += SGMLNextItem(file); } numowned = ReplaceInString(numowned, "</value>", ""); } if (parser == "<directOrIndirectOwnership>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } dori = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ dori += SGMLNextItem(file); } dori = ReplaceInString(dori, "</value>", ""); } parser = SGMLNextElement(file); rc = GetLastError(); } //Insert a row underneath the header row DataSheetRowInsert(hSheet, 1); //Insert all of the data DataSheetSetCellText(hSheet, 1, COL_TORH, "Holding"); DataSheetSetCellText(hSheet, 1, COL_TDATE, date); DataSheetSetCellText(hSheet, 1, COL_DORI, dori); DataSheetSetCellText(hSheet, 1, COL_NUMOWNED, numowned); DataSheetSetCellText(hSheet, 1, COL_CIK, issuer); DataSheetSetCellText(hSheet, 1, COL_NAME, secname); } int populate_filings(string cache, string c_cik) { //creation variables string files[]; //Files from SEC string name; //Name of file boolean s16Check; //Placeholder boolean int rc; //Error Checker string base; //String for name storage int count; int numfiles; int max; //Get the archive list files = EDGARFetchArchiveList(c_cik); rc = GetLastError(); //Check is CIK exists if (rc == (ERROR_REMOTE | ERROR_FILE_NOT_FOUND)) { AddMessage("CIK Does Not Exist"); return; } //If too many filings if (rc == ERROR_OVERFLOW) { files = EDGARFetchArchiveList(c_cik, FALSE, 0, 0, TRUE); } //Initialize needed variables max = ArrayGetAxisDepth(files); count = 0; numfiles = 0; //Open progress bar ProgressOpen("Getting Files"); ProgressSetPhaseCount(max-1); //While there are still files to go through while (count < max) { //Update progress ProgressSetPhase(count); ProgressSetStatus(FormatString("Getting file %d of %d", count, max)); ProgressUpdate(count, max); //Get filename base = ReplaceInString(GetFilename(files[count]), ".txt", ".xml"); //Check if already downloaded if (DoesFileExist(AddPaths(cache, base)) == FALSE) { //Open archive file = EDGARArchiveOpen(files[count]); //Make sure a S16 filing s16Check = is_section_16(EDGARArchiveGetDocType(file, 0)); //If s16 if (s16Check) { if (EDGARArchiveGetDocFileType(file, 0) == FT_XML) { //Set name of XML file name = AddPaths(cache, EDGARArchiveGetProperty(file, "accession_number") + ".xml"); //Get XML file and save it locally EDGARArchiveGetDocFile(file, 0, name); //Add to the list s16filings[numfiles++] = name; } } CloseHandle(file); } else { //Add cached file to list s16filings[numfiles++] = AddPaths(cache, base); } count++; } //Close progress ProgressClose(); return ERROR_NONE; } //Get current holdings void get_current_holdings(string cache) { int count; int numholdings; int countholdings; int max; int intcount; string aord, date, ttype, dori, numtrans, numowned, secname, cik, type; string filings[][]; boolean found; //Clear the array ArrayClear(s16holdings); //Get all transactions from filings filings = CSVReadTable(cache); //Set a counter max = ArrayGetAxisDepth(filings, AXIS_ROW); count = 1; //Currently at 0 holdings numholdings = 0; //Go through all transactions while (count < max-1) { //get all of the values type = filings[count][COL_TORH]; aord = filings[count][COL_AORD]; date = filings[count][COL_TDATE]; ttype = filings[count][COL_TTYPE]; dori = filings[count][COL_DORI]; numtrans = filings[count][COL_NUMTRANS]; numowned = filings[count][COL_NUMOWNED]; cik = filings[count][COL_CIK]; secname = filings[count][COL_NAME]; //set counter to 0 countholdings = 0; //Not found in current holdings found = false; //Go through current holdings while (countholdings < numholdings && found == false) { //Same CIK? if (s16holdings[countholdings][COL_CIK] == cik) { //Same security name? if (CompareStringsNoCase(s16holdings[countholdings][COL_NAME], secname) == 0 ) { //Found already reported found = true; } } countholdings++; } //If not reported yet if (found == false) { //Set holding s16holdings[numholdings][COL_TORH] = type; s16holdings[numholdings][COL_AORD] = aord; s16holdings[numholdings][COL_TDATE] = date; s16holdings[numholdings][COL_TTYPE] = ttype; s16holdings[numholdings][COL_DORI] = dori; s16holdings[numholdings][COL_NUMTRANS] = numtrans; s16holdings[numholdings][COL_NUMOWNED] = numowned; s16holdings[numholdings][COL_CIK] = cik; s16holdings[numholdings][COL_NAME] = secname; numholdings++; } count++; } }
Note that this script is getting to be rather large (almost 500 lines of code now). We’re not going to go through every single line, but I’m going to highlight some of the changes that have been made and I’ll talk in detail about the brand new code in here.
void main() { //creation variables string cache; //Location of cache string c_cik; //CIK of Reporting Owner int count; //Counting Variable int max; c_cik = TEST_CIK; //Find the location for our cache cache = GetScriptFolder() + "Cache\\" + c_cik; //create cache CreateFolders(cache); if (IsFolder(cache) == FALSE) { AddMessage('x', "Unable to create document cache."); return; } //Populate filings array based on SEC's data populate_filings(cache, c_cik); //Parse through filings to find all transactions parse_filings(cache, c_cik); //Parse through transactions and holdings to find current number get_current_holdings(AddPaths(cache, c_cik + ".csv")); count = 0; max = ArrayGetAxisDepth(s16holdings, AXIS_ROW); int printed; printed = 1; //Print out all of our current holdings while (count < max) { if (s16holdings[count][COL_NUMOWNED] != "0") { AddMessage("%-11s %02d: CIK: %08s Name: %-50s Securities Owned: %-10s Last Reported: %10s", s16holdings[count][COL_TORH], printed++, s16holdings[count][COL_CIK], s16holdings[count][COL_NAME], s16holdings[count][COL_NUMOWNED], s16holdings[count][COL_TDATE]); } count++; } }
Our main() function is now very simple. We take our testing CIK and create a cache folder. We then call our three main action functions: populate_filings(), parse_filings(), and get_current_holdings(). Finally we go through each holding and print them out to the screen. If you were to run this script as-is you would get this:
Holding 01: CIK: 0000718877 Name: Common Stock, par value $0.000001 per share Securities Owned: 221288 Last Reported: 2018-12-28 Transaction 02: CIK: 0000021344 Name: Phantom Stock Units Securities Owned: 43003.2605 Last Reported: 2018-03-29 Holding 03: CIK: 0000021344 Name: Common Stock, $.25 Par Value Securities Owned: 18 Last Reported: 2018-03-29 Transaction 04: CIK: 0000718877 Name: Employee Stock Options Securities Owned: 190712 Last Reported: 2017-08-07 Holding 05: CIK: 0000718877 Name: Common Stock, par value $0.000001 Securities Owned: 9600 Last Reported: 2010-06-02 Holding 06: CIK: 0000718877 Name: Common Stock, par value $0.0000001 per share Securities Owned: 160610 Last Reported: 2009-11-13 Transaction 07: CIK: 0000718877 Name: Common Stock, par value $0.00001 per share Securities Owned: 4475288 Last Reported: 2007-12-24 Transaction 08: CIK: 0001011006 Name: Director Stock Option Securities Owned: 322 Last Reported: 2007-12-31 Transaction 09: CIK: 0001011006 Name: Common Stock Securities Owned: 10000 Last Reported: 2007-06-12 Transaction 10: CIK: 0001011006 Name: Stock Option ( right to buy ) Securities Owned: 119741 Last Reported: 2006-04-21 Transaction 11: CIK: 0000718877 Name: Common Stock, par value $.000001 per share Securities Owned: 3397246 Last Reported: 2005-09-15 Transaction 12: CIK: 0000913949 Name: Non-Qualified Stock Option (right to buy Securities Owned: 33332 Last Reported: 2003-09-04 Transaction 13: CIK: 0000913949 Name: Non-Qualified Stock Option (right to buy) Securities Owned: 7500 Last Reported: 2003-07-24
This is the last reported value from each combination of issuer CIK and security name in the entire history of the reporting owner’s CIK. The data can be a little messy, and we will look at cleaning this up in the next part. Likely, we could filter out any holdings older than a year as holdings are expected to be reported once a year on a form 5. However, as I discovered in my testing of this script, not everyone is filing all of their required filings, so I kept this unfiltered for now. Now let’s quickly look at some of the major highlights of each function, starting with populate_filings().
int populate_filings(string cache, string c_cik) { //creation variables string files[]; //Files from SEC string name; //Name of file boolean s16Check; //Placeholder boolean int rc; //Error Checker string base; //String for name storage int count; int numfiles; int max; //Get the archive list files = EDGARFetchArchiveList(c_cik); rc = GetLastError(); //Check is CIK exists if (rc == (ERROR_REMOTE | ERROR_FILE_NOT_FOUND)) { AddMessage("CIK Does Not Exist"); return; } //If too many filings if (rc == ERROR_OVERFLOW) { files = EDGARFetchArchiveList(c_cik, FALSE, 0, 0, TRUE); } //Initialize needed variables max = ArrayGetAxisDepth(files); count = 0; numfiles = 0; //Open progress bar ProgressOpen("Getting Files"); ProgressSetPhaseCount(max-1); //While there are still files to go through while (count < max) { //Update progress ProgressSetPhase(count); ProgressSetStatus(FormatString("Getting file %d of %d", count, max)); ProgressUpdate(count, max); //Get filename base = ReplaceInString(GetFilename(files[count]), ".txt", ".xml"); //Check if already downloaded if (DoesFileExist(AddPaths(cache, base)) == FALSE) { //Open archive file = EDGARArchiveOpen(files[count]); //Make sure a S16 filing s16Check = is_section_16(EDGARArchiveGetDocType(file, 0)); //If s16 if (s16Check) { if (EDGARArchiveGetDocFileType(file, 0) == FT_XML) { //Set name of XML file name = AddPaths(cache, EDGARArchiveGetProperty(file, "accession_number") + ".xml"); //Get XML file and save it locally EDGARArchiveGetDocFile(file, 0, name); //Add to the list s16filings[numfiles++] = name; } } CloseHandle(file); } else { //Add cached file to list s16filings[numfiles++] = AddPaths(cache, base); } count++; } //Close progress ProgressClose(); return ERROR_NONE; }
This code is basically unchanged from last week, other than the fact that we have put it into a function. There’s one change I would like to call your attention to:
//If s16 if (s16Check) { if (EDGARArchiveGetDocFileType(file, 0) == FT_XML) { //Set name of XML file name = AddPaths(cache, EDGARArchiveGetProperty(file, "accession_number") + ".xml"); //Get XML file and save it locally EDGARArchiveGetDocFile(file, 0, name); //Add to the list s16filings[numfiles++] = name; }
The changed line in here is the addition of EDGARArchiveGetDocFileType(). We use this to check to see if the Section 16 document is XML or not. In my initial version of the script I never tested a CIK that had filings from before Section 16 filings were XML based. Since the switch occurred around the year 2003 I decided that if the filing was not XML I did not care about the information inside of it. However, without this line there was a plethora of parsing errors later on that were introduced because older filings did not have the expected XML data inside of them.
void parse_filings(string cache, string c_cik) { int count, max, rc; string date, cik; string parser; //Get number of filings count = ArrayGetAxisDepth(s16filings) - 1; //Create data sheet hSheet = DataSheetCreate(2, 9); //Set the header row DataSheetSetCellText(hSheet, 0, COL_TORH, "Transaction or Holding"); DataSheetSetCellText(hSheet, 0, COL_AORD, "Acquisition or Disposition"); DataSheetSetCellText(hSheet, 0, COL_TDATE, "Transaction Date"); DataSheetSetCellText(hSheet, 0, COL_TTYPE, "Transaction Type"); DataSheetSetCellText(hSheet, 0, COL_DORI, "Direct or Indirect Ownership"); DataSheetSetCellText(hSheet, 0, COL_NUMTRANS, "Number of Securities Transacted"); DataSheetSetCellText(hSheet, 0, COL_NUMOWNED, "Number of Securities Owned"); DataSheetSetCellText(hSheet, 0, COL_CIK, "Issuer CIK"); DataSheetSetCellText(hSheet, 0, COL_NAME, "Security Name"); //Open progress bar ProgressOpen("Reading Filings"); ProgressSetPhaseCount(count); max = count; //Go through filings while (count >= 0) { //Update progress ProgressSetPhase(max-count); ProgressSetStatus(FormatString("Reading file %d of %d", max-count, max)); ProgressUpdate(max-count, max); //AddMessage("Parsing file %s", s16filings[count]); date = ""; //Create SGML parser file = SGMLCreate(); //Set parse to this file rc = SGMLSetFile(file, s16filings[count]); //Check for errors if (rc != ERROR_NONE) { AddMessage("Error on Setting File %s: 0x%08X", s16filings[count], rc); } //Get first element parser = SGMLNextElement(file); rc = GetLastError(); //Check for issuer CIK while (parser != "<periodOfReport>" && rc == ERROR_NONE) { parser = SGMLNextElement(file); rc = GetLastError(); } //Get CIK date = SGMLNextItem(file); //Check for issuer CIK while (parser != "<issuerCik>" && rc == ERROR_NONE) { parser = SGMLNextElement(file); rc = GetLastError(); } //Get CIK cik = SGMLNextItem(file); if (rc != ERROR_NONE) { AddMessage("Error on retrieving CIK: 0x%08X", rc); } //Until end of file while (rc == ERROR_NONE) { //check if element is beginning of transaction if (parser == "<nonDerivativeTransaction>" || parser == "<derivativeTransaction>") { //Add transaction add_transaction(cik); } if (parser == "<nonDerivativeHolding>" || parser == "<derivativeHolding>") { //Add transaction add_holding(cik, date); } //Next element parser = SGMLNextElement(file); rc = GetLastError(); } //Done with SGML parser CloseHandle(file); //Next file count--; } //Export data to CSV rc = DataSheetExport(hSheet, AddPaths(cache, c_cik + ".csv")); if (rc != ERROR_NONE) { AddMessage("Error on Creating CSV: 0x%08X", rc); } CloseHandle(hSheet); //AddMessage("Finished Parsing files from CIK: %s", TEST_CIK); }
The function parse_filings() has some new functionality from last time. Our previous version of the script only looked for transactions in filings. Now we are looking for and recording holding rows as well as transactions. As a result we now have two functions we can call from parse_filings(): add_transaction() and add_holding(). These functions are similar to each other, but the tags they look for are different as different information is contained within transactions and holdings. Another difference from last time is that we now have another column to store whether rows are transactions or holdings. This column in the CSV is the first column in the list, and so we have incremented the number on all of the columns that existed previously. Finally, since holdings are not guaranteed to have a date listed with them we find the period of the report, and if we are adding a holding we use that date to record the holding.
void add_holding(string issuer, string date) { string aord, ttype, dori, numtrans, numowned, secname; string parser; int rc; //Initialize variables dori = ""; numowned = ""; secname = ""; //Get next element parser = SGMLNextElement(file); //Check for end of the transaction while (parser != "</nonDerivativeHolding>" && parser != "</derivativeHolding>" && rc == ERROR_NONE){ if (parser == "<securityTitle>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } secname = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ secname += SGMLNextItem(file); } secname = ReplaceInString(secname, "</value>", ""); } if (parser == "<sharesOwnedFollowingTransaction>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } numowned = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ numowned += SGMLNextItem(file); } numowned = ReplaceInString(numowned, "</value>", ""); } if (parser == "<directOrIndirectOwnership>"){ while (parser != "<value>") { parser = SGMLNextItem(file); } dori = SGMLNextItem(file); while (SGMLGetItemType(file) != SPI_TYPE_TAG){ dori += SGMLNextItem(file); } dori = ReplaceInString(dori, "</value>", ""); } parser = SGMLNextElement(file); rc = GetLastError(); } //Insert a row underneath the header row DataSheetRowInsert(hSheet, 1); //Insert all of the data DataSheetSetCellText(hSheet, 1, COL_TORH, "Holding"); DataSheetSetCellText(hSheet, 1, COL_TDATE, date); DataSheetSetCellText(hSheet, 1, COL_DORI, dori); DataSheetSetCellText(hSheet, 1, COL_NUMOWNED, numowned); DataSheetSetCellText(hSheet, 1, COL_CIK, issuer); DataSheetSetCellText(hSheet, 1, COL_NAME, secname); }
This code looks almost identical to the add_transaction() code from last time, and that is because it uses the same logic. However, the items that it looks for are different. Holdings contain a great deal less information that transactions, so we instead only care about the number of securities owned, the issuer, and the name of the security. The date is not listed in a row, so we get the date out of the period of report, passed to the function when it is called in parse_filings(). The add_transaction() function is unchanged from last week.
//Get current holdings void get_current_holdings(string cache) { int count; int numholdings; int countholdings; int max; int intcount; string aord, date, ttype, dori, numtrans, numowned, secname, cik, type; string filings[][]; boolean found; //Clear the array ArrayClear(s16holdings); //Get all transactions from filings filings = CSVReadTable(cache); //Set a counter max = ArrayGetAxisDepth(filings, AXIS_ROW); count = 1; //Currently at 0 holdings numholdings = 0;
Here is the beginning of our new function. We get passed the location of the cache, and then we get everything ready to find our current holdings. We initialize a bunch of variables. I decided to read all of the values from each row into separate variables. I want to quickly touch on why I did it this way and that is because it is my preference to make code as readable as possible. Since we end up comparing multiple table entries to each other I decided this coding method was easier to read and made it easier to verify I was doing the correct thing. While this ends up being less efficient, the peace of mind of being able to easily understand what is going on makes this decision worth it.
//Go through all transactions while (count < max-1) { //get all of the values type = filings[count][COL_TORH]; aord = filings[count][COL_AORD]; date = filings[count][COL_TDATE]; ttype = filings[count][COL_TTYPE]; dori = filings[count][COL_DORI]; numtrans = filings[count][COL_NUMTRANS]; numowned = filings[count][COL_NUMOWNED]; cik = filings[count][COL_CIK]; secname = filings[count][COL_NAME]; //set counter to 0 countholdings = 0; //Not found in current holdings found = false; //Go through current holdings while (countholdings < numholdings && found == false) { //Same CIK? if (s16holdings[countholdings][COL_CIK] == cik) { //Same security name? if (CompareStringsNoCase(s16holdings[countholdings][COL_NAME], secname) == 0 ) { //Found already reported found = true; } } countholdings++; }
Now we go through each reported row from all of our filings. These rows are in chronological order from newest to oldest. This means that we can take the first row and we know that it is a current holding. We then take each row after that and compare it to each current holding. If the issuer CIK is the same we check the security name, and if that is the same we know it is an older transaction for a current holding.
//If not reported yet if (found == false) { //Set holding s16holdings[numholdings][COL_TORH] = type; s16holdings[numholdings][COL_AORD] = aord; s16holdings[numholdings][COL_TDATE] = date; s16holdings[numholdings][COL_TTYPE] = ttype; s16holdings[numholdings][COL_DORI] = dori; s16holdings[numholdings][COL_NUMTRANS] = numtrans; s16holdings[numholdings][COL_NUMOWNED] = numowned; s16holdings[numholdings][COL_CIK] = cik; s16holdings[numholdings][COL_NAME] = secname; numholdings++; } count++; } }
If the current row makes it through all of the holdings and is not matched, it is a current holding and it gets added to the list. We then continue onward and check all of the rows in our filings the same way. This leaves us with the list of newest unique holdings that gets printed out by our main function.
And here we are. At this point all of our main functionality is complete: we take a CIK and turn it into a list of holdings. The next step from here will be to add everything into an interface that an end user can use, rather than forcing us to put a CIK into a script and run it manually. That will be our goal for the next part in this series.
Joshua Kwiatkowski is a developer at Novaworks, primarily working on Novaworks’ cloud-based solution, GoFiler Online. He is a graduate of the Rochester Institute of Technology with a Bachelor of Science degree in Game Design and Development. He has been with the company since 2013. |
Additional Resources