Today we’re going to talk about one of the most commonly used but least commonly understood features of your computer: the clipboard. Everyone who regularly uses a computer regularly knows how much of a time-saver using copy and paste can be, and the value only goes up the more you use it. As a programmer, the amount of time that I save by moving code around can’t even be measured, and I’m sure I’m not alone when I say that. However, the inner workings of the clipboard are often times misunderstood. Today I will explain what the clipboard is and how to interact with it using Legato.
Friday, March 30. 2018
LDC #78: A Snapshot of the Clipboard
The clipboard is a global object that is used to transfer data around between programs and windows. When you cut or copy data, you are placing that data into the global clipboard object. What is hidden underneath the surface, however, is that a lot more data than you might expect. That data is also being stored in many different formats. It is up to the program into which you pasting your information to decipher the data that is on the clipboard and decide in which formats, if any, it can use.
Before we dive into examples of what gets placed onto the clipboard, we need to first talk about the different kinds of formats that exist. Clipboard formats fall into two categories: conventional and de facto. No matter the category, a format will start with a “cf_” followed by the name of the format. Conventional formats are defined by Windows and are referenced in all caps. These are the formats that are specifically defined in the Windows SDK. Every program built for Windows should place things into conventional formats. These formats include CF_TEXT and CF_UNICODETEXT. De facto formats are defined by an application. They will have a name if given one. Otherwise they will be named “cf_” followed by the registered token in hexadecimal. Registered format types will reset each time the operating system restarts, which means that these types are likely not going to be static from computer to computer. More information, including a list of standard formats, can be found in the Legato Documentation.
Now let’s take a look at the data that gets added to the clipboard in two quick examples. These examples will use the output from our script below. First let’s examine what happens in GoFiler when you copy code from the Legato IDE:
00: Format Code: CF_TEXT Format Name: ANSI Text Description: ANSI Text Size: 259 01: Format Code: cf_control Format Name: NWS Control Description: Novaworks Application Size: 126 02: Format Code: CF_LOCALE Format Name: Locale (language) Description: Locale (language) Size: 004 03: Format Code: CF_OEMTEXT Format Name: OEM Text Description: OEM Text Size: 259 04: Format Code: CF_UNICODETEXT Format Name: Unicode Text Description: Unicode Text Size: 518
GoFiler adds two items to the clipboard: ANSI text and control data. Windows then adds three more items to the clipboard: OEM text, Unicode text, and Locale information. Windows does this with all textual data so that OEM, ANSI, and Unicode formats are always available together.
Next we’re going to change things up by looking at a much different example, Microsoft Word. At first you may wonder how different could the clipboard be when they’re both text.
00: Format Code: cf_data_object Format Name: DataObject Description: Data Object Size: 004 01: Format Code: cf_object_desc Format Name: Object Descriptor Description: Object Descriptor Size: 140 02: Format Code: cf_rtf Format Name: Rich Text Format Description: Rich Text Format (RTF) Size: 41822 03: Format Code: cf_html Format Name: HTML Format Description: HTML Structured Data Size: 39057 04: Format Code: CF_TEXT Format Name: ANSI Text Description: ANSI Text Size: 033 05: Format Code: CF_UNICODETEXT Format Name: Unicode Text Description: Unicode Text Size: 066 06: Format Code: CF_ENHMETAFILE Format Name: Enhanced Meta File (image) Description: Enhanced Meta File (image) Size: 000 07: Format Code: CF_METAFILEPICT Format Name: Meta File (image) Description: Meta File (image) Size: 016 08: Format Code: cf_embed_src Format Name: Embed Source Description: Embedded Source Size: 13271 09: Format Code: cf_native Format Name: Native Description: Application Native Size: 13270 10: Format Code: cf_owner_link Format Name: OwnerLink Description: Ownerlink Size: 039 11: Format Code: cf_link_src Format Name: Link Source Description: Link Source Size: 132 12: Format Code: cf_link_src_desc Format Name: Link Source Descriptor Description: Link Source Descriptor Size: 140 13: Format Code: cf_object_link Format Name: ObjectLink Description: Object Link Size: 037 14: Format Code: cf_0000c355 Format Name: HyperlinkWordBkmk Description: Not Known Size: 040 15: Format Code: cf_ole Format Name: Ole Private Data Description: OLE Data Size: 440 16: Format Code: CF_LOCALE Format Name: Locale (language) Description: Locale (language) Size: 004 17: Format Code: CF_OEMTEXT Format Name: OEM Text Description: OEM Text Size: 033
It turns out that when you copy text from Word you end up with 18 different formats on the clipboard! How could that be? Well, let’s dig in a little bit and see. The first five are easy to explain; they’re actually the same five formats as get put on the clipboard when copying Legato code. In addition there are two images on the clipboard, a Meta File and an Enhanced Meta File. There is then the data stored as HTML and RTF. Finally, a bunch of underlying meta data comprises the rest of the formats. All of these meta data formats are stored in the de facto category we discussed earlier, and presumably they are used by Word (and other Microsoft programs) when you paste elsewhere.
So now that we understand more about what the clipboard is, let’s take a look at how we can edit it. Legato gives you full control over the clipboard object, including reading from it and writing to it. In order to read off of the clipboard, we first have to get the handle to the clipboard. There are two different ways of doing this: the ClipboardCreate and the ClipboardOpen functions. Both return a handle to the clipboard. However, the ClipboardCreate function will clear the clipboard while the ClipboardOpen function will not. Whether you are looking at editing the information on the clipboard or just putting new information on the clipboard will influence which function you choose.
The next four sections can be broken down into: Reading Format Data, Checking Data, Getting Data, and Setting Data.
There are a number of functions that relate to retrieving clipboard format data:
string = ClipboardGetApplication ( [handle hClipboard] );
dword = ClipboardGetFormatCode ( [handle hClipboard], string name );
string = ClipboardGetFormatDescription ([handle hClipboard], dword format | string code);
string = ClipboardGetFormatName ( [handle hClipboard], dword format | string code );
int = ClipboardGetFormatSize ( [handle hClipboard], dword format | string code );
string[] = ClipboardGetFormats ( [handle hClipboard] );
Using these functions we can retrieve what is essentially meta data about the data that is currently stored in the clipboard. We can use these in conjunction with our Checking Data functions (below) to get a clear picture as to what we can do with the data on the clipboard:
boolean = ClipboardIsCSVAvailable ( );
boolean = ClipboardIsDIBAvailable ( );
boolean = ClipboardIsGIFAvailable ( );
boolean = ClipboardIsHTMLAvailable ( );
boolean = ClipboardIsImageAvailable ( );
boolean = ClipboardIsJPGAvailable ( );
boolean = ClipboardIsPNGAvailable ( );
boolean = ClipboardIsRTFAvailable ( );
boolean = ClipboardIsTextAvailable ( );
boolean = ClipboardIsUnicodeAvailable ( );
Once we have a clear picture of what is on the clipboard, we can retrieve any of the data off of the clipboard using the Retrieve Data functions:
handle = ClipboardGetData ( [handle hClipboard], dword format | string code );
string[][] = ClipboardGetCSVData ( [handle hClipboard] );
string = ClipboardGetCSVText ( [handle hClipboard] );
handle = ClipboardGetDIB ( [handle hClipboard] );
handle = ClipboardGetGIF ( [handle hClipboard] );
string = ClipboardGetHTML ( [handle hClipboard], [int mode] );
string[] = ClipboardGetHTMLComponents ( string data );
handle = ClipboardGetJPG ( [handle hClipboard] );
handle = ClipboardGetPNG ( [handle hClipboard] );
string = ClipboardGetRTF ( [handle hClipboard] );
string = ClipboardGetText ( [handle hClipboard], [boolean utf] );
string = ClipboardGetUnicode ( [handle hClipboard] );
Finally we have the functions that we can use to change the data on the clipboard, our Setting Data functions:
int = ClipboardSetCSV ( handle hClipboard, string data | string [][] data | handle hPool );
int = ClipboardSetHTML ( handle hClipboard, string data | handle hPool, [boolean raw] );
int = ClipboardSetHTML ( handle hClipboard, string data | handle hPool, [string header], [string footer] );
int = ClipboardSetRTF ( handle hClipboard, string data | handle hPool );
int = ClipboardSetText ( handle hClipboard, string data | handle hPool );
int = ClipboardSetUnicode ( handle hClipboard, wstring data | handle hPool );
Let’s take a quick look at a couple quick examples of these functions in action. The first example is one I wrote to show as many of these clipboard operations as possible. Here’s the code:
void setup() { MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy"); } void main() { setup(); } int check_copy(int f_id, string mode) { handle hBoard; string sClip; string list[]; int ix; int size; string s1; handle hLog; handle hData; if (mode != "preprocess") { return; } hBoard = ClipboardOpen(); sClip = ClipboardGetApplication(hBoard); hLog = LogCreate("Clipboard Data"); list = ClipboardGetFormats(hBoard); size = ArrayGetAxisDepth(list); ix = 0; s1 = "Clipboard Formats"; AddMessage(hLog, s1); LogIndent(hLog); while (ix < size) { s1 = FormatString("%02d: Format Code: %-15s Format Name: %-20s Description: %-20s Size: %03d", ix, ArrayGetKeyName(list, ix), list[ix], ClipboardGetFormatDescription(hBoard, ArrayGetKeyName(list, ix)), ClipboardGetFormatSize(hBoard, ArrayGetKeyName(list, ix))); AddMessage(hLog, s1); ix++; } LogOutdent(hLog); s1 = "Clipboard Data"; AddMessage(hLog, s1); LogIndent(hLog); if (ClipboardIsCSVAvailable()) { s1 = "CSV"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetCSVText(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsHTMLAvailable()) { s1 = "HTML"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetHTML(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsRTFAvailable()) { s1 = "RTF"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetRTF(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsTextAvailable()) { s1 = "Text"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetText(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsUnicodeAvailable()) { s1 = "Unicode"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(UnicodeToAnsi(ClipboardGetUnicode(hBoard)), 0, 500)); LogOutdent(hLog); } LogDisplay(hLog, "Clipboard Data"); CloseHandle(hBoard); return ERROR_NONE; }
We can break this down pretty easily into the sections that I laid out up above, and we’ll explore that shortly. First, though, this is a script that we’re hooking into the Paste function, so anytime that a user pastes after this function has been called, this script will run. Because this is an example, I didn’t add a way to unhook the function, so you will have to restart GoFiler when you’re done. The first portion of the script is creating the hook:
void setup() { MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy"); } void main() { setup(); } int check_copy(int f_id, string mode) { handle hBoard; string sClip; string list[]; int ix; int size; string s1; handle hLog; handle hData; if (mode != "preprocess") { return; } hBoard = ClipboardOpen(); sClip = ClipboardGetApplication(hBoard); hLog = LogCreate("Clipboard Data"); list = ClipboardGetFormats(hBoard); size = ArrayGetAxisDepth(list); ix = 0; s1 = "Clipboard Formats"; AddMessage(hLog, s1); LogIndent(hLog);
After creating the hook we get the current clipboard. We don’t want to hold the handle forever, so we get the current clipboard each time that the hook is run, and we’ll close that handle before we leave the function. This allows other programs to access the clipboard. We then get the application that put the data on to the clipboard. Next we set up to report all of the data that we can about the clipboard to the user. We create a Log object, get the formats off of the clipboard, and obtain the depth of the array so we have the information needed to run our loop. Next we put a beginning into the log and prepare the log for more information by indenting it.
while (ix < size) { s1 = FormatString("%02d: Format Code: %-15s Format Name: %-20s Description: %-20s Size: %03d", ix, ArrayGetKeyName(list, ix), list[ix], ClipboardGetFormatDescription(hBoard, ArrayGetKeyName(list, ix)), ClipboardGetFormatSize(hBoard, ArrayGetKeyName(list, ix))); AddMessage(hLog, s1); ix++; } LogOutdent(hLog);
We start a loop to go through all of the formats on the clipboard. For each format, we gather the number, Format Code, Format Name, Format Description, and size. Then we put that information into the log. By the end of this, our log will look something like this, depending on where the data on the clipboard and from where it came:
Clipboard Formats 00: Format Code: CF_TEXT Format Name: ANSI Text Description: ANSI Text Size: 415 01: Format Code: cf_control Format Name: NWS Control Description: Novaworks Application Size: 126 02: Format Code: CF_LOCALE Format Name: Locale (language) Description: Locale (language) Size: 004 03: Format Code: CF_OEMTEXT Format Name: OEM Text Description: OEM Text Size: 415 04: Format Code: CF_UNICODETEXT Format Name: Unicode Text Description: Unicode Text Size: 830
Let’s take a look at the next block of code:
s1 = "Clipboard Data"; AddMessage(hLog, s1); LogIndent(hLog); if (ClipboardIsCSVAvailable()) { s1 = "CSV"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetCSVText(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsHTMLAvailable()) { s1 = "HTML"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetHTML(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsRTFAvailable()) { s1 = "RTF"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetRTF(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsTextAvailable()) { s1 = "Text"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(ClipboardGetText(hBoard), 0, 500)); LogOutdent(hLog); } if (ClipboardIsUnicodeAvailable()) { s1 = "Unicode"; AddMessage(hLog, s1); LogIndent(hLog); AddMessage(hLog, "%s", GetStringSegment(UnicodeToAnsi(ClipboardGetUnicode(hBoard)), 0, 500)); LogOutdent(hLog); } LogDisplay(hLog, "Clipboard Data");
Here we ask the clipboard if data is available, and if it is, the function adds the data to the log along with what kind of data it is. We go through all of the textual data and not the image data because GoFiler’s logs do not support having images added to them. Most of these checks look the same, except you’ll notice there is an extra step using the UnicodeToAnsi function on our Unicode check. This is because Unicode support is currently very limited in GoFiler, and the logs do not yet support having Unicode added to them. Rather than cause errors, we do a quick conversion to show the user that there is Unicode data available on the clipboard. Our last step is to tell GoFiler to display the log in the Information View.
CloseHandle(hBoard); return ERROR_NONE; }
Finally we do some very quick cleanup work by closing the handle to the clipboard and returning without an error. There are no other possible return statements in this function because we don’t care about stopping the paste function in this script; we only care about gaining information about what is on the clipboard.
Now that we have looked through all of the different options with the clipboard, I’ll walk you through another script I wrote, this time for a practical security application. Some organizations have extremely strict rules about what you can do on their network computers. Of these, some even restrict what you can do with the clipboard. So I wrote a quick script helping an organization secure the data that they put into GoFiler by restricting their paste functions to only allow pasting data coming from known programs (Microsoft Office, Notepad, Google Chrome, Mozilla Firefox, Adobe Acrobat, GoFiler, and a few others). To do this, I rely upon the information we get from the ClipboardGetApplication function. This function returns a string that includes the name of the application if the application is in the known list but otherwise returns “Unknown Source (program.exe)” if the application is not the known list. Let’s go through the script:
void setup() { MenuSetHook("EDIT_PASTE",GetScriptFilename(),"check_copy"); } void main() { setup(); } int check_copy(int f_id, string mode) { handle hBoard; string sClip; if (mode != "preprocess") { return; } hBoard = ClipboardOpen(); sClip = ClipboardGetApplication(hBoard); if (IsInString(sClip, "Unknown Source")) { MessageBox("You cannot paste from an unauthorized application.\r\n%s", sClip); CloseHandle(hBoard); return ERROR_EXIT; } CloseHandle(hBoard); return ERROR_NONE; }
Wow, doesn’t that look familiar? Sure enough, I used the same base as the previous script but tweaked the ending. We get the application information from the clipboard and this time, rather than just printing it out, we check to see if the clipboard data is from an unknown source. If it is, we inform the user that they cannot paste and return an ERROR_EXIT code, stopping GoFiler from finishing the paste function.
Hopefully I’ve been able to give you a new appreciation for all the work that goes on behind the scenes while you are using your computer, even when doing something as simple as copying and pasting information. We have talked about what information is stored on the clipboard and how that information is represented in the clipboard object. We have also talked about all the different functions that you can use in Legato to read and change the information on the clipboard. After getting through all of this, I have one more important piece of advice: if you are going to change the data on the clipboard, make sure that it is obvious to the user what you are doing. The last thing you want to do is leave the user feeling frustrated and confused by the data on their clipboard changing unexpectedly.
Now, my friends, go forth and clip!
Joshua Kwiatkowski is a developer at Novaworks, primarily working on Novaworks’ cloud-based solution, GoFiler Online. He is a graduate of the Rochester Institute of Technology with a Bachelor of Science degree in Game Design and Development. He has been with the company since 2013. |
Additional Resources