March 2012

Convert clipboard contents into HTML

(and preserve any formatting)

I wrote a bit of AppleScript (to Find Safari Tabs) and got a bit obsessive-compulsive formatting the HTML to match the AppleScript Editor window. Ridiculous by hand. So when I wrote another script (Find Terminal Sessions), I decided to find an easier way.

First, I just copied the text from AppleScript Editor, and pasted it into TextEdit (which preserves all the bolding and pretty colors), and then saved it as HTML. After doing some AppleScript though, I naturally converted this to an automated AppleScript program. Which is below, but not immediately. AppleScript is cute and fun. But it's kind of an ugly hack to use a GUI app (e.g. TextEdit) for command-line stuff. So I poked around and realized it could be done very simply and effectively as a shell script.

#!/bin/sh

osascript -e 'the clipboard as «class RTF »' | \
    perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' | \
    textutil -stdin -stdout -convert html -format rtf
(Thanks to an answer at stackoverflow for a big piece of this).

It uses command-line AppleScript (osascript) to grab the clipboard in hex-encoded richtext (RTF) format, uses perl to decode the hex-encoding, and then uses textutil to convert the RTF to HTML. If I was really industrious I'd throw in a command line option that would push the HTML back into the clipboard. But for my purposes I generally just send the output to a file (e.g. "clipboard2html > savefile.html"), and then edit the file.

Of course, some people may just want to see the AppleScript, or even to use it. Ideally if you wanted a more clickable way to run this, you'd rewrite my very short shell script into AppleScript. But the curious might want to see what my original AppleScript looked like. It's below.

set currapp to path to frontmost application as Unicode text

tell application "System Events" to set TErunning to exists (processes where name is "TextEdit")


tell application "TextEdit"

activate

delay 0.3

if TErunning then

make new document

end if

delay 0.3

tell application "System Events"

keystroke "v" using {command down}

end tell

set name of document of first window to "tempclip"

save document of first window in "Macintosh HD:tmp:tempclip.html"

close document of first window saving no

--and put it back in the clipboard as raw html?

try

set q to display dialog "HTML is in /tmp/tempclip.html.  Load into clipboard?"

if button returned of q is equal to "OK" then

set fh to (open for access (POSIX file "/tmp/tempclip.html"))

set txt to (read fh for (get eof fh))

close access fh

set the clipboard to txt

end if

end try

if not TErunning then

quit

end if

end tell


tell application currapp

activate

end tell




Reader Comments (Experimental. Moderated, expect delays. Posts may be edited or ignored. I reserve the right to remove any or all comments, at any time.)

1 comments:

At 1969/12/31 19:00
Ken wrote:

Belatedly, you could use the following to make the textedit document:

make new document with properties {text:the clipboard as «class RTF »}

It assumes that the formatted text is on the clipboard, i.e. With an applescript, select all and copy, and that TextEdit is in rich text mode. I try to avoid using any UI scripting if I can.

I like the shell script and will have to try and figure out how the perl component works. Thanks.

End Comments

Add a comment


More Mac OS X Stuff


Tom Fine's Home Send Me Email