27 February 2008


Ah, the wonders of an accessible command line! To think that some meat-head boasted about removing it from his OS? Anyway...

Was given the task of preparing some photos to be inspected & maybe used by an Eastern States co, one of the features being a photo release form — only available in PDF format.

So... fired up pdftohtml & now had an editable (not pretty, but close enough) HTML version of it. Easiest way to edit it was OpenOffice Writer, so done.

Difficulty was going to be getting mentions of each photo written into the form, so I left it saved as HTML, with a keyword plonked into the doc at the point where photo refs were going to appear, in a paragraph by itself. Now I can use gawk to search for that keyword & replace it with a table of image files (system “cat”).

So far, so good, but the image names weren’t regarded as distinctive enough by themselves, so out came identify (an ImageMagick tool) & md5sum so now each image has distinctive identification.

The next problem is HTML — while it displays just fine on someone’s browser, it also includes my signature as a raw image file. Not safe. So I dug around & found a PHP script (dompdf) which churns out PDFs from HTML. Fine idea, but it depended on a couple of libraries I don’t have handy, easiest correction being to break out the scalpel & remove the related sections of code. Done. Yes, I know that it’s still feasible to exculpate the image from a PDF (how else would it print out?) but it’s hard enough to stop the vast majority of readers.

Then I added a simple little script (32 sparse lines) to make thumbnails & display each section in a very simple index page, & another to churn out archives of the primary image files so the crew from the “largely unexplored” sections of Oz could grab a whole section at a time. tar plus gzip doesn’t beat proprietary-format zip noticeably for JPEG files, but bzip2 won by a few percent. BASH variable-processing features make selection & the like dead simple.

Then I reclined in the luxury of mod-cons: pushing 140MB to the web-server took only a few minutes (at about a 1.5 real megabits per second) over cheapo ADSL2 connections (wireless this end, wired that) via WAIX, a local exchange point which made the traffic effectively free.

I can sit dow, type one short command, & the whole lot can be re-done from that mysterious engineering substance, “scratch.”

No comments: