Hacking the DITA-OT to Print Multiple Pages from a HTML Help File

Here’s one that took a lit­tle pok­ing around to fig­ure out. One of the many (nay, count­less) draw­backs of using HTML Help (.CHM files) is that print­ing from them is awful. Ide­al­ly, a user could print from the Help View­er to get a hard copy — or at least a .PDF copy — of the man­u­al. This would reduce the bur­den of try­ing to con­tin­ue to pro­duce .PDF deliv­er­ables for writ­ers like myself and keep all of the help in one place for the user, allow­ing them to print what they need1.

The Problem

So, as a writer, I’d like to be able to style dif­fer­ent­ly for the screen and for print. There are a whole host of rea­sons why (dif­fer­ent read­abil­i­ty issues, scale issues, etc.) but often as not because web browsers just don’t print every­thing they ren­der for the screen. Back­ground col­ors or images, for instance, don’t get sent to the print­er. This can quick­ly go from a style issue to a read­abil­i­ty issue.

While CSS gives the writer a fair bit of con­trol over the dis­play medi­um, good ‘ole HTML Help is right there to block you. It all works just fine when you print only the cur­rent top­ic. How­ev­er, HTML Help offers this won­der­ful lit­tle fea­ture which allows you to print the cur­rent top­ic and all child top­ics (for exam­ple, the Chap­ter head­ing and all con­tents of that chap­ter). Sounds great, right? Except for how it is imple­ment­ed breaks all links/references between files.

Print Topic dialog in HTML Help Viewer

The Print Top­ic dia­log: The cause of all this trouble.

That’s right. Hyper­links? Bro­ken. JavaScript? Not only bro­ken, but will now present big, scary error warn­ings to your user2! And CSS? Com­plete­ly busted.

You see, when you select this option in HTML Help, Win­dows copies all of your files (con­ve­nient­ly renam­ing them, thus break­ing links between your top­ics) into some tem­po­rary fold­er and them con­cate­nates them into one long HTML file, which it then prints just as it would have the sin­gle top­ic file (minus all of the CSS, script­ing, and oth­er things I, as a the writer, spent weeks on).

The Solution

For­tu­nate­ly, we can use one of Win­dows’ oth­er odd­i­ties to com­bat this one. That is, the very strange behav­ior of .CHM files in the file sys­tem. For some insane­ly odd rea­son which I can­not fath­om, Win­dows sim­ply does­n’t care about the fold­er where a .CHM file was placed. Upon open­ing it, it sends it to some oth­er place where direc­to­ries and fold­ers don’t exist and you sim­ply only need to call for the name of the file and it will locate it no mat­ter where it is on your machine.3

So, where we would have put a rel­a­tive file path with­in the .CHM file’s inter­nal fold­ers, we will use the MS-ITS syn­tax call to bring it forth! There is no place safe for these tem­po­rary print files that Win­dows cre­ates due to it’s own absurd behavior. 

Now, obvi­ous­ly, this all applies to any HTML Help file, not just one cre­at­ed using the DITA Open Toolk­it. How­ev­er, I’ll show you some of the extra know-how it takes to pull this off from with­in you’re author­ing tools if you’re using DITA. Some oth­er help author­ing tools have options to cor­rect for this4.

The Hack

  1. Cre­ate a new style sheet for use with print dis­play or sim­ply use an @media print { } block to add print-relates styles to an exist­ing style sheet. You can ref­er­ence the same style sheet mul­ti­ple times in the same HTML file with (appar­ent­ly, in HTML Help View­er, at least) no ill effects.
  2. Locate the XSL file respon­si­ble for adding the <head> con­tents into your HTML files: dita2htmlImpl.xsl

    For exam­ple, for XMet­aL, this file is locat­ed in C:\Users\<user.name>\AppData\Roaming\SoftQuad\XMetaL Shared\DITA_OT\xsl\xslhtml\, where <user.name> is your Win­dows user name. On old­er ver­sion of Win­dows, AppData is Application Data and this is usu­al­ly a hid­den sys­tem folder.

  3. Near the end of this long XSL file, you’ll find a num­ber of <xsl:template>s, one of which is used to gen­er­ate links to CSS files.

    Hint: Just do a search for the string “text/css”.

  4. Go to the end of this Tem­plate (the </xsl:template> line) and add the fol­low­ing link:

    <link rel="stylesheet" type="text/css" href="MS-ITS:<your.filename>.chm://<file.path>/<stylesheet.name>.css" media="print" />

    Where: <your.filename>.chm is the name of the HTML Help file you’re gen­er­at­ing. Note that you’ll need to update this file for any dif­fer­ent out­put file­names you gen­er­ate, unless you want Win­dows open­ing up some ran­dom .CHM file every time the user clicks Print.

    <file.path>/<stylesheet.name>.css is the rel­a­tive file path and stylesheet name inside the HTML Help file. If you real­ly just aren’t sure, grap a copy of 7Zip and use it to peek inside your .CHM (it can read them just like a .ZIP file; awe­some thing to have in your toolkit).

I’m fair­ly cer­tain this con­cept can be applied to the issue of <scripts>, as well (though I haven’t got­ten it to work thus far). How­ev­er, it will nev­er fix the issue of hyper­links between top­ics as this sys­tem of con­cate­nat­ing files into a tem­po­rary file irrev­o­ca­bly breaks those links. You can’t do a one time read this file back in the source .CHM for that issue.

A huge cred­it goes to Yuko Ishi­da who sent the key to this over to Helpware.net. I should also point out that this hack was test­ed in HTML Help Work­shop v 4.74.8702, Win­dows 7 64-bit, XMet­aL Author v5.5, and DITA-OT v1.2.

  1. In engi­neer­ing, it is occa­sion­al­ly nec­es­sary to print off some of the tech­ni­cal ref­er­ence or method­ol­o­gy sec­tions of design soft­ware doc­u­men­ta­tion for clients. []
  2. God only knows we’ll get calls about virus­es on this one… []
  3. Tru­ly, this has the poten­tial to wreak hav­oc should you have two or more .CHM files of the same name on your local dri­ve. How­ev­er, for the most part, it is com­plete­ly invis­i­ble. I can assure you, I have oodles of copies of var­i­ous .CHM files with the same name and I only recent­ly learned about this Win­dows weird­ness. []
  4. Mad­Cap Flare, for instance, has an option to cor­rect the appear­ance of mul­ti-page print­ing for .CHM files which I’m fair­ly cer­tain does the same thing as I describe here. []