2014-05-30

RTF - Where's the FM?

I mean a hands-on manual on how to create an Rich Text Format file from scratch, not the friggin' 200 pages  specs! Plus, only Microsoft would provide a 200 pages Word Document as an executable... Oh well, it's not like I never saw IBM (or was it Intel?) providing some source code as a PDF file with page numbering.

Man, what a struggle to figure out how to get Arabic RTF content to properly display in an app's Rich Edit control.

If you try to be smart and have Wordpad produce your RTF for you, and even if you set your Arabic text to use an Unicode font, you end up with something like:

{\rtf1 ... {\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset178 @Arial Unicode MS;}}
\pard\ltrpar\f0 Some blurb\f1\rtlch\lang1025\'da\'e3\'d1 \'c7\'e1\'d5\'e3\'cf\b0\f0\ltrch\lang6153\par
}
...which results in UTTER GARBAGE on screen in place of the Arabic!

I can't help but ask: what is the point of using an Unicode font, really, if that insanely dumb word processor that is Wordpad still insists on living in the 1980s, and switches codepages to insert ASCII codepoints instead?

So here's what you actually want to do, manually:
  • remove the \lang switch
  • insert pure Unicode codepoints using \u
But of course, it wouldn't be as backwards as possible if Microsoft didn't also force you to specify Unicode codepoints in decimal, with no means whatsoever of specifying hex instead. So even if you know the Arabic UTF-16 sequence you want to insert, you will have to spend some time doing your decimal conversions, to, at last, get the properly working:

{\rtf1 ... {\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset178 @Arial Unicode MS;}}
\pard\ltrpar\f0 Some blurb\f1\rtlch\u1575?\u1604?\u1589?\u1605?\u1583? \u1593?\u1605?\u1585?\ltrch\f0\
}

Heed my advice: If you design your format around the idea that no human will ever need to edit some data in a hurry in it, you're designing it all wrong...

As an aside, the above is also the reason why little-endian is an utter abomination that should be banned from the face of this earth: If I'm in a computer-controlled commercial airplane, that's lost all input, and, on account of the ground approaching fast, I'm in a bit of a hurry to figure out from a memory dump where the automatic pilot might store its altitude, to manually alter it, you bet that I'm gonna hope that whoever designed that plane picked a big-endian CPU, to slightly increase the probability of myself and all the other passengers not ending up as a pancake...

First rule of designing anything is to design with the idea that humans will always need to interact with your stuff, in ways that you'll never be able to devise.

So, Microsoft, next time you want to design something like RTF, please RTFM of Design rules and try to make it just a bit easier on people who need to manually interact with your stuff...

1 comment:

  1. Amen to that, have to assume the Devs did it on purpose lol. Also thank you for your efforts and for providing what I consider essential software!

    ReplyDelete