While writing a short topic map today I created the topic:
[ru1-1-2-2 : morph = “YM\u0022Y03”;
“ru1:1,2.2”
@”http://www.grovescenter.org/GC/hb/ru1-1-2-2″]
The basename is based on:
YM”Y03
in a standard Hebrew transliteration system.
I substituted the \u0022 Unicode escape sequence for double-quote mark.
Which displayed:
YM”Y03
So far, so good.
But I also had:
[ru1-1-13-2 : morph = “&:D\u002274Y”;
“ru1:1,13.2”
@”http://www.grovescenter.org/GC/hb/ru1-1-13-2″]
The basename being derived from:
&:D”74Y
Can you guess the character that was displayed without looking?
In case you are wondering, I tried to introduce a space between the escape sequence and “74” (an accent reference) to see if it made any difference:
&:D\u0022 74Y
Same result.
(spoiler space)
The result:
Not exactly what I was hoping for.
The real answer is to obtain an UTF-8 version of the file in Hebrew, so I don’t have to worry with ASCII transliteration.
Still, you may encounter a case where you need to use Unicode escape characters.
Take this as a cautionary tale.
This is actually easy to solve, although the fix is perhaps not entirely intuitive.
“&:D\u00002274Y”
Comment by larsga@garshol.priv.no — November 16, 2012 @ 2:36 am
Thanks!
An since it wasn’t intuitive, at least to me, I will add it to the LTM Cheat-Sheet.
Comment by Patrick Durusau — November 16, 2012 @ 3:26 pm