Discussion:
UTF8 Verbatim?
(too old to reply)
Geoffrey Alan Washburn
2006-12-06 18:10:52 UTC
Permalink
Hi, I would like to typeset some program fragments encoded in UTF8. Is
there reasonable way to do this with the LaTeX verbatim environment or
equivalent? I believe I can create an encoding for my font that will
fit all the gylphs I will need within the standard 255 glyph TeX font
metric.
Sebastian Busch
2006-12-06 18:45:14 UTC
Permalink
... typeset some program fragments encoded in UTF8. ...
hi geoffrey,

i am not sure if this is what you are looking for... but you can feed
utf8 encoded text into latex by using

\usepackage[utf8]{inputenc}

hope this helps,
sebastian.
Geoffrey Alan Washburn
2006-12-06 19:04:33 UTC
Permalink
Post by Sebastian Busch
... typeset some program fragments encoded in UTF8. ...
hi geoffrey,
i am not sure if this is what you are looking for... but you can feed
utf8 encoded text into latex by using
\usepackage[utf8]{inputenc}
I'm looking for something that will correctly handle something like

\begin{verbatim}
datatype List α =
Nil : [|α l|] ***@l α
| Cons : [|α l|] α -> ***@l α -> List@((info α)⊔l) α
\end{verbatim}

I've used the inputenc package in the past to define TeX "expansions" of
UTF8 characters, but it is not obvious that this would interact with the
verbatim environment correctly.
Ulrich Diez
2006-12-07 00:26:46 UTC
Permalink
Post by Geoffrey Alan Washburn
Hi, I would like to typeset some program fragments encoded in UTF8. Is
there reasonable way to do this with the LaTeX verbatim environment or
equivalent?
Just another silly idea:

You could try the CJK-package.
On my system the following example works by accident:

\documentclass{article}
\usepackage{cjk}
\usepackage{verbatim}

\begin{document}

\begin{CJK}{UTF8}{song}%
\begin{verbatim}
datatype List α =
Nil : [|α l|] ***@l α
| Cons : [|α l|] α -> ***@l α -> List@((info α)⊔l) α
\end{verbatim}
\end{CJK}

\end{document}


Of course there is no syntax-highlighting etc like with the
listings-package.
CJK internally treats an UTF8-char like a sequence
of (active) ASCII-characters. So I doubt that this yields
"copy-'n-paste"-ready utf8-stuff within your pdf-file even
if the resulting document looks as expected.


Sincerely

Ulrich
Geoffrey Alan Washburn
2006-12-07 02:06:55 UTC
Permalink
Post by Ulrich Diez
Post by Geoffrey Alan Washburn
Hi, I would like to typeset some program fragments encoded in UTF8. Is
there reasonable way to do this with the LaTeX verbatim environment or
equivalent?
You could try the CJK-package.
\documentclass{article}
\usepackage{cjk}
\usepackage{verbatim}
\begin{document}
\begin{CJK}{UTF8}{song}%
\begin{verbatim}
datatype List α =
\end{verbatim}
\end{CJK}
\end{document}
Of course there is no syntax-highlighting etc like with the
listings-package.
Indeed, I would like to work on something like that eventually, but for
now just being able to typeset my code without having to modify the
source-code significantly by hand will be good enough.
Post by Ulrich Diez
CJK internally treats an UTF8-char like a sequence
of (active) ASCII-characters. So I doubt that this yields
"copy-'n-paste"-ready utf8-stuff within your pdf-file even
if the resulting document looks as expected.
I tried experimenting with this some, but it looks like I need to go
install the Bitstream Cyberbit font, which will take a little work.
However, just for the heck of it, I tried just doing

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{verbatim}

\font\textgreek=cmmi10
\newcommand{\textalpha}{{\textgreek\char'13}}

\DeclareUnicodeCharacter{03B1}{\textalpha}

\begin{document}

\begin{verbatim}
datatype List α = Nil : [|α l|] ***@l α
\end{verbatim}

\end{document}

and it worked just fine. Of course, it did not use a monospaced font
for the α, but once I have an encoding for my monospace font worked out,
that should be easily rectified.

So kudos to the "inputenc" developers, as I would have expected verbatim
to barf on this kind of thing.
Danai SAE-HAN (韓達耐)
2006-12-07 18:57:49 UTC
Permalink
Post by Ulrich Diez
CJK internally treats an UTF8-char like a sequence
of (active) ASCII-characters. So I doubt that this yields
"copy-'n-paste"-ready utf8-stuff within your pdf-file even
if the resulting document looks as expected.
If you use \usepackage[T1]{CJKutf8} and have CJK4.7.0 installed, then you
should have no problem at all to copy and paste, as long as you have one
or more Unicode fonts.

IMHO Werner Lemberg's CJK package offers superior results in comparison
with inputenc.

To get the Bitstream Cyberbit fonts working for TeX, I've written the
following text for Debian GNU/Linux users. I suggest you use the NEW way
using Fontforge: you'll get nicer results. If you also want Unicode
glyphs higher than U+FFFF (not in Cyberbit but e.g. in Sun Haifeng's
SunExt{A,B}.ttf fonts), I suggest you also update your TeX's "Unicode.sfd"
file (on Debian it's part of the freetype1-tools package) with the most
recent one: http://lists.ffii.org/pipermail/cjk/2006-February/001355.html


Cyberbit
--------

(Install this only if you agree to the following license at
http://ftp.netscape.com/pub/communicator/extras/fonts/windows/License.wri)

To install Bitstream's Cyberbit TrueType Font, get
ftp://ftp.netscape.com/pub/communicator/extras/fonts/windows/Cyberbit.ZIP
and unzip it in "/usr/local/share/fonts/truetype/bitstream/".
Rename the file into "cyberbit.ttf", and make a symlink:
ln -s /usr/local/share/fonts/truetype/bitstream/cyberbit.ttf \
/usr/local/share/texmf/fonts/truetype/bitstream/cyberbit/cyberbit.ttf
(or better yet, use a relative path).

Don't forget to make the directories with "mkdir -p" first if they don't
exist yet!

1. OLD way (not recommended):
Now let's make those TeX Font Metric files:
$ cd /usr/local/share/texmf/fonts/truetype/bitstream/cyberbit
$ ttf2tfm cyberbit.ttf ***@Unicode.sfd@ > cyberbit.log

Move all the .tfm files to
/usr/local/share/texmf/fonts/tfm/tfm/bitstream/cyberbit
and run "mktexlsr" or "texhash" to update the TEXMF tree.
You can safely delete cyberbit.log.

Voilà, now you can try out /usr/share/doc/latex-cjk/examples/UTF8.tex!

2. NEW way (longer but much better):
The modern way of adding fonts is to use the Fontforge scripts. For
Cyberbit it's pretty easy: it is already a Unicode font and you don't
need vertical glyphs (unless you're as crazy as me). You will need a
Fontforge installation that is more recent than 2005-07-17. You also
must have "Unicode.sfd" installed somewhere: use either (s)locate or
find to get the exact location on your computer. It can be found
in /usr/share/texmf/fonts/sfd/ if you have the freetype1-tools package
installed on Debian.

Put cyberbit.ttf in /usr/local/share/fonts/truetype/bitstream/ and
make a soft link to your working directory, let's say
/usr/src/cyberbit-fonts/. You might eventually also link
/usr/local/share/fonts/truetype/bitstream/cyberbit.ttf to
/usr/local/share/texmf/fonts/truetype/bitstream, but that's not really
necessary.

Go to your build directory, copy "subfonts.pe" from the CJK
utils/subfonts directory to this map and execute the following
commands:

$ fontforge -script subfonts.pe cyberbit.ttf cyberbit \
/usr/share/texmf/fonts/sfd/Unicode.sfd

This will take a very long time, so make yourself a cup of tea.

$ for filename in *.pfb;
do echo "$(basename $filename .pfb) $(basename $filename .pfb) <$filename" >> cyberbit.map;
done

$ mkdir -p /usr/local/share/texmf/fonts/map/dvips/cyberbit/ /usr/local/share/texmf/fonts/{afm,type1,tfm}/cyberbit

You can write this command all on one line, or just copy and paste
the three lines in your terminal.

Put cyberbit.map in /usr/local/share/texmf/fonts/map/dvips/cyberbit/
and put *.afm, *.pfb and *.tfm to
/usr/local/share/texmf/fonts/{afm,tfm,type1}/cyberbit respectively.

Run "texhash" or "mktexlsr".
Now add a file called /etc/texmf/updmap.d/10cyberbit.cfg with the
following four lines:

######
# 10cyberbit.cfg
Map cyberbit.map
######

and then run "cd ..", "update-updmap" and "updmap-sys".
You need to go to another directory, or updmap-sys will use cyberbit.map
from the building directory; that's why you have to change directory
first.

If c70song.fd already exists on your computer, make sure it's deleted
first. Now make a file /usr/local/share/texmf/tex/latex/CJK/UTF8/c70song.fd
and use the following content:

%%%%%%
% This is the file c70song.fd of the CJK package
% for using Asian logographs (Chinese/Japanese/Korean) with LaTeX2e
%
% created by Werner Lemberg <***@gnu.org>
%
% Version 4.6.0 (11-Aug-2005)

\def\fileversion{4.6.0}
\def\filedate{2005/08/11}
\ProvidesFile{c70song.fd}[\filedate\space\fileversion]


% character set: Unicode U+0080 - U+FFFD
% font encoding: Unicode

\DeclareFontFamily{C70}{song}{\hyphenchar \font\***@ne}

\DeclareFontShape{C70}{song}{m}{n}{<-> CJK * cyberbit}{}
\DeclareFontShape{C70}{song}{bx}{n}{<-> CJKb * cyberbit}{\CJKbold}

\endinput
%%%%%%

and run "texhash" again.


HTH


Danai SAE-HAN
--
題目:《除夜自石湖歸苕溪》其一
作者:姜夔(1155-1221)

細草穿沙雪半銷,吳宮煙冷水迢迢。
梅花竹里無人見,一夜吹香過石橋。
Werner Lemberg
2006-12-08 09:51:52 UTC
Permalink
Post by Danai SAE-HAN (韓達耐)
$ fontforge -script subfonts.pe cyberbit.ttf cyberbit \
/usr/share/texmf/fonts/sfd/Unicode.sfd
This will take a very long time, so make yourself a cup of tea.
You should rather write:

This will take a very long time, so make yourself a nice day.


Werner

Continue reading on narkive:
Loading...