ponysay/pages/ponysay/Universal-Character-Set.html
Mattias Andrée afeec9cc24 first scratch
2012-10-25 06:28:13 +02:00

67 lines
3.4 KiB
HTML

<html lang="en">
<head>
<title>Universal Character Set - Ponysay</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="description" content="Ponysay">
<meta name="generator" content="makeinfo 4.13">
<link title="Top" rel="start" href="index.html#Top">
<link rel="up" href="Inner-workings.html#Inner-workings" title="Inner workings">
<link rel="prev" href="Shell-auto_002dcompletion.html#Shell-auto_002dcompletion" title="Shell auto-completion">
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
<!--
This manual is for ponysay
(version 2.9),
Copyright (C) 2012 Mattias Andrée
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.3 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover Texts,
and with no Back-Cover Texts. A copy of the license is included in
the section entitled ``GNU Free Documentation License''.
-->
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"><!--
pre.display { font-family:inherit }
pre.format { font-family:inherit }
pre.smalldisplay { font-family:inherit; font-size:smaller }
pre.smallformat { font-family:inherit; font-size:smaller }
pre.smallexample { font-size:smaller }
pre.smalllisp { font-size:smaller }
span.sc { font-variant:small-caps }
span.roman { font-family:serif; font-weight:normal; }
span.sansserif { font-family:sans-serif; font-weight:normal; }
--></style>
</head>
<body>
<div class="node">
<a name="Universal-Character-Set"></a>
<p>
Previous:&nbsp;<a rel="previous" accesskey="p" href="Shell-auto_002dcompletion.html#Shell-auto_002dcompletion">Shell auto-completion</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="Inner-workings.html#Inner-workings">Inner workings</a>
<hr>
</div>
<h3 class="section">10.8 Universal Character Set</h3>
<p><a name="index-universal-character-set-404"></a><a name="index-ucs-405"></a><a name="index-unicode-406"></a><a name="index-pony-names-407"></a>
In earlier versions of <samp><span class="command">ponysay</span></samp> only the output truncation supported
Universal Character Set, though handcoded UTF-8 character counting. Now
<samp><span class="command">ponysay</span></samp> lets Python decode the data, Python store all 31 bits of a
character in as one character, not in UTF-16 as some other languages does, this
means that the code is agnostic to the character encoding. However in Unicode
6.1 their are four ranges of combining characters, these do not take up any
width in proper terminal, we therefore have a class in the code named <code>UCS</code>
that help us take them into consideration when determine the length of a string.
<p>Some ponies have names that contain non-ASCII characters, read about it in
<a href="Environment-variables.html#Environment-variables">Environment variables</a>. The UCS names are stored in the file <samp><span class="file">share/ucsmap</span></samp>,
in it lines that are not empty and does not start with a hash (<code>#</code>) are
parsed, and contains a UCS name and a ASCII:ised name. The UCS name comes first,
followed by the ASCII:ised name that the UCS name should replace or link towards.
The two names are separated by and simple left to right arrow character [U+2192],
optionally with surrounding white space.
</body></html>