I was wondering how to insert a
unicode character into Vim the
other day. Here's what I found.
First off, you have to know how
to enter a regular ASCII character
into Vim using its number on the
ASCII table. Here's how you do
that:
- Enter insert mode
- Type ctrl-v
- Type the decimal equivalent
for the ASCII character - To actually see the ASCII
character, type the
ESC key
Say, for example, you wish to
enter a capital-A. Here's how
you would do it following the
above steps:
- Type i for insert
- Type ctrl-v to enter
the ASCII code for a capital-A - Type 65 to represent the
letter A - Hit the ESC key to end
the insertion of text
Of course it is easier to type a
capital-A simply by hitting the
A-key while in insert mode. I'm
taking the long way around the barn
here in order to explain things.
What does this have to do with Unicode?
For Unicode characters, you take an extra
step. After typing ctrl-v, you follow
the ctrl-v with a Lowercase-u.
Here are the steps again, slightly adjusted
for Unicode:
- Enter insert mode
- Type ctrl-v u
- Type the hexidecimal equivalent
for the Unicode character - To see what effect the
insertion of a Unicode character
has had on your document, type
the ESC key. Hitting
ESC ends text insertion.
Here's how to enter a Capital-A
character in Unicode.
- Enter insert mode
- Type ctrl-v u
to insert a Unicode character - Type 0041 which
is the 2-byte hexadecimal equivalent
for a capital-A - To actually see the Capital-A,
type the ESC key. Hitting
ESC ends text insertion.
What does this have to do with
UTF-8? So far, I've only mentioned
Unicode. UTF-8 is a specific
implemenation of Unicode. Wikipdedia
describes the relationship:
Unicode
Remember these distinctions when entering
ASCII versus entering Unicode in Vim:
- ASCII is one byte
- Unicode is 2 bytes
- ASCII is expressed in
decimal notation - Unicode is expressed in
hexadecimal notation
If you are entering an ASCII
character in Vim, you will use
one byte of decimal notation.
If you are entering a Unicode
character in Vim, you will use
2 bytes of hexadecimal notation.
To the best of my knowledge, the
notation (decimal or hexadecimal
) is hard-wired. Please post if
I'm wrong and I'll correct myself.
One more thing I find it helpful
to know is what kind of encoding
Vim is using. Here's how I find
out:
:set encoding
This also seems to work:
:set enc
The answer that comes back in
my current session of Vim is:
encoding=utf-8
Seeing UTF-8 on my screen
makes me feel secure in the knowledge
that I'll be able to enter Unicode
into Vim.
There are many uses for a flexible
tool. That's the lesson I learn
over and over again when using Vim.
Ed Abbott