Download for Windows Download for Linux Download for FreeBSD Download for Mac Manual Wiki Forum IRC Trac

Tuesday, October 14, 2008

Unicode, UTF-8, UTF-16, UCS-2 - In a Nutshell

I have noticed that lots of people have no idea what exactly is the whole "Unicode", "UTF-8", "UTF-16" and "UCS-2" stuff, aside from the fact that it's somehow related to the display of foreign characters. The objective of this post is to briefly explain them and dispel some of the myths associated with them.

Unicode is a coding system used to represent characters from many languages (including Japanese and Chinese) without the need to change your language locale. If you've tried writing Kanji in Medusa, you know what I'm talking about. In Unicode, characters are given an unique number. For example, the capital letter "A" is U+0041 (65 in decimal), and the Hiragana "ふ" is U+3075 (12405 in decimal). Characters are divided into planes of 65536 characters for convenience. Almost all common characters are in plane 0 (also known as the Basic Multilingual Plane, or BMP), which goes from code points U+0000 to U+FFFF. All kanji are in planes 0 and 2.

UTF-8, UTF-16 and UCS-2 are simply techniques used to encode those values into text files. Windows helped create a myth that Unicode is UTF-16 by calling UTF-16 "Unicode" in applications such as Notepad - but the fact is that UTF-16 is as much Unicode as UTF-8 is.

UCS-2 (UCS = Universal Character Set) is an old encoding system that can store characters from the BMP by simply writing them as 16-bit values. The advantage of this system is that it's simple and covers most characters, but anything outside the BMP will fail catastrophically. That's why some Kanji have issues with some programs. An interesting consequence of UCS-2 is that it allows the mapping of characters that don't exist, such as the ones reserved for surrogate pairs (see the next paragraph).

UTF-16 (UTF = Unicode Transformation Format) builds on UCS-2. Indeed, for characters on the BMP, UTF-16 is identical to UCS-2; the difference lies in planes above the BMP. UTF-16 is capable of representing characters in planes 1 through 16 (even though no planes above 3 are specified yet) with a surrogate pair, that is, it uses two 16-bit values to store a character. This means that you can't measure the length of a UTF-16 string by counting how many 16-bit values it has!

UTF-8 is similar to UTF-16, but a character can be encoded as anything ranging from 1 to 6 bytes, although no character is mapped to anything that would be over 4 bytes long in UTF-8. Similarly to how UTF-16 is identical to UCS-2 for the BMP, UTF-8 is identical to Western encoding for the ASCII range (U+0000 to U+007F), making it "backwards compatible" with software that isn't Unicode-aware. It also means that text that is mostly composed of ASCII characters (such as, say, ASS subtitles) will be much shorter as UTF-8 than as other Unicode formats. That's why Aegisub uses UTF-8 as its standard format.

Regardless of encoding differences, UTF-8 and UTF-16 can both represent ANY Unicode character. UTF-16 can sometimes be shorter than UTF-8, but that will, in practice, never be the case for ASS subtitles, even if they are entirely written in Japanese/Chinese, due to all the ASCII text involved in the format syntax.

Related Posts by Categories


  1. I am using aegisub to translate to Hebrew,
    when I open the file in subtitle workshop it turns into jibrish. *sigh*

    I use "Export Subtitle... [as Local]" so sws will be able to read the ASS file.

  2. Why the ultomas versions have not included the Spanish language?

    I would like to add thank you and good luck.

  3. @acro: That's because Subtitles Workshop doesn't support Unicode, so it misinterprets Aegisub's generated UTF-8 as your local charset.

    @yANyZx: Because nobody has updated the Spanish translation to support the new version.

  4. Are you talking about Aegisub2 spanish translation? A long time ago i have sent it to one of the programmers on IRC (can't remember if i was jfs...)

    Anyway, i can post it in my site if anyone needs it.

  5. Can i add Subtitle in khmer (Cambodian) language? Aegisub (this program) Can read khmer character? i am doing on this to add khmer from srt to sub/idx... this program is new to me..

  6. Jordan Brand adds to their Metallic releases to the Air Jordan 5 tongue with the debut of the Air Jordan 5 Bronze.Dressed in an Obsidian, White, Metallic Red Bronze and Bright Grape color Jordan 11 Space Jam scheme. This Air Jordan 5 features an Obsidian leather upper with Metallic Red Bronze tongue, White midsole with a Bronze flame. Completing the look is an icy translucent outsole.Look for the Air Jordan 5 Retro “Bronze” Jordans for sale to release on September 24th at select Jordan Brand retail stores. The retail price tag is set at $190 USD.
    SoleFly and Jordan Brand are set to team up once again for another upcoming Air Jordan Jordan 13 Shoes collaboration, this time celebrating their local World Series history.Following their recent Jordan Eclipse collab, SoleFly just teased a video of footage from when the Florida Marlins won their first MLB championship in 1997. No images have Jordan Eclipse surfaced, but you can expect the upcoming Air Jordan model to be dressed in the OG Marlins’ Teal and Black color scheme.Check out the clip below and stay tuned to Sneaker Bar for more update as Jordan Shoes 2016 they develop on this new SoleFly x Air Jordan collab.
    The Air Jordan 12 OVO debuts this Saturday, which will includes a full matching OVO Air Jordan 12 Apparel Collection in White and Gold.Officially kicking off Kyrie Shoes the month of October, October’s Very Own (OVO) has you covered. The apparel collection includes hoodie ($120), joggers ($100), headband ($15), wristband ($15), t-shirts ($45), hat ($40), slides ($65), bag ($130) and of course the Air Jordan 7 Jordan 12 ($225).Each piece of gear comes dressed in the White and Gold theme (besides the two added October’s Very Own t-shirts).Check out the entire OVO Air Jordan Apparel Collection below and look for them releasing Cheap Jordans


  7. Thanks for sharing your info. I really appreciate your efforts and I will be waiting for your further write.
    Thanks for sharing !
    tanki online 2 | 2048 game online


  8. Their happiness was hard coach factory store online won as Mr D'Arcy, a double amputee, was battling ongoing health problems that had led Ms Werner to give up her business to become his full time carer.

    Late last year, financial pressures forced them to sell their home in Mount Pleasant.

    Buying a house just over the Victorian border, coach outlet store online in the regional town Coleraine, was supposed to be their new coach outlet sale beginning.

    "We found a way out," Ms Werner said.

    "The cost of the house was vastly better. I thought we were going coach outlet online to be perfectly good here."

    At 6:50pm on May 28, Ms Werner made a Facebook post from the new house cheap coach purses reading "Made it", and referencing a storm she could see brewing in the distance.

    She had not heard from her coach outlet store online partner since 5:00pm."He'd coach factory website sit down next to their bed and say 'It's okay, you can do this, you don't have to be defined by this'.

    "It's quite beautiful because there are things coming through [now] that I didn't necessarily know."

    Mr D'Arcy learned how to walk again after just four months in rehab something Ms Werner said was typical of him.

    "Rennie had a bit of a bloody minded attitude," she said.

    "His ability to cope, and just his inner strength, was incredible."

    In the days following coach factory outlet online his death she received a message from a stranger who had also lost limbs.

    "She said, 'You don't know me, but Rennie was the most incredible man. I would look at Rennie and say, well, if you can walk in four months, then so can I'.

    "Those things just blow me coach factory outlet online login away they're things that he didn't go around bragging about himself."

    Country community ralliesThe sense of community spirit in Coleraine was one of Coach Outlet Online Store the qualities that originally attracted the couple to the small town after they visited a friend who lived there.

    1. There are those who want to collect vintage items like Michael Kors Factory Outlet with genuine and high quality leathers from Michael Kors. To be able to grab a quality Michael Kors Bags Outlet item of your choice, make sure to be very careful in selecting what to buy, how to buy and where to buy the Michael Kors Outlet Online.

  9. Replies
    1. There are those who want to collect vintage items like Michael Kors Factory Outlet with genuine and high quality leathers from Michael Kors. To be able to grab a quality Michael Kors Bags Outlet item of your choice, make sure to be very careful in selecting what to buy, how to buy and where to buy the Michael Kors Outlet Online.

  10. There are those who want to collect vintage items like Michael Kors Factory Outlet with genuine and high quality leathers from Michael Kors. To be able to grab a quality Michael Kors Bags Outlet item of your choice, make sure to be very careful in selecting what to buy, how to buy and where to buy the Michael Kors Outlet Online.

  11. Hey, my dear friends, if you are hunting for a wild look Michael Kors Handbags Outlet at a reasonable price; you really should not miss this Michael Kors Handbags On Sale. It is truly a qualifying item. (tags: Michael Kors Outlet Online,Michael Kors Sale,Michael Kors Outlet)
    It posted by Michael Kors Outlet Store.

  12. The use of the Unicode standard brings the benefits of a standard ... different language with only a single standard Unicode encoding.
    18 inches weave hair
    16 inches weave hair

  13. This is an awesome post. Really very informative and creative contents. This concept is a good way to enhance knowledge. I like it and help me to development very well. Thank you for this brief explanation and very nice information. Well, got good knowledge.
    WordPress website development Chennai

  14. The article you have shared here very awesome. I really like and appreciated your work. I read deeply your article, the points you have mentioned in this article are useful
    unfair mario

  15. Great info. I love all the posts, I really enjoyed, I would like more information about this, because it is very nice., Thanks for sharing.


  16. Those who are searching over internet IT assignment help can contact with us now. We are the best assignment writing service provider in Melbourne, Australia. Our Academic assignment writers available 24*7 hours for the students, if you really want to need IT assignment help, java assignment help, programming assignment help, r programming assignment help, data analysis assignment help online at cheapest price and get high distinction grades.


  17. Here is a good Weblog You might Come across Fascinating that we Encourage Youโปรโมชั่นGclub ของทางทีมงานตอนนี้แจกฟรีโบนัส 50%
    เพียงแค่คุณสมัคร Gclub กับทางทีมงานของเราเพียงเท่านั้น
    สมัครสล็อตออนไลน์ >>> goldenslot
    สนใจร่วมลงทุนกับเรา สมัครเอเย่น Gclub คลิ๊กได้เลย

  18. Enjoyed studying this, very good stuff, thanks.
    เว็บไซต์คาสิโนออนไลน์ที่ได้คุณภาพอับดับ 1 ของประเทศ
    เป็นเว็บไซต์การพนันออนไลน์ที่มีคนมา สมัคร Gclub Royal1688
    และยังมีหวยให้คุณได้เล่น สมัครหวยออนไลน์ ได้เลย
    สมัครสมาชิกที่นี่ >>> Gclub Royal1688

  19. Thanks for sharing such nice and informative information.Course Work Help deliver high quality at low rates with zero compromises on quality.

  20. Spotify Premium APK is one online music streaming application where you can easily interact with all the songs that are existing in the music industry.
    Spotify Premium
    Spotify Premium APK
    spotify premium apk ios
    spotify downloader apk
    spotify premium download

  21. I admire the person who wrote this post, you are so talented, hope you will promote them and become more successful. I do not feel sorry for taking the time to read this post, it is really nice and useful to me, thanks for posting it.
    friv 4 school math games 2019, io jogos jogar, cá koi mini, abcya for kids

  22. yahoo customer service number
    Yahoo! Mail is one of the most popular and widely used free email services for personal and professional use, but nowadays many people are facing problem on yahoo, but Now users who are facing problems with their Yahoo email account can get official help through our Yahoo customer service number. see more

  23. Students can place an order for Top Essay Services even with an urgency of one hour and be assured to receive quality Best Custom Paper Writing Service on time.

  24. Very Nice Article. Enjoyed reading It, really explains everything in detail, the article is very interesting and effective.

  25. Excellent blog, it’s really helped me a lot, so keep sharing post like this
    Assignment Help
    Assignment Help Firm

  26. Our programming experts provide a detailed working Programming Assignment help. You get the working snapshots of the code, code comments and much to understand how to write a program and achieving the programming knowledge.Java Assignment help

  27. Nice post !If you know about more information Best Seo company in indio so Indeed SEO is a certified white hat SEO Company in India that has a squad of experienced SEO experts. Top seo companies indiaIn fact, from the start till the end, SEO marketing professionals produce an outcome that grows the value of their business.

  28. Thanks for the article that gave me more information about Unicode, click to play

  29. To allow the additional characters to be represented on platforms that had used UCS-2, the UTF-16 encoding was introduced.



If you need help with Aegisub or have a bug report please use our forum instead of leaving a comment here. If you have a feature request, please go to our UserVoice page.

You will get better help on our forum than in the blog comments.