Download for Windows Download for Linux Download for FreeBSD Download for Mac Manual Wiki Forum IRC Trac

Sunday, December 21, 2008

Bug tracker back!

Yep, we got the bug tracker back online!

Unfortunately, since we only had an old back-up, about 6 weeks of activity was lost. The URL is the same as before and should work everywhere already.


Don't use \fad on fades to black

Dear typesetters,

I have seen this countless times. Every time, it has bothered me. Please, don't do it anymore.

When the screen fades to black (or white, or any other solid colour, for that matter), DON'T use the \fad tag to fade the text along with it. When you use \fad, you're making the text translucent, and not darker. The result is that it will blend with the background - including all the usual associated bugs if you have borders and shadows - and get somewhat darker because the background itself is getting darker - it just won't get as dark as it SHOULD be, and the visual effect is that the text is getting brighter, relative to the background.

With \fad
With \t(\c)

As you can see above, the CORRECT way to deal with this is by using \t to animate the colour (all relevant ones) to black (or whichever colour the screen is fading to). Also note that the background can be seen through the text in the \fad() case, which is not the correct behavior.

Keep in mind, though, \t(\c) is significantly slower than \fad() on VSFilter, so you might have softsub issues, depending on how complex is your text.


Saturday, December 20, 2008

If programming languages were religions: the aftermath

Until December 15 2008, would typically get ~350 hits per day. Between December 16 and 19, it got a total of 266 thousand hits, thanks to the "If programming languages were religions..." post. So, first of all: Thanks, everybody! All feedback, positive and negative, was greatly appreciated.

First it was put on Digg. Then on reddit. Then, to my great surprise, on Slashdot... and on Stumbleupon. And even on one of my favourite blogs, Pharyngula. It was linked on many IRC channels and blogs around the Internet... Several of my friends wrote to tell me that they had seen it linked in some internal forum or mail list. I never expected this kind of reaction!

So here's the basic rundown on all the comments that I saw: most Muslims (that manifested their opinion regarding it, at least) thought that it was funny, and didn't think that it was insulting. Several others thought that it was very offensive. Lots of people thought that I was a Python fanboy (C++ is my actual favourite language, although it's possible that Haskell will take its place as I learn more about it), or a Microsoft hater (hey, I use Vista and Visual C++). Since I got accused of being a Jew by at least 3 different people, let me get this straight: I'm an atheist.

There were many complaints about "missing" languages, or "stereotyping". The list was never meant to be exhaustive, nor was it meant to be perfectly accurate - it's meant to be a JOKE. It's SATIRE. I KNOW that Satanism isn't really about selling souls, and that some of the matches aren't perfect. Also, the reason why languages such as Pascal, Fortran and Smalltalk didn't make it to the list was because I couldn't think of anything funny to say about them. The single most common observation was that assembly should be atheism - I actually WROTE that at a point, but I removed from the final post because I felt that it would end up being biased, one way or the other, and because atheism isn't a religion (or lack of one, for that matter).

If you thought that the original article was funny, I suggest you to read through the comments, both on the article and on the links above - there are some very funny suggestions on those, both for languages that I covered and also for many that I didn't mention.

P.S.: Some people seemed to take issue at me calling C restrictive. I obviously didn't mean that it's restrictive in a "what can you implement with it" sense, but rather in a "to what level can you abstract with it" sense.


Monday, December 15, 2008

If programming languages were religions...

And now, for some off-topic:

"If programming languages were religions"
(Inspired by "If programming languages were cars")

C would be Judaism - it's old and restrictive, but most of the world is familiar with its laws and respects them. The catch is, you can't convert into it - you're either into it from the start, or you will think that it's insanity. Also, when things go wrong, many people are willing to blame the problems of the world on it.

Java would be Fundamentalist Christianity - it's theoretically based on C, but it voids so many of the old laws that it doesn't feel like the original at all. Instead, it adds its own set of rigid rules, which its followers believe to be far superior to the original. Not only are they certain that it's the best language in the world, but they're willing to burn those who disagree at the stake.

PHP would be Cafeteria Christianity - Fights with Java for the web market. It draws a few concepts from C and Java, but only those that it really likes. Maybe it's not as coherent as other languages, but at least it leaves you with much more freedom and ostensibly keeps the core idea of the whole thing. Also, the whole concept of "goto hell" was abandoned.

C++ would be Islam - It takes C and not only keeps all its laws, but adds a very complex new set of laws on top of it. It's so versatile that it can be used to be the foundation of anything, from great atrocities to beautiful works of art. Its followers are convinced that it is the ultimate universal language, and may be angered by those who disagree. Also, if you insult it or its founder, you'll probably be threatened with death by more radical followers.

C# would be Mormonism - At first glance, it's the same as Java, but at a closer look you realize that it's controlled by a single corporation (which many Java followers believe to be evil), and that many theological concepts are quite different. You suspect that it'd probably be nice, if only all the followers of Java wouldn't discriminate so much against you for following it.

Lisp would be Zen Buddhism - There is no syntax, there is no centralization of dogma, there are no deities to worship. The entire universe is there at your reach - if only you are enlightened enough to grasp it. Some say that it's not a language at all; others say that it's the only language that makes sense.

Haskell would be Taoism - It is so different from other languages that many people don't understand how can anyone use it to produce anything useful. Its followers believe that it's the true path to wisdom, but that wisdom is beyond the grasp of most mortals.

Erlang would be Hinduism - It's another strange language that doesn't look like it could be used for anything, but unlike most other modern languages, it's built around the concept of multiple simultaneous deities.

Perl would be Voodoo - An incomprehensible series of arcane incantations that involve the blood of goats and permanently corrupt your soul. Often used when your boss requires you to do an urgent task at 21:00 on friday night.

Lua would be Wicca - A pantheistic language that can easily be adapted for different cultures and locations. Its code is very liberal, and allows for the use of techniques that might be described as magical by those used to more traditional languages. It has a strong connection to the moon.

Ruby would be Neo-Paganism - A mixture of different languages and ideas that was beaten together into something that might be identified as a language. Its adherents are growing fast, and although most people look at them suspiciously, they are mostly well-meaning people with no intention of harming anyone.

Python would be Humanism: It's simple, unrestrictive, and all you need to follow it is common sense. Many of the followers claim to feel relieved from all the burden imposed by other languages, and that they have rediscovered the joy of programming. There are some who say that it is a form of pseudo-code.

COBOL would be Ancient Paganism - There was once a time when it ruled over a vast region and was important, but nowadays it's almost dead, for the good of us all. Although many were scarred by the rituals demanded by its deities, there are some who insist on keeping it alive even today.

APL would be Scientology - There are many people who claim to follow it, but you've always suspected that it's a huge and elaborate prank that got out of control.

LOLCODE would be Pastafarianism - An esoteric, Internet-born belief that nobody really takes seriously, despite all the efforts to develop and spread it.

Visual Basic would be Satanism - Except that you don't REALLY need to sell your soul to be a Satanist...

Thanks to jfs and other people on #aegisub for the suggestions. Keep in mind, this list is a joke, and is not meant to offend anyone. Also, if you're a Muslim, please don't kill me. ;)

Note: I wrote a follow-up to this article, regarding the overwhelming reaction that it received.

Note 2: This article is also hosted at my official blog here.

Comments closed as of April 2011. This post has started attracting more spam than useful comments, so commenting has been closed now. Thank you for the many responses.


Saturday, November 29, 2008

Font hinting and you

Image showing the difference between Arial size 8 scaled 40x and Arial size 320.... or why you should not use animation on the \fs override tag.

Do you know what font hinting is? If you haven't worked with digital typography you might not, but it's a technique used by (almost) all font rendering systems to make vector fonts (such as True Type, Open Type and Adobe Type 1) appear better on low-resolution mediums like computer monitors. (Actually CRT TV screens are even worse.)

Usually glyphs ("characters") in outline fonts have quite some detail in them, but if you only have 7x13 pixels to render a character in, you're going to have a hard time fitting all that detail in, even if you use sub-pixel rendering such as anti-aliasing and ClearType. That's where font hinting comes it. It's a technique for intelligently modifying the outlines of characters so they look better without completely losing the characteristics that makes the font face special. The basic idea in font hinting is to snap the outlines to the edges of pixels, such that stems and vertical lines take up a whole number of pixels instead of disappearing in quantisation or become a smudge of sub-pixel noise.

So what does that have to do with subtitles? Well, the amount of hinting applied to a glyph depends on the point size of it. The bigger the point size, the less strong the hinting needs to be. For example, here's some text in Verdana at different sizes:

Verdana 8 pixels
Verdana 12 pixels
Verdana 30 pixels

Depending on your font rendering system it might look different (eg. Windows and Macintosh OS X render quite differently) but at least if you're on Windows you should be able to see that the shapes of the letters actually change a bit. The smaller the text size, the more drastic the change.

It's this change of glyph shapes that's interesting in subtitle context. If you've ever needed to have some text change size on screen in ASS subtitles you might have considered whether you should use \t(\fs) or \t(\fscx\fscy). It's the latter that's correct. When you animate the \fs tag you're changing the actual font size requested of the font subsystem, and this also affects the hinting applied to the text.

This leads me to the image at the start of this post: Both of the top two lines with "Test" are rendered in what should have been Arial size 40. But the upper has been given its size with \fs1\fscx4000\fscy4000 while the lower has been given its size with \fs40\fscx100\fscy100. Because VSFilter internally works at 8 times resolution, the upper line is requested as Arial with a font-height of 8 pixels, so it's hinted to look best when rendered just 8 pixels tall, while the other line is requested as 320 pixels tall Arial. The red/blue at the bottom are the same two lines with the border removed, then laid over each other.

Do you now see why you shouldn't animate the font size, but rather the font scale?


Wednesday, November 26, 2008

Aegisub 2.1.6 released


Apparently the fix on 2.1.5 caused audio selection to become much slower. This release will hopefully fix all of those issues. Since this is a very minor change, you can download a RAR with only the new executable (plus its pdb) and extract it over the 2.1.5 install (typically "C:\Program Files\Aegisub"). The RAR "patch" is available here. [Exias' mirror]

If you want to download the complete 2.1.6 installer, you can download it here. [Exias' mirror]

Also, a cookie to the first one who can tell me what Hollywood movie is related to this particular version number. ;)


Tuesday, November 25, 2008

Mac progress...?

We get asked "what about the mac version?" once in a while and yes, it's the eternal problem.

Aegisub can build on Mac and it can run, but unfortunately it's quite hard to make it useful, not to mention the numerous GUI bugs and glitches.

Tonight I managed to make my first running build of Aegisub for PPC (G4/G5) architecture which was (of course) the first thing people started screaming for when I put out my first Intel builds some time last year!

So far this build is essentially useless! It can't load video, can't load audio, has no Automation support, the libass build can't render anything (because Fontconfig is misconfigured) and even the PCM WAV audio provider that otherwise always works, won't. (The PCM WAV problem is due to endianness.)

If you really want to try it anyway, here's a download link:

Remember: You have no right to complain about this build. I know it's horrible and useless and I'll try to make something better. This is just a proof of concept, it is (still) possible to get Aegisub on Mac and PPC.

Also, while it does run on Intel machines, it's quite slow, especially during startup, as it's not a Universal binary but a PPC-only one. Of course it's also possible to make an Intel version and it shouldn't be a major problem to lipo them together, I just haven't bothered to try yet.

There's no ETA for the next useful version. (But oh, by the way, I have added a new script to the SVN repository. It's called and can create Application bundles. Quite useful if you're building for Mac yourself! Yes I used it for this build.)


Sunday, November 23, 2008

Aegisub 2.1.5 released

Due to a fairly serious bug introduced in 2.1.4, here's 2.1.5. Everyone is advised to update to it.

  • Fixed a bug in audio display that caused it not to update properly (introduced on 2.1.4)
  • Fixed a bug that caused Aegisub to crash if you attempted to load any ASS subtitles with malformed embedded fonts
  • Tweaked the layout of the visual typesetting bar
The win32 binary is available here.

BTW, a good way to keep up with Aegisub updates it to subscribe to the Atom feed - I recommend Google Reader.


Friday, November 14, 2008

Aegisub 2.1.4 Released

Version 2.1.4 is now out for Windows. The highlights are:

  • Hopefully removed the dependency on Visual C++ 2005 SP1 runtimes
  • Greatly improved the draw speed of audio display (should make committing on spectrum mode much faster, depending on your settings) - please let me know if any instabilities are caused by this
  • Fixed the aspect ratio of video when the audio display is too tall
  • Added Hungarian translation
  • Fixed a styling glitch in the fonts collector and translation assistant
  • Made Aegisub capable of running if ffms2.dll isn't found
It can be downloaded HERE.


Saturday, November 8, 2008

Three things that Java could learn from C++

This is a mostly off-topic rant that is being posted because I know that a good portion of our user base is composed of programmers.

Ever since I started working with Java, a few months ago, there have been many things that I have felt SHOULD be there, but aren't. Three, in particular, have always bothered me: stack building, operator overloading, and const-correctness. For the rest of this post, I will write snippets in both C++ and in Java to illustrate the differences.

1. Stack building
Creating objects on the heap is slow. Not only does it involve complex algorithms for determining WHERE to allocate the memory, but it also means that you're giving more work for the garbage collector when you're done with the object. In many situations, you will have to allocate many small temporary objects and then never use them again. Stack building makes this much faster, not to mention EASIER. Since Java lacks this feature, you are forced to use annoying workarounds (such as keeping pools of objects) if performance becomes critical.


bool isNormalPointingAt(Vector3f a,Vector3f b, Vector3f dir) {
    return (a.cross(b).dot(dir) > 0);

boolean isNormalPointingAt(Vector3f a,Vector3f b, Vector3f dir) {
    Vector3f normal = new Vector3f();
    return ( > 0);

In the above C++ example, the result of a.cross(b) is stored in a temporary variable in the stack, which is then dot()ed with dir. This could be done in Java if every such method allocated a new instance, but that would quickly become prohibitively slow.

2. Operator Overloading
This is a hotly debated topic. Many people insist that operator overloading can lead to unreadable code, if you start overloading operators to perform things that are illogical - but the same is true for method names, you can have an "add()" method that performs a multiplication. However, especially when working with mathematical vectors, operator overloading makes your code way easier to understand. Bellow is a method that, given 4 control points and a "t" parameter in the range [0,1], finds the corresponding point along a cubic Bézier curve:

Vector3f getBezierCubicPoint(Vector3f a, Vector3f b,
        Vector3f c, Vector3f d, float t) {
    float u = 1-t;
    return (u*u*u)*a + (u*u*t)*b + (u*t*t)*c + (t*t*t)*d;

Vector3f getBezierCubicPoint(Vector3f a, Vector3f b,
        Vector3f c, Vector3f d, float t) {
    float u = 1-t;
    Vector3f result = new Vector3f();
    Vector3f tmp = new Vector3f();
    tmp.scale(u*u*u, a);
    tmp.scale(u*u*t, b);
    tmp.scale(u*t*t, c);
    tmp.scale(t*t*t, d);
    return result;

<sarcasm>Yeah, operator overloading really made that code a lot harder to read...</sarcasm> Operator overloading is a tool. It can be very useful, as the example above demonstrates. I don't think that the language designers should remove that power from their users just because some won't know how to use it wisely.

3. Const-correctness
This one is possibly even more infuriating than the above two points, because it makes proper encapsulation of data much more complicated (not to mention slower). Consider this class in C++ and Java:

class Body {
    const Vector3f& getPosition() const {
        return position;
    void setPosition(const Vector3f& pos) {
        position = pos;

    Vector3f position;

public class Body {
    public Vector3f getPosition() {
        return position;
    public void setPosition(Vector3f pos) {
        position = pos;
    private Vector3f position = new Vector3f();

What's wrong with the above example? Here, let me illustrate it with some code:

Body body;
body.getPosition().x = 5.0f; // compile error

Body body = new Body();
body.getPosition().x = 5.0f; // works

Basically, Java allows you to modify an object's private member without using its "set" method! There is no way to declare that a given object is "read-only" (final only means that the reference can't be reassigned), so you can't prevent that code from working. If you really must be sure that position can't be modified like that, then you have to change your Java class to this:

public class Body {
    public Vector3f getPosition() {
        return (Vector3f)position.clone();
    public void setPosition(Vector3f pos) {
        position = (Vector3f)pos.clone();
    private Vector3f position = new Vector3f();

In other words, you're now returning a full copy of the object. There are two major problems with this: First, the previous code STILL COMPILES. Since the user will have no way to tell if he's getting the actual position or a copy of it (unless he enters the original source or look in the documentation), he might try to modify the position that he got, and then be surprised that it doesn't work. The second problem is that returning an entire copy might be SLOW. What if, instead of a vector, it was an image? And what if it was accessed many times per second? This could quickly become impossibly slow. Java offers no solution for this problem.

(In case you're wondering, I also had to add a clone() to the set method, because otherwise the caller might call the set method, but keep the reference that it passed to it and modify it later.)

While many programs don't suffer from those problems much, there are certain applications that become a true nightmare to write and maintain - physics simulations or anything to do with vectorial math are the obvious example (that's why I kept using "Vector3f" classes above). If Java is supposed to be a cleaner and easier version of C++, then why is it that writing that sort of code in Java is much harder and much less robust?

I am aware that C# supports some (all?) of the above, but it doesn't as yet have as much portability as Java does. Indeed, it seems that C# is what Java SHOULD HAVE BEEN. If Sun doesn't start fixing this sort of thing in Java, then Microsoft just might take the lead. Meanwhile, I'll have to stick to writing that kind of abomination in Java. To think that it's much easier to code this sort of thing in


Monday, November 3, 2008

Bug tracker lost...


We lost the bug tracker, and the latest back-up of it is from 2008-09-17. The chances to restore any newer dataset are very slim.

The server the tracker was hosted on has been shut down, and we don't have a replacement yet. We're trying to find a solution for the hosting.

Until later notice, expect to time out.


Wednesday, October 29, 2008

Aegisub 2.1.3a Released

Yes, it took a while, but here's a new release preview build. The biggest changes from 2.1.2 are:

  • FFmpegSource2 is the new default audio and video provider, replacing Avisynth. This should provide frame-exact seeking (with keyframe support) on AVI, MKV and MP4 files, as well as other benefits. This is still a bit experimental, however. If you have any issues, just switch back to Avisynth in options.
  • The DirectSound audio player was reverted to what it was in 2.1.1, since 2.1.2 seems to have critical issues related to it. [Edit: jfs says that the issue was after 2.1.2]
  • Many small issues around the program were fixed.
  • VSFilter has been updated to the MPC-HC 2.39 version, which includes jfs's new patches (see this post)
  • Aegisub is now built against Visual C++ 2008 SP1 runtimes. Hopefully there will be no issues related to this (ASSDraw is still built against 2005 SP1 runtimes, due to library issues). If you can't run Aegisub, try installing this and reporting how it goes.
A listing of other fixes can be found on TheFluff's builds page. This release includes the Brazilian Portuguese (100%), Catalan (99%) and Spanish (99%) translations. All other translations are too outdated, and were left out. If you are willing to update any of them, please let us know.

The download link to the installer is:

As usual, feature requests and bug reports go in the bug tracker. Please leave your feedback!

Note: previous (non-a) release got nuked due to lack of Freetype2 support, which would cripple Fonts Collector.


Thursday, October 16, 2008

Two Firefox extensions that you'll want

If you do any sort of Japanese reading on the Internet, you'll want these two Firefox extensions:

XHTML Ruby Support
Furigana Injector

The first one adds proper ruby (a.k.a. furigana, in the case of Japanese) support to Firefox. The second one is more interesting: it adds furigana to kanji on websites that don't have it, making it much easier to read Japanese text.

An interesting side-effect of having proper ruby support in Firefox is that the times in blogger look quite odd, with the full date above them.


Tuesday, October 14, 2008

Unicode, UTF-8, UTF-16, UCS-2 - In a Nutshell

I have noticed that lots of people have no idea what exactly is the whole "Unicode", "UTF-8", "UTF-16" and "UCS-2" stuff, aside from the fact that it's somehow related to the display of foreign characters. The objective of this post is to briefly explain them and dispel some of the myths associated with them.

Unicode is a coding system used to represent characters from many languages (including Japanese and Chinese) without the need to change your language locale. If you've tried writing Kanji in Medusa, you know what I'm talking about. In Unicode, characters are given an unique number. For example, the capital letter "A" is U+0041 (65 in decimal), and the Hiragana "ふ" is U+3075 (12405 in decimal). Characters are divided into planes of 65536 characters for convenience. Almost all common characters are in plane 0 (also known as the Basic Multilingual Plane, or BMP), which goes from code points U+0000 to U+FFFF. All kanji are in planes 0 and 2.

UTF-8, UTF-16 and UCS-2 are simply techniques used to encode those values into text files. Windows helped create a myth that Unicode is UTF-16 by calling UTF-16 "Unicode" in applications such as Notepad - but the fact is that UTF-16 is as much Unicode as UTF-8 is.

UCS-2 (UCS = Universal Character Set) is an old encoding system that can store characters from the BMP by simply writing them as 16-bit values. The advantage of this system is that it's simple and covers most characters, but anything outside the BMP will fail catastrophically. That's why some Kanji have issues with some programs. An interesting consequence of UCS-2 is that it allows the mapping of characters that don't exist, such as the ones reserved for surrogate pairs (see the next paragraph).

UTF-16 (UTF = Unicode Transformation Format) builds on UCS-2. Indeed, for characters on the BMP, UTF-16 is identical to UCS-2; the difference lies in planes above the BMP. UTF-16 is capable of representing characters in planes 1 through 16 (even though no planes above 3 are specified yet) with a surrogate pair, that is, it uses two 16-bit values to store a character. This means that you can't measure the length of a UTF-16 string by counting how many 16-bit values it has!

UTF-8 is similar to UTF-16, but a character can be encoded as anything ranging from 1 to 6 bytes, although no character is mapped to anything that would be over 4 bytes long in UTF-8. Similarly to how UTF-16 is identical to UCS-2 for the BMP, UTF-8 is identical to Western encoding for the ASCII range (U+0000 to U+007F), making it "backwards compatible" with software that isn't Unicode-aware. It also means that text that is mostly composed of ASCII characters (such as, say, ASS subtitles) will be much shorter as UTF-8 than as other Unicode formats. That's why Aegisub uses UTF-8 as its standard format.

Regardless of encoding differences, UTF-8 and UTF-16 can both represent ANY Unicode character. UTF-16 can sometimes be shorter than UTF-8, but that will, in practice, never be the case for ASS subtitles, even if they are entirely written in Japanese/Chinese, due to all the ASCII text involved in the format syntax.


Friday, October 10, 2008

Kanamemo on Linux

Tomman has figured out how to build Kanamemo on Linux... So, in case you're interested:


Kanjimemo brainstorming and input request


[Text above added to try to attract some attention to this post]

Ever since I've posted about Kanamemo, there have been quite a few requests for a "Kanjimemo", a tool based on the same idea, but for Kanji.

Even before, I had considered writing something like that... But I never started because I couldn't quite figure out all the details on how it'd work. On this post, I'll talk about some of the ideas that I had for it. If you're interested in a "Kanjimemo", please leave your feedback and suggestions in the comments!

Programming Language

First of all, I'm not sure which programming language to write it in. At first, I considered C++, since that would be the easiest for me and allow the maximum flexibility, at least as far as PCs are concerned. The problem is that I'm already fairly experienced with C++, and so it wouldn't be much of a learning experience (which is always a plus :)) unless I went for Direct3D.

Then I pondered about Java: with all the cell phones supporting J2ME, it seemed like a good idea - Kanjimemo on the go? Great! The real problem came when I realized that J2ME is *REALLY* limited - you often have less than 1 MB of heap memory available (!) for your application, which makes a program like Kanjimemo almost impossible to implement. I also lack a J2ME-enabled cell phone, so I couldn't even work on a J2ME port right away.

A few other languages crossed my mind. C# is something that I've always wanted to learn, but its cross-platform support is quite bad (I'm looking at you, Mono). It's also much slower than Java. Python is another "to learn" language, but I question the sanity of doing complicated data analysis on such a high level and slow language... Plus all the horrible dependencies. Same goes for Ruby.

So, any thoughts on the "language barrier" might be useful.


On to how the program would ACTUALLY work... Learning kanji is nowhere as easy as learning kana. The problem with kanji is that most of them have multiple (typically two) readings, depending on the word... but some (like 日, one of the most basic kanji) can have many more. So my idea is to have an algorithm that works like this:

  1. Select a group of five or so kanji for each level (like Kanamemo)
  2. Mine EDICT for all words marked as [Common] that use that kanji
  3. Perhaps attempt to extract the pronunciation of your kanji on that word? If that doesn't work, just go with individual words.
  4. Create a list of all the different unique pronunciations and associated words.
  5. Have the user learn all the unique pronunciations, preferably by using words that contain nothing but that kanji and kana.
  6. If there's no word with that kanji by itself, make sure that the user already "learned" all the other kanji in the word displayed.
 Of course, steps 3 and 6 might be very tricky to code. All of this will require mining data from EDICT and possibly KANJIDIC. If it becomes necessary, I might use a SQLite database to store this information.

Progression would work similarly to Kanamemo, with a new set of 5 kanji unlocked with each memorized set. Ideally, the user could choose profiles to control the new kanji: perhaps follow the JLPT progression, or the japanese school system progression, or how common a given kanji is, or a combination of them (i.e. start with all JLPT4 kanji sorted by frequency, then all JLPT3 sorted by frequency, etc). The user should also be able to customize a list of kanji that he wants to learn.

Given this system, it'd be possible to simply consider kana as being kanji, and have the program work in the same way for those, so you'd be entering actual japanese words when learning kana. This has the advantage of making your japanese reading skill progress.

Multiple fonts
One problem that I noticed with kanamemo is that it was easy to just memorize the font glyph, as opposed to the more abstract shape of the kana. This could prove to be an issue with kana that are very different depending on how they're written (such as さ and ふ). This program would fix that problem by using different types of fonts (cursive, brush, type) randomly, or perhaps by forcing you to learn all the different variation before progressing.

Since the concept of the program is word-focused, it might feel strange to be learning how to read words without learning what they mean. If you're an anime watcher, then perhaps you already have a relatively big vocabulary of words, but you won't know all of them, and not everyone is an anime watcher. EDICT provides translations, but I'm not sure if just slapping the translations there will do any good... Thoughts on this?

Finally, it might be useful to have someone read the words out loud for you whenever you get them right. I'm not sure how hard it would be to add support for some third-party voice synthesizer, but it might be worth the trouble.

Other ideas
Perhaps the program should be designed to look more like a game? A little mascot cheering for you, a scrolling background, some background music? Perhaps this game could have multiple "stages" that you would do in alternating order: First learn to read the kanji, then what the word means, then perhaps a speed typing test? Maybe even a grammar test mode?

Of course, what this needs the most right now are IDEAS! If you have any, please share them with us. If you know of somebody who might be interested in this sort of thing, link them to this page! If you want to help with the development itself, drop by IRC and let us know. The idea is that this should be an open, free project.


Wednesday, September 10, 2008

Undocumented control

First, let me apologise for the bad pun in the title up front.

There are a few undocumented/"secret" functions in Aegisub that can be activated by holding the Ctrl key (Command on Mac) while clicking on things. These aren't easter-eggs, just little things you may or may not need at times.

First thing is "play video without ensuring audio synchronisation". This is mainly a debugging function. If you hold the Ctrl key while clicking a Play Video button, a flag is set such that, during that playback, Aegisub doesn't attempt to keep the audio synchronised with the video by regularly seeking the audio if it gets too far away from the actual displayed video frame. This may be useful if you have a slow system that can't handle decoding and displaying some complex video in real-time.

Second is the Automation Manager toolbar button. If you hold Ctrl while clicking that, the Automation Manager will not open, but instead all scripts in the Autoload directories are reloaded instantly. The Autoload directories are also scanned for new scripts. If you hold both Ctrl and Shift while clicking the button, then every Automation script is reloaded. This can be really useful if you're working on a macro script and don't want to open Automation Manager every time you make a small change.

I think there might be some more hidden functions like these I'm forgetting, but those can be saved for another post another day ;)


Thursday, August 28, 2008

Subtitles on Youtube?

Recently Youtube got a new features: Subtitles and captions. You can upload a subtitle file to your own videos, so the viewers can get subs. Softsubs, that is. And you can add as many as you want.

So what format are those subtitles in? "Anything that reminds of SRT and SUB goes."

So: Yes, Aegisub supports Youtube subtitles. Just export to SRT when you are done.

Remember that while Aegisub doesn't show SRT (and other formats) as an option in the regular Save As dialogue box, you can still save directly to those formats. Just add the appropriate file extension yourself to the filename, and Aegisub will pick the right format automagically.
You can also just pick regular "Save" if the filename (as displayed in the window title bar) already ends with eg. .srt.


Tuesday, August 19, 2008

Aegisub for Windows Me and 98? Never.

You may wonder why there is no version of Aegisub that works on Windows 95, 98, 98 SE or Me, together knows as Windows 9x. The reason is part technical-practical and part ideological.

The ideological part is short and simple: All of those operating systems are outdated, no longer under support by Microsoft and in general regarded as far inferior to the Windows NT line of systems (2000, XP, Vista). We would rather see people stop using outdated technology.

The technical part mostli lies in Unicode support. All Windows NT versions are native Unicode from the kernel all the way to the user interface. Windows 9x is not. The only way to get some kind of Unicode support on Windows 9x is to use the Microsoft Layer for Unicode library, but even that only gives flaky support at best. We did try building Aegisub against MSLU (unicows.dll) once a long time ago, and failed. Then decided that it wasn't worth the trouble.

True, Aegisub's support for Unicode isn't perfect. There's problems with right-to-left/bidirectional text and probably also several other complex scripts, but it's far better than only supporting one codepage at a time.

Finally, of course, there's the question of number of potential users. In the time this blog has been up (around a month) we've had one sigle hit from a user searching for a Windows Me version. There just isn't a user base large enough to justify the amount of work it would be to create an inferior version.


Wednesday, August 13, 2008

TrayDict: EDICT on your systray

A while ago, I posted about Kanamemo, a tool that I made to help me learn Hiragana and Katakana. Another tool that I've made to help me in the learning of the Japanese language was TrayDict:


TrayDict is a very simple application that sits on your systray until you bring it up with WinKey+Z. Then it lets you type a word in Kanji, Kana, Rōmaji or English, and it will search for it in EDICT (or any of the other supported dictionaries), returning every match, ordered by relevance.

The catch is: it's a complete hack. I made it in just a few days, without worrying at all about how well it'd work - I just wanted a tool for personal use. Even though I'm posting the link to it here, here are some things that you'll have to keep in mind:
  1. It has a bug that prevents Windows from shutting down as long as it's running. Right click on it on systray and choose "Exit" to terminate it.
  2. There are no options at all. The shortcut is WinKey+Z, and that can't be changed. You can also bring it up/down via systray.
  3. This is no longer maintained, so don't bother sending bug reports and/or suggestions.
 For a long time, I've been meaning to rewrite it, with less bugs, more customization, support for regular expressions, rendering with HTML+CSS, Kanji information, etc... But, alas, I wanted to use HTML to render this, and my experience with Trident, Gecko and WebKit were all catastrophic, so I've lost some interest. But maybe, if there is a demand for it, I'll pick this project up again.

Anyway, here is the Win32 binary: Link
The source is available here, but the HEAD is completely broken (result of my experimentation with Gecko), so you'll have to dig the repository for an older version.


Sunday, August 3, 2008

SVN server moved

Some time ago our SVN server (source code management) stopped working. We aren't sure what went wrong, and since we haven't been able to reach the administrator of it we've had to move the repository to a new server.

Fortunately we have had access to full repository backups, so no data is lost. However we have had to re-create the user database, so if you had commit access before but don't have it now, please contact one of the project administrators.

The new SVN server is at


Tuesday, July 29, 2008

Aegisub crack + keygen

Damn, it seems people started pirating Aegisub.

Screenshot of a warez site

Though I wonder where the keys generated by that keygen should be entered... and what would they crack? And I'd love to know how they managed to compress Aegisub that much, it could really help our own distribution!

SPOILER: The site is probably fake, just making up torrents that don't exist, so I'll assume those numbers are fake too. The site also just links to a different website whose URLs look like they'd want you to sign up just to perform a search of their archive.


Saturday, July 26, 2008

The epic of Aegisub-tan

A while ago, jfs and I thought that it would be interesting to have a mascot for Aegisub, similar to the OS-tans from Futaba. She would be a chibi-style mascot, wearing light plate armor and holding a staff, and she would be seen in places like the splash screen, or in "loading" and "Aegisub crashed" dialogs, with different expressions to show her "mood" (e.g. crying when Aegisub crashes).

So we tried to contact several different artists, and many of them seemed to be quite interested in drawing at least a concept for her - but all of them eventually gave up. Is there any hope for an Aegisub-tan?

This is the description that I posted on the Wiki article (a little modified):

  • "Chibi" style, 2 to 3 heads tall.
  • Long white hair.
  • On her left hand, she holds Aegis, the program's name giver (this).
  • On her right hand, she holds a magic baton (Mahou Shoujo Lyrical Nanoha style) with a cog in it. This represents Automation.
  • Below her knee, on her chest, shoulders and hands, she wears red light armor plating, matching the shield. These should be simple, but curvy and shiny. Above all, they should look very light (as in, opposite of heavy).
  • Below her armor, she wears grey (black?) tights. They cover her body and legs, but not her arms or neck.
  • Also on her head, she wears a black/grey (same colour as underlying clothes?) beret.

(Keep in mind that the above description is NOT set on stone, and is just how I visualized her).

I even attempted to draw her myself. This is my pathetic attempt:

At this moment, many people lost hope in an Aegisub-tan. We even considered paying an artist to draw her, but nobody was really interested in donating money to the cause.

In conclusion: does anyone feel like drawing her? What we really need at first, I believe, is a high-resolution concept art, to go on splash, logo, t-shirts, etc. After that, we would need several small vectors of her doing many different things, but it'll be much easier to find artists to draw THOSE after we have a final concept drawn.

If you're interested, please leave a comment!


Friday, July 25, 2008

How VSFilter renders border and shadow

If you've ever used the \be or the new \blur tag you might wonder, why do they blur edges and not everything?

Example of how VSFilter blurs just edges and not everything
The answer lies in how VSFilter internally handles fill, border and shadow, and the relationship between them.

The basic component of a subtitle rendering is the fill. The fill is the main shape of the text, ie. what you see if you disable border and shadow. (When I write "text" here it can just as well be a vector drawing made with \p1.)

I am keeping things simple here, there are some technical details in the actual implementation I'm skipping over because they aren't relevant for the discussion, although they actually impact the actual algorithm used greatly. I might discuss the detailed algorithm later.

When I talk about bitmaps in this post, they are single-channel bitmaps, ie. black/white bitmaps. Colour is applied during the painting-step, this is described in detail below.

When a subtitle is to be rendered, VSFilter first creates a bitmap of the text fill. It then sees if the text should have a "wide outline", ie. if a \bord tag is in effect. If there is a wide outline, it allocates an additional bitmap that will contain the widened region.

The widened region is the fill bitmap modified in a way so it's effectively "embolded", ie. the outline is expanded outwards, but the entire fill is still kept.

Visualisation of how the widened region is the fill expanded outwards by border size
You might notice that the widened region looks a lot like the shadow. This is entirely correct, because it is used for rendering not just the border but also the shadow.

When the line is to be rendered, the fill bitmap is used as-is for fill, the widened region is used as-is for shadow, and the border is generated by subtracting the fill from the widened region, ie. the border is the part of the widened region that is not also in the fill.

Things work a little different when there is no border. In that case, the fill is painted as-is, but the shadow is also painted using the fill.

I wrote above that the fill and widened region bitmaps are black/white bitmaps, so what about colour? The bitmaps are simply re-coloured with the selcted colour during painting, or more correctly, used as alpha masks to paint a frame full of the colour.

Okay, so on to the blur effects.

The blur effects are applied to the bitmaps, either to the fill or to the widened region bitmap. If there is a widened region bitmap, it is applied to that one only and the fill is left alone. If there is no widened region (ie. no border) the blur is applied to only the fill bitmap.

This is why only the border blurs when you use \be or \blur along with a border, and why the fill does blur when you use blur on lines with no border: The fill bitmap is rendered on top of the blurred border, even though the border blur extends below the fill.


Thursday, July 24, 2008

VSFilter hacks

There's lots of versions of Gabest's VSFilter subtitle renderer around, some people are even still using versions that are several years old and are missing features and bugfixes. There's cases of people still distributing these old versions around, sometimes as part of a package with other software such as VirtualDub.

Now it happens that some time in 2006, Gabest seems to have lost interest in working on VSFilter, so nothing has happened from his side, not even bugfixes. Fortunately VSFilter is open source, so we have picked it up and made a fork. We have included both some existing bugfixes and improvements that existed as patches, and also made our own bugfixes and further improvements.

I'm going to talk about some of the improvements we have made throughout this post.


First, I'm going to assume you're familiar with most regular ASS override tags, if not you can get a refresher in our manual. Second, not all of these improvements are available in the version of VSFilter we ship with the current release (2.1.2) of Aegisub, but they should all be in the next one (which will be 2.1.3 or 2.2.0, depending on circumstances.)

Four lines of text, each positioned 0.2 pixels further to the right, showing the effect of sub-pixel positioningSometimes you need sub-pixel precision when positioning text and drawings. Normally you've only been able to get this by setting the script resolution (PlayResX and PlayResY) to something larger than the video resolution, but the "float \pos" patch changes this. It allows you to use decimal/floating point numbers for positioning lines and gives you up to 1/8th pixel precision.

The image to the left shows four lines of text positioned at different X coordinates. You can see how they move ever so slightly, although not a full pixel.


The tags \fax and \fay allow you to do shearing operations on your text. This is a bit like rotations, except that it's not. They are especially useful for doing perspective correction when you are typesetting signs rotated in 3D.

Demonstration of \fax and \fay tags

Shortly before Gabest dropped development of VSFilter, he introduced the "ASS2" format. This isn't very well known, and it only has minor changes over the original ASS format. One of the things was a new karaoke timing tag, \kt. While Aegisub can read ASS2 files it can't write them, so if you edit them in Aegisub you will lose some information. It doesn't have support for timing with the \kt tag either, but you can of course still use that and use all the additional features of ASS2 if you edit the file with a text editor.

So what is \kt? It's "set absolute timing", it allows you to move both back and forth in time of highlight without highlighting other syllables. It is probably best illustrated with an example:
{\kf10}ABC {\kt20\kf10}DEF {\kt10\kf10}GHI

When you render this example, you will first see ABC highlight. Then GHI will follow, because its highlight is set to start at time 10 by \kt. Finally, DEF will highlight because its start time was set to 20 by \kt. You can also use \kt to make karaoke syllables overlap in timing.


Originally the \be tag (Blur Edges) only allowed blurring to be turned on or off, and was very "weak", the blur effect was hard to notice at all. We have updated it so it supports variable strength blur effects now, and you can also use it with \t to animate the blur strength.

Because \be doesn't look very high at high values we have also introduced the alternate \blur tag, which performs the blur effect using a two-pass gaussian blur algorithm. This gives a much nicer and wider blur, especially at high values.

Demonstration and comparison of \be and \blur tags with different parameters

I will discuss why \be and \blur are "blur edges" effects and not "blur everything" effects tomorrow.


Demonstration of \xshad and \yshad tagsIf you've ever wanted the shadow to be positioned differently than just "X down and X right", the \xshad and \yshad tags are probably just what you want. These work like \shad except that they set only the X or Y displacement. You can even use negative values with them!


Demonstration of \xshad and \yshad tagsFinally there's the \xbord and \ybord tags. They can be useful for various things, but one of the intentions was to better support anamorphic video: The \xbord and \ybord tags allows you to control the border width in X and Y direction separately. You can even disable border in one direction entirely!

You can also combine this with strong \be or \blur and maybe some shadow for other interesting effects.


Again, remember that many of these tags require a very new version of VSFilter. For example, the \blur tag was only added yesterday..! The next release of Aegisub will ship with a VSFilter version that supports all of them, so if you want to safe you can wait until then. If you're impatient, you can follow development on the Aegisub forum.


Kanamemo: a tool for the apprentice weeaboo

Back in 2006, when I decided to learn Hiragana and Katakana, I looked around for flashcard programs to help me in my task. After finding that none of them actually worked as I thought that they SHOULD, I decided to roll my own. The result is Kanamemo:

It works by teaching you Hiragana and/or Katakana (at your choice) by "levels". Each level typically contains 5 different kana. It basically shows you a kana and asks you to enter its Hepburn roomaji transliteration. If you get it right, you get +1 point in it. If you get it wrong, you get -10 to it AND -10 to the one that you confused it with. Once all kana of a given level are at least at level 5 (or so, I don't remember the exact rules), you've learned them and it unlocks the next level.

It also never stops flashing old kana to you, but the probability of a given kana being picked is inversely proportional to how good you are at it - that way, it makes sure that you don't forget the ones that you learned earlier, while focusing on the ones that you struggle with.

I found that it works exceptionally well, and I could learn to read all of Hiragana and Katakana in 2 days, but if you're particularly diligent, you can probably do it in a day.

You can download a Win32 binary for it here. Please post a comment if you have any trouble with the runtimes (i.e. missing DLL errors, or "application hasn't been installed properly").

The source code for the program has been available at the Aegisub repository for a while. Here's a link.

I have been meaning to write a similar tool for kanji+words (by mining data from EDICT and KANJIDIC), but my sloth has been preventing me from doing so.

[EDIT] If you want to build it natively on Linux, see this.


Wednesday, July 23, 2008

Five cool things in Aegisub that people aren't aware of

One thing that I've noticed while talking to users of Aegisub is that there are a number of features that people just don't realize Aegisub has. Here are five of them:

(Note: these tips apply to version 2.1.2.)

1. Loading audio from a video file
If you have a video raw that you're working on, and you're just going to be performing audio timing on it, you don't need to demux the audio from it. Provided that the video is in a format that Aegisub can work with (that is: most files on Windows), you can just go to Audio->Open File and pick the VIDEO file as the audio file to open (you will need to change the open dialog's default filter to show those files). Note that this is different from "Open Audio From Video", that just loads the audio from the currently open video file.

2. Saving to non-ASS formats
Aegisub's primary format is ASS (this is due to a few technical reasons, such as avoiding accidental loss of formatting information), so you can't easily save to other formats. But it's not impossible! If you go to File->Export..., you can tell Aegisub to save in other formats, such as SRT, SSA, or Adobe Encore. It's also possible to force the "Save As" dialog to save directly to those formats, if you tell it to use the right extension (i.e., tell it to save as "" and it will save as SRT).

3. Converting framerate with "Export"
The Export dialog has a "Transform Framerate" filter. If you enable it, it can perform a VFR->CFR conversion on your file (useful for hardsubbing to decimated VFR files). However, it can also do CFR->CFR conversion, that is, "ramp" the file. This can be useful if you have e.g. NTSC subtitles and want to speed them up by 4% for a PAL video. In that case, you would set Input to 23.976 and output to 25.

4. Saving screenshots
Often, you might want to share a screenshot of a subtitle with somebody else. Aegisub has tools to help you do just that - right click the video display, and you will have an option to save the current frame as a PNG file, or copy it as an image to the clipboard.

5. Copying to/from clipboard in plain-text
If you want to share some lines with somebody over IRC or an IM program, or you want to copy from another file, it might be useful to know that Aegisub performs copy and pasting of lines as raw plain-text. So if you copy lines from Aegisub, you can paste them in any text medium, and vice-versa. The same is valid for Style lines.


Universal Subtitle Format: a post-mortem?

The Universal Subtitle Format (USF) was an ambitious project. It was a XML-based subtitle format, supposed to replace the old and problematic community standard, Advanced Substation Alpha (ASS). It was chosen as the default subtitle format for the Matroska multimedia container, and was the primary format of Medusa's unfortunate successor, ChronoSub.

It failed to achieve that goal.

This is what its page on CoreForge has to say:

The format is based on XML for many reasons: flexibility, human readability, portability, Unicode support, hierarchical system and easier management.
While I can certainly understand "flexibility" and "portability", I don't see why you need XML to have a portable format, or Unicode support. But they go ahead and claim human readability and easier management. Is that supposed to be a joke? It sure is human readable - compared to binary formats. But it's still an incredibly verbose format that no sane person would try to edit by hand. And how is it easier to manage? Only if they mean that it's easier to avoid horribly misshapen subtitle files (you know, the kind of file that's always floating around the community and that VSFilter will happily eat). And what's with "hierarchical system"? It is true, but isn't it also completely irrelevant? Subtitles are NOT intrinsically hierarchical - forcing them to be only complicates matters.

But let's have a look at the format itself. This is a simple "Hello World" in USF:
<?xml version="1.0" encoding="UTF-8"?>
<!-- DOCTYPE USFSubtitles SYSTEM "USFV100.dtd" -->
<?xml-stylesheet type="text/xsl" href="USFV100.xsl"?>

<USFSubtitles version="1.0">

<style name="NarratorSpeaking">
<fontstyle italic="yes" color="#FFEEEE" outline-color="#FF4444"/>

<subtitle start="6.100" duration="4.900">
<text style="NarratorSpeaking">This is a demo of<br/>The Core Media Player<br/>subtitle format</text>
<subtitle start="00:00:11.000" stop="00:00:15.000">
<text style="NarratorSpeaking">What can be done ?</text>

The above sample is the "official" sample included with the specs, stripped down to make a suitable "hello world". For comparison, I've re-created the script in ASS:
[Script Info]
ScriptType: v4.00+
PlayResX: 640
PlayResY: 480

[V4+ Styles]
Style: NarratorSpeaking,Arial,20,&H00EEEEFF,&H000000FF,&H004444FF,&H00000000,0,-1,0,0,100,100,0,0,1,2,0,2,10,10,10,0

Dialogue: 0,0:00:06.10,0:00:11.00,NarratorSpeaking,,0000,0000,0000,,This is a demo of\NThe Core Media Player\Nsubtitle format
Dialogue: 0,0:00:11.00,0:00:15.00,NarratorSpeaking,,0000,0000,0000,,What can be done ?
Note: I removed the "Format:" lines from the above file. This is because, to the best of my knowledge, Sabbu is the only program that actually cares about those lines. Neither VSFilter nor Aegisub care if it's there or not, and both will, in fact, ignore it.

For further comparison, this is what the same script would look like in the current draft of AS5:
ScriptType: AS5
Resolution: 640x480

Style: NarratorSpeaking,,\i1\1c#FFEEEE\3c#FF4444

Line: 0:00:06.10,0:00:11.00,NarratorSpeaking,,This is a demo of\NThe Core Media Player\Nsubtitle format
Line: 0:00:11.00,0:00:15.00,NarratorSpeaking,,What can be done ?
The first thing that we can notice there is that ASS is a much more "compact" format, and USF is more "readable" in the sense that you'll easily know what each thing does if you aren't familiar with the format - unless you are very familiar with the ASS format, the "Style" line should be incomprehensible. ASS is also more "horizontal" - that is, unless you cram things into the same line in USF, ASS will take less lines but those lines will be longer.

Here's the important point: USF is NOT designed to be written by hand. It's just too much effort to write all of that, and if you forget to close some tag somewhere, you'll break the entire file, which isn't an issue in an ASS-like format. And this is exactly where the problem is: there is no good editor that supports USF!

Lately, there has been a trend for XML-based subtitle formats. This is probably because XML is relatively easy to parse by a machine, and also because of the "buzz" associated with XML. But let's face it: subtitles are not best modeled by XML. The ASS format is a strange hybrid of an INI file, a CSV list, and TeX, and that works astonishingly well - that's why we have decided to base AS5 on the same combination, although that is a subject for another post.

So here's the situation that USF faced: there was no real editor that could deal with it and nobody wants to write or maintain USF files by hand. Because of that, nobody actually uses USF, so there is no renderer that accepts it. Finally, USF offers very few real benefits over ASS, feature-wise. All in all, it just wasn't interesting to support it, and it faded into oblivion.

But here's an idea: Athenasub (the library that will be the backend of Aegisub 3.x series) will be completely format-agnostic. That would make it feasible to make Aegisub fully support USF, even its fanciest features, except that there is no renderer to display it. Should we bother? Is there still any interest in this format? At the moment, I have little interest in attempting to resurrect it (especially since we have our own plans with AS5), but if there is popular demand for it, I might reconsider. Last time I checked, not even the Matroska team seemed to care much for it anymore.

AS5. USF. And let's not forget Gabest's SSF. Do any of those formats have a future in the community? Or will fansubbers linger to ASS until the rest of their days?


Tuesday, July 22, 2008

Random code-snippets from VSFilter

I've been reading and hacking on the VSFilter code more than is probably healthy, and have over the time found a lot of funny/strange clips of code.

For example, this line in GFN.cpp (Get File Name):

CString filename = title + _T(".nooneexpectsthespanishinquisition");

In VSFilter.cpp you can find this gem:

Do you know the "opaque box" background style supported? As an alternative to wide outlines? Well here's how it's created:
CStringW str;
str.Format(L"m %d %d l %d %d %d %d %d %d",
-w, -w,
m_width+w, -w,
m_width+w, m_ascent+m_descent+w,
-w, m_ascent+m_descent+w);

m_pOpaqueBox = new CPolygon(style, str, 0, 0, 0, 1.0/8, 1.0/8, 0);

Yup, it creates a drawing object, by a string. While it is a bit clever (the alternative would be much more code) it has some bad problems which you may have seen if you've used it yourself: If for any reason it needs to create multiple boxes, such as when you have multiple lines, the boxes will overlap and a non-zero alpha will make that look really bad.

I wonder how long this line has been sitting there:
// TODO: handle collisions == 1 (reversed collisions)

Maybe pre-buffering could be more useful if this was actually implemented:
STDMETHODIMP_(bool) CRenderedTextSubtitle::IsAnimated(POSITION pos)

I still find this the weirddest part... there is a CPP file with a stange name. It contains among other things this function:
#define LEN1 (countof(str1))
#define LEN11 (countof(str1[0]))
#define LEN2 (countof(str2))
#define LEN3 (countof(str3))

static void dencode()
int i, j;
for(i = 0; i < LEN1; i++) for(j = 0; j < LEN11; j++) str1[i][j] ^= 0x12;
for(i = 0; i < LEN2; i++) str2[i] ^= 0x34;
for(i = 0; i < LEN3; i++) str3[i] ^= 0x56;

If you think it looks like XOR en/decryption you're right. That's exactly what it is.

It's from the file containing the void JajDeGonoszVagyok() function, and it's called "valami.cpp". This file also contains one other strangely named function: bool HmGyanusVagyTeNekem(IPin* pPin)

Both of those functions decrypt some strings written as arrays of numbers. These strings are the names of registry keys of other DirectShow filters. The JajDeGonoszVagyok function then detects the highest merit of all those filters and makes sure that DirectVobSub itself gets a merit higher than any of those... I think this is the DirectShow version of the "law of the jungle".


So, what happened to the competition?

On July 9 2006, the last actively maintained ASS-Based General-Purpose Subtitling Software (henceforth ABGPSS) competition that Aegisub had - Sabbu - was dropped by its creator, kryptolus. Sabbu was an important program in the sense that it was the first ABGPSS to support Unicode and to be cross-platform. With Medusa and SubStation Alpha long dead, Subtitles Workshop being far from usable for anime fansubbing purposes, and SSATool designed for very specific purposes, Aegisub obtained monopoly on the ABGPSS business.

But what really happened? Was that a good thing? Let's take a quick look at all the related software.

Substation Alpha started it all. Written in Visual Basic, it had many advanced features for its time, and many timers still think that it's the best timing tool ever made (I was recently shocked to learn that some old-school fansubbing groups have been using Aegisub even for timing!). As revolutionary as it was, it was essentially useless for typesetting and had too many quirks for most users.

  • What happened to it? It was discontinued many years ago by its creator, Kotus.
  • Who still uses it? Many old-school timers still do, apparently, and won't replace it with anything else.
  • Why was it important? It supported genlocks, but it also helped ignite the digisubs revolution. The current standard subtitles format is a direct descendant from SSA's own version 4 format, which is the source of many oddities in the format.

Medusa is the tool that I actually used when I was a "fansubber" (it's worthy pointing out that I was also a fansubber [sans-quotes] for a brief while). Medusa is infamous for its instability. Not only was it also written in Visual Basic, like its predecessor, but it managed to exploit that fact in new unique ways, making it infamous for its instability and propensity for misbehavior. It was such a marvelous tool that I (and many other typesetters) decided that it was better to simply typeset with good old Notepad+VirtualDub. This technique would later inspire Aegisub's video mode.
  • What happened to it? kaiousama, its creator, apparently attempted to rewrite it from scratch into a greater abomination known as "ChronoSub", which would use the dreadful USF format as its primary format. He vanished after that.
  • Who still uses it? Masochists. Aegisub was designed to replace Medusa specifically, so there is no real reason to use it, unless you are on Windows 9x.
  • Why was it important? It was the first ABGPSS to support the Advanced Substation Alpha (ASS) format, and the first to include a video display for typesetting.

Sabbu was an important step in the right direction. This was the only program still in active development when Aegisub started, and that competition probably helped both programs grow faster - I know that Aegisub did benefit from it! This program made fansubbing in UNIX systems a possibility, and solved many of the problems from the older tools. However, it suffered from an unusual GUI, that many people could not get used to.
  • What happened to it? It was discontinued 3 years ago.
  • Who still uses it? Many timers believe that Sabbu's audio timing mode is as good as audio timing can get, and so they stick to it. Because of that, Aegisub 2.x series basically copied Sabbu's timing mode, so now both programs are almost identical on that aspect. (Except that Aegisub supports a few extra tools.)
  • Why was it important? It was the first time that an ABGPSS was developed following modern trends and it was, for a while, the only option that UNIX fansubbers had.

So the situation now is that Aegisub has nothing to compete against. I do not deny that this is somewhat frustrating - many people claim that the entire fansubbing community is driven by fierce competition between groups, and the same holds true of its tools.

Sure, Subtitles Workshop does many of the things that Aegisub does - but it does many essential things very poorly, and has horrible support for ASS. Certainly, there are specific tools (many kept "in house" by paranoid fansubbers who actually believe that they have much to gain from that practice) to do many tasks, especially karaoke. Even SSATool is being incorporated into Aegisub ever since its developer joined our staff. But I miss the thrill of having a real, actively-developed tool to compete against.

Since the dawn of time (since before I started Aegisub in June 2005, that is), there have been rumors that a certain fansubber has been working on a certain fansubbing tool whose ultimate goal would be to replace Medusa (even the name implies that). Well, Medusa has, I believe, been replaced. Perhaps there is still hope for some fun game in the back stage of the community?

Maybe it's only natural that such projects would eventually die out - Sabbu was the only open-source amongst them, but, even then, kryptolus was the only developer. I hope that Aegisub survives for as long as subtitles and fansubbers are around, but I have to keep in mind that, statistically speaking, the odds aren't in my favor...

That said, remember that Aegisub is a free project - if you develop tools for the fansubbing community and would like to join our staff, we will always welcome developers who prove themselves capable of helping us. Ultimately, the goal of the Aegisub project is to be THE tool for all subtitling needs in the anime community.


Sunday, July 20, 2008

Why rendering \k and \kf effects is fast in VSFilter

This is a repost of something I wrote earlier on the AnimeSuki forums, in relation to a discussion of how much CPU time various kinds of karaoke effects take to render.

This discussion only covers TextSub (VSFilter), I don't know what other renderers do and their use is still very limited. Also, everything that goes for \k also goes for \kf, \K and \ko. They use the same rendering technique.
This will also explain a funny "artifact" some karaokers might have seen when using \kf with vertical karaoke.

First, while TextSub does have a function that should tell whether a line is animated or not (presumably so it could avoid re-rendering static lines for every frame) that function is empty, it just says "return true;", so every line is always considered animated, no matter what's in it.

Next, the way \k effects are handled is using a "switchpoints" algorithm.
TextSub renders (up to) three different single-channel 6-bit bitmaps for each line, fill, border and shadow. (Border is Shadow minus Fill. Shadow is Fill "expanded" to give an outline.)
When the subtitle is to be painted onto the video, TextSub builds a list of switchpoints for each line component. A switchpoint has two parts: Colour (which includes alpha) and end-coordinate. The end-coordinate is which pixel index on the scanline the colour is valid up till.
(When a line has a vector-\clip, the vector drawing is rendered as a fourth 6-bit image which is used to mask the other layers while painting.)

When there is no \k effect, there is only one switchpoint for each component, which has the colour of it and the end-coordinate set to infinity (actually 0xFFFFFFFF).
When there is a \k effect, the current position of the highlight is calculated for the frame, and a switchpoint is added at the right coordinate. This is very fast to calculate. The pixel size of every syllable is already known (because the rasteriser breaks the line into "words" at every change in formatting - \k tags are formatting) and for \kf effects, getting the position within the syllable is a matter of simple linear interpolation between the endpoints of the syllable.

Now for painting an actual component.
For every scanline of the component, loop over each switchpoint. For each switchpoint, paint its colour to the video frame, using the component as mask and optionally also masking with a vector-\clip mask. When the endpoint of a switchpoint is reached, do the same for the next switchpoint, continuing where the previous one left off.
This is repeated for every scanline of the component. Also very fast.
(The case of just a single switchpoint, ie. no \k effect, is questionably optimised by removing the switchpoints-loop. I think this in practice only saves a few hundred or maybe thousand machine instructions in total for each component, but I haven't checked the actual code.)

For the reason why \k effects don't rotate when you use \frz (and family): They are scanline-based and the switchpoints are assumed to always be on the same coordinate on every scanline. The switchpoints can't change between scanlines for the same component.

This should explain why \k effects are fast to render, unlike many \t-based effects. Using purely \k-based karaoke effects is safe to do when softsubbing, any modern CPU should be able to render it, since it doesn't really take any more CPU than rendering static lines.


The future of Aegisub

Greetings to all readers, and welcome to our new blog!

I'm Rodrigo Monteiro (a.k.a. amz) and I've founded the Aegisub project together with Niels Hansen (jfs). Although I've written a good portion of all the code, lately real life has decided to get in my way and I haven't been contributing much - which is part of the reason why development has been slow.

But, to get to the point, this is what we're planning for the future of Aegisub:

  1. We want a stable 2.2.0 release ASAP. Nobody should be using 1.10 anymore.
  2. We want proper Linux, *BSD and OS X support. Although those three platforms work to varying degrees, Aegisub still works better in Windows.
  3. A major infrastructure review, which will decouple all the subtitle parsing and manipulation into an external library tentativelly named Athenasub.
  4. Implement even more features!
  5. AS5.

I think that we're very close to point #1, and that depends mostly on jfs finishing the manual. On the UNIX front, we have verm porting the program to accomplish #2, but we still need more C++ developers to work on the actual features that don't work too well there - TheFluff has been trying to fix LAVC support, which is very problematic.

Point #3 is largely my responsibility. Athenasub will be a standalone C++ library that will load, manipulate and write subtitle files in many formats (all that Aegisub supports now, plus new formats, including image-based). It will probably also support some form of script similar to Avisynth, which could be used to edit individual subtitles from command line or do whole batches at once. While the library itself is coming along nicely, integrating it into Aegisub will be extremely difficult, but will hopefully make the program more stable and easier to understand (source-wise). It will also warrant a major version change, so look forward for that in 3.1.x.

Point #4 includes all those features that we've always wanted but never got around implementing... gradient and blur visual typesetting tools, a bleed checker, a script analyzer (that will search for any potential issues and display all of them in a list, with support for automation plugins), a character counter, and a few others.

Point #5 is probably the farthest in the future. AS5 is a subtitles format that is intended to replace the Advanced Substation Alpha (ASS) format, by adding many critical new features while overall simplifying the format. A draft specification is available here, but beware that it will certainly change much before it sees the light of day.

This is all that I can think of now. Perhaps jfs will have some more to say regarding his plans for the future of the program. Either way, I intend to detail those points more carefully in posts to come, so stay tuned.


Aegisub development blog now open

This is a bit of an experiment, something we (at the Aegisub team) have talked about a few times: A development blog.
The idea behind this blog is simply to have somewhere to post longer texts about Aegisub development and subtitling/video/technology in general.

I hope it'll become a success, but let's see :)