Software by Steven

Sometimes a square peg is a round hole

Subtitle Parsing: Planning for Maintainability

While other commitments have kept me from contributing a lot of code to Popcorn.js in recent months, I’ve had plenty of time think about how to tackle a group of related tickets assigned to me. Approximately a year ago, for Popcorn version 0.3 and 0.4, I was responsible for incorporating some earlier subtitle parsing into popcorn. That grew into Popcorn supporting text display for 7 standardized subtitle formats. 5 of these formats also include their own in-source formatting, each with their own syntax. They grouped into two main classifications:


  • TTXT Format
  • TTML Format


  • WebVTT
  • Sub-Station Alpha (SSA)
  • Advanced Sub-Station (ASS)

With five different ways to represent the same information, I plan to avoid duplication as much as possible. When done properly, this not only makes it easier to initially develop, but makes future maintenance and improvements simpler. A real example of why this is useful is that in May/June 2011 I had a nearly-complete initial version of a WebVTT parser, with just a few CSS-related quirks to work out. After having put it down for a bit, the standard evolved such that much of what I had done was obsolete. Since it wasn’t modular, most of it has been discarded into a file on my desktop.

Looking forward, I see two independent cases for code maintenance: format evolution and CSS evolution. Similar domain problems have been solved by translating the source (raw subtitles) into a universal intermediary language, or a machine interpretation, which then gets processed and output. Human language translation has been approached like this, as have cross-platform programming languages. Both Java and .NET compile to an intermediary language (called bytecode and MSIL respectively), which then is translated to the desired, platform-specific output at runtime. While there is a very small overhead with a second translation, it has been said by a Microsoft engineer that maintainability increases drastically.

Essentially, given n input formats and m output formats, an intermediate level causes their to be n + m + 1 possible combinations, rather than n x m.

Translation Count

Translation Count

Strengthening the argument for this approach is that certain common display functionality (CSS class lookups, creation, etc.) will be required by multiple parsers, and changes must be reflected in all parsers. It is with this knowledge that I plan to stand on the shoulders of giants.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: