OpenAI ChatGPT Subtitles Translation

By Akbar

I’m sure you have heard of, and most likely also used, OpenAI Chat-GPT by now. And in case you don’t know what it is, why don’t you ask it yourself? Yes, it can answer this and many other questions with tremendous detail and creativity. Here is its response when I asked this chatbot, “What is Chat-GPT?”:

Chat-GPT (Generative Pre-trained Transformer) is a type of natural language processing (NLP) model developed by OpenAI, based on the GPT architecture. It is a large language model trained on a vast amount of data to generate human-like responses to natural language inputs. Chat-GPT can perform a variety of NLP tasks such as language translation, text summarization, question-answering, and conversational agent…

And while all its language-processing capabilities are amazing (sometimes unbelievable), here we are mostly interested in its “translation” function and how we can use it to translate subtitles. As many of you know, I have been developing Online Subtitle Translator for a few years now, and it supports various machine translation engines and APIs like Google, Azure, DeepL, etc. All of these are machine translations, which, while doing a great job – if you keep in mind these are not being done by humans – machine translation is still far away from being perfect. But the gap is shrinking quite fast, and Large Language Model-based AI models are probably the future way to handle machine translations. Chat-GPT by OpenAI is one of these LLM models and is the main host of today’s blog.

Being a user of various translation APIs, I have been keeping an eye on GPT translation capabilities for a few months now, and I must say I’m super impressed with some initial results. The GPT translation is coming on par and sometimes exceeding when compared with some long-term translation veterans like Google or DeepL (as discussed here and here), which are specifically designed to do only a single task i.e. language translation. What’s even more interesting is these models were never targeted to perform content translation; rather, this comes as a side-effect of their powerful natural language processing capabilities – more like human brains do these translations?

I have been playing with the language translation capabilities of Chat GPT for subtitles, and I found that it often generates far superior translations than Google and Azure translators for the English <-> Urdu translation pair. I can’t personally say how it competes with DeepL, as DeepL does not currently support my native language for translation. Therefore, I will let my readers answer this for me.

The integration of Chat-GPT in the Online Subtitle Translator is not that different to other paid translators like Microsoft Azure or Google, as shown below:

After you select the Translator and Target Language for your subtitles, the user interface shows the following parameters:

From Language: This one is optional, as in most cases, GPT can detect the correct source language from given content. However, providing the correct language may improve the overall translation quality.

OpenAI Model: The choice of the OpenAI language model affects the overall translation quality and cost. You can read more about it here. For now, only a single model is supported, but support for other language models will be added soon.

Translation Type: The Free mode is for demo or evaluation and is limited to 30 lines only. This one is a good choice if you want to just get an idea of the overall translation quality and experience of this new translator before paying for it.

As you may notice, the OpenAI translator is substantially cheaper (sometimes 5-10 times) when compared to other paid translators. But then it’s also painfully slow. More on this is below.

Why is it so slow?

So you tried Chat-GPT and liked the translation quality, but I’m sure you will notice that it’s extremely slow. Like, even for a 20-30 line translation, it sometimes takes 30-40 seconds. And it takes ages for some reasonably large subtitle files for videos and movies. The reason it’s so slow is mainly due to the following two factors:

  1. The first and most important one is the overall infrastructure overload. OpenAI Chat-GPT users went from thousands to millions in a very short period. They are working hard, but it may take some time to adjust their infrastructure to handle all this demand.
  2. The second is how these LLM models work on paragraph completion tasks (in our case, translation) in general. These algorithms mostly work on given content word-by-word (or token-by-token to be exact). Making it hard to parallelise the completion of tasks.

But the good thing is that research and work are being done on both hardware and algorithm parts, and I’m sure things will improve substantially in the near future.

What’s Next?

Integrating the first LLM-based translator for subtitle translation was an interesting and challenging task. Unlike other translation APIs, here, instead of passing the correct source and target language parameters, I had to build a translation task prompt/command to achieve the correct translation. I also have to split and join the subtitle lines to minimise tokens and keep context between lines for improved quality. There is probably still room for improvement here, so I will keep reviewing and improving these parts.

Another interesting point of discussion is whether I should expose various AI model tuning parameters to users or not. Like, should I expose model turning parameters (e.g. to control creativity vs predictability) to the translation user interface? Open to suggestions here.

Also, there are a few more interesting LLM models than OpenAI Chat-GPT (from Google and Facebook, to name a few). Probably time to explore a few of those, too? If you have any suggestions or requests, please feel free to let me know as well.

Tags: , , , ,