VLC media player, the popular open-source software developed by nonprofit VideoLAN, has topped 6 billion downloads worldwide and teased an AI-powered subtitle system.
The new feature automatically generates real-time subtitles — which can then also be translated in many languages — for any video using open-source AI models that run locally on users’ devices, eliminating the need for internet connectivity or cloud services, VideoLAN demoed at CES.
↫ Manish Singh at TechCrunch
VLC is choosing to throw users who rely on subtitles for accessibility or translation reasons under the bus. Using speech-to-text and even “AI” as a starting point for a proper accessibility expert of translator is fine, and can greatly reduce the workload. However, as anyone who works with STT and “AI” translation software knows, their output is highly variable and wildly unreliable, especially once English isn’t involved. Dumping the raw output of these tools onto people who rely on closed captions and subtitles to even be able to view videos is not only lazy, it’s deeply irresponsible and demonstrates a complete lack of respect and understanding.
I was a translator for almost 15 years, with two university degrees on the subject to show for it. This is obviously a subject close to my heart, and the complete and utter lack of respect and understanding from Silicon Valley and the wider technology world for proper localisation and translation has been a thorn in my side for decades. We all know about bad translations, but it goes much deeper than that – with Silicon Valley’s utter disregard for multilingual people drawing most of my ire. Despite about 60 million people in the US alone using both English and Spanish daily, software still almost universally assumes you speak only one language at all times, often forcing fresh installs for something as simple as changing a single application’s language, or not even allowing autocorrect on a touch keyboard to work with multiple languages simultaneously.
I can’t even imagine how bad things are for people who, for instance, require closed-captions for accessibility reasons. Imagine just how bad the “AI”-translated Croation closed-captions on an Italian video are going to be – that’s two levels of “AI” brainrot between the source and the ears of the Croation user.
It seems subtitles and closed captions are going to be the next area where technology companies are going to slash costs, without realising – or, more likely, without giving a shit – that this will hurt users who require accessibility or translations more than anything. Seeing even an open source project like VLC jump onto this bandwagon is disheartening, but not entirely unexpected – the hype bubble is inescapable, and a lot more respected projects are going to throw their users under the bus before this bubble pops.
…wait a second. Why is VLC at CES in the first place?
I’m not sure I see a scenario where someone would lose their job because their client thinks “I’ll tell everyone to read the video with VLC’s local transcript/translation feature instead of paying a translator”. This is not intended for companies but for end users, and it’s not to run as a service but as local feature on a case-by-case basis. Even a obscure video content producer will keep advertising Google’s version of this feature by keeping their videos on Youtube before telling their audience to use VLC.
I know and understand your views on this matter, but they may have inflated the importance of this feature release compared to it’s real value.
On the contrary, I think accessibility-wise, it’s an (imperfect, for sure), good step, if only for transcript.
I might of course be wrong, and the future will tell.
Dont worry, the society will renormalize after the Great Reset.
“Croation” ? or “Croatian” as in, the subdialect of new shtokavian ? 🙂
Honestly, it’s merely meant to be a tool, to fill a gap for those who don’t have a real/official subtitle underhand… as you say, STT is less than good, AI or not, and it’s obvious enough… I’d say, as long as it’s clearly mentioned as automated and states that’s it’s far from precise, we’re fine… there will be enough public at large, for content that needs it, to ask for a real translation that needs quality and finesse
But yeah, Videolan at CES is kinda weird 🙂 (go go Epitech ! I do remember getting drunk in the parking they squatted under the school to hang out… it was the birthplace of a few amazing projects!)
“VLC is choosing to throw users who rely on subtitles for accessibility or translation reasons under the bus.”
What is this utter nonsense you wrote? If subtitles exist in the media its not like VLC is going not display them stop trying to shoot the messenger…
So, the introduction of “AI” has improved the quality of content on the internet? What makes you think the introduction of “AI” in subtitling/closed captioning is going to be any different?
I swear to god, pattern recognition is a lost art.
Absolutely subtitles actually reflect intent of original content now instead of being platforms for translators to inject bias. If translations arent good enough right now…. give it a few more months. Translation as a job is a decade obsolete anyway in all but very demanding senarios.
People can get better faster cheaper subs and translations today than ever.
When real subtitles are absent, it’s better than nothing, Thom. I often use the automatic translation of X/Twitter for posts written in languages that are completely foreign to me, and most of the time the translation is good. Things have moved on from the days of Babelfish, Thom.
I’ve used tools far, far more advanced than Google Translate or ChatGPT. They are not even remotely capable of doing what is needed for proper accessibility, but they will still replace actually educated specialists. If you don’t think this is going to make subtitling and closed captioning worse for everyone, I would once again like to point you to the web. Has the introduction of “AI” improved the quality of web content?
If no, why do you think it’s going to be different for accessibility?
educated specialist that requires 70k payroll and still makes mistakes and injects biases vs nearly free AI translation…. 99.9999% of people and companies are gonna pick the cheaper option and deal with the usually very minor flaws.
Ai models can generate translations at much lower latency also why wait a month or more for a bad translation, that fails to covey meaning when I can get a good translation in milliseconds in a dozen languages.
I think many people will just have to change jobs… teaching languages isn’t going away but translation is dead.
I don’t have a dog in this fight, but I will say that AI is absolutely vulnerable to bias, and this has been proven time and again with ChatGPT and other content-scraping AI tools. It’s literally garbage in/garbage out, it can only learn from what it scrapes and the Internet at large is biased, period. Anything built by humans with emotions and opinions is subject to bias, including AI. The difference between a human translator and an AI translator is that the human is perceptive enough to realize they might have projected their own bias or opinion and is capable of correcting themselves, or submitting their work to another human editor for review and correction. Currently there is no AI out there that can self-review like that, because they don’t have the ability to reason, they just copy and paste in an efficient way. It’s why you see warnings everywhere AI is used that implore you to fact check and review the results, because they are often wrong.
https://www.ohchr.org/en/stories/2024/07/racism-and-ai-bias-past-leads-bias-future
https://time.com/5520558/artificial-intelligence-racial-gender-bias/
Quality releases will still use professional translators. Non-quality releases (and Google services) have moved to automatic translation years ago when Google Translate became good enough. I have the language in my Google Account set to Greek, so I know this happened years before ChatGPT.
Thom, web content and movie subtitles are completely different things. With AI and web content — I’m all with you without any moment of hesitation.
But VLC and movie subtitles… just imagine recent example (I’m european not american, just for the background) I’m learning spanish, I’m good enough to listen to podcasts or radio where people tend to move more clearly and distinctly. But with *some* movies where there are absolutely no subtitles provided on DVD (speaking of older movies) with fast speaking, I’m just struggling, so such kind of automatic AI subtitles would really be helpful. Even if lousy – my mind is flexible enough to compensate 😀
And mind you – even before you say anything – we are talking here about the cases where I don’t expect anyone (even non-professional!) creating the subtitles for some old niche movies.
On Windows I’ve never had a reason to use anything other than MPC-HC (actively maintained by clsid2). Old school interface, intuitive controls, can fetch subtitles from Podnapisi or OpenSubtitles, no plugins required. The GOAT.
Interest rates (the price of money) have been too low/cheap for too long. A lot of businesses were created that can’t exist in a higher interest rate environment. We are now seeing those businesses get caught out with no revenue and they are rushing to automate and cut costs, or jack up prices and try to gouge their way out of it. This is another symptom of this. Enshitifcation is another symptom of this. The market is broken and it will reset to a new equilibrium but the process is going to be painful, I expect ~50% of software companies are going to go bust. AI isn’t living up to the hype, at some point the market will catch up. Engineers already know.
I see this as positive news tbh. A Lot of content isn’t, and never will be translated. And this gives a “good enough” option for those who need it.
An example I might use this for is sports streams, where the commentary isn’t always in my language of choice.
I recently watched a rugby game between Georgia and Japan, but it wasn’t available with commentary from any of the 3 languages I can speak/understand. This would allow me to get an idea of what the commentary team were saying!
This is the dumbest article I’ve read this year.
This seems to me to be the precise thing that technology is good for. Taking something that was once super expensive and making it accessible to all. I don’t think anyone suggests that it should be used to translate Harry Potter, but it can certainly be useful to to transcribe and translate the odd video that no one works ever pay for a translation for.
Heck, it can even be a good tool for a professional transcriber and translator to generate a reasonable first draft, making even professional transcription and translation cheaper.
What’s the world coming to if a tech site can no longer celebrate a new technology?