It seems like every piece of modern software now comes with AI to provide its customers with features of questionable quality, practicality, and security. Mozilla and its Firefox browser are not immune to this widespread use of AI, and the company wants to implement AI for better accessibility.
In a recent post on Mozilla Hacks, Tarek ZIade explained how Firefox will use artificial intelligence to improve accessibility, namely by offering AI-generated image captions for people who rely on assistive technologies, such as screen readers.
Image captions or "alt text" provide the necessary context to readers, but sadly, many writers ignore alt text, resulting in nearly half of all the images missing proper descriptions. With the latest AI advancements, it is now possible to run a local machine-learning model to auto-generate captions without sending potentially sensitive info to servers.
Firefox 130 will ship in the Nightly Channel with a new feature for the PDF editor that will generate alt text using small open-source Transformer-based machine-learning models. Mozilla claims they are good at describing images without a heavy resource tax. Therefore, Firefox users should get image descriptions (first in PDFs) even on less powerful devices.
According to the blog post, small models can generate alt text with over 200 million parameters while taking less than 200MB of disk space and providing output in a matter of seconds. They are less detailed and accurate compared to mastodons of modern LLMs, like the latest GPT-4o, but developers do not want to overwhelm users with too much information. Therefore, Firefox will focus on producing a one-sentence description, such as this:
A group of people in an office celebrates with a lit birthday cake in the foreground and a smiling woman in the background.
There are several benefits of using a local model. Besides improved privacy (images do not go anywhere for processing), users get better resource efficiency, more transparency, less CO2 emissions (training large models generates a lot of carbon emissions), and frequent updates with regular enhancements.
You can find more technical information in the official post.
38 Comments - Add comment