It is called Simba and has a clear mission: to make the internet easier to understand for everyone. The web application, which simplifies German-language texts, offers two AI-supported solutions: an internet app to simplify your own texts and a browser extension that automatically summarises texts on websites. The aim is to break down digital language barriers.
Participation instead of digital language barriers
Texts on websites and in online articles are not only a barrier for people with learning difficulties or those learning German. “Our research shows that the complicated language used on public administration websites and in the education and science sector in particular excludes a significant proportion of the population from important information,” explains Freya Hewett, a researcher at the Alexander von Humboldt Institute for Internet and Society (HIIG).
Even experienced native speakers can find it difficult to understand the content behind complex sentences and technical terms. However, participation in society depends on being able to use information and services on the internet. “Simplified language can help close these gaps,” claims Hewett.
Simba‘s free solution is aimed specifically at end users, allowing them to simplify everyday texts themselves. The AI-supported applications replace long words with shorter terms that have a similar or identical meaning. They shorten sentences and add additional information to make contexts clearer. This principle not only helps the application to understand texts better, but helps media professionals to formulate texts that are easier to understand.
There are already comparable solutions that simplify language automatically. However, most of them are fee-based and are mainly used by institutions and companies. Simba, on the other hand, wants to make the service accessible to as many people as possible without a payment barrier.
“Our goal is for Simba to become an everyday tool that helps everyone who wants to use text simplification in their daily lives,” says Dr Theresa Züger, head of the “Public Interest AI” research group, in which the AI application was developed.
The technology behind Simba
Both Simba applications are based on a so-called “text generation model”, also known as Large Language Models or Foundation Models. Prominent representatives of these applications are GPT-4, Mistral 7B or Llama.
After being trained with large amounts of text data, the models calculate which word is most likely to come next in a sequence. Simba is based on the foundation model Llama-3-8B-Instruct, which was refined using German-language newspaper articles.
AI application for the common good
The focus of the “Public Interest AI” research group is to find answers to the question of which principles artificial intelligence must fulfil in order to benefit society as a whole. In this context, the team also develops its own AI prototypes with which they test these principles in practice – just like Simba.
One of the principles is that the AI application is operated without commercial interests. In addition, the source code and the underlying models are freely accessible. This enables transparent collaboration in which a community of researchers, inclusion experts and users can continuously develop and improve Simba. “The target groups we address, such as people with learning difficulties or people who don’t speak German as their first language, are very heterogeneous. Our aim is to improve the language model through continuous feedback and thus create simplifications that benefit many people,” emphasises Freya Hewett.
Simba is looking for a partner
Freya Hewett points out that with Simba, as with all text generation models, there is of course the possibility that automatically generated summaries contain incorrect information. “Nevertheless, we are convinced that Simba offers valuable support.” To ensure that the facts are correct, Hewett recommends carefully comparing the input and output text of the AI application.
The beta version of Simba was funded by the German Federal Ministry of Education and Research (BMBF) and is available free of charge until further notice. However, the running costs for the operation of such an AI application are considerable. In order to ensure the continuous availability and further development of Simba, HIIG is looking for further co-operation partners. After all, Simba should remain freely available in order to promote a more inclusive and fairer digital future by overcoming digital language barriers.