18 min read

110 #apertus #instruere

Learn about my initial experiences working with the Apertus Large Language Model during the Swiss {ai} Weeks, with advice on getting started yourself.
110 #apertus #instruere
Still image from a generative animation - Aperture 1

Continues from my previous post, where some history of the development of open source AI projects was followed by discussion of the new Apertus frontier model from Switzerland. Here is a summary of this article by Apertus 8B: (*all spelling in context!)

The article covers the aperture model (built by the Swiss AI Initiative with an open-source ethos), its capabilities, setup options (cloud or local with hardware considerations), and the importance of community input for trust and governance.
It also touches on broader questions about open data, model transparency, and the Swiss cultural/national context of the model's launch. (see Thoughts on Open & Closed)
Practical tips for developers or researchers on downloading, using, and setting up the model are provided along with resources (e.g., in the cloud, Ollama, Hugging Face, and vLLM) and explanation of costs.

From the slides of the Apertus Tech Sessions prepared for the Swiss {ai} Weeks, we are clearly reminded that the goal of the project is to 1) "Develop capabilities, know-how, and talent to build trustworthy, aligned, and transparent AI." and 2) "Make these resources available for the benefit of Swiss society and global actors" - i.e. nowhere does it say that we should expect a production-ready service. The sessions note that our goal here is to help create an open development ecosystem – especially as open models approach closed model in performance over time.

We have seen charts like these, and want to join into the fray:

Massive multitask language understanding performance of open-source and private AI models. Source: ARK Invest

What we know is that the Apertus model was trained on the Alps supercomputer, operational at CSCS since September 2024, a data center of over 10'000 top-of-the-line NVIDIA Grace-Hopper chips, with a computing power of 270-435 PFLOPS, reportedly in 6th place (June 2024) globally.

Here is how Apertus compares 'on paper' with similar models:

Model Parameters Openness Language Coverage Training Hardware Strengths
Apertus 8B / 70B Open Source, Weights, Data >1,500 Alps: 10,752 GH200 GPUs Linguistic diversity, data privacy, transparency
GPT-4.5 ~2T (estimated) Proprietary ~80 - 120 Azure: ~25,000 A100 GPUs Creativity, natural conversation, agentic planning
Claude 4 Not published Proprietary ? Anthropic: Internal clusters Adaptive reasoning, coding
Llama 4 109B / 400B Open Weight 12, with 200+ in training Meta: ~20,000 H100 GPUs Multimodality, large community, agentic tasks
Grok 4 ~1.8T MoE Proprietary ? Colossus: 200,000 H100 GPUs Reasoning, real-time data, humor...

With a basis of approximately 15 trillion tokens, a LOT of data has gone into the preparation. Particularly noteworthy is the high proportion of non-English data (40%) and coverage of over 1,500 languages, including rare ones like Romansh or Zulu. The data was ethically sourced - without illegal scraping, respecting robots.txt and copyright requirements. While this limits access to certain specialized information, CSCS emphasizes: «For general tasks, this doesn't lead to measurable performance losses.»

The Evaluation section of the Apertus Model Card, Section 5 of the Tech Report have various data on evaluations, and I recommend the blog posts at effektiv.ch for a good overview. In the following sections, I will focus on getting the model up and running for your own testing.

Apertus in the cloud

There are already several good options for playing with Apertus with a minimum of set up, as I have outlined in the Resource section of our Bern {ai} Hackathon last week:

PublicAI
The main chat app for the Public AI Inference Utility, based on OpenWebUI

Direct link: https://chat.publicai.co/

Swiss AI Platform
Enterprise-level support for AI services from Swisscom

Direct link: https://digital.swisscom.com/products/swiss-ai-platform

Brandbot
AI-Platform - powered by BEGASOFT

Direct link: https://www.begasoft.ch/brandbot

Hugging Face
Share your Spaces, Datasets and Models in the world’s largest model zoo, official partner of Swiss {ai} Weeks

Direct link: https://huggingface.co/Swiss-AI-Weeks

With kind thanks to the three providers above, we managed to have a good start last week. Our hackathon platform swissai.dribdat.cc (Dribdat) is connected to the PublicAI API, whom we thank for the free service – generating evaluations for all the teams, like this:

We also added a RunLLM widget (kind thanks for sponsoring a free agent) for user support, and potentially comparison with Apertus:

Would you like to run your own LLM, on premise or in the cloud? Then we need to have a quick talk about the money.

If I were GPU-rich ...

A famous musical scene from the movie Fiddler on the Roof (1971)

On my Hugging Face profile, you can see the hardware that I brought to our Bern {ai} Hackathon last week: two rather average workstations, at current market prices around 700 CHF each. They are representative of what most enthusiasts could afford. 23.24 TFLOPS is actually not that bad, as far as value for money goes.

The workstation we borrowed for the Data Hackdays in Uri (shown above) was significantly more powerful - costing a factor of 2 - 3X more than the builds above. Nevertheless, it comes at less than half the TFLOPS - reminding us that these values are only a rough approximation of true performance. Read see the project report of our local AI installation in Uri here (in German).

In my tests, as you will need to reserve some memory for your operating system and programs, at least 16GB of VRAM and ideally 20GB should be available to run the smaller Apertus 8B model, as it was provided on launch day. In other words, you would need to try to get a hold of a top-of-the-line graphics card.

Image of an RTX 4000 graphics card courtesy of NVIDIA

This could be the NVIDIA RTX 4000 SFF Ada pictured above, currently retailing at around 1150 CHF, or the RADEON RX 7900 XTX which is about 800 CHF – though note that support for Radeon chips can be a bit patchy. Oh, and even then, don't expect to get much performance out of it: you probably want to get 2 or 3 of such cards in an SLI setup 🤑

Comparing Apples to oranges?

Given the situation above, it is understandable people feel that the most cost-effective way to get the Swiss LLM in house today is Apple hardware. The current Mac Mini with an M4 chip and 24 GB unified memory should be enough to run Apertus, and retails for < 900 CHF at the moment. Going up up up to 12'000 CHF for the behemoth 512 GB version.

Clearly we are not comparing apples to apples here: the way that NVIDIA, AMD and Apple measure their GPU cores and ALU units differs. Your performance may be quite different depending on the way your platform is set up, and models themselves need to be optimized to run decently on Mac hardware in the first place.

And you have to ask yourself: just how hot do these things get? The performance of a powerful Mac Studio (192GB Unified Memory) was evaluated at our hackathon in the Measure footprint of open LLMs project. Here are two charts excerpted from their report:

Power Consumption and Energy use of Llama 4
The same prompt and system running Apertus

There are plenty of installation guides online for Llama, that you can also use to install Apertus on your Mac. I have particularly heard good things about the combination of LM Studio with the MLX quantizations. Cool.

There is a lot of debate out there about what constitutes AI-level hardware, and certainly the push to sell new computers is one of the major factors in the global race to build capacity. Just for fun, here is how my Hugging Face profile would look like if I had 4000 Grace-Hopper units, which were used to train the Apertus model 🤗

Whether you are GPU-poor, or GPU-rich, what software do you need to work with Apertus? I will discuss Ollama and vLLM in the following section. Others have reported good performance with LM Studio (Macs) or Lemonade (Ryzen). Another client I have been recommended is Jan.

Open source ChatGPT alternative that runs offline - Jan
Jan is building Open Superintelligence. It’s the open-source ChatGPT alternative that leverages the best of open-source AI.

Using Ollama

Models downloaded from the Ollama library can be configured and managed most easily with an elegant chat interface of Open WebUI. This is my default option, as I have run a shared server in the office and at home for over a year with open weight models like Llama, Olma and Qwen.

What is Ollama? Understanding how it works, main features and models
Ollama is a tool for running large language models locally on your system. Check out this article to learn more about its features and use cases.

As of writing this, we are still a little ways away from being able to run Apertus on Ollama – due to some bleeding edge methods used to engineer and parametrize the model, we need to wait for a new release of the llama.cpp library. Once this is distributed through an Ollama release (usually quite a speedy process), we can get started.

The quantized versions of Apertus in GGUF, FP8 or MLX formats can give you more milage. In particular, if you lack the RAM for a full model. For Ollama, you may need to write a Modelfile, include the chat template and other bits for a complete model specification. But your video card may still not support it, like it is the case for me currently.

I am involved in the discussion and testing nightly releases: stay tuned! Thumb up and watch this issue request to the Ollama team to stay on top of progress:

[Model Request] Support new Apertus model · Issue #12149 · ollama/ollama
This is a new model from the Swiss AI initiative. It currently does not load due to Error: unsupported architecture “ApertusForCausalLM” Some tips on getting Transformers updated on the Hugging Fac…

See also the CH Open & BFH workshop described in my previous blog, with links to a video and slides where the ecosystem around these tools is discussed:

109 #swissaiweeks #siliconlovefield
A whirlwind of activities generated (no pun intended) by the Swiss {ai} Weeks in Bern.

Using vLLM

This is the route recommended by the Apertus team, and was available from day 1. Based on this, PublicAI and other providers have launched their inference services.

vLLM is a fast and easy-to-use library for LLM inference and serving. Originally developed in the Sky Computing Lab at UC Berkeley, vLLM has evolved into a community-driven project with contributions from both academia and industry.

For a software developer, the library may be easy-to-use, but deployment is not simple. Nevertheless, for an IT team the multiplatform deployment and integration with DevOps tools like Kubernetes would make a lot of sense. Once you have the NVIDIA libraries and CUDA tools set up, a relatively simple way to start vLLM is with Docker.

Apertus running in Open WebUI via vLLM

I have prepared a script, tested with a basic 16GB GPU machine on Linode: the RTX4000 Ada Small, which costs $350 CHF per month ($0.52 per hour) in my region (Frankfurt - Germany). If there's interest from the community, I will put the recipe into a StackScript. Note that you need to agree to NVIDIA's licensing conditions to use their proprietary libraries. My scripts can be downloaded on GitHub, or here:

You should also create a .env file or pass in a couple of environment variables otherwise:

  • Set HF_TOKEN to a token you generated on your Hugging Face settings, making sure to allow "Read access to contents of all public gated repos you can access".
  • The HF_MODEL parameter should be set to swiss-ai/Apertus-8B-Instruct-2509 - or to any other model, or remix of Apertus.

Note that I'm using the 'nightly' version to make sure the latest Transformers library is used. I've also set max-model-len to a low 4K (the default is 64K, you probably want to use at least 8K), which you can increase if your system allows it.

Screenshot of btop showing system load during inference

Using Hugging Face

Most of my initial experiments are in a Hugging Face Space, mirrored in a Codeberg repository, which can be used by developers during our hackathons. It has a typical completions API with the recommended System Prompt, as well as some optimizations and other ways to query the model using the Transformers library.

fastapi-apertus
Apertus (Swiss LLM) hosted on FastAPI with Hugging Face transformers

In addition to the space, I've prepared an Apertus 8B Instruct 2509 endpoint which you can use with a Hugging Face token. Thanks to support from the HF team, this can be easily deployed on your own by entering the model's name at endpoints.huggingface.co:

Screenshot from Hugging Face

Make sure to configure the auto-sleep options to your liking. Too short, and you'll be frustrated by the long startup times. Too long, and you may be surprised by a large credit card bill (though you can also set spending limits in the Billing section).

Screenshot of Apertus responding to questions in the Hugging Face chat interface

Many thanks and kudos to Leandro and team for their tech support, and giving out free credits, stickers, badges and other goodies to participants of our hackathon. I really encourage everyone to check out their learning resources online:

Hugging Face - Learn
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

What's in a name?

If I may diverge from strictly technical topics for a moment. Using a Latin word in the name is a quaint choice, but not extremely original. There are several companies with it in their trademarks, so some legal discussion will surely be necessary. The relatively inactive apertus.org project is an open hardware camera. This is alluded to by the generative image pasted at the top of this blog you're reading.

Excerpt from the PONS Latin-German dictionary definition of Apertus

"Professor Nümmerli" made a quite humorous take on the subject (a fondness for Swiss German humor is recommended) in a video posted by comedian Mike Casa on Saturday.

#ai #apertus #apertus #apertus #geneva #funonsaturday #swissai #swissaiweeks #generationgenerative #dontlie2urai | Daniel Dobos | 14 comments
🇨🇭📰🤖 BREAKING: Prof. Nümmerli elected first Swiss #AI Model #Apertus Ambassador? With new voice interface: Apertus? :: ***HOI*** Following his great performance at the ‘inofficial’ launch of the #Apertus Swiss AI Model press conference, the nomination process for Prof. Nümmerli started immediately and is about to conclude soon ... right after: - the collection of internal signatures - the launch of the referendum - then the vote again - followed by the counter-proposal - and the final blah, blah, blah so practically he is already almost fully approved yet, weischt? Pollings show that already 3️⃣0️⃣%, so *more* then the 3️⃣9️⃣% necessary for approval, of #Apertus researchers, developers, trainers and deployers were laughing - so an overwhelming absolute majority. Thanks Imanol, Martin, Antoine, Marcel, Alex, Melanie, Martin, Sarah, Joost, Oliver, Giuseppe, Bettina, Maria-Grazia, Ido, Barna, Angelika, Eduard, Nikodem, David, Adriano, Andrei, Sabine, Katka, Christoph, Jürg, Claudio, Marc, Thilo, Anna, Allen, Dino, Joshua and many others, please consider supporting Prof. Nümmerli’s nomination. Thanks Mike Casa & Mike Casa Comedy for the good laugh - it made my Friday evening after an overwhelming week and looking forward to see you on 19th Sept, in #Geneva. Can we please get the ***HOI*** sound-bit for a voice chatbot interface to be developed during the Swiss {ai} Weeks? Will you join? #FunOnSaturday #SwissAI #SwissAIWeeks #GenerationGenerative #DontLie2UrAI | 14 comments on LinkedIn

One can get used to it.

Thoughts on Open and Closed

Upgrading from existing tools and systems to new ones, especially when it involves innovative AI models like Apertus, often requires a careful reconsideration of old and new. I get the feelings of responsibility, a sense of stewardship over projects embodying the values of openness and community empowerment.

The choice of the Apache 2.0 license suggests transparency and intention to foster open & global collaboration. However, we also need active, public discussion around the choice of licenses and governance models. The rhetoric around openness can sometimes mask real concerns, so understanding the logic behind this choice and whether it aligns with community principles is key.

The decision to publish Apertus initially on the Hugging Face site, for which there is currently no direct equivalent in Switzerland, plus the absence of any mentions on SUPSI, CERN, or other universities, has been somewhat conspicuous. It seems to me that the gated (registration required - the "You have been granted access" in my screenshot at the top) initial publication of the Apertus model has provoked the most skepticism.

Logicians Find a Genie
A philosophy webcomic about the inevitable anguish of living a brief life in an absurd world. Also Jokes

I chatted about this with Apertus, and got some sensible action items to foster trust:

1) Involve the Community in Key Decisions: Engage with open data advocates early and often. Open forums or public consultations on model development, governance, and licensing can address concerns proactively.

2) Transparency in Data and Governance: Publish more documentation on data sources and training processes. Include explanations of the decision-making process around the choice of license and motivation for the current deployment strategy.

3) Strategic Partnerships and Affiliations: Explore and communicate about whether there are less visible partnerships or affiliation that would explain the current institutional landscape.

4) Open Data Interoperability: Consider discussing integrations or complementary strategies with Opendata.swiss, Zenodo, and similar open data platforms to enhance accessibility and visibility but also value governed, controlled access.

5) Highlight Accessibility and Fairness: Ensure that while the main model is accessible through a controlled portal, there exists a pathway for researchers and developers to understand, audit, or build upon the model with clear guidelines and support for responsible use.

Connecting to Apertus

It's reasonable to take a step back to ensure that the model is indeed what it promises to be. Legitimate questions about the dataset's origins, compliance with legal and ethical standards should be addressed. Understanding the model's capabilities, the intentions and process behind its creation is fundamental, not only to trust but also to meaningfully engage with and potentially contribute to its roadmap. Here are some places where this is happening:

GitHub - swiss-ai/Apertus-Generation-Issues-Reports: This is the repository for reporting issues with the SwissAI Apertus Model family generation
This is the repository for reporting issues with the SwissAI Apertus Model family generation - swiss-ai/Apertus-Generation-Issues-Reports

GitHub repo from the Swiss AI team

swiss-ai/Apertus-8B-Instruct-2509 · Discussions
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hugging Face community

Apertus
Learn about the new foundation model from the Swiss AI Initiative.

Our hackathon wiki

Further helpful references

Apertus: 4 ways to try out Switzerland’s New AI Model · Blog · Liip
Liip is a Swiss digital agency developing web and mobile applications, designing user experiences, and crafting content.
Apertus is here: What can the Swiss LLM really do? - effektiv
In July we asked: «Can the Swiss LLM keep up?» – now the first generation is here with Apertus. Time for a sober reality check: Where does Swiss AI really stand, and who is it interesting for today?
Why choose #Apertus over Llama 3.1 for AI applications | Marcel Salathé posted on the topic | LinkedIn
Still running your AI application on models like Llama 3.1? Try #Apertus - same muscle, no legal hassle. Every week, I see AI demos using models like Llama 3.1. Why are people still building with legally questionable “open” models? “Well - there aren’t really any alternatives”, some of you might say. I ask you to reconsider. The new #Apertus models are both highly performant, legally compliant, and have Apache 2.0 licensing. If you’re building AI applications, this matters more than you might think. If you’re building with such a model, you’ll get: ✅ Simplified legal review and procurement ✅ Freedom to use outputs for training other models ✅ No usage restrictions or special permissions needed ✅ Standard OSI-approved license your legal team already knows 💪 But is it strong enough? Not only is #Apertus highly competitive, it even edges slightly ahead of Llama-3.1! Real-world performance will vary by task, but it’s clearly competitive with leading “open” models. And note that many of those so called “open” models are not actually open in the broad sense. Here’s what you get with #Apertus: 1️⃣ Full training pipeline transparency - scripts, data, intermediate checkpoints all public 2️⃣ Respects opt-out signals - including retroactively 3️⃣ EU AI Act documentation included for compliance workflows 4️⃣ Auditable from start to finish - rare at this scale 🫣 The alternative: Do I really need to list all the legal cases brought against some of the model providers? The dirty (not-so-)secret everybody knows: the leading models have trained on data obtained illegally. It’s one of the reasons their models are so strong (the so-called compliance gap). You may argue that this is not your problem as a user, and I fully understand you. But if you are building, and you are building in a legally sensitive setting, you will definitely need permissive licensing and full auditability for compliance/risk teams. With Apertus, you get that, along with its strong performance. So, if you currently have an application running with something like Llama 3.1, then definitely trial Apertus - developed by EPFL, ETH Zurich & CSCS. Get the models: https://lnkd.in/eZxGHJvN | 19 comments on LinkedIn
Apertus tested: How the multilingual AI model performs
With Apertus, Swiss researchers have released an open-source and transparent large language model that cannot catch up with the frontrunners, however.
Swiss AI’s Apertus 70B and 8B: A Complete Deep Dive into Switzerland’s Revolutionary Open Language…
In the rapidly evolving landscape of artificial intelligence, Switzerland has made a groundbreaking contribution with the release of…
Creative Commons LicenceThe works on this blog are licensed under a Creative Commons Attribution 4.0 International License