11 min read

104 #opensourceai

Notes from the Open Source AI Conference, organized by CH Open and BFH IPST on May 7, 2025.
104 #opensourceai
Demo of OpenShift AI from Red Hat, our sponsors today, featuring a graph designer for RAG pipelines.

Quick and raw notes from the Open Source AI Conference, organized by CH Open and BFH/IPST on May 7. This is a research area for the University, clarifying the still somewhat volatile definition along with recommendations for the public sector.

Open Source AI
Zahlreiche grosse und kleine Technologie-Unternehmen veröffentlichen KI-Modelle unter dem Begriff «Open Source AI». Gleichzeitig wird in der Fach-Community heftig über die Definition diskutiert. Dieser Beitrag erläutert die wesentlichen Aspekte der Thematik und zeigt praktische Anwendungen von offen zugänglichen KI-Modellen.

Oh and, by the way, Maintainer Month is going on as well right now, a chance to contribute some documentation or pen testing to your favorite FOSS:

May is Maintainer Month: Celebrating those who secure Open Source
Maintainer Month returns this May, and the Open Source Initiative (OSI) is proud to join GitHub and a global community of contributors in honoring the individuals who steward and sustain Open Source projects. In 2025, Maintainer Month enters its fourth year with a clear and urgent theme: Securing Open Source.

Just avoid AI slop in your patches, plz:

Open source project curl is sick of users submitting “AI slop” vulnerabilities
“One way you can tell is it’s always such a nice report,” founder tells Ars.

There was related discussion at yesterday's TRANSFORM event, I invite you to browse the posts from my live coverage:

Post by @loleg@hachyderm.io
View on Mastodon

Getting back to the Wednesday afternoon conference, here is what we heard:

13:00 Welcome

Matthias Stürmer and Markus Danhel

  • The intro makes it clear that we are in risky, murky waters, and urgently need to respond to the topic - but actually everyone in the room knows this.
  • What I would like to hear, is how do CH Open and Red Hat support a community of practice, besides today's event? How many member organisations are AI-ready, and run a sovereign data automation pipeline? What standards and legal guidelines exist? Hopefully this will be made clearer in the near future, all eyes on the board.
Post by @loleg@hachyderm.io
View on Mastodon

13:15 AI is not a dream, it is a choice

Andy Fitze, Co-Founder Managing Partner Swiss Cognitive

AI everywhere. There's an AI tool for every task. You might dislike AI, it… | Miro Dietiker
AI everywhere. There's an AI tool for every task. You might dislike AI, it triggers fears, it challenges, disrupts every single status quo, and is a security risk. But banning its use will not make it go away and complicates the situation. It will still be used: 1 in 3 workers use it as a "secret advantage". The Open Source AI Conference was all about asking the right questions to find answers.
  • We predicted much of what is happening, the rise of interest in automation, the war of chips.
  • Agent AI will be understood and adopted by 80% of C-class by the end of this year.
  • There will be no reason to discuss security or data protection. It will be "solved" by integrations into the multi-agent pipeline.
  • Today we train people to understand technology, tomorrow we will build technology to understand people.
  • We need to think about the bigger scale of AI competitiveness, every use case is a small step forward.
  • Leadership! . . .
A brief interaction with my daughter, who looks forward (or maybe not haha) to a serious discussion with a parent this evening.

13:50 Confidential Compute AI meets digital sovereignty

Thomas Taroni, Executive Chairman, Phoenix Technologies AG

  • We need to trust our AI a lot just to book a table at the restaurant. Your email credit card calendar: ready to give it up to the autopilot?
  • With process and people power earlier ventures were moved forward. Now we use innovation as a motor in the same way. Without (open source) sovereignty there is no real innovation, we are progressing on borrowed time.
  • Hallucinations? The data that we use to correct them are worth protecting at any cost. On the GPU the datasets are available for milliseconds, unencrypted. For a public cloud it is irresponsible (grob fahrlässig) to put information out of control.
  • We developed our own cloud. If I knew all the things that were necessary for this, including our own power supply and AI cluster and cluster design, we would not have done it. It's incredible to see all these moving parts and people coming together in Switzerland. We created the Switch edu-cloud based on the same architecture, and hope to win more federal cloud contracts.
  • Can you imagine having your whole org team made up of agents? We're not quite there yet, but think about it. A year ago, I pulled a coder agent into my IDE, the result was shit. I am a codes, I need transparency and to see what steps the agent is taking to arrive at a result.
  • Shows a demo of an agent generating a report on global agriculture, where the sources are mostly well known international and US organisations. Even if we can see the sources, how does the agent help us check their authenticity and bias?
  • The code demo is a textbook case of a factorial program, which defaults to Python, and makes suggestions of unsurprising use cases. Why this language, why this code, why these bugs and patterns? The creativity and elegance of programming and logical thought seems extremely shallow. The Agentic documentation spews out a bunch of very verbose overview followed by minimal bullet points.
  • DeepSeek is not sovereign, and the biases implicit in this AI model are not shown anywhere. I don't see any model cards or semantic references in the log.
  • The future for kvant looks bright.

14:20 Coffee break

  • I spoke to several fellow participants. We agreed that so far there has been yet little mention of the issues creative people, coders and citizens are facing. We are determined to stick around and hope to hear more practical advice next.
  • Have you talked about job security and AI with your local tech union?
Wir unternehmen

14:40 The Impact of DeepSeek-R1 on Open Source AI and the Rise of Large Reasoning Models

Lewis Tunstall, Hugging Face

  • OpenAI discovered the scaling laws which allowed predictable returns, through smaller scale experiments, using benchmarks and a series of measurements to understand the loss while trying to refine a model. See https://openai.com/index/gps-4-research
  • But it seems that, as with Moore, the laws of physics get in the way. NVIDIA can only produce so many chips. Epoch AI has assumed that if we will invest hundreds of billions of dollars, we will scale up.
  • There are crazy stories around the past deprecation of GPT-4.5, probably the largest model ever deployed in public, involving replication along multiple data centers.
  • E.g. the PyTorch sum function apparently had a bug, just as an example of the issue at scale. People wonder if tech companies can keep up their progress.
  • Auto regressive decoding: feed in text, let the Transformer architecture use a fixed amount of compute no matter what the problem. These models have no way to determine how complex the question is. Is o1 and DeepSeek an actually new approach to machine learning, or just a hack (aka user experience paradigm) designed to bust benchmarks?
  • The science was available for a year, nobody noticed, until some great engineering and new data was applied to prove the hunch. Basically, it sounds like the Wissenschaft behind all this is still emerging, and open source is leading to a more chaotic and disruptive development. Keep your eyes on arXiv.
  • Assuming it's not, how has it changed the practice of prompt engineering. To what extent do system prompts allow the engineering of faster reasoning pipelines? To be prompt-researched later.
Open LLM Progress Tracker - a Hugging Face Space by andrewrreed
This app shows the progress of open-source versus proprietary large language models over time based on scores from the LMSYS Chatbot Arena. Users can filter by category, ELO score, and organization…
  • Reinforcement learning with verifiable rewards (RLVR) have a surprising feature that token lengths increase during training from a couple of hundred to up to 10000.
  • The *aha!* moments are very eery and akin to human insights. The Distill models were a really good gift to the community. But how do we do this ourselves, adapt it to our use case and datasets? Can we train fully open (weights and code) models? Yes, we can.
  • Hugging Face started with mathematics, as it's easier to test. You start with a source of hard products, like the IMO.
GitHub - huggingface/Math-Verify
Contribute to huggingface/Math-Verify development by creating an account on GitHub.
  • We used a library to verify equivalence between mathematical expression. (Mathematicians, sorry, your work is also just open data for the choppAIng block now!)
  • OlympicCode 7b runs on your phone and does very well on the AI math olympiad, heralding a new age of AI tooling.
  • It is very slow to generate the models, we ran into a lot of scaling issues involving GPU management.
  • Code verifiability crisis (skipped)
  • DeepSeek-R1 had a huge impact on open AI: better tools for reinforcement learning, explosion of interest in reusing data sets. Just search "reasoning" in HF datasets.
  • How far can you compress? Phi-4 recently released by Microsoft has very impressive performance.
  • Where is the moat in terms of the ability of open source to truly close the gap in all sectors? It sounds to me like compute capability, open or closed is really the determining factor in the AI wars - instead of getting too wrapped up in their daily battles, we should focus on the transformational changes and ethically tolerable and professionally defensible use cases.
🚀 Can you beat an LLM in a race across Wikipedia? Let’s find out. We… | Jason Stillerman | 13 comments
🚀 Can you beat an LLM in a race across Wikipedia? Let’s find out. We just launched a new project: WikiRacing with LLMs 🏁 Race from Pokémon to Jennifer Aniston (or anywhere else), using only internal Wikipedia links. Compete head-to-head with models like Qwen3, Gemma, and DeepSeek in a classic test of reasoning, planning, and world knowledge. 🧠 It’s a fun way to explore how language models handle real-time decision-making over open-world data. 🎮 The Space is now live (link in comments) We’re also open-sourcing the code that supports massively parallel races, enabling large-scale evaluations with ease. Huge thanks to Hugging Face for making this experiment possible 🤗 Give it a spin! | 13 comments on LinkedIn

15:15 DevOps for AI: Bring AI to the data, not the data to AI

Aarno Aukia, VSHN and Manuel Schindler, DevX Team, Red Hat

  • DevOps have come of age and we are rocking it.
  • Using RAG the LLM is part of the software workflow.
  • Agents are a useful software construct. Like they will bring lots of different info sources together.
  • Arno seems still a bit confusing the open weight and open source terms, making only the distinction in what runs where. The infra guys want to run your APIs, but not train your model.
  • Ollama is what we use in development. KubeFlow with KServe for production.
  • Demo of the OpenShift workbench with Red Hat.
Demo effect!

15:45 Coffee break

  • I had to take off, family duty calls. Will try to fill in the gaps l8r.

16:05 Panel Discussion

Innovation, Transparency and Digital Sovereignty Erica Dubach, Digital Transformation and ICT Steering (DTI) at Swiss Federal Chancellery (Bundeskanzlei) Jacqueline Kucera, Head of the Parliamentary Library, Research, Data (The Swiss Parliament) Ornella Vaccarelli, Lead Scientist at SCAI - Swiss Centre for Augmented Intelligence

#chopen #bfh #opensourceai #aicommunity #digitalsovereignty #aiforgood… | Dr. Jacqueline Kucera
What an inspiring day at the Open Source AI Conference 2025! 🎉Innovation, Transparency & Digital Sovereignty! I was honored to be an active contributor and panelist, joining forces with Erica Dubach Spiegler (Swiss Federal Chancellery) and Ornella Vaccarelli (Swiss Center for Augmented Intelligence), with excellent moderation by Markus Danhel (CH Open Board). Together, we explored: ✅ Switzerland’s AI strategy and the importance of AI sovereignty ✅ How open-source AI accelerates innovation ✅ Using AI - experiences, lessons learned, and how to adopt AI within the federal administration and Parliament. 🎉The engaging discussions during the panel and the connecting apéro, where participants from diverse fields exchanged ideas, was a highlight of the conference. It felt like a community! 💡 The outcome was clear: we are thriving together in a challenging world. Another highlight was Lewis Tunstall (Hugging Face) sharing insights on The Impact of DeepSeek-R1 on Open Source AI and the Rise of Large Reasoning Models — a fascinating look into the next frontier of AI capabilities. ✨Switzerland is small, innovative and creative, always finding niches to excel. Is AI Switzerland’s next field of excellence? Let’s go for it! 🇨🇭🚀 A heartfelt thank you to Kateryna Schuetz, Matthias Stürmer, Markus Danhel, the amazing #CHOpen team and the #BFH for the flawless organization. And thanks to the photographer 📸 who perfectly captured the day’s energy! 👥 Over 70 participants gathered at the BFH in Bern - a strong signal that the Swiss AI and open-source community is vibrant and ready to step into the future! #OpenSourceAI #AICommunity #DigitalSovereignty #AIforGood #CHOpen #AIInnovation #SwissParliament #SwitzerlandExcellence #AI #Innovation #Parliament #HuggingFace
#opensourceai #digitalesouveränität #ki #llm #chopen #opensource | CH Open
🚀𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝗔𝗜 𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝟮𝟬𝟮𝟱 – 𝗜𝗻𝗻𝗼𝘃𝗮𝘁𝗶𝗼𝗻, 𝗧𝗿𝗮𝗻𝘀𝗽𝗮𝗿𝗲𝗻𝗰𝘆 & 𝗗𝗶𝗴𝗶𝘁𝗮𝗹 𝗦𝗼𝘃𝗲𝗿𝗲𝗶𝗴𝗻𝘁𝘆 Über 70 Teilnehmende und 10 Referierende trafen sich am 7. Mai 2025 an der Berner Fachhochschule BFH, um Chancen und Herausforderungen von Open Source AI zu diskutieren. Ein Nachmittag voller spannender Einblicke zu den Themen Künstliche Intelligenz, digitale Souveränität und Open Source – mit inspirierenden Beiträgen von Expert:innen Lewis Tunstall (Hugging Face), Thomas Taroni (Phoenix Technologies AG), Manuel Schindler (Red Hat), Aarno Aukia (VSHN AG - The DevOps Company) und DALITH STEIGER-GABLINGER und THE AI ANDY FITZE 🧠⛵ ☑️(SwissCognitive | AI Ventures, Advisory & Research). Moderiert wurde die Konferenz von CH Open Vorstandsmitglied Markus Danhel. Ein Highlight war die Paneldiskussion mit 🎤 Dr. Jacqueline Kucera, Head of the Parliamentary Library, Research, Data (The Swiss Parliament) 🎤 Erica Dubach Spiegler, Digital Transformation and ICT Steering (DTI) at Swiss Federal Chancellery (Bundeskanzlei) 🎤 Ornella Vaccarelli, Lead Scientist at SCAI – Swiss Center for Augmented Intelligence (SCAI) in der zentrale Fragen zur Transparenz, Innovation und Praxiserfahrung mit KI in der Verwaltung im Fokus standen. Zum Abschluss rundete das Networkingapéro den gelungenen Anlass ab. 🥂 🔍 Wer noch tiefer einsteigen möchte, hat diese Woche die Gelegenheit dazu! 📅 Open Source AI Workshops – 8. & 9. Mai 2025 📍 BFH, Brückenstrasse 73, Bern 💡 Inhalte und Programm der Open Source AI Workshops: https://lnkd.in/emCgcKFu #OpenSourceAI #DigitaleSouveränität #KI #LLM #CHOpen #OpenSource

Thanks very much to the team, speakers, sponsors! 🫶

I'll make good use of the take-aways from today & looking forward to the workshops tomorrow and Friday!

Creative Commons LicenceThe works on this blog are licensed under a Creative Commons Attribution 4.0 International License