
For the past 500 years, the National Library of Sweden has collected virtually every word published in Swedish, from priceless medieval manuscripts to present-day pizza menus.
Thanks to a centuries-old law that requires a copy of everything published in Swedish to be submitted to the library - also known as Kungliga biblioteket, or KB - its collections span from the obvious to the obscure: books, newspapers, radio and TV broadcasts, internet content, Ph.D. dissertations, postcards, menus and video games. It's a wildly diverse collection of nearly 26 petabytes of data, ideal for training state-of-the-art AI.
We can build state-of-the-art AI models for the Swedish language since we have the best data, said Love B rjeson, director of KBLab, the library's data lab.
Using NVIDIA DGX systems, the group has developed more than two dozen open-source transformer models, available on Hugging Face. The models, downloaded by up to 200,000 developers per month, enable research at the library and other academic institutions.
Before our lab was created, researchers couldn't access a dataset at the library - they'd have to look at a single object at a time, B rjeson said. There was a need for the library to create datasets that enabled researchers to conduct quantity-oriented research.
With this, researchers will soon be able to create hyper-specialized datasets - for example, pulling up every Swedish postcard that depicts a church, every text written in a particular style or every mention of a historical figure across books, newspaper articles and TV broadcasts.
Turning Library Archives Into AI Training Data The library's datasets represent the full diversity of the Swedish language - including its formal and informal variations, regional dialects and changes over time.
Our inflow is continuous and growing - every month, we see more than 50 terabytes of new data, said B rjeson. Between the exponential growth of digital data and ongoing work digitizing physical collections that date back hundreds of years, we'll never be finished adding to our collections.
The library's archives include audio, text and video. Soon after KBLab was established in 2019, B rjeson saw the potential for training transformer language models on the library's vast archives. He was inspired by an early, multilingual, natural language processing model by Google that included 5GB of Swedish text.
KBLab's first model used 4x as much - and the team now aims to train its models on at least a terabyte of Swedish text. The lab began experimenting by adding Dutch, German and Norwegian content to its datasets after finding that a multilingual dataset may improve the AI's performance.
NVIDIA AI, GPUs Accelerate Model Development The lab started out using consumer-grade NVIDIA GPUs, but B rjeson soon discovered his team needed data-center-scale compute to train larger models.
We realized we can't keep up if we try to do this on small workstations, said B rjeson. It was a no-brainer to go for NVIDIA DGX. There's a lot we wouldn't be able to do at all without the DGX systems.
The lab has two NVIDIA DGX systems from Swedish provider AddPro for on-premises AI development. The systems are used to handle sensitive data, conduct large-scale experiments and fine-tune models. They're also used to prepare for even larger runs on massive, GPU-based supercomputers across the European Union - including the MeluXina system in Luxembourg.
Our work on the DGX systems is critically important, because once we're in a high-performance computing environment, we want to hit the ground running, said B rjeson. We have to use the supercomputer to its fullest extent.
The team has also adopted NVIDIA NeMo Megatron, a PyTorch-based framework for training large language models, with NVIDIA CUDA and the NVIDIA NCCL library under the hood to optimize GPU usage in multi-node systems.
We rely to a large extent on the NVIDIA frameworks, B rjeson said. It's one of the big advantages of NVIDIA for us, as a small lab that doesn't have 50 engineers available to optimize AI training for every project.
Harnessing Multimodal Data for Humanities Research In addition to transformer models that understand Swedish text, KBLab has an AI tool that transcribes sound to text, enabling the library to transcribe its vast collection of radio broadcasts so that researchers can search the audio records for specific content.
AI-enhanced databases are the latest evolution of library records, which were long stored in physical card catalogs. KBLab is also starting to develop generative text models and is working on an AI model that could process videos and create automatic descriptions of their content.
We also want to link all the different modalities, B rjeson said. When you search the library's databases for a specific term, we should be able to return results that include text, audio and video.
KBLab has partnered with researchers at the University of Gothenburg, who are developing downstream apps using the lab's models to conduct linguistic research - including a project supporting the Swedish Academy's work to modernize its data-driven techniques for creating Swedish dictionaries.
The societal benefits of these models are much larger than we initially expected, B rjeson said.
Images courtesy of Kungliga biblioteket
Most recent headlines
09/11/2025
Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...
25/10/2025
LOS ANGELES As the popularity of short-for vertical videos from mobile devices continues to soar, vgames, Pitango and a group of Hollywood executives and celebr...
25/10/2025
LONDON The AI-powered VFX toolkit Slapshot has launched a professional-grade AI camera tracking tool the company said is designed to deliver precise camera sol...
25/10/2025
NEW YORK NAB Show New York said its 2025 edition wrapped up its program on Oct. 23 with 11,500 registered attendees from 95 countries, reinforcing its status as...
25/10/2025
NEW YORK Vimeo said it rolled out new AI-powered features and creative tools that it said will make professional video production faster, smarter and more rewar...
25/10/2025
HOUSTON Regional sports network Space City Home Network has upgraded its audio control room with a Solid State Logic System T S300-32 audio console as part of t...
25/10/2025
BAFTA-nominated cinematographer Annemarie Lean-Vercoe ( Breeders , Call the Midwife , Murder in Provence ) was just the DoP to set the look on sophisticated a...
25/10/2025
OpenDrives, Inc, a leading provider of software-defined data storage and data services, today announced a new distribution partnership with Versatile Distributi...
25/10/2025
Chaos today announces the release of Chaos Vantage 3, the first major update to its real-time visualization platform in more than two years. With Vantage, AEC p...
25/10/2025
Frequency, the engine behind many of the world's leading streaming television channels, today announced that it powered the first-ever delayed-live broadcas...
25/10/2025
LTN is accelerating the digital evolution of US local broadcasters with innovations that enable stations to launch streaming channels faster, deliver live news ...
25/10/2025
Rise WIB and Rise AV, advocacy groups championing gender diversity and professional development in the broadcast and AV sectors, have announced key leadership u...
25/10/2025
European technology developer Profuz Digital is proud to announce its partnership with the Atlantic Club of Bulgaria as a Special Technical Partner. To overcome...
25/10/2025
European cultural broadcaster ARTE has strengthened its long-standing relationship with Grass Valley, selecting the company's LDX 135 cameras and Creative G...
25/10/2025
IBC today announced the official call for challenge submissions to the Accelerator Media Innovation Programme 2026, inviting forward-thinking organisations from...
25/10/2025
ASB GlassFloor, the Germany-based global leader in high-performance sports flooring, announces the official launch of ASB Arena and Event Services AG (AES), a s...
25/10/2025
Bitmovin, leading provider of video streaming solutions, has announced that its internal playback stream testing system for the Bitmovin Player now leverages AI...
25/10/2025
Envoi, multi-cloud data management and data protection solutions provider, has launched a new solution, Envoi Express Lane, for managing the demands of distribu...
25/10/2025
Accedo, a global provider of video streaming software and services, has supported FloSports, a leading sports media company, to expand its service to Samsung an...
25/10/2025
Spanish-Language Music Production Course Debuts at Berklee Online New 12-week online course expands access to Berklee's renowned music curriculum for Span...
24/10/2025
NEP CEO Martin Stewart on $700M Investment, Restructuring, and the Challenges Fa...
24/10/2025
FOX Sports Debuts Next-Gen Graphics, Celebrates Career of Lead Producer Pete Mac...
24/10/2025
GROUP MEDIAPRO Chairman and CEO Tatxo Benet Steps DownBy Ken Kerschbaumer, Editorial Director
Friday, October 24, 2025 - 2:37 pm
Print This Story | Subscri...
24/10/2025
NBA Tip-Off: Amazon Prime Video Debuts Cutting-Edge Studio, Mobile Units, Global...
24/10/2025
(L-R) Director Justin Lin with his cast and producers at Eccles Theatre for the premiere of Last Days in Park City. (Photo by George Pimentel/Shutterstock for...
24/10/2025
As global connectivity demands continue to grow, non-terrestrial networks (NTNs) are emerging as a transformative force in telecommunications. By extending cove...
24/10/2025
Warsaw - Poland, October 20, 2025 - Nielsen, a global leader in audience measurement, data and analytics, has published its latest All Screens Video Landscape r...
24/10/2025
Springsteen: Deliver Me from Nowhere Filmed at Berklee NYCs Power Station The biopic, starring Jeremy Allen White as the Boss, focuses on the period when Spri...
24/10/2025
TORONTO Sometimes in sports, as in life, it's the little things that matter, and that aphorism will be on full display tonight when the Toronto Blue Jays ta...
24/10/2025
NEW YORK Charters Spectrum Reach has announced that its clients have used Waymark's AI-driven ad creation platform to create more than 15,000 ads since Spec...
24/10/2025
BURLINGTON, Mass. Avid has today announced the release of Pro Tools 2025.10, a feature-rich update that the company said offers notable advances in immersive mu...
24/10/2025
NEW YORK In a major change for the ad industry, Comcast Advertising will unveil technology that enables agencies and brands to buy targetable, biddable ads on l...
24/10/2025
WASHINGTON The ATSC broadcast standards group has outlined a growing list of international activities that the group said is expanding its influence and solidif...
24/10/2025
FIRST PLACE AND 5,000 LYNDA McCARTHY FOR WITNESS'
SECOND PLACE AND 4,000 ANGELA FINN FOR A SPECTRUM OF SORROW'
THIRD PLACE AND 3,000 IAN FE...
24/10/2025
24 Oct 2025
VEON to Release 3Q25 Earnings Update on November 10, 2025 Dubai, October 24, 2025 - VEON Ltd. (NASDAQ: VEON), a global digital operator, today conf...
24/10/2025
One-off special from the team behind BAFTA award-winning Libby, Are You Home Yet...
24/10/2025
The review examined how the model is developed, managed, and delivered against the requirements set out in the Origin framework.
Simon Redlich, Chief Executive...
24/10/2025
Countdown to GTC Washington, DC: What to Watch Next Week Next week, Washington, D.C., becomes the center of gravity for artificial intelligence. NVIDIA GTC W...
24/10/2025
RT will provide extensive coverage of the results of the Presidential Election across television, radio and online on Saturday, 25 October 2025.
Throughout th...
24/10/2025
New Coaches, New Families and New Challenges Set for Ireland's Fittest Famil...
24/10/2025
Westlife, Imelda May and Ben Elton among the guests on this week's Late Late...
23/10/2025
Unlocking character: Sportcast on executing the Bundesliga and Bundesliga 2 new ...
23/10/2025
Clear coordination: Juggling the new Bundesliga rights cycle requirements and pu...
23/10/2025
Analysis: Is piracy just the cost of doing business? By Callum McCarthy, Editor-at-Large
Tuesday, October 21, 2025 - 09:58
Print This Story
It's high ...
23/10/2025
ESPN's Adam Whitlock on Driving Real-World Innovation Across the Video-Trans...
23/10/2025
SVG TranSPORT 2025 Unites 300+ Industry Leaders in New York for Deep Dive Into L...
23/10/2025
NBA Tip-Off: League Starts Season With Two New Broadcast Partners, In-House NBA ...
23/10/2025
NFL Deepens Business Partnership with EA Sports; More Madden Casts to Come?EA Sports will remain the exclusive producer and distributor of Madden NFL video game...
23/10/2025
NFL Moves Pro Bowl Games Indoors and to Super Bowl Week; Leans Into a Made-for-T...
23/10/2025
By Alan Dominguez
Recently I have been thinking about the intersection of two e...