9-day longest streak
👋 Hi there I'm currently working on the awesome Flair library and love contributing to various open source projects. 📰 Latest news Latest news of new language models, PRs and…
[](https://schweter.ml/)
👋 Hi there
I'm currently working on the awesome Flair
library and love contributing to various open source projects.
📰 Latest news
Latest news of new language models, PRs and many more!
- 16.02:2026: GOLMo - Small German OLMo models are in the making, stay tuned!
- 28.10.2025: I've started a new project: the first German nanochat model!
- 17.10.2025: Check out our new German Commons paper!
- 18.09.2025: New project: Baivaria - a strong Bavarian encoder-only model is released! Available on the Model Hub here and more details can be found in this GitHub repo!
- 09.09.2025: Check out our new Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian paper!
- 08.07.2025: New project started: the Bavarian NLP organization on the Model Hub with lot of new resources for Bavarian.
- 03.03.2025: BERT5urk, a new T5 language model for Turkish is out now!
- 21.12.2024: The Turkish Model Zoo got new evaluations - performed with the awesome Flair library - see here.
- 09.12.2024: Public announcement of the TensorFlow Model Garden LMs, including first FineWeb-LM releases on the Model Hub.
- 02.10.2024: Zeitungs-LM, a new language model trained on Historical German Newspapers is out now!
- 04.07.2024: Flair fine-tuned NER models on the awesome CleanCoNLL dataset are now available on the Model Hub.
- 28.03.2024: New project: NER models on the recently released CO-Fun NER dataset. Repo is here with a lot of fine-tuned models on the Model Hub.
- 23.12.2023: New project: NER Datasets for Historical German (HisGermaNER) is out and available on the Model Hub here.
- 11.10.2023: New launch of hmBench project: it benchmarks Historical Multilingual Language Models such as hmBERT, hmTEAMS and hmByT5, see here.
- 25.05.2023: New project: Historical Multilingual and Monolingual ELECTRA Models is released here.
- 25.05.2023: Several ByT5 Historical Language Models are released under hmByT5 Preliminary and hmByT5 are released on the Hugging Face Model Hub.
- 06.03.2023: Updated Ukrainian ELECTRA repository, see here.
- 05.02.2023: New repository on experiments for XLM-V 🤗 Transformers Integeration, see here.
- 03.02.2023: New repository for on-going evaluation of German T5 models on the GermEval 2014 NER task is up now! See here.
- 28.01.2023: Start of new language models trained on the British Library corpus (model size ranges from 110M to 1B!), repository is here.
- 23.01.2023: New German T5 models are released (trained on the the head and middle of GC4 corpus) and are available here.
- 09.06.2022: Preprint of our upcoming HIPE-2022 Working Notes paper is now available here: hmBERT: Historical Multilingual Language Models for Named Entity Recognition.
- 20.02.2022: Check out our new GermanT5 organization - expect new T5 models for German soon!
- 14.12.2021: New badge: Member of Hugging Face Supporter org now 🎉
- 13.12.2021: Release of Historical Language Model for Dutch (trained on Delpher corpus) - see repo here.
- 06.12.2021: Release of smaller multilingual Historical Language Models (ranging from 2-8 layers) - see repo here.
- 18.11.2021: Release of new multilingual and monolingual Historical Language Models - as preparation for upcoming CLEF-HIPE 2022 - see repo here.
- 23.09.2021: Release of ConvBERTurk (cased and uncased) and ELECTRA (uncased) trained on Turkish part of mC4 corpus - see repo here.
- 07.09.2021: Release of new larger German GPT-2 model - see model hub card here.
- 17.08.2021: Release of new re-trained German GPT-2 model - see repo here.
- 05.07.2021: Preprint of the ICDAR 2021 paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" together with Luisa März, Nina Poerner, Benjamin Roth and Hinrich Schütze is out now!
- 24.06.2021: Turkish Language Model Zoo repo got a new logo from Merve Noyan, please follow her! Additionally, a new Turkish ELECTRA model was released, that was trained on the Turkish part of multilingual C4 dataset. More details here.
- 03.05.2021: GC4LM: A Colossal (Biased) language model for German was released. Repo with more details here.
- 27.04.2021: Our paper "Data Centric Domain Adaptation for Historical Text with OCR Errors" was accepted at ICDAR 2021. More details soon!
- 16.03.2021: Turkish model zoo is still growing! Public release of ConvBERTurk - see repo here.
- 07.02.2021: Public release of German Europeana DistilBERT and ConvBERT models. Repo with more information is here.
- 28.01.2021: Expect a new German Europeana ELECTRA Large model incl. a distilled German Europeana BERT model soon 🤗
- 16.11.2020: Public release of French Europeana BERT and ELECTRA models - see repository here.
- 16.11:2020: Public release of a German GPT-2 model (incl. fine-tuned model on Faust I and II). Repo with more information is available here.
- 11.11.2020: Public release of Ukrainian ELECTRA model. Repo is now available here.
- 11.11.2020: New workstation build (RTX 3090 and Ryzen 9 5900X) has completed! Expect a lot of new Flair/Transformers models in near future!
- 02.11.2020: Public release of Italian XXL ELECTRA model. New repo for Italian BERT and ELECTRA models is now available here 🎉
- 22.10.2020: Preprint of "German's Next Language Model" is now available here. Models are also available on the Hugging Face model hub 🎉
- 22.10.2020: Our shared task paper Triple E - Effective Ensembling of Embeddings and Language Models for NER of Historical German together with Luisa März is released 🎉
- 30.09.2020: "German's Next Language Model" together with Branden Chan and Timo Möller was accepted at COLING 2020!
- 23.09.2020: Flair in version 0.6.1 is out now!
- 02.09.2020: Slow response time - I'm currently focussing on EACL 2021. Expect great new things 😎
- 18.08.2020: French BERT model, trained on Historical newspapers from Europeana:
📃 Publications
- Raphael Schmitt and Stefan Schweter. SindBERT, the Sailor: Charting the Seas of Turkish NLP was accepted at SIGTURK 2026 Workshop.
- Lukas Thoma, Ivonne Weyers, Erion Çano, Stefan Schweter, Jutta L Mueller and Benjamin Roth. CogMemLM: Human-Like Memory Mechanisms Improve Performance and Cognitive Plausibility of LLMs. In Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning (CoNLL 2023).
- Stefan Schweter, Luisa März, Katharina Schmid and Erion Çano. hmBERT: Historical Multilingual Language Models for Named Entity Recognition. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2022).
- Francesco De Toni, Christopher Akiki, Javier de la Rosa, Clémentine Fourrier, Enrique Manjavacas, Stefan Schweter and Daniel Van Strien. Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0. Accepted at "Challenges & Perspectives in Creating Large Language Models" Workshop at ACL 2022.
- Luisa März, Stefan Schweter, Nina Poerner, Benjamin Roth and Hinrich Schütze. Data Centric Domain Adaptation for Historical Text with OCR Errors. In International Conference on Document Analysis and Recognition, ICDAR 2021.
- Branden Chan, Stefan Schweter and Timo Möller. German's Next Language Model. In Proceedings of the 28th International Conference on Computational Linguistics.
- Stefan Schweter and Luisa März. Triple E - Effective Ensembling of Embeddings and Language Models for NER of Historical German. In Experimental IR Meets Multilinguality, Multimodality, and Interaction - Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2020).
- Stefan Schweter and Sajawel Ahmed. Deep-EOS: General-Purpose Neural Networks for Sentence Boundary Detection. In Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019).
- Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter and Roland Vollgraf. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations).
- Stefan Schweter and Johannes Baiter. Towards Robust Named Entity Recognition for Historic German. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019).
📃 Preprints
- Lukas Gienapp, Christopher Schröder, Stefan Schweter, Christopher Akiki, Ferdinand Schlatt, Arden Zimmermann, Phillipe Genêt and Martin Potthast. The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models.
- Michael Hoffmann, Jophin John, Stefan Schweter, Gokul Ramakrishnan, Hoi-Fong Mak, Alice Zhang, Dmitry Gaynullin and Nicolay J. Hammer. Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian.
- Part of BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
- Stefan Schweter and Alan Akbik. FLERT: Document-Level Features for Named Entity Recognition.
💬 Contact
Please open an issue in the corresponding repository, tag me (@stefan-it) in
issues/prs/commits on GitHub or connect with me on LinkedIn :)
-
turkish-bert ★ PINNED
Turkish BERT/DistilBERT, ELECTRA, ConvBERT and T5 models
Python ★ 576 1y agoExplain → -
hmByT5 ★ PINNED
Upcoming Historical Multilingual and Monolingual ByT5 Models
Python ★ 8 2y agoExplain → -
europeana-bert ★ PINNED
BERT and ELECTRA models trained on Europeana Newspapers
Python ★ 39 4y agoExplain → -
ukrainian-electra ★ PINNED
Ukrainian ELECTRA model
Python ★ 12 3y agoExplain → -
gc4lm ★ PINNED
GC4LM: A Colossal (Biased) language model for German
★ 13 5y agoExplain → -
xlm-v-experiments ★ PINNED
Experiments for XLM-V Transformers Integeration
Python ★ 13 3y agoExplain → -
capsnet-nlp
CapsNet for NLP
Python ★ 66 7y agoExplain → -
nmt-en-vi ▣
Neural Machine Translation system for English to Vietnamese (IWSLT'15 English-Vietnamese data)
★ 64 7y agoExplain → -
fine-tuned-berts-seq ▣
Fine-tuned Transformers compatible BERT models for Sequence Tagging
Python ★ 40 6y agoExplain → -
flair-experiments
Experiments with Zalando's flair library
Python ★ 34 3y agoExplain → -
german-gpt2
German GPT-2 model
★ 32 4y agoExplain → -
modern-bert-ner
My NER Experiments with ModernBERT and Ettin
Python ★ 28 11mo agoExplain → -
italian-bertelectra
🇮🇹 Italian BERT and ELECTRA models (incl. evaluation)
Shell ★ 18 3y agoExplain → -
dl-benchmarks
Deep Learning Benchmark Results (RTX 2080 TI vs. RTX 2070)
★ 16 7y agoExplain → -
hetzner-gpu-server ▣
My cheatsheet for Hetzner GPU Server Setup
★ 9 1y agoExplain → -
lrz-gpu-tutorial
Useful tips for LRZ GPU usage
★ 8 7y agoExplain → -
nanochat-german
The best German ChatGPT that $100 can buy.
Python ★ 7 7mo agoExplain → -
delpher-lm
Language Model for Historic Dutch (Delpher Corpus)
Jupyter Notebook ★ 7 3y agoExplain → -
hmBench
hmBench: Fine-Tuning, Evaluating & Benchmarking of Historic Language Models on NER Datasets
Python ★ 6 2y agoExplain → -
flair-pos-tagging
Flair Embeddings for PoS Tagging: A Multilingual Evaluation
Python ★ 6 6y agoExplain → -
deep-wittgenstein
Classification of Wittgenstein's remarks
Python ★ 5 8y agoExplain → -
plur
Pre-trained Language Models for Under-represented Languages in NLP
★ 5 6y agoExplain → -
llms-meet-ner
LLMs Meet NER
Python ★ 4 3mo agoExplain → -
nmt-mk-en
Neural Machine Translation system for Macedonian to English
Shell ★ 4 8y agoExplain → -
model-garden-lms
Language Model Pretraining with TensorFlow Model Garden
Python ★ 3 9mo agoExplain → -
hmTEAMS
Historical Multilingual TEAMS Models
★ 3 1y agoExplain → -
germeval-ner-t5
Evaluating German T5 Models on GermEval 2014 (NER)
Python ★ 3 3y agoExplain → -
georgian-ner
Resources about Named Entity Recognition for Georgian
Jupyter Notebook ★ 2 2y agoExplain → -
stefan-it
No description.
★ 2 4mo agoExplain → -
gerturax-fine-tuner
GERTuraX fine-tuning experiments
Python ★ 2 8mo agoExplain → -
transformers ⑂
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python ★ 2 2y agoExplain → -
blbooks-lms
Pretrained Language Models on British Library Corpus
Python ★ 2 2y agoExplain → -
haystef
My personal RAG System based on Haystack
★ 1 3mo agoExplain → -
modalities ⑂
Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
★ 1 4mo agoExplain → -
german-tokenizer-benchmark ⑂
German Tokenizer Benchmark
Python ★ 1 7mo agoExplain → -
co-funer
Experiments on CO-Fun NER Dataset
Jupyter Notebook ★ 1 2y agoExplain → -
transformer-smaller-training-vocab ⑂
Temporary remove unused tokens during training to save ram and speed.
★ 1 1y agoExplain → -
tecb-de ⑂
German Text Embedding Clustering Benchmark
★ 1 2y agoExplain → -
adapters ⑂
A Unified Library for Parameter-Efficient and Modular Transfer Learning
★ 1 2y agoExplain → -
awesome-huggingface ⑂
🤗 A list of wonderful open-source projects & applications integrated with Hugging Face libraries.
★ 1 2y agoExplain → -
historic-domain-adaptation-icdar
Data Centric DomainAdaptation for Historical Text with OCR Errors
Python ★ 1 3y agoExplain → -
FARM ⑂
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry.
★ 1 6y agoExplain → -
maxtext ⑂
A simple, performant and scalable Jax LLM!
Python ★ 0 6mo agoExplain → -
xlstm-jax ⑂
Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.co/NX-AI/xLSTM-7b.
★ 0 7mo agoExplain → -
llmdata ⑂
No description.
★ 0 8mo agoExplain → -
baivaria
LM for Bavarian
★ 0 9mo agoExplain → -
haystack-integrations ⑂
🚀 A list of Haystack Integrations, maintained by the community or deepset.
★ 0 1y agoExplain → -
cisnlp.github.io ⑂
Homepage of cisnlp
★ 0 1y agoExplain → -
spring-into-haystack ⑂
🌱 Sprout an Agent with Haystack & MCP
Python ★ 0 1y agoExplain → -
helibrunna ⑂
A HuggingFace compatible xLSTM trainer.
Python ★ 0 1y agoExplain → -
api-inference-community ⑂
No description.
Python ★ 0 1y agoExplain → -
sipi ⑂
Simple Image Presentation Interface
★ 0 11mo agoExplain → -
DeBERTa ⑂
The implementation of DeBERTa with own modifications ;)
Python ★ 0 2y agoExplain → -
CleanCoNLL ⑂
The CleanCoNLL dataset from our EMNLP 2023 paper where we corrected annotation errors and inconsistencies in CoNLL-03.
★ 0 2y agoExplain → -
bpemb ⑂
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
★ 0 2y agoExplain → -
TencentPretrain ⑂
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
★ 0 2y agoExplain → -
autotrain-flair-mobie
Example Repository for using Auto Train with Flair Library on MobIE NER Dataset
Jupyter Notebook ★ 0 2y agoExplain → -
hub-docs ⑂
Frontend components, documentation and information hosted on the Hugging Face website.
Svelte ★ 0 2y agoExplain → -
autotrain-advanced ⑂
🤗 AutoTrain Advanced
★ 0 2y agoExplain → -
binder ⑂
No description.
★ 0 2y agoExplain → -
charmen-electra ⑂
No description.
★ 0 3y agoExplain → -
tanl ⑂
Structured Prediction as Translation between Augmented Natural Languages
★ 0 2y agoExplain → -
ASP ⑂
PyTorch implementation and pre-trained models for ASP - Autoregressive Structured Prediction with Language Models, EMNLP 22. https://arxiv.org/pdf/2210.14698.pdf
★ 0 3y agoExplain → -
lam ⑂
Libraries, Archives and Museums (LAM)
★ 0 3y agoExplain → -
albert ⑂
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
★ 0 4y agoExplain → -
german-t5-eval ⑂
German T5 Model Evaluation
★ 0 4y agoExplain → -
git-hash
https://twitter.com/julien_c/status/1489607892298870786
★ 0 4y agoExplain → -
apis ⑂
Apis list.
★ 0 4y agoExplain → -
KnowMAN ⑂
KnowMAN: Weakly Supervised Multinomial Adversarial Networks
★ 0 4y agoExplain → -
stacked-ner ⑂
Stacked-Transformers Named Entity Recogition
★ 0 5y agoExplain → -
hgiyt ⑂
Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"
★ 0 5y agoExplain → -
oscar-website ⑂
The website of one humongous corpus.
★ 0 6y agoExplain → -
datasets ⑂
🤗 Fast, efficient, open-access datasets and evaluation metrics in PyTorch, TensorFlow, NumPy and Pandas
★ 0 5y agoExplain → -
bertram ⑂
This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".
Python ★ 0 5y agoExplain → -
xtreme ⑂
XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 typologically diverse languages and includes nine tasks.
★ 0 6y agoExplain → -
electra ⑂
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
★ 0 6y agoExplain → -
fairseq ⑂
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
★ 0 6y agoExplain → -
google-research ⑂
Google AI Research
Python ★ 0 6y agoExplain → -
deep-docker
No description.
Dockerfile ★ 0 7y agoExplain → -
demetsiiify ⑂
Web service for creating and hosting IIIF manifests from METS/MODS documents
Python ★ 0 7y agoExplain → -
nmt-en-et
Neural Machine Translation system for English to Estonian (WMT18 data)
Shell ★ 0 8y agoExplain → -
nmt-en-mk
Neural Machine Translation system for English to Macedonian
★ 0 7y agoExplain →
No repos match these filters.