Research Group
Linguistic Engineering Group
The Linguistic Engineering Group advances the state of the art in Natural Language Processing by building practical systems that bridge academic research and real-world deployment. The team develops multilingual resources, high-performance pipelines, and evaluation benchmarks that allow organisations to understand and act on unstructured language data at scale.
Working closely with partners in the public and private sectors, the group leads projects in information extraction, corpus linguistics, and semantic text processing. Their applied perspective ensures that research outputs are robust, well-documented, and ready to integrate into mission-critical workflows.
Group Head
Łukasz Kobyliński, PhD
Łukasz Kobyliński coordinates collaborations across computational linguistics, machine learning, and software engineering, ensuring that the group’s research translates into resilient tools and datasets for the broader community.
Faculty Members
Meet the faculty members contributing to research in the Linguistic Engineering Group.
Research Expertise:
Publications
Browse publications authored by faculty members affiliated with the Linguistic Engineering Group.
2026
AI-Powered Knowledge Discovery in the Digital Library of Old Ephemeral Prints: A Case Study
Intelligence and Equity: Shaping the Future of Knowledge
Authors
- Maciej Ogrodniczuk
- Dariusz Czerski
2026
The Role of Gender in the Evaluation of Politicians in an Online Debate: Evidence from a Natural Experiment
Journal of Language and Politics
Authors
- Natalia Zawadzka-Paluektau
2026
Rethinking the Evaluation of Alignment Methods: Insights into Diversity, Generalisation, and Safety
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Authors
- Denis Janiak
- Julia Moska
- Dawid Motyka
- Karolina Seweryn
- Paweł Walkowiak
- Bartosz Żuk
- Arkadiusz Janz
Authors
- Kristina Maiti Tekavčič
- Ksenija Bogetić
- Petya Osenova
- Maciej Ogrodniczuk
- Darja Fišer
2025
ParlaMint II: Advancing Comparable Parliamentary Corpora across Europe
Language Resources and Evaluation
Authors
- Tomaž Erjavec
- Matyáš Kopp
- Nikola Ljubešić
- Taja Kuzman
- Paul Rayson
- Petya Osenova
- Maciej Ogrodniczuk
- Çağr Çöltekin
- Danijel Koržinek
- Katja Meden
- Jure Skubic
- Peter Rupnik
- Tommaso Agnoloni
- José Aires
- Starkaður Barkarson
- Roberto Bartolini
- Núria Bel
- María Calzada Pérez
- Roberts Darģis
- Sascha Diwersy
- Maria Gavriilidou
- Ruben van Heusden
- Mikel Iruskieta
- Neeme Kahusk
- Anna Kryvenko
- Noémi Ligeti-Nagy
- Carmen Magariños
- Martin Mölder
- Costanza Navarretta
- Kiril Simov
- Lars Magne Tungland
- Jouni Tuominen
- John Vidler
- Adina Ioana Vladu
- Tanja Wissik
- Väinö Yrjänäinen
- Darja Fišer
Authors
- Ewa Kozioł-Chrzanowska
- Włodzimierz Gruszczyński
- Anna Niepytalska-Osiecka
- Maciej Ogrodniczuk
- Monika Buraczyńska
- Bartłomiej Nitoń
2025
Corpus Studies in 2024: Emerging Trends and Applications
Contemporary Linguistics: Integrating Languages, Communities, and Technologies
Authors
- Aleksandra Tomaszewska
- Maciej Ogrodniczuk
Authors
- Alina Wróblewska
- Martyna Lewandowska
- Aleksandra Tomaszewska
- Karol Saputa
- Maciej Ogrodniczuk
Authors
- Witold Kieraś
- Małgorzata Marciniak
- Marek Łaziński
- Marcin Woliński
- Krystyna Bojałkowska
- Wiktor Eźlakowski
- Łukasz Kobyliński
- Dorota Komosińska
- Katarzyna Krasnowska-Kieraś
- Michał Rudolf
- Aleksandra Tomaszewska
- Joanna Wołoszyn
- Natalia Zawadzka-Paluektau
2025
NeoN: A Tool for Automated Detection, Linguistic and LLM-Driven Analysis of Neologisms in Polish
Computational Science – ICCS 2025
Authors
- Aleksandra Tomaszewska
- Dariusz Czerski
- Bartosz Żuk
- Maciej Ogrodniczuk
Selected Projects
Explore current initiatives led by the Linguistic Engineering Group. Each project demonstrates how cutting-edge language technology is deployed to solve complex analytical challenges.
PLLuM – Polish Large Language Model
The consortium’s goal is to develop the first open Polish LLM and an associated smart assistant. The project will adhere to ethical and responsible best practices in AI, incorporating data representativeness, transparency and fairness.
Lead: Maciej Ogrodniczuk
Universal Discourse: a multilingual model of discourse relations
The project intends to create a unified description of discourse relations (at the level of discourse markers, relation arguments and types) in the multilingual setting by harmonizing current corpus-based discourse representation models.
Lead: Maciej Ogrodniczuk