### Domagoj Ševerdija

Head of Computer Science and Machine Learning Research Group

School of Applied Mathematics and Informatics

Josip Juraj Strossmayer University of Osijek

Josip Juraj Strossmayer University of Osijek

### Research Interests

- Combinatorial Optimization
- Computational Linguistics
- Natural Language Processing
- Machine Learning

### Degrees

- PhD in Electrical Engineering, FERIT, University in Osijek, Croatia, 2013. [dissertation]
- BSs in Mathematics and Computer Science, School of Applied Mathematics and Computer Science, University of Osijek, Croatia, 2007.

### Publications

Journal Publications

- D. Ševerdija, R. Čorić, M. Orešković, L. Šošić, Detecting inflectional patterns for Croatian verb stems using class activation mappings, Journal of Language Modelling
**12**/1 (2024), 43-68All verbal forms in the Croatian language can be derived from two basic forms: the infinitive and the present stems. In this paper, we present a neural computation model which takes a verb in an infinitive form and finds a mapping to a present form. The same model can be applied vice-versa, i.e.~map a verb from present form to its infinitive form. Knowing the present form of a given verb one can deduce its inflections using grammatical rules. We experiment with our model on the Croatian language which belongs to the Slavic group of languages. In both classification tasks, the model learns a classifier and uses a class activation mapping to find characters in verbs that contribute to classification. We see that the model detects patterns that follow established grammatical rules for deriving present stem form from infinitive stem form and vice-versa. - D. Ševerdija, T. Prusina, L. Borozan, D. Matijević, Efficient Sentence Representation Learning via Knowledge Distillation with Maximum Coding Rate Reduction, CIT. Journal of Computing and Information Technology
**31**/4 (2023)Addressing the demand for effective sentence representation in natural language inference problems, this paper explores the utility of pre-trained large language models in computing such representations. Although these models generate high-dimensional sentence embeddings, a noticeable performance disparity arises when they are compared to smaller models. The hardware limitations concerning space and time necessitate the use of smaller, distilled versions of large language models. In this study, we investigate the knowledge distillation of Sentence-BERT, a sentence representation model, by introducing an additional projection layer trained on the novel Maximum Coding Rate Reduction (MCR2) objective designed for general-purpose manifold clustering. Our experiments demonstrate that the distilled language model, with reduced complexity and sentence embedding size, can achieve comparable results on semantic retrieval benchmarks, providing a promising solution for practical applications. - S. Jelić, D. Ševerdija, Government Formation Problem, Central European Journal of Operations Research (2017), prihvaćen za objavljivanjeIn addition to the same political and ideological attitudes, members of political parties can be interconnected at private and/or professional levels. They are considered as a part of one large social network. After democratic elections, the total effectiveness and stability of a government may depend on expertness and cooperability of its members. Our main goal is not to give a mechanism for pre-elective government formation, but to propose a model that decides what can be a good subset of candidates for positions in the new government. The decision is based on expertness of candidates and their interconnections in the social network. Inspired by the team formation problem in a social network, we present a Government Formation Problem (GFP). We prove that this problem is NP-hard and give an algorithm based on integer linear programming formulation. In the experimental part, we test our algorithm on simulated data using the GUROBI MILP solver.
- D. Matijević, D. Ševerdija, G. Martinović, Efficient Implementations of Guarding 1.5D Terrains, Croatian Operational Research Review
**6**/1 (2015), 79-89In 1.5D Terrain Guarding Problem we are given an $x$-monotone polygonal line defined by $k$ vertices and a set $G$ of points from the terrain, i.e. guards, and a set $N$ of points from the terrain which are to be seen (guarded) by guards. We deal with a weighted version of the guarding problem where guards $G$ have weights and the goal is to find a minimum weight subset of $G$ to cover, and a version of the problem where points from $N$ have demands, and the goal is to find the smallest subset from $G$ such that every point in $N$ is seen by the demanded number of guards. Both problems are NP-hard and have a factor $5$ approximation (cite{journals/alg/Elbassioni08}, cite{journal/Elbassioni12}). We show that if $(1+epsilon)$-approximate solver to the corresponding linear program is a computer, for any $epsilon>0$, an extra $1+epsilon$ factor will appear in the final approximation factor for both problems. We compare our parallel implementation based on GPU and CPU threads with textsc{Gurobi} solver and conclude that our algorithm outperforms Gurobi solver on large and dense inputs typically by one order of magnitude. - K. Elbassioni, D. Matijević, D. Ševerdija, Guarding 1.5D Terrains with Demands, International Journal of Computer Mathematics
**89**/16 (2012), 2143-2151In this paper, we consider the 1.5D terrain guarding problem in which every point on the terrain that is to be covered has an integer demand associated with it. The goal is to find a minimum cardinality set of guards such that each point is guarded by a number of guards satisfying its demand. We present a first constant-factor approximation algorithm for the problem, that is, a $(1+1/d_min)$-approximation algorithm, where $d_min$ is a minimum demand. The algorithm is based on solving appropriate subproblems established by a decomposition of the main problem due to linear programming relaxation of the corresponding covering problem. - K. Elbassioni, E. Krohn, D. Matijević, J. Mestre, D. Ševerdija, Improved Approximations for Guarding 1.5-Dimensional Terrains, Algorithmica
**60**/2 (2011), 451-463We present a 4-approximation algorithm for the problem of placing the fewest guards on a 1.5D terrain so that every point of the terrain is seen by at least one guard. This improves on the currently best approximation factor of 5. Unlike most of the previous techniques, our method is based on rounding the linear programming relaxation of the corresponding covering problem. Besides the simplicity of the analysis, which mainly relies on decomposing the constraint matrix of the LP into totally balanced matrices, our algorithm, unlike previous work, generalizes to the weighted and partial versions of the basic problem. - F. Belić, Ž. Hocenski, D. Ševerdija, Naive Matrix Multiplication versus Strassen Algorithm in Multi-thread Environment, Tehnički vjesnik
**18**/3 (2011), 309-314n last few decades computational power of computers has greatly increased. Highest speeds and power are still reserved for super-computers, but high-speed computers have been available for home and amateur users for some time. Normal user most of the time uses only a small amount of computational resources available ; even in cases of high-strain, a good part of these resources stays unused. This is partly a result of poor programming. Most of programmers still use single- threaded programming although platforms for parallel programming have been widely available for long time. This article describes using one such platform (.NET Framework) to decrease time needed for multiplication of matrices. This article tries to present what results can be achieved using common equipment and easily acquirable software.

Refereed Proceedings

- D. Ševerdija, T. Prusina, A. Jovanović, L. Borozan, J. Maltar, D. Matijević, Compressing Sentence Representation with Maximum Coding Rate Reduction (Best paper award in AIS - Artificial Intelligence Systems track), ICT and Electronics Convention (MIPRO), 2023 46th MIPRO, Opatija, Hrvatska, 2023In most natural language inference problems, sentence representation is needed for semantic retrieval tasks. In recent years, pre-trained large language models have been quite effective for computing such representations. These models produce high-dimensional sentence embeddings. An evident performance gap between large and small models exists in practice. Hence, due to space and time hardware limitations, there is a need to attain comparable results when using the smaller model, which is usually a distilled version of the large language model. In this paper, we assess the model distillation of the sentence representation model Sentence-BERT by augmenting the pre-trained distilled model with a projection layer additionally learned on the Maximum Coding Rate Reduction (MCR2) objective, a novel approach developed for general purpose manifold clustering. We demonstrate that the new language model with reduced complexity and sentence embedding size can achieve comparable results on semantic retrieval benchmarks.
- B. Borozan, L. Borozan, D. Ševerdija, D. Matijević, S. Canzar, Fortuna Detects Novel Splicing in Drosophila scRNASeq Data, ICT and Electronics Convention (MIPRO), 2023 46th MIPRO, Opatija, Hrvatska, 2023, 410-415Recent developments in single-cell RNA sequencing techniques (scRNASeq) have made large quantities of sequenced data available across numerous species and tissues. Alternative splicing (AS) of pre-mRNA introns varies between tissues and even between cell-types and can be altered in disease. The study of novel AS, using standard RNASeq data, has been extensively studied for many years, while similar work on scRNASeq data has been scarce, despite its potential to offer a broader insight into cell-type specific processes. In this paper, we propose a novel pipeline that uses fortuna, a method that efficiently classifies and quantifies novel AS events, to process scRNASeq samples. Due to its short lifespan, high number of progeny, low maintenance cost, and intricate alternative splicing patterns similar in complexity to those of mammals, Drosophila Melanogaster (fruit fly) is a species of particular interest to researchers. Therefore, we experimentally evaluate our pipeline on real-world Drosophila single-cell data samples from the Fly Cell Atlas.
- K. Elbassioni, D. Matijević, D. Ševerdija, I. Ali, Guarding 1.5D Terrains with Demands, 26th European Workshop on Computational Geometry (EuroCG), Dortmund, 2010
- K. Elbassioni, D. Matijević, J. Mestre, D. Ševerdija, Improved approximations for guarding 1.5-dimensional terrains, Symposium on Theoretical Aspects of Computer Science (STACS), Freiburg, 2009We present a 4-approximation algorithm for the problem of placing a fewest guards on a 1.5D terrain so that every point of the terrain is seen by at least one guard. This improves on the currently best approximation factor of 5. Our method is based on rounding the linear programming relaxation of the corresponding covering problem. Besides the simplicity of the analysis, which mainly relies on decomposing the constraint matrix of the LP into totally balanced matrices, our algorithm, unlike previous work, generalizes to the weighted and partial versions of the basic problem.

Others

- D. Matijević, D. Ševerdija, S. Jelić, L. Borozan, Uparena optimizacijska metoda, Math.e : hrvatski matematički elektronski časopis
**30**(2016)U ovom članku analiziramo metode gradijentnog i zrcalnog spusta u području konveksne optimizacije s danim naglaskom na njihove brzine konvergencije. Nadalje, uparujući dvije spomenute metode dobivamo takozvanu uparenu metodu čija analiza konvergencije pokazuje ubrzanje u odnosu na gradijentnu i zrcalnu metodu, te bilo koju drugu nama poznatu metodu prvoga reda. - D. Ševerdija, I. Matić, Grčko - kineski stil u teoriji brojeva, Osječki matematički list
**10**/1 (2010), 43-58 - D. Ševerdija, I. Matić, Metodički aspekti abakusa II, Matematika i škola
**53**(2010), 106-111 - D. Matijević, D. Ševerdija, Problem vidljivosti, Osječki matematički list (2010), prihvaćen za objavljivanjeZa dvije točke kažemo da vide jedna drugu ukoliko ne postoji prepreka koja bi presjecala segment koji ih spaja. Na temelju geometrijskog modela predstaviti ćemo klasične probleme vidljivosti kao što su problem galerije, problem utvrde i problem čuvanja terena. Iznosimo osnovne rezultate vezane uz spomenute probleme i neke od varijacija tih problema.
- D. Ševerdija, I. Matić, Metodički aspekti abakusa I, Matematika i škola
**52**(2009), 57-62 - I. Matić, D. Ševerdija, S. Škorvaga, Numerička ograničenja kineskog abakusa, Osječki matematički list
**9**(2009), 75-91We present some aspects of numerical constraints using chinese abacus in standard arithmetic operations with natural numbers.

Books

- I. Kuzmanović Ivičić, D. Ševerdija, Znanstveno-stručni skup Računalno jezikoslovlje i Zlatna formula hrvatskoga jezika - Knjiga sažetaka, Sveučilište J. J. Strossmayera u Osijeku – Odjel za matematiku, Osijek, 2021.

### Projects

**Računalom upravljano korpusno jezikoslovlje**UNIOS – ZUP 2018 – Interdisciplinarni istraživački projekti”, (Odjel za matematiku, Sveučilište u Osijeku – Sveučilište Josipa Jurja Strossmayera u Osijeku), 2019-2020

Project Lead: Domagoj Ševerdija,**Mrežni okvir „Ča-Kaj-Što“ sastavnice hrvatskoga identiteta**Stvaralaštvo, ekologija, baština i dobrota – Zaklada Adris”, (Odjel za matematiku, Sveučilište u Osijeku – Zaklada Adris), 2020 – 2021

Project Lead: Domagoj Ševerdija**Hrvatski jezik u mrežnom oblaku svjetskih jezika**

Stvaralaštvo, ekologija, baština i dobrota – Zaklada Adris”, (Odjel za matematiku, Sveučilište u Osijeku – Zaklada Adris), 2019-2020

Project Lead: Domagoj Ševerdija

### Teaching

- I051 Računalno jezikoslovlje (Computational Linguistics)
- I069 Obrada prirodnog jezika metodama dubokog učenja (Natural Language Processing with Deep Learning)
- Io59 3D računalna grafika (3D Computer Graphics)
- I054 Strukture podataka i algoritmi II (Data structures and algorithms 2)