Tomislav Prusina
PhD student Department of Mathematics Josip Juraj Strossmayer University of Osijek Trg Ljudevita Gaja 6 Osijek, HR-31000, Croatia¸
|
Research Interests
- Machine learning
Degrees
- MSc in mathematics, Mathematics and Computer Science, Department of Mathematics, University of Osijek, Croatia, 2022.
- BSc in Computer Science, Department of Mathematics, University of Osijek, Croatia, 2020.
Publications
Technical Reports
- T. Prusina, D. Matijević, L. Borozan, J. Maltar, A. Jovanović, Compressing Sentence Representation with maximum Coding Rate Reduction (2023)In most natural language inference problems, sentence representation is needed for semantic retrieval tasks. In recent years, pre-trained large language models have been quite effective for computing such representations. These models produce high-dimensional sentence embeddings. An evident performance gap between large and small models exists in practice. Hence, due to space and time hardware limitations, there is a need to attain comparable results when using the smaller model, which is usually a distilled version of the large language model. In this paper, we assess the model distillation of the sentence representation model Sentence-BERT by augmenting the pre-trained distilled model with a projection layer additionally learned on the Maximum Coding Rate Reduction (MCR2)objective, a novel approach developed for general-purpose manifold clustering. We demonstrate that the new language model with reduced complexity and sentence embedding size can achieve comparable results on semantic retrieval benchmarks.
Teaching
Konzultacije (Office Hours): Konzultacije su moguće i po dogovoru.
Personal
Birthplace: Osijek.