Follow
Taja Kuzman
Title
Cited by
Cited by
Year
Automatic genre identification: a survey
T Kuzman, N Ljubešić
Language Resources and Evaluation, 1-34, 2023
722023
Neural machine translation of literary texts from English to Slovene
T Kuzman, Š Vintar, M Arcan
Proceedings of the qualities of literary machine translation, 1-9, 2019
332019
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification
T Kuzman, I Mozetič, N Ljubešić
arXiv preprint arXiv: 2303.03953, 2023
28*2023
Multilingual comparable corpora of parliamentary debates ParlaMint 2.1
T Erjavec, M Ogrodniczuk, P Osenova, N Ljubešić, K Simov, V Grigorova, ...
CLARIN ERIC, 2021
222021
Training corpus ssj500k 1.3. Slovenian language resource repository CLARIN. SI
S Krek, T Erjavec, K Dobrovoljc, S Može, N Ledinek, N Holz
182013
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
M Banón, M Espla-Gomis, ML Forcada, C García-Romero, T Kuzman, ...
23rd Annual Conference of the European Association for Machine Translation …, 2022
162022
The GINCO training dataset for web genre identification of documents out in the wild
T Kuzman, P Rupnik, N Ljubešić
arXiv preprint arXiv:2201.03857, 2022
112022
Verbal multiword expressions in Slovene
P Gantar, S Krek, T Kuzman
Computational and Corpus-Based Phraseology: Second International Conference …, 2017
62017
Assessing comparability of genre datasets via cross-lingual and cross-dataset experiments
T Kuzman, N Ljubešic, S Pollak
Jezikovne tehnologije in digitalna humanistika: zbornik konference …, 2022
52022
Training corpus ssj500k 2.2
S Krek, K Dobrovoljc, T Erjavec, S Može, N Ledinek, N Holz, K Zupan, ...
Centre for Language Resources and Technologies, University of Ljubljana, 2019
52019
Automatic Genre Identification for Robust Enrichment of Massive Text Collections: Investigation of Classification Methods in the Era of Large Language Models
T Kuzman, I Mozetič, N Ljubešić
Machine Learning and Knowledge Extraction 5 (3), 1149-1175, 2023
42023
BENCHić-lang: A Benchmark for Discriminating between Bosnian, Croatian, Montenegrin and Serbian
P Rupnik, T Kuzman, N Ljubešić
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023
42023
Get to Know Your Parallel Data: Performing English Variety and Genre Classification over MaCoCu Corpora
T Kuzman, P Rupnik, N Ljubešić
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023
42023
Slovene-English parallel corpus MaCoCu-sl-en 2.0
M Bañón, M Chichirau, M Esplà-Gomis, ML Forcada, A Galiano-Jiménez, ...
Jožef Stefan Institute, 2023
42023
Exploring the Impact of Lexical and Grammatical Features on Automatic Genre Identification
T Kuzman, N Ljubešić
Proceedings of the Odkrivanje Znanja in Podatkovna Skladišca—SiKDD …, 2022
42022
Glagolske večbesedne enote v učnem korpusu ssj500k 2.1
P Gantar, ŠA Holdt, J Čibej, T Kuzman, T Kavčič
Proceedings of the conference on Language Technologies & Digital Humanities …, 2018
32018
The ParlaMint Project: Ever-growing Family of Comparable and Interoperable Parliamentary Corpora
M Ogrodniczuk, P Osenova, T Erjavec, D Fišer, N Ljubešic, Ç Çöltekin, ...
CLARIN Annual Conference Proceedings, 62, 2023
22023
Serbian-English parallel corpus MaCoCu-sr-en 1.0
M Bañón, M Chichirau, M Esplà-Gomis, ML Forcada, A Galiano-Jiménez, ...
Jožef Stefan Institute, 2023
2*2023
Croatian-English parallel corpus MaCoCu-hr-en 2.0
M Bañón, M Chichirau, M Esplà-Gomis, ML Forcada, A Galiano-Jiménez, ...
Jožef Stefan Institute, 2023
22023
Training corpus SUK 1.0
Š Arhar Holdt, S Krek, K Dobrovoljc, T Erjavec, P Gantar, J Čibej, E Pori, ...
Centre for Language Resources and Technologies, University of Ljubljana, 2022
22022
The system can't perform the operation now. Try again later.
Articles 1–20