Joint MTS-Skoltech Laboratory
The NLP group is privileged to lead collaboration with the MTS AI in the field of Natural Language Processing (NLP) a subfield of Artificial Intelligence (AI). The joint laboratory features various projects focusing on developing cutting-edge language processing technologies using modern statistical and neural approaches, as well as datasets to solve contemporary tasks on topics such as active learning for NLP, neural textual style transfer, and others.
This page contains a list of scientific publications resulting from this collaboration:
Journal articles
- Sevgili, O., Shelmanov, A., Arkhipov, M., Panchenko, A., and Biemann, C. (2022): Neural Entity Linking: A Survey of Models Based on Deep Learning. In Semantic Web journal. IOS Press.
- Dementieva, D.; Moskovskiy, D.; Logacheva, V.; Dale, D.; Kozlova, O.; Semenov, N.; Panchenko, A. (2021): Methods for Detoxification of Texts for the Russian Language. Multimodal Technologies and Interaction, 5, 54.
- Logacheva, V., Dementieva, D., Krotova, I., Fenogenova, A., Nikishina, I., Shavrina, T., and Panchenko, A. (2022): RUSSE-2022: Findings of the First Russian Detoxification Shared Task Based on Parallel Corpora. In Proceedings of the Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2022”. Moscow, Russia (Online)
- Babakov, N., Dale, D., Logacheva, V., and Panchenko, A. (2022): Studying the role of named entities for content preservation in text style transfer. In Proceeding the 27th International Conference on Natural Language & Information Systems (NLDB-22). Valencia, Spain. Springer Lecture Notes on Computer Science (LNCS).
- Babakov, N., Dale, D., Logacheva, V., and Panchenko, A. (2022): A large-scale computational study of content preservation measures for text style transfer and paraphrase generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 300–321, Dublin, Ireland. Association for Computational Linguistics.
- Moskovskiy, D., Dementieva, D., and Panchenko, A. (2022): Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 346–354, Dublin, Ireland. Association for Computational Linguistics.
- Logacheva, V., Dementieva, D., Ustyantsev, S., Moskovskiy, D., Dale, D, Krotova, I., Semenov, N., and Panchenko, A. (2022): ParaDetox: Detoxification with Parallel Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6804–6818, Dublin, Ireland. Association for Computational Linguistics.
- Dale, D., Voronov, A., Dementieva, D., Logacheva, V., Kozlova, O., Semenov, N. and Panchenko, A. (2021): Text Detoxification using Large Pre-trained Neural Models. In Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP-2021). Punta Cana, Dominican Republic.
- Vorona, I., Phan, A.-H., Panchenko, A., Cichocki, A. (2021): Documents Representation via Generalized Coupled Tensor Chain with the Rotation Group Constraint. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics. Bangkok, Thailand (Online)
- Dementieva, D. and Panchenko, A. (2021): Cross-lingual Evidence Improves Monolingual Fake News Detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP): Student Research Workshop. Association for Computational Linguistics. Bangkok, Thailand (Online)
- Dementieva, D., Moskovskiy, D., Logacheva, V., Dale, D., Kozlova, O., Semenov, N., and Panchenko, A. (2021): Methods for Detoxification of Texts for the Russian Language. Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2021”. Moscow, Russia (Online)
- Shelmanov, A., Puzyrev, D., Kupriyanova, L., Belyakov, D., Larionov, D., Khromov, N., Kozlova, O., Artemova, E. Dylov, D. V., and Panchenko A. (2021): Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates. In Proceeding of the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL). Kiev, Ukraine (online).
- Shelmanov, A., Tsymbalov, E., Puzyrev, D., Fedyanin, K., Panchenko, A., Panov, M. (2021): How Certain is Your Transformer? In Proceeding of the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL). Kiev, Ukraine (online).
Workshop proceedings
- Logacheva, V., Dementieva, D., Krotova, I., Fenogenova, A., Nikishina, I., Shavrina, T., and Panchenko, A. (2022): A Study on Manual and Automatic Evaluation for Text Style Transfer: The Case of Detoxification. In Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval), pages 90–101, Dublin, Ireland. Association for Computational Linguistics.
- Kuimov, M., Dementieva, D., and Panchenko, A. (2022): SkoltechNLP at SemEval-2022 Task 8: Multilingual News Article Similarity via Exploration of News Texts to Vector Representations. In Proceedings of SemEval-22. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). Seattle, Washington, USA.
- Dementieva, D., Ustyantsev, S., Dale, D., Kozlova, O., Semenov, N., Panchenko, A., and Logacheva, V. (2021): Crowdsourcing of Parallel Corpora: the Case of Style Transfer for Detoxification. Proceedings of the 2nd Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale co-located with 47th International Conference on Very Large Data Bases (VLDB 2021) . Copenhagen, Denmark
- Dale, D., Markov, I., Logacheva, V., Kozlova, O., Semenov, N., Panchenko, A. (2021): SkoltechNLP at SemEval-2021 Task 5: Leveraging Sentence-level Pre-training for Toxic Span Detection. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021). Association for Computational Linguistics. Bangkok, Thailand (Online)
- Dementieva, D., Moskovskiy, D., Logacheva, V., Dale, D., Kozlova, O., Semenov, N., and Panchenko, A. (2021): Methods for Detoxification of Texts for the Russian Language. In Proceedings of the 1st Workshop on NLP for Positive Impact (non-archival). The Association for Computational Linguistics and The Asian Federation of Natural Language Processing. Bangkok, Thailand (online
- Babakov N., Logacheva, V., Kozlova, O., Semenov, N., Panchenko, A. (2021): Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company’s Reputation. In Proceeding of the 8th Workshop on Balto-Slavic Natural Language Processing (BSNLP 2021). The 2021 Conference of the European Chapter of the Association for Computational Linguistics. Kyiv, Ukraine (Online)
Presentations
- David Dale, a research engineer at the joint lab, speaks at the conference Conversations AI on dialogue systems winning an award for the best presentation.
- Nikolay Babakov, Daryna Dementieva, and Varvara Logacheva delivered three talks at the Trustworthy AI conference respectively on fake news and propaganda detection, inappropriate message detection, and text detoxification.
- An invited talk by Nikolay Babakov at the Crowd Science Seminar on detection of inappropriate messages. Video.
Press Releases