Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

Improving Domain-Specific Retrieval by NLI Fine-Tuning

, , ,

DOI: http://dx.doi.org/10.15439/2023F7569

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 949953 ()

Full text

Abstract. The aim of this article is to investigate the fine-tuning potential of natural language inference (NLI) data to improve information retrieval and ranking. We demonstrate this for both English and Polish languages, using data from one of the largest Polish e-commerce sites and selected open-domain datasets. We employ both monolingual and multilingual sentence encoders fine-tuned by a supervised method utilizing contrastive loss and NLI data. Our results point to the fact that NLI fine-tuning increases the performance of the models in both tasks and both languages, with the potential to improve mono- and multilingual models. Finally, we investigate uniformity and alignment of the embeddings to explain the effect of NLI-based fine-tuning for out-of-domain use-case.

References

  1. W. Guo, X. Liu, S. Wang, H. Gao, A. Sankar, Z. Yang, Q. Guo, L. Zhang, B. Long, B.-C. Chen, and D. Agarwal, “DeText: A deep text ranking framework with BERT,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, ser. CIKM ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 2509–2516. [Online]. Available: https://doi.org/10.1145/3340531.3412699
  2. J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2019.
  3. R. Mroczkowski, P. Rybak, A. Wróblewska, and I. Gawlik, “HerBERT: Efficiently pretrained transformerbased language model for Polish,” in Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing. Kiyv, Ukraine: Association for Computational Linguistics, Apr. 2021, pp. 1–10. [Online]. Available: https://www.aclweb.org/anthology/2021.bsnlp-1.1
  4. A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, “Unsupervised cross-lingual representation learning at scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, Jul. 2020, pp. 8440–8451. [Online]. Available: https://aclanthology.org/2020.acl-main.747
  5. Y. Yang, D. Cer, A. Ahmad, M. Guo, J. Law, N. Constant, G. H. Abrego, S. Yuan, C. Tar, Y.-H. Sung, B. Strope, and R. Kurzweil, “Multilingual universal sentence encoder for semantic retrieval,” 2019. [Online]. Available: https://aclanthology.org/2020.acl-demos.12.pdf
  6. S. R. Bowman, G. Angeli, C. Potts, and C. D. Manning, “A large annotated corpus for learning natural language inference,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2015.
  7. T. Gao, X. Yao, and D. Chen, “SimCSE: Simple contrastive learning of sentence embeddings,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 6894–6910. [Online]. Available: https://aclanthology.org/2021.emnlp-main.552
  8. R. Litschko, I. Vulić, S. P. Ponzetto, and G. Glavaš, “On cross-lingual retrieval with multilingual text encoders,” Information Retrieval Journal, vol. 25, no. 2, pp. 149–183, Jun. 2022. [Online]. Available: https://doi.org/ 10.1007/s10791-022-09406-x
  9. P. Rybak, R. Mroczkowski, J. Tracz, and I. Gawlik, “KLEJ: Comprehensive benchmark for Polish language understanding,” Online, pp. 1191–1201, Jul. 2020. [Online]. Available: https://aclanthology.org/2020.acl-main.111
  10. S. Dadas, M. Perełkiewicz, and R. Poświata, “Evaluation of sentence representations in Polish,” in Proceedings of the Twelfth Language Resources and Evaluation Conference. European Language Resources Association, 2020, pp. 1674–1680. [Online]. Available: https:// aclanthology.org/2020.lrec-1.207
  11. A. Wróblewska and K. Krasnowska-Kieraś, “Polish evaluation dataset for compositional distributional semantics models,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 784–792. [Online]. Available: https://aclanthology.org/P17-1073
  12. Y. Chen, S. Liu, Z. Liu, W. Sun, L. Baltrunas, and B. Schroeder, “WANDS: Dataset for product search relevance assessment,” in Proceedings of the 44th European Conference on Information Retrieval, 2022.
  13. D. Wadden, S. Lin, K. Lo, L. L. Wang, M. van Zuylen, A. Cohan, and H. Hajishirzi, “Fact or Fiction: Verifying Scientific Claims,” in EMNLP, 2020.
  14. N. Thakur, N. Reimers, A. Rücklé, A. Srivastava, and I. Gurevych, “BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. [Online]. Available: https://openreview.net/forum?id=wCu6T5xFjeJ
  15. R. Rei, C. Stewart, A. C. Farinha, and A. Lavie, “COMET: A neural framework for MT evaluation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online: Association for Computational Linguistics, Nov. 2020, pp. 2685–2702. [Online]. Available: https: //aclanthology.org/2020.emnlp-main.213
  16. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
  17. T. Wang and P. Isola, “Understanding contrastive representation learning through alignment and uniformity on the hypersphere,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13– 18 Jul 2020, pp. 9929–9939. [Online]. Available: https://proceedings.mlr.press/v119/wang20k.html
  18. C. Wang, Y. Yu, W. Ma, M. Zhang, C. Chen, Y. Liu, and S. Ma, “Towards representation alignment and uniformity in collaborative filtering,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 1816–1825. [Online]. Available: http://arxiv.org/abs/2206.12811