UNLPS at TextGraphs-16 Natural Language Premise Selection Task: Unsupervised Natural Language Premise Selection in Mathematical Text using Sentence-MPNet

Research output: Contribution to journalArticlepeer-review

Abstract

This paper describes our system for the submission to the TextGraphs 2022 shared task at COLING 2022: Natural Language Premise Selection (NLPS) from mathematical texts. The task of NLPS regards selecting mathematical statements called premises in a knowledge base written in natural language and mathematical formulae that are most likely to be used to achieve a particular mathematical proof. We formulated this solution as an unsupervised semantic similarity task by first obtaining contextualized embeddings of both the premises and mathematical proofs using sentence transformers. We then obtained the cosine similarity between these embeddings and then selected premises with the highest cosine scores as the most probable. Our system improves over the baseline system that uses bag of words models based on term frequency inverse document frequency in terms of mean average precision (MAP) by about 23.5% (0.1516 versus 0.1228).

Original languageEnglish
Pages (from-to)119-123
Number of pages5
JournalProceedings - International Conference on Computational Linguistics, COLING
Volume29
Issue number16
Publication statusPublished - 2022
Event16th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2022, in conjunction with the 29th International Conference on Computational Linguistics, COLING 2022 - Gyeongju, Korea, Republic of
Duration: 12 Oct 202217 Oct 2022

Fingerprint

Dive into the research topics of 'UNLPS at TextGraphs-16 Natural Language Premise Selection Task: Unsupervised Natural Language Premise Selection in Mathematical Text using Sentence-MPNet'. Together they form a unique fingerprint.

Cite this