Detecting Uninformative Research Job Titles via Classifier Failures—Zero Shot Approach
Martin Víta
DOI: http://dx.doi.org/10.15439/2022F285
Citation: Position Papers of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 31, pages 35–39 (2022)
Abstract. The aim of this paper is to introduce a novel approach to detecting ``uninformative'' job titles in research domain, i.e., detecting titles that convey little or no information about the focus and/or content of a particular job -- like ``Academic staff member AP/2'', ``PhD student position'' etc. Such job titles decrease the success rate of job advertisements. The proposed approach belongs to zero shot approaches -- it exploits only existing, easy accessible classification of jobs to research fields and it does not require any additional (manual) annotations. This work introduces an experimental corpus and provides preliminary results of our approach.
References
- M. Jiang, Y. Fang, H. Xie, J. Chong, and M. Meng, “User click prediction for personalized job recommendation,” World Wide Web, vol. 22, no. 1, pp. 325–345, 2019. http://dx.doi.org/10.1007/s11280-018-0568-z
- K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, p. 150, 2019. http://dx.doi.org/10.3390/info10040150
- J. Wehrmann, W. Becker, H. E. Cagnini, and R. C. Barros, “A characterbased convolutional neural network for language-agnostic twitter sentiment analysis,” in 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017. http://dx.doi.org/10.1109/IJCNN.2017.7966145 pp. 2384–2391.
- P. K. Pushp and M. M. Srivastava, “Train once, test anywhere: Zero-shot learning for text classification,” arXiv preprint https://arxiv.org/abs/1712.05972, 2017. http://dx.doi.org/10.48550/arXiv.1712.05972
- X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in Neural Information Processing Systems, vol. 28, 2015.
- L. Akhtyamova, M. Alexandrov, and J. Cardiff, “Adverse drug extraction in twitter data using convolutional neural network,” in 2017 28th International Workshop on Database and Expert Systems Applications (DEXA). IEEE, 2017. http://dx.doi.org/10.1109/DEXA.2017.34 pp. 88–92.
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems, vol. 26, 2013.
- G.-I. Brokos, P. Malakasiotis, and I. Androutsopoulos, “Using centroids of word embeddings and word mover’s distance for biomedical document retrieval in question answering,” in Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016. http://dx.doi.org/10.18653/v1/W16-2915 pp. 114–118.
- D. Cer, Y. Yang, S.-y. Kong, N. Hua, N. Limtiaco, R. S. John, N. Constant, M. Guajardo-Cespedes, S. Yuan, C. Tar et al., “Universal sentence encoder,” arXiv preprint https://arxiv.org/abs/1803.11175, 2018. http://dx.doi.org/10.48550/arXiv.1803.11175
- D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia, “Semeval-2017 task 1: Semantic textual similarity - multilingual and crosslingual focused evaluation,” arXiv preprint https://arxiv.org/abs/1708.00055, 2017. doi: 10.48550/arXiv.1708.00055
- I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT press, 2016.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint https://arxiv.org/abs/1810.04805, 2018. http://dx.doi.org/10.48550/arXiv.1810.04805
- N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint https://arxiv.org/abs/1908.10084, 2019. http://dx.doi.org/10.48550/arXiv.1908.10084
- A. Gillioz, J. Casas, E. Mugellini, and O. Abou Khaled, “Overview of the transformer-based models for nlp tasks,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2020. http://dx.doi.org/10.15439/2020F20 pp. 179–183.
- C. Jefferson, H. Liu, and M. Cocea, “Fuzzy approach for sentiment analysis,” in 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, 2017. http://dx.doi.org/10.1109/FUZZ-IEEE.2017.8015577 pp. 1–6.
- A. F. M. de Paula, R. F. da Silva, and I. B. Schlicht, “Sexism prediction in spanish and english tweets using monolingual and multilingual bert and ensemble models,” arXiv preprint https://arxiv.org/abs/2111.04551, 2021. http://dx.doi.org/doi.org/10.48550/arXiv.2111.04551
- B. Ghojogh and A. Ghodsi, “Attention mechanism, transformers, bert, and gpt: Tutorial and survey,” 2020.
- I. Cachola, K. Lo, A. Cohan, and D. S. Weld, “Tldr: Extreme summarization of scientific documents,” arXiv preprint https://arxiv.org/abs/2004.15011, 2020. http://dx.doi.org/10.48550/arXiv.2004.15011