Logo PTI Logo FedCSIS

Position Papers of the 17th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 31

Detecting Uninformative Research Job Titles via Classifier Failures—Zero Shot Approach

DOI: http://dx.doi.org/10.15439/2022F285

Citation: Position Papers of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 31, pages 3539 ()

Full text

Abstract. The aim of this paper is to introduce a novel approach to detecting ``uninformative'' job titles in research domain, i.e., detecting titles that convey little or no information about the focus and/or content of a particular job -- like ``Academic staff member AP/2'', ``PhD student position'' etc. Such job titles decrease the success rate of job advertisements. The proposed approach belongs to zero shot approaches -- it exploits only existing, easy accessible classification of jobs to research fields and it does not require any additional (manual) annotations. This work introduces an experimental corpus and provides preliminary results of our approach.

References

  1. M. Jiang, Y. Fang, H. Xie, J. Chong, and M. Meng, “User click prediction for personalized job recommendation,” World Wide Web, vol. 22, no. 1, pp. 325–345, 2019. http://dx.doi.org/10.1007/s11280-018-0568-z
  2. K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, p. 150, 2019. http://dx.doi.org/10.3390/info10040150
  3. J. Wehrmann, W. Becker, H. E. Cagnini, and R. C. Barros, “A characterbased convolutional neural network for language-agnostic twitter sentiment analysis,” in 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017. http://dx.doi.org/10.1109/IJCNN.2017.7966145 pp. 2384–2391.
  4. P. K. Pushp and M. M. Srivastava, “Train once, test anywhere: Zero-shot learning for text classification,” arXiv preprint https://arxiv.org/abs/1712.05972, 2017. http://dx.doi.org/10.48550/arXiv.1712.05972
  5. X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” Advances in Neural Information Processing Systems, vol. 28, 2015.
  6. L. Akhtyamova, M. Alexandrov, and J. Cardiff, “Adverse drug extraction in twitter data using convolutional neural network,” in 2017 28th International Workshop on Database and Expert Systems Applications (DEXA). IEEE, 2017. http://dx.doi.org/10.1109/DEXA.2017.34 pp. 88–92.
  7. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems, vol. 26, 2013.
  8. G.-I. Brokos, P. Malakasiotis, and I. Androutsopoulos, “Using centroids of word embeddings and word mover’s distance for biomedical document retrieval in question answering,” in Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016. http://dx.doi.org/10.18653/v1/W16-2915 pp. 114–118.
  9. D. Cer, Y. Yang, S.-y. Kong, N. Hua, N. Limtiaco, R. S. John, N. Constant, M. Guajardo-Cespedes, S. Yuan, C. Tar et al., “Universal sentence encoder,” arXiv preprint https://arxiv.org/abs/1803.11175, 2018. http://dx.doi.org/10.48550/arXiv.1803.11175
  10. D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia, “Semeval-2017 task 1: Semantic textual similarity - multilingual and crosslingual focused evaluation,” arXiv preprint https://arxiv.org/abs/1708.00055, 2017. doi: 10.48550/arXiv.1708.00055
  11. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT press, 2016.
  12. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint https://arxiv.org/abs/1810.04805, 2018. http://dx.doi.org/10.48550/arXiv.1810.04805
  13. N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint https://arxiv.org/abs/1908.10084, 2019. http://dx.doi.org/10.48550/arXiv.1908.10084
  14. A. Gillioz, J. Casas, E. Mugellini, and O. Abou Khaled, “Overview of the transformer-based models for nlp tasks,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2020. http://dx.doi.org/10.15439/2020F20 pp. 179–183.
  15. C. Jefferson, H. Liu, and M. Cocea, “Fuzzy approach for sentiment analysis,” in 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, 2017. http://dx.doi.org/10.1109/FUZZ-IEEE.2017.8015577 pp. 1–6.
  16. A. F. M. de Paula, R. F. da Silva, and I. B. Schlicht, “Sexism prediction in spanish and english tweets using monolingual and multilingual bert and ensemble models,” arXiv preprint https://arxiv.org/abs/2111.04551, 2021. http://dx.doi.org/doi.org/10.48550/arXiv.2111.04551
  17. B. Ghojogh and A. Ghodsi, “Attention mechanism, transformers, bert, and gpt: Tutorial and survey,” 2020.
  18. I. Cachola, K. Lo, A. Cohan, and D. S. Weld, “Tldr: Extreme summarization of scientific documents,” arXiv preprint https://arxiv.org/abs/2004.15011, 2020. http://dx.doi.org/10.48550/arXiv.2004.15011