Filtering Decision Rules Driven by Sequential Forward and Backward Selection of Attributes: An Illustrative Example in Stylometric Domain

Beata Zielosko; Urszula Stańczyk; Kamil Jabloński

Filtering Decision Rules Driven by Sequential Forward and Backward Selection of Attributes: An Illustrative Example in Stylometric Domain

Beata Zielosko, Urszula Stańczyk, Kamil Jabloński

DOI: http://dx.doi.org/10.15439/2023F7295

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 833–842 (2023)

Full text

Abstract. The paper presents investigations concerning the decision rule filtering process controlled by the estimated relevance of available attributes. In the conducted study, two search directions were used, sequential forward selection and sequential backward elimination, applied after the knowledge discovery step to the rule sets inferred from a dataset. The steps of sequential search, along with two different strategies of rule selection, were governed by three rankings obtained for variables, all related to characteristics of data and rules that can be induced, as follows, (i) a ranking based on the weighting factor referring to the occurrence of attributes in generated decision reducts, (ii) the OneR ranking exploiting short rule properties, and (iii) the proposed ranking defined through the operation of greedy algorithm for rule induction. The three rankings were confronted and compared from the perspective of their usefulness for the selection of rules performed in the two directions. The resulting sets of rules were analysed with respect to the properties of the constituent decision rules and from the point of performance for all constructed rule-based classifiers. Substantial experiments were carried out in the stylometric domain, treating the task of authorship attribution as classification. The results obtained indicate that for all three rankings and search paths it was possible to obtain a noticeable reduction of attributes while at least maintaining the power of inducers, at the same time improving characteristics of rule sets.

References

J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,” ACM Comput. Surv., vol. 50, no. 6, 2017.
I. Guyon, S. Gunn, M. Nikravesh, and L. Zadeh, Eds., Feature Extraction: Foundations and Applications, ser. Studies in Fuzziness and Soft Computing. Physica-Verlag, Springer, 2006, vol. 207.
A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, no. 1, pp. 245–271, 1997.
U. Stańczyk, “Weighting of features by sequential selection,” in Feature Selection for Data and Pattern Recognition, ser. Studies in Computational Intelligence, U. Stańczyk and L. Jain, Eds. Berlin, Germany: Springer-Verlag, 2015, vol. 584, pp. 71–90.
I. Witten, E. Frank, and M. Hall, Data Mining. Practical Machine Learning Tools and Techniques, 3rd ed. Morgan Kaufmann, 2011.
B. Zielosko and U. Stańczyk, “Reduct-based ranking of attributes,” in Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES-2020, Virtual Event, 16-18 September 2020, ser. Procedia Computer Science, M. Cristani, C. Toro, C. Zanni-Merk, R. J. Howlett, and L. C. Jain, Eds., vol. 176. Elsevier, 2020, pp. 2576–2585.
H. Liu and H. Motoda, Computational Methods of Feature Selection. CRC Press, 2007.
E. Amaldi and V. Kann, “On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems,” Theoretical Computer Science, vol. 209, no. 1, pp. 237–260, 1998.
B. Zielosko and M. Piliszczuk, “Greedy algorithm for attribute reduction,” Fundam. Informaticae, vol. 85, no. 1-4, pp. 549–561, 2008.
M. M. Mafarja and S. Mirjalili, “Hybrid whale optimization algorithm with simulated annealing for feature selection,” Neurocomputing, vol. 260, pp. 302–312, 2017.
P. Pudil, J. Novovièová, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognition Letters, vol. 15, no. 11, pp. 1119–1125, 1994.
I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
U. Stańczyk, B. Zielosko, and L. C. Jain, “Advances in feature selection for data and pattern recognition: An introduction,” in Advances in Feature Selection for Data and Pattern Recognition, ser. Intelligent Systems Reference Library, U. Stañczyk, B. Zielosko, and L. C. Jain, Eds. Springer, 2018, vol. 138, pp. 1–9.
I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, no. 1-3, pp. 389–422, 2002.
W. Altidor, T. M. Khoshgoftaar, and J. V. Hulse, “An empirical study on wrapper-based feature ranking,” in 2009 21st IEEE International Conference on Tools with Artificial Intelligence, 2009, pp. 75–82.
Z. Pawlak and A. Skowron, “Rough sets and boolean reasoning,” Information Sciences, vol. 177, no. 1, pp. 41–73, 2007.
A. Janusz and D. Ślęzak, “Utilization of attribute clustering methods for scalable computation of reducts from high-dimensional data,” in 2012 Federated Conference on Computer Science and Information Systems (FedCSIS), 2012, pp. 295–302.
Y. Yang, D. Chen, H. Wang, E. C. Tsang, and D. Zhang, “Fuzzy rough set based incremental attribute reduction from dynamic data with sample arriving,” Fuzzy Sets and Systems, vol. 312, pp. 66–86, 2017.
Y. Liu, L. Zheng, Y. Xiu, H. Yin, S. Zhao, X. Wang, H. Chen, and C. Li, “Discernibility matrix based incremental feature selection on fused decision tables,” International Journal of Approximate Reasoning, vol. 118, pp. 1–26, 2020.
J. Henzel, A. Janusz, M. Sikora, and D. Ślęzak, “On positive-correlation-promoting reducts,” in Rough Sets, R. Bello, D. Miao, R. Falcon, M. Nakata, A. Rosete, and D. Ciucci, Eds. Springer International Publishing, 2020, pp. 213–221.
J. Wróblewski, “Ensembles of classifiers based on approximate reducts,” Fundam. Informaticae, vol. 47, no. 3–4, p. 351–360, 2001.
J. Bazan and M. Szczuka, “The rough set exploration system,” in Transactions on Rough Sets III, ser. Lecture Notes in Computer Science, J. F. Peters and A. Skowron, Eds. Berlin, Heidelberg: Springer, 2005, vol. 3400, pp. 37–56.
J. Rissanen, “Modeling by shortest data description,” Automatica, vol. 14, no. 5, pp. 465–471, 1978.
R. Holte, “Very simple classification rules perform well on most commonly used datasets,” Machine Learning, vol. 11, pp. 63–91, 1993.
S. Ali and K. A. Smith, “On learning algorithm selection for classification,” Applied Soft Computing, vol. 6, no. 2, pp. 119–138, 2006.
M. J. Moshkov, M. Piliszczuk, and B. Zielosko, “Greedy algorithm for construction of partial association rules,” Fundam. Informaticae, vol. 92, no. 3, pp. 259–277, 2009.
——, “On construction of partial reducts and irreducible partial decision rules,” Fundam. Informaticae, vol. 75, no. 1-4, pp. 357–374, 2007.
B. Zielosko, “Sequential optimization of γ-decision rules,” in Federated Conference on Computer Science and Information Systems - FedCSIS 2012, Wroclaw, Poland, 9-12 September 2012, Proceedings, M. Ganzha, L. A. Maciaszek, and M. Paprzycki, Eds., 2012, pp. 339–346.
E. Stamatatos, “A survey of modern authorship attribution methods,” Journal of the American Society for Information Science and Technology, vol. 60, no. 3, pp. 538–556, 2009.
M. Eder, “Style-markers in authorship attribution a cross-language study of the authorial fingerprint,” Studies in Polish Linguistics, vol. 6, no. 1, pp. 99–114, 2011.
H. Wu, Z. Zhang, and Q. Wu, “Exploring syntactic and semantic features for authorship attribution,” Applied Soft Computing, vol. 111, p. 107815, 2021.
S. G. Weidman and J. O’Sullivan, “The limits of distinctive words: Reevaluating literature’s gender marker debate,” Digital Scholarship in the Humanities, vol. 33, pp. 374–390, 2018.
U. Stańczyk and G. Baron, “On heterogeneity or sub-classes aspect in construction of stylometric input datasets,” in Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 26th International Conference KES-2022, Verona, Italy, 7-9 September 2022, ser. Procedia Computer Science, M. Cristani, C. Toro, C. Zanni-Merk, R. J. Howlett, and L. C. Jain, Eds. Elsevier, 2022, vol. 207, pp. 2526–2535.
U. M. Fayyad and K. B. Irani, “Multi-interval discretization of continuousvalued attributes for classification learning,” in 13th International Joint Conference on Articial Intelligence, vol. 2. Morgan Kaufmann Publishers, 1993, pp. 1022–1027.