Main Article Content

Ahkyar Khadafi
Muhammad Iqbal

Abstract

This research investigates the effectiveness of combining SIFT-based feature extraction with Random Forest classification for high-accuracy motif classification. Motif classification is a critical task in fields such as bioinformatics, image recognition, and pattern analysis. The approach begins with the extraction of invariant features from motifs using the SIFT algorithm, which captures scale, rotation, and affine transformations. These features are then reduced in dimensionality using Principal Component Analysis (PCA) to ensure computational efficiency. The reduced feature vectors are classified using a Random Forest model, which aggregates the predictions of multiple decision trees through majority voting, resulting in robust classification results. Experimental results demonstrate that this combination offers high accuracy, with the Random Forest classifier effectively handling variations in motif appearance and producing reliable predictions. The model’s performance is evaluated through metrics such as accuracy, precision, and recall, showcasing its potential for real-world applications. This research provides a solid foundation for further exploration into motif classification, with potential extensions to more complex datasets and optimization techniques.

Downloads

Download data is not yet available.

Article Details

How to Cite
Khadafi, A. and Iqbal, M. (2022) “Implementation of Random Forest for Motif Classification Based on Sift”, Jurnal Mantik, 5(4), pp. 2660-2666. doi: 10.35335/mantik.v5i4.2060.
References
Abpeykar, S., Ghatee, M., & Zare, H. (2019). Ensemble decision forest of RBF networks via hybrid feature clustering approach for high-dimensional data classification. Computational Statistics & Data Analysis, 131, 12–36.
Ampomah, E. K., Qin, Z., & Nyame, G. (2020). Evaluation of tree-based ensemble machine learning models in predicting stock price direction of movement. Information, 11(6), 332.
Ayesha, S., Hanif, M. K., & Talib, R. (2020). Overview and comparative study of dimensionality reduction techniques for high dimensional data. Information Fusion, 59, 44–58.
Azeem, A., Sharif, M., Shah, J. H., & Raza, M. (2015). Hexagonal scale invariant feature transform (H-SIFT) for facial feature extraction. Journal of Applied Research and Technology, 13(3), 402–408.
Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25, 197–227.
Bruno, A., Ardizzone, E., Vitabile, S., & Midiri, M. (2020). A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images. Journal of Medical Signals & Sensors, 10(3), 158–173.
Chen, S., Zhong, S., Xue, B., Li, X., Zhao, L., & Chang, C.-I. (2020). Iterative scale-invariant feature transform for remote sensing image registration. IEEE Transactions on Geoscience and Remote Sensing, 59(4), 3244–3265.
Chen, Y., Jiang, H., Li, C., Jia, X., & Ghamisi, P. (2016). Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 54(10), 6232–6251.
Denisko, D., & Hoffman, M. M. (2018). Classification and interaction in random forests. Proceedings of the National Academy of Sciences, 115(8), 1690–1692.
Florez-Lopez, R., & Ramon-Jeronimo, J. M. (2015). Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Systems with Applications, 42(13), 5737–5753.
Georgiou, T., Liu, Y., Chen, W., & Lew, M. (2020). A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. International Journal of Multimedia Information Retrieval, 9, 135–170.
Ghiasi, M. M., & Zendehboudi, S. (2021). Application of decision tree-based ensemble learning in the classification of breast cancer. Computers in Biology and Medicine, 128, 104089.
Gray, C., & Malins, J. (2016). Visualizing research: A guide to the research process in art and design. Routledge.
Hashemi, M. (2020). Web page classification: a survey of perspectives, gaps, and future directions. Multimedia Tools and Applications, 79(17), 11921–11945.
Karim, M. R., Beyan, O., Zappa, A., Costa, I. G., Rebholz-Schuhmann, D., Cochez, M., & Decker, S. (2021). Deep learning-based clustering approaches for bioinformatics. Briefings in Bioinformatics, 22(1), 393–415.
Kong, Y., & Yu, T. (2018). A deep neural network model using random forest to extract feature representation for gene expression data classification. Scientific Reports, 8(1), 16477.
Lakshmanaprabu, S. K., Shankar, K., Ilayaraja, M., Nasir, A. W., Vijayakumar, V., & Chilamkurti, N. (2019). Random forest for big data classification in the internet of things using optimal features. International Journal of Machine Learning and Cybernetics, 10(10), 2609–2618.
Lakshmi, K. D., & Vaithiyanathan, V. (2017). Image registration techniques based on the scale invariant feature transform. IETE Technical Review, 34(1), 22–29.
Lee, N. K., Li, X., & Wang, D. (2018). A comprehensive survey on genetic algorithms for DNA motif prediction. Information Sciences, 466, 25–43.
Li, C., Li, J., Li, Y., He, L., Fu, X., & Chen, J. (2021). Fabric defect detection in textile manufacturing: a survey of the state of the art. Security and Communication Networks, 2021(1), 9948808.
Li, D., Wang, S., & Li, D. (2015). Spatial data mining. Springer.
Mistry, S., & Patel, A. (2016). Image stitching using Harris feature detection. International Research Journal of Engineering and Technology (IRJET), 3(04), 2220–2226.
Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., & Muharemagic, E. (2015). Deep learning applications and challenges in big data analytics. Journal of Big Data, 2, 1–21.
Nusrat, S., Harbig, T., & Gehlenborg, N. (2019). Tasks, techniques, and tools for genomic data visualization. Computer Graphics Forum, 38(3), 781–805.
Oraibi, Z. A., Yousif, H., Hafiane, A., Seetharaman, G., & Palaniappan, K. (2018). Learning local and deep features for efficient cell image classification using random forests. 2018 25th IEEE International Conference on Image Processing (ICIP), 2446–2450.
Rashid, Y., Rashid, A., Warraich, M. A., Sabir, S. S., & Waseem, A. (2019). Case study method: A step-by-step guide for business researchers. International Journal of Qualitative Methods, 18, 1609406919862424.
Ravi, C., & Gowda, R. M. (2020). Development of image stitching using feature detection and feature matching techniques. 2020 IEEE International Conference for Innovation in Technology (INOCON), 1–7.
Ray, P., Reddy, S. S., & Banerjee, T. (2021). Various dimension reduction techniques for high dimensional data analysis: a review. Artificial Intelligence Review, 54(5), 3473–3515.
Renard, X. (2017). Time series representation for classification: a motif-based approach. Université Pierre et Marie Curie-Paris VI.
Rincy, T. N., & Gupta, R. (2020). Ensemble learning techniques and its efficiency in machine learning: A survey. 2nd International Conference on Data, Engineering and Applications (IDEA), 1–6.
Ristin, M., Guillaumin, M., Gall, J., & Van Gool, L. (2015). Incremental learning of random forests for large-scale image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 490–503.
Rodrigues, M. T. A., Freitas, M. H. G., Pádua, F. L. C., Gomes, R. M., & Carrano, E. G. (2015). Evaluating cluster detection algorithms and feature extraction techniques in automatic classification of fish species. Pattern Analysis and Applications, 18, 783–797.
Setiawan, W., Wahyudin, A., & Widianto, G. R. (2017). The use of scale invariant feature transform (SIFT) algorithms to identification garbage images based on product label. 2017 3rd International Conference on Science in Information Technology (ICSITech), 336–341.
Sheykhmousa, M., Mahdianpari, M., Ghanbari, H., Mohammadimanesh, F., Ghamisi, P., & Homayouni, S. (2020). Support vector machine versus random forest for remote sensing image classification: A meta-analysis and systematic review. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 6308–6325.
Soleimani, M., Campean, F., & Neagu, D. (2021). Diagnostics and prognostics for complex systems: A review of methods and challenges. Quality and Reliability Engineering International, 37(8), 3746–3778.
Susan, S., Jain, A., Sharma, A., Verma, S., & Jain, S. (2015). Fuzzy match index for scale?invariant feature transform (SIFT) features with application to face recognition with weak supervision. IET Image Processing, 9(11), 951–958.
Syam, N., & Kaul, R. (2021). Random forest, bagging, and boosting of decision trees. In Machine Learning and Artificial Intelligence in Marketing and Sales: Essential Reference for Practitioners and Data Scientists (pp. 139–182). Emerald Publishing Limited.
Thudumu, S., Branch, P., Jin, J., & Singh, J. (2020). A comprehensive survey of anomaly detection techniques for high dimensional big data. Journal of Big Data, 7, 1–30.
Xue, M., Yuan, C., Liu, Z., & Wang, J. (2019). SSL: A novel image hashing technique using SIFT keypoints with saliency detection and LBP feature extraction against combinatorial manipulations. Security and Communication Networks, 2019(1), 9795621.
Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D., & Saeed, J. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends, 1(1), 56–70.
Zheng, C., Chen, C., Chen, Y., & Ong, S. P. (2020). Random forest models for accurate identification of coordination environments from X-ray absorption near-edge structure. Patterns, 1(2).