DETEKSI EMOSI PADA TEKS BERBAHASA INDONESIA MENGGUNAKAN PENDEKATAN ENSEMBLE

Syafrial Fachri Pane, Faisal Abdullah, Roni Habibi

Abstract


Emotion in written text is often difficult to recognize due to the absence of visual cues such as facial expressions or vocal intonation, which typically aid in understanding a person's feelings. This research aims to address this challenge by developing an emotion detection model for Indonesian text. The approach used is Ensemble Learning, combining three Machine Learning models: SVM, KNN, and XGBoost, to optimize emotion detection results. The main contribution of this research is the implementation of the Ensemble method for detecting emotions in Indonesian text, with performance evaluated using metrics such as accuracy, precision, recall, F1 score, and ROC AUC. The evaluation results show that the Ensemble model outperforms previous models, achieving an accuracy, precision, recall, and F1 score of 87.14%, and a ROC AUC score of 97.90%. To further enhance performance, this study utilizes GridSearchCV for hyperparameter tuning of the SVM and XGBoost models and employs the Automated Machine Learning (AutoML) tool TPOT to generate the KNN model.

Full Text:

PDF

References


Ahmad, A., Saraswat, D., Aggarwal, V., Etienne, A., & Hancock, B. (2021). Performance of deep learning models for classifying and detecting common weeds in corn and soybean production systems. Computers and Electronics in Agriculture, 184. https://doi.org/10.1016/j.compag.2021.106081

Artama, M., Sukajaya, I. N., & Indrawan, G. (2020). Classification of official letters using TF-IDF method. Journal of Physics: Conference Series, 1516(1). https://doi.org/10.1088/1742-6596/1516/1/012001

Aslam, Z., Javaid, N., Ahmad, A., Ahmed, A., & Gulfam, S. M. (2020). A combined deep learning and ensemble learning methodology to avoid electricity theft in smart grids. Energies, 13(21). https://doi.org/10.3390/en13215599

Chatterjee, A., Gupta, U., Chinnakotla, M. K., Srikanth, R., Galley, M., & Agrawal, P. (2019). Understanding Emotions in Text Using Deep Learning and Big Data. Computers in Human Behavior, 93, 309–317. https://doi.org/10.1016/j.chb.2018.12.029

Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1). https://doi.org/10.1186/s40537-020-00327-4

Chowanda, A., Sutoyo, R., Meiliana, & Tanachutiwat, S. (2021). Exploring Text-based Emotions Recognition Machine Learning Techniques on Social Media Conversation. Procedia Computer Science, 179, 821–828. https://doi.org/10.1016/j.procs.2021.01.099

Cunningham-Nelson, S., Baktashmotlagh, M., & Boles, W. (2019). Visualizing Student Opinion through Text Analysis. IEEE Transactions on Education, 62(4), 305–311. https://doi.org/10.1109/TE.2019.2924385

George, N., Khan, M., Velu, A., & Whig, P. (2021). Framework of Perceptive Artificial Intelligence using Natural Language Processing (P.A.I.N). https://acors.org/ijacoi/VOL2_ISSUE2_3.pdf

George, S., & Srividhya, V. (2022). Performance Evaluation of Sentiment Analysis on Balanced and Imbalanced Dataset Using Ensemble Approach. Indian Journal of Science and Technology, 15(17), 790–797. https://doi.org/10.17485/IJST/v15i17.2339

Hasan, M., Milon Islam, M., Ishrak Islam Zarif, M., & Hashem, M. (2019). Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. https://doi.org/10.1016/j.iot.2019.10

Hu, J., Wang, J., Lin, J., Liu, T., Zhong, Y., Liu, J., Zheng, Y., Gao, Y., He, J., & Shang, X. (2019). MD-SVM: A novel SVM-based algorithm for the motif discovery of transcription factor binding sites. BMC Bioinformatics, 20. https://doi.org/10.1186/s12859-019-2735-3

Ibrahim, I., & Abdulazeez, A. (2021). The Role of Machine Learning Algorithms for Diagnosing Diseases. Journal of Applied Science and Technology Trends, 2(01), 10–19. https://doi.org/10.38094/jastt20179

Kiangala, S. K., & Wang, Z. (2021). An effective adaptive customization framework for small manufacturing plants using extreme gradient boosting-XGBoost and random forest ensemble learning algorithms in an Industry 4.0 environment. Machine Learning with Applications, 4, 100024. https://doi.org/10.1016/j.mlwa.2021.100024

Kumar, P., & Raman, B. (2022). A BERT based dual-channel explainable text emotion recognition system. Neural Networks, 150, 392–407. https://doi.org/10.1016/j.neunet.2022.03.017

Lan, F. (2022). Research on Text Similarity Measurement Hybrid Algorithm with Term Semantic Information and TF-IDF Method. Advances in Multimedia, 2022. https://doi.org/10.1155/2022/7923262

Le Glaz, A., Haralambous, Y., Kim-Dufor, D. H., Lenca, P., Billot, R., Ryan, T. C., Marsh, J., DeVylder, J., Walter, M., Berrouiguet, S., & Lemey, C. (2021). Machine learning and natural language processing in mental health: Systematic review. In Journal of Medical Internet Research (Vol. 23, Issue 5). JMIR Publications Inc. https://doi.org/10.2196/15708

Ma, H., Wang, J., Qian, L., & Lin, H. (2021). HAN-ReGRU: hierarchical attention network with residual gated recurrent unit for emotion recognition in conversation. Neural Computing and Applications, 33(7), 2685–2703. https://doi.org/10.1007/s00521-020-05063-7

Mielke, S. J., Alyafeai, Z., Salesky, E., Raffel, C., Dey, M., Gallé, M., Raja, A., Si, C., Lee, W. Y., Sagot, B., & Tan, S. (2021). Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP. http://arxiv.org/abs/2112.10508

Nandwani, P., & Verma, R. (2021). A review on sentiment analysis and emotion detection from text. In Social Network Analysis and Mining (Vol. 11, Issue 1). Springer. https://doi.org/10.1007/s13278-021-00776-6

Nimmi, K., Janet, B., Selvan, A. K., & Sivakumaran, N. (2022). Pre-trained ensemble model for identification of emotion during COVID-19 based on emergency response support system dataset. Applied Soft Computing, 122. https://doi.org/10.1016/j.asoc.2022.108842

Pane, S. F., Ramdan, J., Putrada, A. G., Fauzan, M. N., Awangga, R. M., & Alamsyah, N. (2023). A Hybrid CNN-LSTM Model With Word-Emoji Embedding For Improving The Twitter Sentiment Analysis on Indonesia’s PPKM Policy. 51–56. https://doi.org/10.1109/icitisee57756.2022.10057720

Parvin, T., & Hoque, M. M. (2021). An Ensemble Technique to Classify Multi-Class Textual Emotion. Procedia Computer Science, 193, 72–81. https://doi.org/10.1016/j.procs.2021.10.008

Pei, Z., Sun, Z., & Xu, Y. (2019). Slang detection and identification.

Plaza-del-Arco, F. M., Martín-Valdivia, M. T., Ureña-López, L. A., & Mitkov, R. (2020). Improved emotion recognition in Spanish social media through incorporation of lexical knowledge. Future Generation Computer Systems, 110, 1000–1008. https://doi.org/10.1016/j.future.2019.09.034

Radzi, S. F. M., Karim, M. K. A., Saripan, M. I., Rahman, M. A. A., Isa, I. N. C., & Ibahim, M. J. (2021). Hyperparameter tuning and pipeline optimization via grid search method and tree-based autoML in breast cancer prediction. Journal of Personalized Medicine, 11(10). https://doi.org/10.3390/jpm11100978

Ranganathan, J., & Tzacheva, A. (2019). Emotion mining in social media data. Procedia Computer Science, 159, 58–66. https://doi.org/10.1016/j.procs.2019.09.160

Riccosan, Saputra, K. E., Pratama, G. D., & Chowanda, A. (2022). Emotion dataset from Indonesian public opinion. Data in Brief, 43. https://doi.org/10.1016/j.dib.2022.108465

Rizwan, A., Iqbal, N., Ahmad, R., & Kim, D. H. (2021). Wr-svm model based on the margin radius approach for solving the minimum enclosing ball problem in support vector machine classification. Applied Sciences (Switzerland), 11(10). https://doi.org/10.3390/app11104657

Romano, J. D., Le, T. T., Fu, W., & Moore, J. H. (2021). TPOT-NN: augmenting tree-based automated machine learning with neural network estimators. Genetic Programming and Evolvable Machines, 22(2), 207–227. https://doi.org/10.1007/s10710-021-09401-z

S, S. B., Khyani, D., M, N. N., & M, D. B. (2021). An Interpretation of Lemmatization and Stemming in Natural Language Processing. https://www.researchgate.net/publication/348306833

Slavova, V., & Andonov, F. (2022). Bad news or good news when recognizing emotional valence using phonemic content. 2022 21st International Symposium INFOTEH-JAHORINA, INFOTEH 2022 - Proceedings. https://doi.org/10.1109/INFOTEH53737.2022.9751339

Sorin, V., Barash, Y., Konen, E., & Klang, E. (2020). Deep Learning for Natural Language Processing in Radiology—Fundamentals and a Systematic Review. Journal of the American College of Radiology, 17(5), 639–648. https://doi.org/10.1016/j.jacr.2019.12.026

Sultana, R., & Nishino, T. (2023). EPiC Series in Computing Fake News Detection System: An implementation of BERT and Boosting Algorithm (Vol. 91).

Xu, T., Ma, Y., & Kim, K. (2021). Telecom churn prediction system based on ensemble learning using feature grouping. Applied Sciences (Switzerland), 11(11). https://doi.org/10.3390/app11114742

Yu, H., Ji, N., Ren, Y., & Yang, C. (2019). A special event-based K-nearest neighbor model for short-term traffic state prediction. IEEE Access, 7, 81717–81729. https://doi.org/10.1109/ACCESS.2019.2923663

Zadgaonkar, A. V., & Agrawal, A. J. (2021). An overview of information extraction techniques for legal document analysis and processing. International Journal of Electrical and Computer Engineering, 11(6), 5450–5457. https://doi.org/10.11591/ijece.v11i6.pp5450-5457




DOI: https://doi.org/10.31884/jtt.v10i2.551

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 JTT (Jurnal Teknologi Terapan)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

View Stats

 

 Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)