The Comparison of Activation Functions in Feature Extraction Layer using Sharpen Filter
DOI:
https://doi.org/10.37385/jaets.v6i2.5895Keywords:
Convolutional Neural Networks, Activation Function, Feature Extraction, Sharpen Filter, Image Processing, Deep LearningAbstract
Activation functions are a critical component in the feature extraction layer of deep learning models, influencing their ability to identify patterns and extract meaningful features from input data. This study investigates the impact of five widely used activation functions—ReLU, SELU, ELU, sigmoid, and tanh—on convolutional neural network (CNN) performance when combined with sharpening filters for feature extraction. Using a custom-built CNN program module within the researchers’ machine learning library, Analytical Libraries for Intelligent-computing (ALI), the performance of each activation function was evaluated by analyzing mean squared error (MSE) values obtained during the training process. The findings revealed that ReLU consistently outperformed other activation functions by achieving the lowest MSE values, making it the most effective choice for feature extraction tasks using sharpening filters. This study provides practical and theoretical insights, highlighting the significance of selecting suitable activation functions to enhance CNN performance. These findings contribute to optimizing CNN architectures, offering a valuable reference for future work in image processing and other machine-learning applications that rely on feature extraction layers. Additionally, this research underscores the importance of activation function selection as a fundamental consideration in deep learning model design.
Downloads
References
Bogdan, V., Bonchi?, C., & Orhei, C. (2024). An Image Sharpening Technique Based on Dilated Filters and 2D-DWT Image Fusion. Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. https://doi.org/10.5220/0012416600003660
Chattopadhyay, T., & Gayen, D. K. (2023). Sigmoid activation function generation by photonic artificial neuron (PAN). Optical and Quantum Electronics, 56(2). https://doi.org/10.1007/s11082-023-05618-7
Chegeni, M. K., Rashno, A., & Fadaei, S. (2022). Convolution-layer parameters optimization in Convolutional Neural Networks. Knowledge-Based Systems, 261, 110210. https://doi.org/10.1016/j.knosys.2022.110210
Choi, J., Hong, S., Park, N., & Cho, S. (2023). Blurring-Sharpening Process Models for Collaborative Filtering. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1096–1106. https://doi.org/10.1145/3539618.3591645
Cococcioni, M., Rossi, F., Ruffaldi, E., & Saponara, S. (2020). A novel posit-based fast approximation of elu activation function for deep neural networks. 2020 IEEE International Conference on Smart Computing (SMARTCOMP), 244–246. https://doi.org/10.1109/smartcomp50058.2020.00053.
Demir, Y., & Kaplan, N. (2023). Low-light image enhancement based on sharpening-smoothing image filter. Digital Signal Processing, 138, 104054. https://doi.org/10.1016/j.dsp.2023.104054
Eckle, K., & Schmidt-Hieber, J. (2019). A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Networks, 110, 232–242. https://doi.org/10.1016/j.neunet.2018.11.005.
Eom, G., & Yun, K. (2024). Deep Learning-Based Computational Time-Domain Homogenization Method for Viscoelastic Composites. Transactions of the Korean Society of Mechanical Engineers A, 48(10), 681–687. https://doi.org/10.3795/ksme-a.2024.48.10.681
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press,Massachusetts.
Herwanto, H. W., Handayani, A. N., Wibawa, A. P., Chandrika, K. L., & Arai, K. (2021). Comparison of Min-Max, Z-Score and Decimal Scaling Normalization for Zoning Feature Extraction on Javanese Character Recognition. 2021 7th International Conference on Electrical, Electronics and Information Engineering (ICEEIE), 1–3. https://doi.org/10.1109/iceeie52663.2021.9616665
Hu, X., Niu, P., Wang, J., & Zhang, X. (2019). A dynamic rectified linear activation units. IEEE Access, 7, 180409–180416. https://doi.org/10.1109/access.2019.2959036.
Ismail, A. H., Soliman, T. A., Rihan, M., & Dessouky, M. I. (2023). Deep Learning-Based Beamforming for Millimeter-Wave Systems Using Parametric ReLU Activation Function. Wireless Personal Communications, 129(2), 825–836. https://doi.org/10.1007/s11277-022-10157-7
Job, M. S., Bhateja, P. H., Gupta, M., Bingi, K., & Prusty, B. R. (2022). Fractional rectified linear unit activation function and its variants. Mathematical Problems in Engineering, 2022(1), 1860779. https://doi.org/10.1155/2022/1860779.
K. Jermsittiparsert et al. (2020). Pattern recognition and features selection for speech emotion recognition model using deep learning. International Journal of Speech Technology, 23(4), 799–806. https://doi.org/10.1007/s10772-020-09690-2.
Kalyanam, L. K., & Katkoori, S. (2023). Unstructured Pruning for Multi-Layer Perceptrons with Tanh Activation. 2021 IEEE International Symposium on Smart Electronic Systems (iSES). https://doi.org/10.1109/ises58672.2023.00025
Khaskhoussy, R., & Ayed, Y. B. (2023). Improving Parkinson’s disease recognition through voice analysis using deep learning. Pattern Recognition Letters, 168, 64--70. https://doi.org/10.1016/j.patrec.2023.03.011.
Lee, D. (2020). Comparison of reinforcement learning activation functions to improve the performance of the racing game learning agent. Journal of Information Processing Systems, 16(5), 1074–1082. https://doi.org/10.3745/JIPS.02.0141
Lin, S., Chi, K., Wei, T., & Tao, Z. (2021). Underwater image sharpening based on structure restoration and texture enhancement. Applied Optics, 60(15), 4443. https://doi.org/10.1364/ao.420962
Liu, K., Shi, W., Huang, C., & Zeng, D. (2023). Cost effective Tanh activation function circuits based on fast piecewise linear logic. Microelectronics Journal, 138, 105821. https://doi.org/10.1016/j.mejo.2023.105821
Liu, X., Zhou, Z., & Qian, H. (2019). Comparison and Evaluation of Activation Functions in Term of Gradient Instability in Deep Neural Networks. 2019 Chinese Control And Decision Conference (CCDC). https://doi.org/10.1109/ccdc.2019.8832578
Narmadha, S., & Vijayakumar, V. (2019). An improved stacked denoise autoencoder with elu activation function for traffic data imputation. Int J Innov Technol Explor Eng, 8, 977–980. https://doi.org/10.35940/ijitee.k2022.0981119.
Nguyen, T. D., Kim, D. H., Yang, J. S., & Park, S. Y. (2021). High-Speed ASIC Implementation of Tanh Activation Function Based on the CORDIC Algorithm. 2022 37th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), 1–3. https://doi.org/10.1109/itc-cscc52171.2021.9501440
Pappas, C., Kovaios, S., Moralis-Pegios, M., Tsakyridis, A., Giamougiannis, G., Kirtas, M., Van Kerrebrouck, J., Coudyzer, G., Yin, X., Passalis, N., Tefas, A., & Pleros, N. (2023). Programmable Tanh-, ELU-, Sigmoid-, and Sin-Based Nonlinear Activation Functions for Neuromorphic Photonics. IEEE Journal of Selected Topics in Quantum Electronics, 29(6: Photonic Signal Processing), 1–10. https://doi.org/10.1109/jstqe.2023.3277118
Pham, T. D. (2022). Kriging-Weighted Laplacian Kernels for Grayscale Image Sharpening. IEEE Access, 10, 57094–57106. https://doi.org/10.1109/access.2022.3178607
Qiumei, Z., Dan, T., & Fenghua, W. (2019). Improved Convolutional Neural Network Based on Fast Exponentially Linear Unit Activation Function. IEEE Access, 7, 151359–151367. https://doi.org/10.1109/access.2019.2948112.
Rachmawati, O. C. R., & Darmawan, Z. M. E. (2024). The Comparison of Deep Learning Models for Indonesian Political Hoax News Detection. CommIT (Communication and Information Technology) Journal, 18(2), 123–135. https://doi.org/10.21512/commit.v18i2.10929
Ramadijanti, N., Barakbah, A., & Husna, F. A. (2018). Automatic breast tumor segmentation using hierarchical k-means on mammogram. 2018 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), 170–175. https://doi.org/10.1109/kcic.2018.8628467.
Sakketou, F., & Ampazis, N. (2019). On the invariance of the SELU activation function on algorithm and hyperparameter selection in neural network recommenders. Artificial Intelligence Applications and Innovations: 15th IFIP WG 12.5 International Conference, AIAI 2019, Hersonissos, Crete, Greece, May 24--26, 2019, Proceedings 15, 673–685. https://doi.org/10.1007/978-3-030-19823-7_56.
Salam, A., Hibaoui, A. ., & Saif, A. (2021). A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power. International Journal of Electrical and Computer Engineering (IJECE), 11(1), 163. https://doi.org/10.11591/ijece.v11i1.pp163-170.
Shatravin, V., Shashev, D., & Shidlovskiy, S. (2022). Sigmoid Activation Implementation for Neural Networks Hardware Accelerators Based on Reconfigurable Computing Environments for Low-Power Intelligent Systems. Applied Sciences, 12(10), 5216. https://doi.org/10.3390/app12105216.
Shodiq, M. N., Kusuma, D. H., Rifqi, M. G., Barakbah, A. R., & Harsono, T. (2017). Spatial analisys of magnitude distribution for earthquake prediction using neural network based on automatic clustering in Indonesia. 2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC), 246–251. https://doi.org/10.1109/kcic.2017.8228594.
Subhan, M., Sudarsono, A., & Barakbah, A. R. (2017). Classification of Radical Web Content in Indonesia using Web Content Mining and k-Nearest Neighbor Algorithm. EMITTER International Journal of Engineering Technology, 5(2), 328–348. https://doi.org/10.24003/emitter.v5i2.214.
Suganya, M., & Sasipraba, T. (2023). Stochastic Gradient Descent long short-term memory based secure encryption algorithm for cloud data storage and retrieval in cloud computing environment. Journal of Cloud Computing Advances Systems and Applications, 12(1). https://doi.org/10.1186/s13677-023-00442-6
Szanda?a, T. (2020). Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks. Bio-Inspired Neurocomputing, 203–224. https://doi.org/10.1007/978-981-15-5495-7_11.
Tian, Y., Zhang, Y., & Zhang, H. (2023). Recent Advances in Stochastic Gradient Descent in Deep Learning. Mathematics, 11(3), 682. https://doi.org/10.3390/math11030682
Vaidya, B., & Paunwala, C. (2019). Deep Learning Architectures for Object Detection and Classification. Smart Techniques for a Smarter Planet, 53–79. https://doi.org/10.1007/978-3-030-03131-2_4.
Wong, K., Dornberger, R., & Hanne, T. (2022). An analysis of weight initialization methods in connection with different activation functions for feedforward neural networks. Evolutionary Intelligence. https://doi.org/10.1007/s12065-022-00795-y
Xiangyang, L., Xing, Q., Han, Z., & Feng, C. (2023). A Novel Activation Function of Deep Neural Network. Scientific Programming, 2023, 1–12. https://doi.org/10.1155/2023/3873561.
Yan, W., Xu, G., Du, Y., & Chen, X. (2022). SSVEP-EEG Feature Enhancement Method Using an Image Sharpening Filter. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 30, 115–123. https://doi.org/10.1109/TNSRE.2022.3142736
Ye, S., & He, Q. (2024). Mean squared error bound for learning-based multi-target localization and its application in learning network architecture design. Digital Signal Processing, 151, 104559. https://doi.org/10.1016/j.dsp.2024.104559.