Previous Articles     Next Articles

Improve Robustness and Accuracy of Deep Neural Network with $L_{2,\infty}$ Normalization

YU Lijia1,2, GAO Xiao-Shan1,2   

  1. 1. Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-08-30 Revised:2022-06-22 Online:2023-01-25 Published:2023-02-09
  • Supported by:
    This work is partially supported by NKRDP under Grant No. 2018YFA0704705 and the National Natural Science Foundation of China under Grant No. 12288201.

YU Lijia, GAO Xiao-Shan. Improve Robustness and Accuracy of Deep Neural Network with $L_{2,\infty}$ Normalization[J]. Journal of Systems Science and Complexity, 2023, 36(1): 3-28.

In this paper, the $L_{2,\infty}$ normalization of the weight matrices is used to enhance the robustness and accuracy of the deep neural network (DNN) with Relu as activation functions. It is shown that the $L_{2,\infty}$ normalization leads to large dihedral angles between two adjacent faces of the DNN function graph and hence smoother DNN functions, which reduces over-fitting of the DNN. A global measure is proposed for the robustness of a classification DNN, which is the average radius of the maximal robust spheres with the training samples as centers. A lower bound for the robustness measure in terms of the $L_{2,\infty}$ norm is given. Finally, an upper bound for the Rademacher complexity of DNNs with $L_{2,\infty}$ normalization is given. An algorithm is given to train DNNs with the $L_{2,\infty}$ normalization and numerical experimental results are used to show that the $L_{2,\infty}$ normalization is effective in terms of improving the robustness and accuracy.
[1] LeCun Y, Bengio Y, and Hinton G, Deep learning, Nature, 2015, 521(7553):436-444.
[2] Voulodimos A, Doulamis N, Doulamis A, et al., Deep learning for computer vision:A brief review, Comput. Intel. and Neurosc., 2018, DOI:10.1155/2018/7068349.
[3] Socher R, Bengio Y, and Manning C D, Deep Learning for NLP (without magic), Tutorial Abstracts of ACL'2012, 2012, 5.
[4] Leshno M, Lin V Y, Pinkus A, et al., Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Networks, 1993, 6(6):861-867.
[5] Goodfellow I, Bengio Y, and Courville A, Deep Learning, MIT Press, Cambridge, 2016.
[6] Molchanov D, Ashukha A, and Vetrov D, Variational dropout sparsifies deep neural networks, arXiv:1701.05369, 2017.
[7] Srivastava N, Hinton G E, Krizhevsky A, et al., Dropout:A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research, 2014, 15:1929-1958.
[8] Wan L, Zeiler M, Zhang S, et al., Regularization of neural networks using DropConnect, ICML'13, 2013, 28:III-1058-III-1066.
[9] Ioffe S and Szegedy C, Batch normalization:Accelerating deep network training by reducing internal covariate shift, arXiv:1502.03167, 2015.
[10] Montúfar G, Pascanu R, Cho K, et al., On the number of linear regions of deep neural networks, NIPS'2014, 2014.
[11] Zhang X Y, Liu C L, and Suen C Y, Towards robust pattern recognition:A review, Proc. of the IEEE, 2020, 108(6):894-922.
[12] Zheng S, Song Y, Leung T, et al., Improving the robustness of deep neural networks via stability training, CVPR'16, 2016, 4480-4488.
[13] Meng D, Zhao Q, and Xu Z, Improve robustness of sparse PCA by L1-norm maximization, Pattern Recognition, 2012, 45:487-497.
[14] Hinton G, Vinyals O, and Dean J, Distilling the knowledge in a neural network, arXiv:1503.02531, 2015.
[15] Yu L and Gao X S, Robust and information-theoretically safe bias classifier against adversarial attacks, arXiv:2111.04404, 2021.
[16] Yu L, Wang Y, and Gao X S, Adversarial parameter attack on deep neural networks, arXiv:2203.10502, 2022.
[17] Madry A, Makelov A, Schmidt L, et al., Towards deep learning models resistant to adversarial attacks, arXiv:1706.06083, 2017.
[18] Lin W, Yang Z, Chen X, et al., Robustness verification of classification deep neural networks via linear programming, CVPR'2019, 2019, 11418-11427.
[19] Carlini N and Wagner D, Towards evaluating the robustness of neural networks, IEEE Symposium on Security and Privacy, DOI:10.1109/SP.2017.49.
[20] Neyshabur B, Tomioka R, and Srebro N, Norm-based capacity control in neural networks, COLT'15, 2015, 1376-1401.
[21] Wen M, Xu Y, Zheng Y, et al., Sparse deep neural networks using L1,∞-weight normalization, Statistica Sinica, 2021, 31:1397-1414.
[22] Bai T, Luo J, and Zhao J, Recent advances in understanding adversarial robustness of deep neural networks, ArXiv:2011.01539, 2020.
[23] Hodge W V D and Pedoe D, Methods of Algebraic Geometry, Volume I. Cambridge University Press, Cambridge, 1968.
[24] Croce F and Hein M, Minimally distorted adversarial examples with a fast adaptive boundary attack//International Conference on Machine Learning, PMLR, 2020, 2196-2205.
[25] Vaccaro L, Sansonetti G, and Micarelli A, An empirical review of automated machine learning, Computers, 2021, 10(1):11.
No related articles found!
Full text