中文    English

Journal of Library and Information Science in Agriculture ›› 2023, Vol. 35 ›› Issue (4): 90-99.doi: 10.13998/j.cnki.issn1002-1248.23-0154

Previous Articles     Next Articles

Application of EfficientNet-based Transfer Learning in Image Classification of Modern Documents: Taking Shanghai Library's "Picture Gallery of Modern Chinese Literature" as an Example

YANG Min, GUO Limin   

  1. Shanghai Library, Shanghai 200031
  • Received:2023-03-14 Online:2023-04-05 Published:2023-07-12

Abstract: [Purpose/Significance] As important historical data, images in modern literature are increasingly valued by humanities researchers. The deep annotation of large-scale image resources has also become an important part of the construction of image data infrastructure. It is a new direction of image study to analyze the content of massive images by using technologies such as deep learning. The purpose of this paper is to address the challenges of automatic classification of large-scale modern document images, improve accuracy and efficiency in practical application through the empirical research of transfer learning based on a simplified EfficientNet network specifically optimized for modern document image classification. [Method/Process] This paper adopts the selective images of "Illustrated Century - Modern Chinese Literature Library" from Shanghai Library, which is a deep exploration of the image content in modern Chinese literature by the "National Newspaper Index". The research method is to improve the diversity of sample images by serial stacking of those selective 7,645 modern literature image data sets through imaging enhancement technologies such as cutting, white balance, tone separation, and affine transformation, based on the characteristics analysis of modern literature images. Then we conducted transfer learning by fine-tuning simplified EfficientNet depth convolution neural network model through the study of depth learning algorithms. Finally, an optimized model that performs well in modern literature image classification was identified. Our simplified model achieved an average classification Top1 accuracy of 90.97%, an average F1 value of 91.00%, which validated its simplification, efficient, and good generalization ability for modern literature image classification application. During the experiments, it was also found that some images had high similarity and phototropism in morphology, which led to not good-enough classification results. However, this does provide valuable insights for further optimization and simplification of EfficientNet network model. The performance comparison test results with ResNet50-vd network also fully demonstrate that the simplified EfficientNet network can more economically and efficiently support incremental iterative training of subsequent models for achieving high-precision artificial intelligence classification of modern literature image databases. [Results/Conclusions] The experimental results indicate that the model effectively improves the efficiency and accuracy of image classification, and thus it has certain application promotion value for solving the automatic classification challenges of large-scale images in modern literature. In the future, we will continue to explore its application in the extraction of digital image semantic information. Through digital image pre-processing and extraction of digital image content and characteristics, we will provide technical enablers for automatic extraction of semantic information, so as to truly reduce the workload of manual intervention and achieve semantic description of millions of image data.

Key words: ResNet, EfficientNet, transfer learning, image classification, modern literature

CLC Number: 

  • TP391.41
[1] 杨敏, 夏翠娟, 颜佳. 数字人文视域下图像库建设的现状分析与趋势前瞻[J]. 图书馆杂志, 2021, 40(4): 90-99.
YANG M, XIA C J, YAN J.Analysis of the current situation and prospect of image database construction from the perspective of digital humanities[J]. Library journal, 2021, 40(4): 90-99.
[2] 楚敏南. 基于卷积神经网络的图像分类技术研究[D]. 湘潭: 湘潭大学, 2015.
CHU M N.Research of image classification technology based on convolutional neural network[D]. Xiangtan: Xiangtan University, 2015.
[3] 马艳春, 刘永坚, 解庆, 等. 自动图像标注技术综述[J]. 计算机研究与发展, 2020, 57(11): 2348-2374.
MA Y C, LIU Y J, XIE Q, et al.Review of automatic image annotation technology[J]. Journal of computer research and development, 2020, 57(11): 2348-2374.
[4] 杨真真, 匡楠, 范露, 等. 基于卷积神经网络的图像分类算法综述[J]. 信号处理, 2018, 34(12): 1474-1489.
YANG Z Z, KUANG N, FAN L, et al.Review of image classification algorithms based on convolutional neural networks[J]. Journal of signal processing, 2018, 34(12): 1474-1489.
[5] OQUAB M, BOTTOU L, LAPTEV I, et al.Learning and transferring mid-level image representations using convolutional neural networks[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2014: 1717-1724.
[6] 郑远攀, 李广阳, 李晔. 深度学习在图像识别中的应用研究综述[J]. 计算机工程与应用, 2019, 55(12): 20-36.
ZHENG Y P, LI G Y, LI Y.Survey of application of deep learning in image recognition[J]. Computer engineering and applications, 2019, 55(12): 20-36.
[7] 宋光慧. 基于迁移学习与深度卷积特征的图像标注方法研究[D]. 杭州: 浙江大学, 2017.
SONG G H.Image annotation method based on transfer learning and deep convolutional feature[D]. Hangzhou: Zhejiang University, 2017.
[8] 王海沣, 邓柯, 陈静. 基于卷积神经网络的近代报纸广告图片聚类方法[J]. 数字人文, 2021(2): 50-61.
WANG H F, DENG K, CHEN J.Clustering methods of modern newspaper advertisements via convolutional neural network[J]. Digital humanities, 2021(2): 50-61.
[9] 殷婕, 曾子明, 孙守强. 基于深度学习和哈希方法的敦煌壁画移动视觉搜索研究[J]. 现代情报, 2023, 43(5): 35-45, 78.
YIN J, ZENG Z M, SUN S Q.Research on the mobile visual search of Dunhuang murals based on deep learning and hashing[J]. Journal of modern information, 2023, 43(5): 35-45, 78.
[10] 高亚琪, 王昊, 刘渊晨. 图像语义特征的探索及其对分类的影响研究[J]. 情报科学, 2021, 39(10): 107-117.
GAO Y Q, WANG H, LIU Y C.A study on the exploring of image semantic features and their influence on classification[J]. Information science, 2021, 39(10): 107-117.
[11] 杨建梁, 刘越男. 机器学习在档案管理中的应用:进展与挑战[J]. 档案学通讯, 2019(6): 48-56.
YANG J L, LIU Y N.The application of machine learning in archives management: Progress and challenges[J]. Archives science
12 bulletin, 2019(6): 48-56.
[12] SCHUETTPELZ E, FRANDSEN P, DIKOW R, et al.Applications of deep convolutional neural networks to digitized natural history collections[J]. Biodiversity data journal, 2017, 5: e21139.
[13] 武苏雯, 赵慧杰, 刘鑫, 等. 基于迁移学习的图像分类在诗词中的应用研究[J]. 计算机技术与发展, 2021, 31(7): 215-220.
WU S W, ZHAO H J, LIU X, et al.Research on application of image classification based on transfer learning in poetry[J]. Computer technology and development, 2021, 31(7): 215-220.
[14] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2023-05-25]. http://arxiv.org/abs/1409.1556.
[15] TAN M X, LE Q V.EfficientNet: Rethinking model scaling for convolutional neural networks[EB/OL].[2023-04-09]. https://doi.org/10.48550/arXiv.1905.11946.
[16] DENG J, DONG W, SOCHER R, et al.ImageNet: A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, USA: IEEE, 2009: 248-255.
[17] 陈涛, 单蓉蓉, 李惠. 数字人文中图像资源的语义化标注研究[J]. 农业图书情报学报, 2020, 32(9): 6-14.
CHEN T, SHAN R R, LI H.Semantic annotation of image resources in digital humanities[J]. Journal of library and information science in agriculture, 2020, 32(9): 6-14.
[18] 陈金菊. 图像语义标注研究综述[J]. 图书馆学研究, 2017(18): 2-7, 20.
CHEN J J.A review of image semantic annotation research[J]. Research on library science, 2017(18): 2-7, 20.
[1] JIN Ying, YE Sa, LI Honglei. The Intelligent Diagnosis Model of Fruit Tree Disease Based on ResNet-50 [J]. Journal of Library and Information Science in Agriculture, 2021, 33(4): 58-67.
[2] GAO Yunmei. Image Resource Service for Digital Library in Big Data Era [J]. , 2017, 29(11): 157-160.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!