农业图书情报学报 ›› 2023, Vol. 35 ›› Issue (4): 90-99.doi: 10.13998/j.cnki.issn1002-1248.23-0154

• 应用实践 • 上一篇    下一篇

基于EfficientNet的迁移学习在近代文献图像分类中的应用研究——以上海图书馆《中国近代文献图库》为例

杨敏, 郭利敏   

  1. 上海图书馆,上海 200031
  • 收稿日期:2023-03-14 出版日期:2023-04-05 发布日期:2023-07-12
  • 作者简介:杨敏,女,副研究馆员,上海图书馆(上海科学技术情报研究所),研究方向为数字人文、历史文献资源数字化建设。郭利敏,男,副高级工程师,上海图书馆(上海科学技术情报研究所),研究方向为卷积神经网络、智慧图书馆
  • 基金资助:
    社会科学基金项目“近代期刊中的数字图像语义描述框架研究与应用”(19BTQ044)

Application of EfficientNet-based Transfer Learning in Image Classification of Modern Documents: Taking Shanghai Library's "Picture Gallery of Modern Chinese Literature" as an Example

YANG Min, GUO Limin   

  1. Shanghai Library, Shanghai 200031
  • Received:2023-03-14 Online:2023-04-05 Published:2023-07-12

摘要: [目的/意义]近代文献中的图像作为重要的历史史料,日益受到人文学者的重视,大规模图像资源的深度标注也随之成为图像数据基础设施建设的重要组成部分,利用深度学习对海量图像进行内容解析是图像研究的新方向。本文的研究目的,是通过基于EfficientNet的迁移学习在近代文献图像分类中的实证研究,解决大规模近代文献图像的自动分类问题,提高其在实际应用中的准确率和效率。[方法/过程]本文的研究方法,是根据近代文献图像中的特征分析,采用7 645张近代文献图像数据集,通过裁切、白平衡、色调分离、仿射变换等图像增强手段串行叠加,提高样本图像的多样性,并通过对深度学习算法的研究,使用微调的简化EfficientNet深度卷积神经网络模型进行迁移学习,最终得到了在近代文献图像分类上表现良好的模型。[结果/结论]本文的研究结论,是根据实验结果发现,该模型有效提高了图像分类效率和分类准确性,对于解决近代文献中大规模图像的自动分类具有一定的推广价值。

关键词: ResNet, EfficientNet, 迁移学习, 图像分类, 近代文献

Abstract: [Purpose/Significance] As important historical data, images in modern literature are increasingly valued by humanities researchers. The deep annotation of large-scale image resources has also become an important part of the construction of image data infrastructure. It is a new direction of image study to analyze the content of massive images by using technologies such as deep learning. The purpose of this paper is to address the challenges of automatic classification of large-scale modern document images, improve accuracy and efficiency in practical application through the empirical research of transfer learning based on a simplified EfficientNet network specifically optimized for modern document image classification. [Method/Process] This paper adopts the selective images of "Illustrated Century - Modern Chinese Literature Library" from Shanghai Library, which is a deep exploration of the image content in modern Chinese literature by the "National Newspaper Index". The research method is to improve the diversity of sample images by serial stacking of those selective 7,645 modern literature image data sets through imaging enhancement technologies such as cutting, white balance, tone separation, and affine transformation, based on the characteristics analysis of modern literature images. Then we conducted transfer learning by fine-tuning simplified EfficientNet depth convolution neural network model through the study of depth learning algorithms. Finally, an optimized model that performs well in modern literature image classification was identified. Our simplified model achieved an average classification Top1 accuracy of 90.97%, an average F1 value of 91.00%, which validated its simplification, efficient, and good generalization ability for modern literature image classification application. During the experiments, it was also found that some images had high similarity and phototropism in morphology, which led to not good-enough classification results. However, this does provide valuable insights for further optimization and simplification of EfficientNet network model. The performance comparison test results with ResNet50-vd network also fully demonstrate that the simplified EfficientNet network can more economically and efficiently support incremental iterative training of subsequent models for achieving high-precision artificial intelligence classification of modern literature image databases. [Results/Conclusions] The experimental results indicate that the model effectively improves the efficiency and accuracy of image classification, and thus it has certain application promotion value for solving the automatic classification challenges of large-scale images in modern literature. In the future, we will continue to explore its application in the extraction of digital image semantic information. Through digital image pre-processing and extraction of digital image content and characteristics, we will provide technical enablers for automatic extraction of semantic information, so as to truly reduce the workload of manual intervention and achieve semantic description of millions of image data.

Key words: ResNet, EfficientNet, transfer learning, image classification, modern literature

中图分类号: 

  • TP391.41

引用本文

杨敏, 郭利敏. 基于EfficientNet的迁移学习在近代文献图像分类中的应用研究——以上海图书馆《中国近代文献图库》为例[J]. 农业图书情报学报, 2023, 35(4): 90-99.

YANG Min, GUO Limin. Application of EfficientNet-based Transfer Learning in Image Classification of Modern Documents: Taking Shanghai Library's "Picture Gallery of Modern Chinese Literature" as an Example[J]. Journal of Library and Information Science in Agriculture, 2023, 35(4): 90-99.