中文    English

Journal of Library and Information Science in Agriculture ›› 2024, Vol. 36 ›› Issue (2): 51-60.doi: 10.13998/j.cnki.issn1002-1248.24-0109

Previous Articles     Next Articles

Framework for the Semantic Description of Images with Integrated Events and Emotions

HU Shoumin1, DONG Huanqing2   

  1. 1. Central China Normal University Library, Wuhan 430079;
    2. School of Information Management, Central China Normal University, Wuhan 430079
  • Received:2024-01-11 Online:2024-02-05 Published:2024-04-30

Abstract: [Purpose/Significance] Aiming at the semantic missing and incomplete problems in the process of image organization and retrieval, a framework for semantic description of images in social media is proposed to enrich the existing theoretical system of image description, improve the efficiency and utilization of image retrieval, and provide a reference for the realization of the automatic semantic annotation of images. [Method/Process] First, we conducted a survey and analysis of research progress related to image description both domestically and internationally, summarizing the existing theories of image description and annotation, metadata specifications, and related technical methods. Second, based on the image metadata standards and the theory of hierarchical and categorical description of image features, we constructed a semantic description framework for social media images, focusing on seven layers: external feature layer, content layer, object layer, relationship layer, scene layer, event layer, and emotional layer. We also elaborated in detail the various semantic layers and their interrelationships. Finally, we verified the feasibility of the image semantic description framework by describing the examples of character images and landscape images. [Results/Conclusions] The results of the descriptive examples of character images and landscape images indicate that the image semantic description framework can eliminate the "semantic gap" in image description through semantic associations between different layers, and achieve a multi-faceted, multi-dimensional, and multi-level structured and semantic description of the external and content features of images. It has strong portability and flexibility. However, there are also certain limitations and areas for improvement in this paper: 1) Based on the image semantic description framework proposed in this paper, a prototype system based on image annotation needs to be developed; 2) The images posted by users on social media are closely related to the situation, and they are more likely to express emotions. In the future, more research on the semantic layer of images can be conducted based on the text information posted by users; 3) Future research can further explore the application of deep learning in image and text fusion to achieve more accurate event and emotion recognition. By constructing a more complex neural network structure, the event and emotion information in the image can be deeply mined and fused; 4) When describing images, the study should pay attention not only to static visual features, but also to consider the dynamic course of events. Future frameworks could attempt to combine static and dynamic information to provide richer, more vivid descriptions of images.

Key words: semantic description framework, image feature, semantic annotation, Sora

CLC Number: 

  • G251.3
[1] Data never sleeps 4.0[EB/OL].[2017-01-08]. https://www.domo.com/blog/data-never-sleeps-4-0/.
[2] 王晓光, 徐雷, 李纲. 敦煌壁画数字图像语义描述方法研究[J]. 中国图书馆学报, 2014, 40(1): 50-59.
WANG X G, XU L, LI G.Semantic description framework research on Dunhuang fresco digital images[J]. Journal of library science in China, 2014, 40(1): 50-59.
[3] 黄崑, 王珊珊, 耿骞. 国外图像特征研究进展与启示[J]. 图书情报工作, 2015, 59(8): 138-146.
HUANG K, WANG S S, GENG Q.Research progress and enlighten-ment of image features abroad[J]. Library and information service, 2015, 59(8): 138-146.
[4] IPTC photo metadata standard[EB/OL]. [2017-01-08]. http://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata.
[5] RANSOM N, RAFFERTY P.Facets of user-assigned tags and their effectiveness in image retrieval[J]. Journal of documentation, 2011, 67(6): 1038-1066.
[6] JORGENSEN C, STVILIA B, WU S H.Assessing the relationships among tag syntax, semantics, and perceived usefulness[J]. Journal of the association for information science and technology, 2014, 65(4): 836-849.
[7] KEISTER L H.User types and queries: Impact on image access systems[J]. Challenges in indexing electronic text and images, 1994: 7-22.
[8] TURNER J M.Comparing user-assigned terms with indexer-assigned terms for storage and retrieval of moving images[C]//Proceedings of the Annual Meeting of the American Society of Information Science, 1995.
[9] JORGENSEN C.Indexing images: Testing an image description template[C]//Proceedings of the ASIS Annual Meeting. 1996, 33: 209-13.
[10] SHATFORD S.Analyzing the subject of a picture: A theoretical approach[J]. Cataloging & classification quarterly, 1986, 6(3): 39-62.
[11] JAIMES A, CHANG S F.Conceptual framework for indexing visual information at multiple levels[C]. Proceedings of SPIE - The International Society for Optical Engineering, 2000, 3964: 2-15.
[12] FAUZI F, BELKHATIR M.Multifaceted conceptual image indexing on the world wide web[J]. Information processing & management, 2013, 49(2): 420-440.
[13] 王惠锋, 孙正兴, 王箭. 语义图像检索研究进展[J]. 计算机研究与发展, 2002, 39(5): 513-523.
WANG H F, SUN Z X, WANG J.Semantic image retrieval: Review and research[J]. Journal of computer research and development, 2002, 39(5): 513-523.
[14] 吴楠, 宋方敏. 一种基于图像高层语义信息的图像检索方法[J]. 中国图象图形学报, 2006, 11(12): 1774-1780.
WU N, SONG F M.An image retrieval method based on high-level image semantic information[J]. Journal of image and graphics, 2006, 11(12): 1774-1780.
[15] MPEG-7 overview[EB/OL]. [2017-04-07]. http://mpeg.chiariglione.org/standards/mpeg-7.
[16] JORGENSEN C.Attributes of images in describing tasks[J]. Information processing & management, 1998, 34(2/3): 161-174.
[17] 廉营. 基于语义角色标注的微博人物关系抽取[D]. 哈尔滨: 哈尔滨工业大学, 2013.
LIAN Y.Character relationship extraction in microblog based on sementic role labeling[D]. Harbin: Harbin Institute of Technology, 2013.
[18] (英)格列高里, 著. 彭聃龄, 杨栚, 译. 视觉心理学[M]. 北京: 北京师范大学出版社, 1986.
GREGORY L R.Visual psychology[M]. Beijing: Beijing Normal University Press, 1986.
[19] 高强, 游宏梁. 事件抽取技术研究综述[J]. 情报理论与实践, 2013,36(4): 114-117, 128.
GAO Q, YOU H L.Summary of research on event extraction tech-nology[J]. Information studies: Theory & application, 2013, 36(4): 114-117, 128.
[20] 刘小瑞. 基于Mpeg-7的图像多层次语义知识库的构建[D]. 太原: 太原理工大学, 2012.
LIU X R.Construction of image multi-level semantic knowledge base based on Mpeg-7[D]. Taiyuan: Taiyuan University of Technology, 2012.
[21] 黄崑, 赖茂生. 图像情感特征的分类与提取[J]. 计算机应用, 2008, 28(3): 659-661, 668.
HUANG K, LAI M S.Classification and extraction of image affective features[J]. Journal of computer applications, 2008, 28(3): 659-661, 668.
[1] CHEN Tao, SHAN Rongrong, LI Hui. Semantic Annotation of Image Resources in Digital Humanities [J]. Journal of Library and Information Science in Agriculture, 2020, 32(9): 6-14.
[2] YU Li. Discipline Development Trend Analysis based on Text Semantic Understanding [J]. Journal of Library and Information Science in Agriculture, 2020, 32(3): 29-36.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!