走向“已知之未知”：GPT大语言模型助力实现以人为本的信息检索

doi:10.13998/j.cnki.issn1002-1248.23-0386

Abstract

Abstract: [Purpose/Significance] The foundation of public library services lies within information retrieval (IR), an area that has a profound societal impact through activities such as digital resource integration and the advancement of societal equity. Current methodologies focus primarily on classical keyword-based Online Public Access Catalog (OPAC)-like top-down retrieval and large language model (LLM) based point-to-point retrieval. Unfortunately, these approaches individually fail to strike a balance between flexibility and reliability, hindering the evolution towards user-centric IR systems. Consequently, there is an urgent need for an innovative retrieval strategy that fosters a human-centered IR paradigm. [Method/Process] Contrary to the prevalent school of thought that advocates for the complete substitution of classical OPAC-like approach with LLM methods such as GPT, we put forward a groundbreaking proposal that synergizes the merits of both strategies. This proposition represents the inaugural effort of this kind within the scholarly community of public information service. We introduce the adaptive literature retrieval framework (ALRF), an innovative approach grounded in the principles of cognitive science, addressing the critical user challenge in retrieval - the pursuit of known unknown knowledge (KUK). KUK originates from a user's explicit understanding of the desired outcome, without comprehending the associated domain-specific terminology, thereby lacking the necessary entry point for a keyword-based search. ALRF's novel two-stage workflow caters specifically to such situations: (i) users can identify target keywords or keywords at a more abstract level by entering descriptions in natural language, thus implementing a bottom-up strategy; (ii) utilizing these extracted keywords, users can then conduct a top-down search. ALRF accommodates LLMs such as ChatGPT, GPT-4, and ERNIE Bot. The platform's effectiveness in retrieving literature from diverse fields such as science and engineering, biology and medicine, literature and sociology was carefully evaluated. [Results/Conclusions] The ALRF significantly outperforms standard methods, i.e., LLM-based retrieval service and OPAC-like retrieval service, in terms of both flexibility and reliability. This holds true for tasks involving keyword abstraction (i.e., identifying keywords at a higher level of abstraction in the target domain) and property extraction (i.e., locating keywords with specific attributes but at the same abstraction level as the target domain). Consequently, it addresses the pressing need for KUK retrieval, signifying that ALRF has showcased initial potential to cater to the diverse and personalized retrieval requirements of users. This suggests that ALRF could potentially revolutionize public information services by placing humans at the center of its operation. Regrettably, a current hindrance to the wider adoption of ALRF in public IR in China is the pace of development of powerful LLMs by Chinese corporations. We recommend that researchers remain abreast of such advancements to be cognizant of the realistic possibilities and limitations in real-world applications.

Key words: GPT, large language model, information retrieval, knowledge management, AIGC

CLC Number:

G254.9

SHOU Jianqi. Towards Known Unknowns: GPT Large Language Models Empower Human-Centered Information Retrieval[J].Journal of library and information science in agriculture, 2023, 35(5): 16-26.

References

[1] JAEGER P T, BERTOT J C.Designing, implementing, and evaluating user-centered library services: public libraries as civic spaces[M]//In public libraries and the Internet: Roles, perspectives, and implications. Santa Barbara, CA: Libraries Unlimited, 2010: 67-89.
[2] HOCKLY N.Digital literacies[J]. ELT journal, 2012, 66(1): 108-112.
[3] REAL B, BERTOT J C, JAEGER P T.Digital inclusion and the role of public libraries in promoting social justice[J]. Reference & user services quarterly, 2014, 54(2): 24-34.
[4] MATTHEWS J R.The evaluation and measurement of library services[M]. 2nd ed. Santa Barbara, CA: Libraries Unlimited, 2012
[5] CHAN L M, HODGES T.Cataloging and classification: An introduction[M]. 3rd ed. Lanham, Md.: Scarecrow Press, 2007.
[6] BROUGHTON V.Essential classification[M]. 2nd ed. London, UK: Facet Publishing, 2006.
[7] CHOWDHURY G G.Natural language processing for information retrieval: A review of recent advances and future possibilities[J]. Library review, 2018, 67(4/5): 252-267.
[8] NIU X, ZHANG T, CHEN H.Study on the performance of ALIP, an online information service system[J]. Information processing & management, 2017, 53(5): 1105-1122.
[9] 张智雄, 刘欢, 于改红. 构建基于科技文献知识的人工智能引擎[J]. 农业图书情报学报, 2021, 33(1): 17-31.
ZHANG Z X, LIU H, YU G H.Building an artificial intelligence engine based on scientific and technological literature knowledge[J]. Journal of library and information science in agriculture, 2021, 33(1): 17-31.
[10] Openai. ChatGPT[EB/OL].[2023-04-07]. https://openai.com/blog/ChatGPT/.
[11] LUND B, WANG T.Chatting about ChatGPT: How may AI and GPT impact academia and libraries?[J]. Library hi tech news, 2023, 40(3): 26-29.
[12] CHEN X T.ChatGPT and its possible impact on library reference services[J]. Internet reference services quarterly, 2023, 27(2): 121-129.
[13] COX C, TZOC E.ChatGPT: Implications for academic libraries[J]. College & research libraries news, 2023, 84(3): 99-102.
[14] PANDA S, KAUR N.Exploring the viability of ChatGPT as an alternative to traditional chatbot systems in library and information centers[J]. Library hi tech news, 2023, 40(3): 22-25.
[15] 李书宁, 刘一鸣. ChatGPT类智能对话工具兴起对图书馆行业的机遇与挑战[J]. 图书馆论坛, 2023, 43(5): 104-110.
LI S N, LIU Y M.Opportunities and challenges for library from the rise of ChatGPT-like intelligent chat tools[J]. Library tribune, 2023, 43(5): 104-110.
[16] 陆伟, 刘家伟, 马永强, 等. ChatGPT为代表的大模型对信息资源管理的影响[J]. 图书情报知识, 2023, 40(2): 6-9, 70.
LU W, LIU J W, MA Y Q, et al.The influence of large language models represented by ChatGPT on information resources manage-ment[J]. Documentation, information & knowledge, 2023, 40(2): 6-9, 70.
[17] 赵瑞雪, 黄永文, 马玮璐, 等. ChatGPT对图书馆智能知识服务的启示与思考[J]. 农业图书情报学报, 2023, 35(1): 29-38.
ZHAO R X, HUANG Y W, MA W L, et al.Insights and reflections of the impact of ChatGPT on intelligent knowledge services in libraries[J]. Journal of library and information science in agriculture, 2023, 35(1): 29-38.
[18] SHNEIDERMAN B.Human-centered AI[M/OL]. Oxford University Press, 2022. https://doi.org/10.1093/oso/9780192845290.001.0001.
[19] XU W.Toward human-centered AI[J]. Interactions, 2019, 26(4): 42-46.
[20] LOGAN D C.Known knowns, known unknowns, unknown unknowns and the propagation of scientific enquiry[J]. Journal of experimental botany, 2009, 60(3): 712-714.
[21] PAWSON R, WONG G, OWEN L.Known knowns, known unknowns, unknown unknowns[J]. American journal of evaluation, 2011, 32(4): 518-546.
[22] TALEB N N.The black swan: The impact of the highly improbable[M]. New York: Random House, 2007.
[23] KNIGHT F H.Risk, uncertainty and profit[M]. Boston: Hougthon Mifflin company, 1921.
[24] ATKINSON R D, COURT R H.The new economy index: Under-standing America's economic transformation[M]. Washington, DC: Progressive Policy Institute, 1998.
[25] POLANYI M.The tacit dimension[M]. 1st ed. Garden City, N.Y.: Doubleday, 1966.
[26] DRETSKE F I.Knowledge & the flow of information[M]. Cam-bridge, Mass.: MIT Press, 1981.
[27] SIMON H A.Bounded rationality and organizational learning[J]. Or-ganization science, 1991, 2(1): 125-134.
[28] GIUNCHIGLIA F, WALSH T.A theory of abstraction[J]. Artificial intelligence, 1992, 57(2/3): 323-389.
[29] TENENBAUM J B, KEMP C, GRIFFITHS T L, et al.How to grow a mind: Statistics, structure, and abstraction[J]. Science, 2011, 331(6022): 1279-1285.
[30] KASNECI E, SESSLER K, KüCHEMANN S, et al. ChatGPT for good? On opportunities and challenges of large language models for education[J]. Learning and individual differences, 2023, 103: 102274.
[31] HELBERGER N, DIAKOPOULOS N.ChatGPT and the AI act[J]. Internet policy review, 2023, 12(1).
[32] WHITE J, FU Q C, HAYS S, et al. A prompt pattern catalog to enhance prompt engineering with ChatGPT[EB/OL].2023: arXiv: 2302.11382. https://arxiv.org/abs/2302.11382.
[33] OpenAI. GPT-4[EB/OL].[2023-04-07]. https://openai.com/research/gpt-4.
[34] 百度. 文心大模型[EB/OL].[2023-03-11].https://wenxin.baidu.com/.

Related Articles 15

[1]	GUO Jinbo. Effects of AIGC on Reader Trust in Library Information [J]. Journal of library and information science in agriculture, 2026, 38(4): 84-98.
[2]	LYU Lucheng, ZHOU Jian, SUN Wenjun, ZHAO Yajuan, HAN Tao. Performance of Fine-Tuned Large Language Models in Patent Text Mining [J]. Journal of library and information science in agriculture, 2026, 38(4): 36-46.
[3]	QIAN Li, YANG Yanxi, ZHANG Yuanzhe, HU Maodi, CHANG Zhijun. The Impacts and Implications of OpenClaw for Scientific and Technical Literature Intelligence Work [J]. Journal of library and information science in agriculture, 2026, 38(4): 4-12.
[4]	DENG Qiping, KE Jiaxiu, GAN Peng, ZHOU Song. Construction of an Intelligent Agent for Academic Output Data Analysis Oriented to Academic Evaluation [J]. Journal of library and information science in agriculture, 2026, 38(3): 76-87.
[5]	HUANG Xiaotang, YAO Qibin. Collaborative Development Path of GLAM Institutions Based on AIGC Technology Application [J]. Journal of library and information science in agriculture, 2026, 38(2): 66-78.
[6]	FENG Li, GUO Bochi, GAO Mian. Optimizing the Path of Cultivating Intellectual Property Literacy among College Students through AIGC Empowerment [J]. Journal of library and information science in agriculture, 2026, 38(1): 58-70.
[7]	WU Yuhao, LIU Yihao, LI Qingjun, HU Xu. Open Sharing of Library Data Based on Large Language Models: Logic, Path and Strategy [J]. Journal of library and information science in agriculture, 2026, 38(1): 28-43.
[8]	WANG Xiaoyu, HU Jingyuan, WU Ruoyu, WANG Shu, ZHAI Yujia. An LLM-based Data Augmentation Method for Constructing Science & Technology Topic Linkages: Taking the Energy Conservation Field as an Example [J]. Journal of library and information science in agriculture, 2025, 37(9): 63-81.
[9]	LIU Wei, ZHANG Lei, JI Ting, CHEN Xiaoyang. Shaping the Smart Libraries with AI: An Agent-based, Next-Generation Library Service Platform [J]. Journal of library and information science in agriculture, 2025, 37(5): 15-26.
[10]	QIAN Li, WANG Qianying, LIU Yi, ZHANG Yuanzhe, CHANG Zhijun. Agent Technology and Its Applications in Scientific Research [J]. Journal of library and information science in agriculture, 2025, 37(5): 5-14.
[11]	ZHANG Li, WANG Bo, JING Shui. Generative AI-Driven Resource Discovery in Public Libraries: Service Optimization Based on a Dynamic Evaluation Model [J]. Journal of library and information science in agriculture, 2025, 37(5): 58-71.
[12]	SANG Yuanyuan. Multimodal Learning Technology Aimed at Exploring the Innovative Path of Library Intelligence Service [J]. Journal of library and information science in agriculture, 2025, 37(3): 42-52.
[13]	QIAO Jinhua, MA Xueyun. Risks and Regulations for Application of the LLaMA Model in University Future Learning Centers [J]. Journal of library and information science in agriculture, 2025, 37(2): 37-48.
[14]	CAI Yiran, HU Zhengyin, LIU Chunjiang. Analysis of Progress in Data Mining of Scientific Literature Using Large Language Models [J]. Journal of library and information science in agriculture, 2025, 37(2): 4-22.
[15]	AN Bo. A Multi-Task Knowledge Extraction Method for Traditional Chinese Medicine Ancient Books Integrating Chain-of-Thought [J]. Journal of library and information science in agriculture, 2025, 37(12): 81-94.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Towards Known Unknowns: GPT Large Language Models Empower Human-Centered Information Retrieval

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0