中文    English

Journal of library and information science in agriculture

   

The Changing Landscape of US Technology Think Tanks Reports on the Electronic Information Research and Industry: A Topic Mining Perspective

XUE Qian1, ZHAO Hong2(), REN Fubing1,3,4   

  1. 1. School of Business, East China University of Science and Technology, Shanghai 200237
    2. School of Foreign Languages, East China University of Science and Technology, Shanghai 200237
    3. School of Marxism, East China University of Science and Technology, Shanghai 200237
    4. Institute of Marxism, East China University of Science and Technology, Shanghai 200237
  • Received:2025-07-07 Online:2025-10-21
  • Contact: ZHAO Hong

Abstract:

[Purpose/Significance] Science and technology have emerged as pivotal domains of competition between China and the United States. This article provides a quantitative analysis of US technology think tanks reports on the electronic information research and industry, with a focus on the evolution of themes and topics over the past decade. This analysis not only reflects their technological priorities but also maps their analytical focus on China, providing decision-making support for China's think tanks development and strategic response. [Method/Process] Based on the "2020 Global Go to Think Tank Index Report" released by the Think Tanks and Civil Societies Program (TTCSP) at the University of Pennsylvania, considering factors such as think tank authority, research topic relevance, and research continuity, we collected a total of 1 360 reports on the electronic information research and industry published between 2015 and 2024 by 8 leading US technology think tanks. Topic analysis was conducted with BERTopic, a topic modeling tool based on Transformer embeddings. The methodology involved several key steps. First, text cleaning was performed using NLTK tools; then, the all-MiniLM-L6-v2 model was employed to generate high-dimensional document embedding vectors. Subsequently, dimensionality reduction was achieved through the UMAP algorithm, followed by density clustering using the HDBSCAN algorithm. Finally, topic words were extracted based on the c-TF-IDF algorithm. [Results/Conclusions] The research identified 31 distinct research themes, of which 6 were directly related to China, specifically: global semiconductor industry competition, Sino-US digital policies and cloud computing competition, 5G network and technology competition, Chinese AI investment, Sino-US science and innovation policies, and Sino-US military technology competition. These 31 research themes were hierarchically clustered using HDBSCAN and could be categorized into 11 major research directions. The US technology think tanks persistently focused on 11 major research directions, which were largely concentrated on key areas of electronic information research and industry, such as semiconductors and microelectronics, artificial intelligence, wireless communication, quantum information technology, network security, and big data. The evolutionary trends across these research directions were generally consistent, with military technology and network security receiving the highest level of attention. The attention attached to China has undergone a significant strategic shift over the years, with drastic increase in semiconductor export control, AI technology and Sino-US digital competition. Based on the identified key themes and topic words, it is highly recommended to establish an evolutionary mapping of China-related topics and to develop a dynamic monitoring and early warning mechanism for technology issues concerning China. Future research could incorporate larger-scale corpus resources and more advanced large language models to continuously optimize topic modeling effectiveness.

Key words: US technology think tanks, electronic information research and industry, BERTopic, topic mining, Sino-US tech race

CLC Number: 

  • G353

Table 1

Comparison of advantages and limitations of different topic models"

模型 优点 局限性
LDA 能建模词语与主题的概率关系,适合静态语料分析 难以处理动态文本,语义表达浅显
DTM 具备时间序列分析能力,可揭示主题演化 依赖词袋模型,缺乏上下文语义理解能力
LLMs 语义理解和生成能力强,适应复杂语境 可控性、稳定性和解释性不足,直接用于主题建模尚不成熟
BERTopic 融合深度语义表示与动态主题建模,能揭示专业文本的潜在逻辑与演化轨迹 对超大规模数据计算成本较高,模型选择需调优

Fig.1

Technical roadmap"

Table 2

Distribution of data sources"

智库名称 报告数量/篇 智库名称 报告数量/篇
安全与新兴技术中心(CEST) 128 美国国家标准与技术研究院(NIST) 134
贝尔弗科学与国际事务中心(BC) 52 美国国家科学院(NAS) 104
兰德公司(RAND) 456 信息技术与创新基金会(ITIF) 194
布鲁金斯学会(BI) 143 美国战略与国际问题研究中心(CSIS) 149

Fig.2

Distribution of thematic feature words for think tank reports"

Table 3

Topic identification results and brief description"

Topic 主题 简要描述
T0 美国军事技术与AI应用 指挥系统、情报分析、兵棋推演、无人系统、军事训练、AI辅助、技术伦理等
T1 网络安全治理与技术 威胁评估、防御技术、政策治理、社会影响及新兴技术安全等
T2 社交媒体虚假信息治理 虚假信息传播机制、社会影响及治理措施等
T3 AI与教育及劳动力 AI对劳动力市场的冲击、教育领域的应用、挑战和机遇
T4 全球半导体产业竞争 半导体产业发展、供应链韧性、出口管制效能等
T5 数字隐私与消费者保护 隐私保护法规、平台责任、技术对策、跨境数据治理等
T6 中美数字政策与云计算竞争 中美在数字政策、云计算及相关技术领域的竞争与合作
T7 算法偏见与治理 AI偏见及治理措施
T8 智能物理系统 智能物理系统(Cyber-Physical Systems, CPS)的系统架构、性能优化、数据管理、标准化、安全和人才培养等
T9 远程医疗技术 信息技术与医疗服务融合
T10 大数据与智能交通 大数据在交通管理和自动驾驶中的应用
T11 刑事司法中技术的应用 刑事司法系统中的技术创新、信息共享与司法系统运行挑战
T12 5G网络与技术竞争 基础设施建设、频谱管理、安全隐私及中美竞争态势
T13 军事威慑与空间战 中国在网络空间的威慑能力、高超音速武器和无人系统等对未来战争的影响
T14 中国人工智能投资 中国在AI领域投资格局及其对全球技术竞争的影响
T15 武器系统中的网络威胁 现代武器系统中的网络攻击风险及任务保障问题
T16 AI在网络安全中的应用 AI/ML(Machine Learning)在攻击检测、防御及系统自我保护中的作用
T17 公共安全技术应用 身份验证、通信安全与应急场景中的技术赋能
T18 数字政府与公共服务 数字化手段提升政府服务的效率和透明度
T19 无线通信与频谱管理 无线通信发展及频谱分配与管理问题
T20 信息技术赋能教育 个性化学习、混合式学习模式、教育公平与质量提升
T21 数据共享与数据政策 数据共享机制、隐私保护以及数据流通和合作
T22 中美科技与创新政策 中美在AI、半导体等关键领域政策与发展战略
T23 增强现实与虚拟现实 沉浸式技术的用户体验与应用前景
T24 量子信息科学 量子计算原理、算法及在密码学等领域的应用
T25 宽带网络发展 宽带网络部署、社会影响与城乡覆盖问题
T26 物联网与智慧城市 物联网在城市治理与服务中的应用
T27 区块链与加密货币 区块链在金融领域的应用及加密货币对传统金融体系的影响
T28 中美军事技术竞争 中美在前沿军事技术领域的竞争态势
T29 生物识别技术的治理挑战 生物识别在隐私、伦理方面的挑战以及在执法、商业、公共安全中的应用
T30 AI+生物医药 AI在药物研发、医疗应用及蛋白质结构预测中的作用

Fig.3

Two-dimensional visual distribution of literature on various themes"

Fig.4

Topic hierarchical clustering"

Table 4

Distribution of research directions"

研究方向命名 Topic(占比) 特征词Top5
(1)中美科技竞争 4(4.71%) semiconductor/chip/export/manufacturing/control
14(2.28%) china/investment/Chinese/company/united
22(2.21%) china/innovation/united/policy/technology
6(3.68%) cloud/digital/china/policy/company
12(2.57%) network/wireless/spectrum/technology/united
28(1.18%) military/competition/united/china/state
(2)量子信息科学 24(1.40%) quantum/computing/computer/application/science
(3)生物识别技术 29(1.10%) facial/recognition/image/technology/frt
(4)人工智能多维治理 7(3.38%) algorithm/system/algorithmic/bias/ artificial
3(5.44%) learning/workforce/artificial/llm/intelligence
16(2.06%) learning/machine/attack/cyber/code
(5)数字医疗 30(1.10%) open/drug/source/development/research
9(3.24%) health/care/telehealth/patient/telemedicine
(6)信息技术赋能教育 20(1.76%) student/school/teacher/instruction/learning
(7)军事技术与网络安全 13(3.31%) military/deterrence/space/ warfare/defense
0(10.51%) force/air/dod/defense/military
1(11.69%) cyber/cybersecurity/security/threat/risk
15(2.21%) cyber/mission/ operation/system/weapon
2(6.84%) medium/social/disinformation/information/russia
(8)元宇宙与数字经济 23(1.40%) arvr/immersive/virtual/user /ar
27(1.25%) blockchain/financial/cryptocurrencies/sro/crypto
(9)数字技术与社会变革 18(1.99%) digital/government/website/service/state
5(4.04%) privacy/online/consumer/digital/data
21(1.62%) data/divide/sharing/datadriven/country
26(1.47%) iot/thing/internet/city/smart
(10)智能技术应用 10(2.94%) big/vehicle/transportation/data/ car
8(3.31%) cps/system/manufacturing/smart/ grid
17(2.06%) safety/public/responder/fire/identity
11(2.50%) enforcement/justice/law/need/ criminal
(11)无线通信技术 19(1.91%) spectrum/wireless/radio/band/mobile
25(1.25%) broadband/network/speed/fixed/rural

Fig.5

Research direction changes over time"

Fig.6

Distribution of topics related to China"

[1]
王克平, 孙华伟, 鞠孜涵, 等. 我国科技智库研究述评[J]. 情报科学, 2023, 41(10): 177-188.
WANG K P, SUN H W, JU Z H, et al. A review of the researches on science and technology think tanks in China[J]. Information science, 2023, 41(10): 177-188.
[2]
新华社. 高举中国特色社会主义伟大旗帜为全面建设社会主义现代化国家而团结奋斗——在中国共产党第二十次全国代表大会上的报告[EB/OL]. [2022-10-25].
[3]
SALTON G. Some research problems in automatic information retrieval[C]//Proceedings of the 6th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Bethesda, Maryland: ACM, 1983: 252-263.
[4]
DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American society for information science, 1990, 41(6): 391-407.
[5]
HOFMANN T. Probabilistic latent semantic indexing[J]. ACM SIGIR forum, 2017, 51(2): 211-218.
[6]
BLEI D M, NG A, JORDAN M I. Latent dirichlet allocation[J]. Journal of machine learning research, 2003, 3(3): 993-1022.
[7]
BRUINSMA B, JOHANSSON M. Finding the structure of parliamentary motions in the Swedish Riksdag 1971-2015[J]. Quality & quantity, 2024, 58(4): 3275-3301.
[8]
PAVITHRA, SAVITHA. Topic modeling for evolving textual data using LDA, HDP, NMF, BERTOPIC, and DTM with a focus on research papers[J]. Journal of technology and informatics (JoTI), 2024, 5(2): 53-63.
[9]
王志强, 李宜展, 李云龙, 等. 基于BERTopic的大科学装置科学研究联合基金资助主题挖掘[J]. 图书情报工作, 2024, 68(24): 104-113.
WANG Z Q, LI Y Z, LI Y L, et al. Theme mining of the projects by joint research fund of large-scale scientific facility based on BERTopic[J]. Library and information service, 2024, 68(24): 104-113.
[10]
MARAGHEH R Y, FANG C H, IRUGU C C, et al. LLM-TAKE: Theme aware keyword extraction using large language models[EB/OL]. 2023: arXiv: 2312.00909.
[11]
王莉丽, 郑博临. 美国智库对中国式现代化的话语建构: 基于批判话语分析与LDA方法[J]. 智库理论与实践, 2025, 10(1): 128-137.
WANG L L, ZHENG B L. Discourse construction of Chinese modernization by U. S. think tanks: A critical discourse analysis and LDA approach[J]. Think tank (theory & practice), 2025, 10(1): 128-137.
[12]
FU H, WEI F, ZHOU H, et al. Research on "The Belt and Road Initiative" report of think tank based on theme evolution and identification: Taking 2013-2020 as an example[J]. PLoS one, 2024, 19(6): e0297127.
[13]
吴瑞鹏, 李勇男, 刘帅, 等. 基于DTM的美国人工智能战略热点主题及演化分析[J]. 情报杂志, 2023, 42(12): 134-143.
WU R P, LI Y N, LIU S, et al. Analysis on hot topics and evolution of American artificial intelligence strategy based on DTM[J]. Journal of intelligence, 2023, 42(12): 134-143.
[14]
RAMAN R, PATTNAIK D, LATHABAI H H, et al. Green and sustainable AI research: An integrated thematic and topic modeling analysis[J]. Journal of big data, 2024, 11(1): 55.
[15]
国家质量监督检验检疫总局 中国国家标准化管理委员会. 学科分类与代码: GB/T 13745—2009 [S]. 北京: 中国标准出版社, 2009.
Standardization Administration of the People's Republic of China. Classification and code of disciplines: GB/T 13745-2009 [S]. Beijing: Standards Press of China, 2009.
[16]
电气电子工程师学会. 关于IEEE - IEEE中国[EB/OL]. [2025-03-03].
[17]
MCGANN J G. 2020 global go to think tank index report[R/OL]. [2025-02-15].
[18]
王益成, 蒋星宇, 郑彦宁. 基于BERTopic模型的科技报告主题挖掘与演化分析: 以生物技术领域为例[J]. 情报科学, 2024, 42(9): 51-60.
WANG Y C, JIANG X Y, ZHENG Y N. Topic mining and evolution analysis of science and technology report based on BERTopic model: Taking biotechnology as an example[J]. Information science, 2024, 42(9): 51-60.
[19]
Gartner. 解读2024年Gartner Hype Cycle™新兴技术成熟度曲线[EB/OL]. [2025-04-15].
[20]
The White House. Critical and emerging technologies list 2024 update[EB/OL]. [2025-04-15].
[21]
黄亚茜. 美国智库对西方“技术联盟”的观点建议及中国应对[J]. 情报杂志, 2023, 42(6): 80-86, 103.
HUANG Y X. U. S. think Tanks'Viewspoints on western technological alliance and China's countermeasures[J]. Journal of intelligence, 2023, 42(6): 80-86, 103.
[1] CHEN Yuanyuan, FU Bin, GAO Yuan, QIAO Junwei. Identification of Emerging Technology Topics and Prediction of Trends Using a Method Integrating BERTopic and IWOA-BiLSTM Models [J]. Journal of library and information science in agriculture, 2025, 37(6): 55-69.
[2] ZHAO Yajing. A Study of the Factors Influencing Participation Behavior among Users with Depression on User-Generated Content (UGC) Platforms [J]. Journal of library and information science in agriculture, 2025, 37(6): 70-86.
[3] SHEN Mengcheng, CHEN Xiuping. Analysis of the Evaluation and Development Pathways for Rural Cultural-Tourism Integration Based on Online Text Data: A Case Study of 26 Mountainous Counties in Zhejiang Province [J]. Journal of library and information science in agriculture, 2025, 37(4): 66-82.
[4] XIANG Rui, SUN Wei. Methodology for Assessing the Influence of Technical Topics Based on PhraseLDA-SNA and Machine Learning [J]. Journal of library and information science in agriculture, 2024, 36(4): 45-62.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!