中文    English

›› 2015, Vol. 27 ›› Issue (2): 57-59.doi: 10.13998/j.cnki.issn1002-1248.2015.02.015

• Network technology • Previous Articles     Next Articles

Automatic Extraction of Metadata Information for Dissertation based on Feature and Rule Pattern

CHEN Shu-ping   

  1. Library of Yanshan University, Yanshan University, Hebei 066004, China
  • Received:2014-07-01 Online:2015-02-05 Published:2015-03-04

Abstract: Currently, in our digital library, dissertations database is one important of digital resources. However, metadata entry has relied on manual to complete, which is low efficiency, and cost a lot of manpower. For this problem, our applied the method of document features and pattern matching, and made use of regular expressions to research automatic extraction of dissertation metadata. The algorithm includes two modules of information field location and metadata extraction. The experimental data shows that the algorithm has higher precision and recall, and overall performance index F.

Key words: Dissertation

CLC Number: 

  • G203
[1] 李胜利,李昌清,袁平鹏,等.基于Web的电子期刊元数据信息抽取方法[J].华中科技大学学报,2007,35(12):13-15.
[2] 北大方正集团有限公司,北京方正阿帕比技术有限公司.一种基于文字流的文章元数据信息自动抽取方法及系统:中国,CN200810119832.X[P].2010-3-17.
[3] 钱建,吴广茂,蒋路.基于特征相似度的科技论文元数据提取算法研究[J].微电子学与计算机,2008,25(8):129-132.
[4] Liger F, McQueen C, Wilton P.C#字符串和正则表达式参考手册[M].刘勒亭,译.北京:清华大学出版社,2003.
[5] Ben Forta.正则表达式必知必会[M].杨涛等译.北京:人民邮电出版社,2007.
[6] 曹俊,万晓云,廖顺宝.基于正则表达式批量提取CNKI文献元数据技术探究[J].图书情报工作,2010,54(19):111-114.
[7] 李朝光,张铭,邓志鸿,等.论文元数据信息的自动抽取[J].计算机工程与应用,2002,(21):189-191,235.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!