中文    English

Agricultural Library and Information ›› 2019, Vol. 31 ›› Issue (7): 4-13.doi: 10.13998/j.cnki.issn1002-1248.2019.07.19-0421

• Special review •     Next Articles

Review of Data Quality Research: Comparative Perspective

SUN Lili1, YUAN Qinjian2   

  1. 1. Information Service Department, Nanjing Tech University, Nanjing 210023, China;
    2.School of Information Management, Nanjing University, Nanjing 210023, China
  • Received:2019-05-13 Online:2019-07-05 Published:2019-09-12

Abstract: Data quality is the precondition of data application and the core content of big data research. In this study, related topics such as the concept connotation of data quality, and its influence factors, data quality evaluation, the optimization strategies were comprehensively combed and reviewed. With the methods of comparative analysis,the paper analyzed data quality and information quality, traditional data quality and large data quality in order to clarify the research progress on key issues in the field of data quality, and to clarify the focus of data quality research.

Key words: data quality, comparative perspective, research status

CLC Number: 

  • G250
[1] 宋敏,覃正.国外数据质量管理研究综述[J].情报杂志,2007,26(2):7-9.
[2] 韩京宇,徐立臻,董逸生.数据质量研究综述[J].计算机科学,2008,35(2):1-5.
[3] 王宏志. 大数据质量管理:问题与研究进展[J].科技导报,2014,32(34):78-84.
[4] 张新兴. 高校科学数据管理研究综述[J].情报资料工作,2016,37(6):48-54.
[5] 张宁,袁勤俭.数据质量评价述评[J].情报杂志,2017,40(10):135-139.
[6] PAOLO B, SILVANA S, VALERIO B.A Review of big data quality and an assessment method and features of data quality for public health information systems[J].International Journal of Management Sciences and Business Research,2018,7(1):19-33.
[7] 刘冰,庞琳. 国内外大数据质量研究述评[J]. 情报学报, 2019, 38(2):217-226.
[8] Wang. What data quality means to data consumers[J]. Journal of Management Information Systems,1996,12(4):5-33.
[9] Kahn B K, Strong D M.Product and Service Performance Model for Information Quality: An Update[C]// Conference on Information Quality. DBLP, 1998:102-115.
[10] Cappiello C, Francalanci C, Pernici B.Data quality assessment from the user's perspective[C]// International Workshop on Information Quality in Information Systems. ACM, 2004:68-73.
[11] 莫祖英. 国内外信息质量研究述评[J].情报资料工作,2015(02):29-36.
[12] Aebi D, Perrochon L.Towards Improving Data Quality[C]// International Conference on Information Systems and Management of Data.1993:273-281.
[13] Redman T C, Blanton A.Data quality for the information age[M]. Artech House, Inc., 1997.
[14] MERINO J, CABALLERO I, RIVAS B, et al.A data quality in use model for big data[J]. Future Generation Computer Systems,2016,63:123-130.
[15] Cai L, Zhu Y.The challenges of data quality and data quality assessment in the big data era[J]. Data Science Journal,2015,14:2.
[16] Rao D, Gudivada V N, Raghavan V V.Data quality issues in big data[C]// IEEE International Conference on Big Data. IEEE, 2015: 2654-2660.
[17] 莫祖英. 大数据处理流程中的数据质量影响分析[J].现代情报,2017,37(03):69-72.
[18] 马费成,宋恩梅.信息管理学基础.第2版[M].武汉:武汉大学出版社,2011:10-11.
[19] Ballou D P, Pazer H L.Modeling data and process quality in multi-input, multi-output information systems[J]. Management science,1985,31(2):150-162.
[20] Strong D M, Lee Y W, Wang R Y.Data quality in context[J]. Communications of the ACM, 1997,40(5):103-110.
[21] 论信息形态与信息质量(下)——论信息的质与量及其意义[J].档案学通讯,2006(3):20-22.
[22] 曹瑞昌,吴建明.信息质量及其评价指标体系[J].情报探索,2002(4):6-9.
[23] 王欣,陈建华,韩洁平.信息系统质量的影响因素分析[J].情报科学,2002,20(4):426-427.
[24] Redman T C.Improve data quality for competitive advantage[J]. MIT Sloan Management Review,1995,36(2):99.
[25] 张博,宋立荣.农业科技信息共享中信息质量需求分析[J].中国农学通报,2010,26(10):343-346.
[26] 莫祖英, MoZuying.国内外信息质量研究述评[J].情报资料工作,2015,36(2):29-36.
[27] FISHER C W, KINGMA B R.Criticality of data quality as exemplified in two disasters[J]. Information & management,2001,39(2):109-116.
[28] OTTO B.EconPapers: Data governance[J]. Business & information Systems Engineering, 2011(4):241-244.
[29] CAO L, ZHU H.Normal accidents: data quality problems in ERP-enabled manufacturing[J]. Journal of Data & information Quality,2013,4(3):1-26.
[30] 胡逢彬,沈炜.数据ETL过程中的数据质量控制[J].信息技术,2006(4):19-21.
[31] 卢本新. 数据仓库数据质量管理的研究[D].大连理工大学,2013:18.
[32] 孙俐丽,吴建华,袁勤俭.B2C企业数据资产质量影响因素研究[J].情报理论与实践,2017,40(07):99-102+98.
[33] LEE Y W, PIPINO L L, FUNK J D等. 数据质量征途[M].黄伟,王嘉寅,冯耕中,译.北京:高等教育出版社,2015.
[34] 刘伟. 基于元数据的数据质量控制与评估模型研究[D].东北石油大学,2011:12.
[35] 曹建军,刁兴春,汪挺等. 数据质量控制研究中若干基本问题[J].微计算机信息,2010,26(9):12-14.
[36] PIERCE E. A progress report from MIT information quality conference[EB/OL].[2016-7-1]. http://tdan.com/a-progress-report-from-mit-information-quality-conference/5143.
[37] LENZ H J, BOROWSKI E.Business data quality control: a step by step procedure[M]// Frontiers in Statistical Quality Control 10. Physica-Verlag HD, 2012:374.
[38] MCGILVRAY D.数据质量管理工程实践——获取高质量数据和可信信息的十大步骤[M].刁兴春,曹建军,张健美等译.北京:电子工业出版社,2010:29-31.
[39] 陈远,罗琳,沈祥兴. 信息系统中的数据质量问题研究[J].中国图书馆学报,2004,30(1):48-50.
[40] KHUSHALI Y D.Big data quality modeling and validation[D].CA: San Jose State University,2018.
[41] MERINO J, CABALLERO I, RIVAS B, et al.A data quality in use model for big data[J]. Future Generation Computer Systems,2016,63:123-130.
[42] 莫祖英. 大数据质量测度模型构建[J].情报理论与实践,2018,41(03):11-15.
[43] YANG W L, STRONG D M, KAHN B K, et al.AIMQ: a methodology for information quality assessment[J].Information & management,2002,40(2):133-146.
[44] BALLOU D P, CHENGALUR-SMITH I S N, Wang R Y. Sample-based quality estimation of query results in relational database environments[J].IEEE Transactions on Knowledge & data Engineering,2006,18(5):639-650.
[45] 杨青云,赵培英,杨冬青等. 数据质量评估方法研究[J].计算机工程与应用,2004,40(9):3-4.
[46] 韩京宇, 宋爱波, 董逸生. 数据质量维度量化方法[J].计算机工程与应用,2008,44(36):1-6.
[47] PIERCE E M.Assessing data quality with control matrices[J].Communications of the ACM,2004,47(2):82-86.
[48] 陈苏,柏文阳,徐洁磐. 一种新的数据质量模型的研究[J].计算机应用研究,2005(7):48-50.
[49] 黄武锋,郑华.面向企业信息化的数据质量评估研究[J].计算机技术与发展,2011,21(1):185-188.
[50] CARO A, CALERO C, CABALLERO I, PIATTINI M.A proposal for a set of attributes relevant for Web portal data quality[J].Software Quality Journal,2008,16:513-542.
[51] 胡晓程. 企业实施ERP系统数据质量管理研究[D].西安科技大学,2011:2.
[52] CAI L, ZHU Y Y.The challenges of data quality and data quality assessment in the big data era[J]. Data Science Journal,2015,14:2-10.
[53] WAND Y.Anchoring data quality dimensions in ontological foundations[J]. Communications of the ACM,1996,39(11):86-95.
[54] 方幼林,杨冬青,唐世渭等. 数据仓库中数据质量控制研究[J].计算机工程与应用,2003,39(13):1-4.
[55] 李贺,张世颖.移动互联网用户生成内容质量评价体系研究[J].情报理论与实践,2015,38(10):6-11+37.
[56] 莫祖英. 地市级政府公开信息质量评价实证研究[J].情报科学,2018,36(08):112-117.
[57] 邓胜利,赵海平.用户视角下网络健康信息质量评价标准框架构建研究[J].图书情报工作,2017,61(21):30-39.
[58] BOVEE M,SRIVASTAVA R P,BRENDA M.A conceptual framework and belief-function approach to assessing overall information quality[J].International Journal of Intelligent Systems,2003,18(1):51-74.
[59] WATTS S,SHANKARANRAYANAN G,EVEN A.Data quality assessment in context: a cognitive perspective[J].Decision Support Systems,2009,48:202-211.
[60] Ryzhov A, Bray F, Ferlay J, et al.Evaluation of data quality at the National Cancer Registry of Ukraine[J].Cancer epidemiology,2018,53:156-165.
[61] KERR K.The institutionalisation of data quality in the New Zealand health sector[D]. New Zealand:The University of Auckland,2006.
[62] 蒋清泉. 政府统计数据质量优化方法研究[J].统计与决策,2017(24):26-30.
[63] 刘文奇. 中国公共数据库数据质量控制模型体系及实证[J].中国科学:信息科学,2014,44(7):836-856.
[64] 刘文云,岳丽欣,马伍翠等. 政府数据开放保障机制在数据质量控制中的应用研究[J].情报理论与实践,2018(4):21-27.
[65] Owonibi M, Koenig-Ries B.A Quality Management Workflow Proposal for a Biodiversity Data Repository[C]//International Conference on Conceptual Modeling.Springer,Cham,2014: 157-167.
[66] Buchmann T, Jablonski S, Volz B, et al.Towards a generic infrastructure for sustainable management of quality controlled primary data[C]//OTM Confederated International Conferences" On the Move to Meaningful Internet Systems". Springer,Berlin, Heidelberg,2010:130-138.
[67] 张静蓓,任树怀. 国外科研数据知识库数据质量控制研究[J]. 图书馆杂志,2016(11):44-50.
[68] PIPINO L, WANG R, FUNK J, et al.Journey to data quality[C]//The MIT Press,2006:793-794.
[69] 宗威,吴锋.大数据时代下数据质量的挑战[J].西安交通大学学报社会科学版,2013,33(5):38-43.
[70] 沈振萍,谢阳群.基于企业信息工厂的商务智能数据管理研究[J].情报科学,2013(3):105-109.
[71] WAHYUDI A, KUK G, JANSSEN M.A process pattern model for tackling and improving big data quality[J]. Information Systems Frontiers,2018,20(3):457-469.
[72] YAO L, Ge Z.Big data quality prediction in the process industry: A distributed parallel modeling framework[J]. Journal of Process Control,2018,68:1-13.
[73] Luo T, Huang J, Kanhere S S, et al.Improving IoT Data Quality in Mobile Crowd Sensing: A Cross Validation Approach[J]. IEEE Internet of Things Journal,2019:1-1.
[74] ACOSTA M, Zaveri A, Simperl E, et al.Detecting linked data quality issues via crowdsourcing: A dbpedia study[J]. Semantic web,2018,9(3):303-335.
[75] FOX F, Aggarwal V R, Whelton H, et al.A data quality framework for process mining of electronic health record data[C]//2018 IEEE International Conference on Healthcare Informatics (ICHI).IEEE,2018:12-21.
[76] CORRALES D, Corrales J, Ledezma A.How to address the data quality issues in regression models: a guided process for data cleaning[J].Symmetry,2018,10(4):99.
[77] DAVID C, Agapito L, Juan C . From Theory to Practice: A Data Quality Framework for Classification Tasks[J].Symmetry,2018,10(7):248-.
[78] 刘冰,庞琳.国内外大数据质量研究述评[J].情报学报,2019,38(02):217-226.
[79] Juddoo S.Overview of data quality challenges in the context of Big Data[C]//2015 International Conference on Computing, Communication and Security (ICCCS).IEEE,2015:1-9.
[80] Glowalla P, Balazy P, Basten D, et al.Process-Driven Data Quality Management--An Application of the Combined Conceptual Life Cycle Model[C]//2014 47th Hawaii International Conference on System Sciences.IEEE,2014:4700-4709.
[81] 韩京宇, 徐立臻, 董逸生. 数据质量研究综述[J].计算机科学,2008,35(2):1-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!