data_scince数据科学
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured,[1][2] similar to data mining.
数据科学是一个跨学科的领域,它使用科学的方法、过程、算法和系统从各种形式的数据中提取知识和见解,无论是结构化的还是非结构化的,类似于数据挖掘。
Data science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data.[3] It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
数据科学是一种“将统计、数据分析、机器学习及其相关方法统一起来的概念”,以“理解和分析实际现象”。它采用了从数学、统计学、信息科学和计算机科学的许多领域中提取的技术和理论。
Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.[4][5]
图灵奖(Turing award)得主吉姆•格雷(Jim Gray)将数据科学设想为科学的“第四种范式”(实证、理论、计算和现在的数据驱动),并断言“由于信息技术的影响,科学的一切都在改变”,数据泛滥。
In 2012, when Harvard Business Review called it "The Sexiest Job of the 21st Century",[6] the term "data science" became a buzzword. It is now often used interchangeably with earlier concepts like business analytics,[7] business intelligence, predictive modeling, and statistics. Even the suggestion that data science is sexy was paraphrasing Hans Rosling, featured in a 2011 BBC documentary with the quote, "Statistics is now the sexiest subject around."[8] Nate Silver referred to data science as a sexed up term for statistics.[9] In many cases, earlier approaches and solutions are now simply rebranded as "data science" to be more attractive, which can cause the term to become "dilute[d] beyond usefulness."[10] While many university programs now offer a data science degree, there exists no consensus on a definition or suitable curriculum contents.[7] To its discredit, however, many data-science and big-data projects fail to deliver useful results, often as a result of poor management and utilization of resources.[11][12][13][14]
2012年,当《哈佛商业评论》(Harvard Business Review)称其为“21世纪最性感的工作”时,“数据科学”(data science)一词成为了热门词汇。现在,它经常与早期的概念(如业务分析、业务智能、预测建模和统计)互换使用。即便是那些认为数据科学很性感的说法,也是借用了汉斯•罗斯林(Hans Rosling)的话。罗斯林在2011年英国广播公司(BBC)的一部纪录片中写道:“统计现在是最性感的话题。”内特·西尔弗(Nate Silver)将数据科学(data science)称为统计学的性感术语。在许多情况下,早期的方法和解决方案现在被简单地重新命名为“数据科学”,以变得更有吸引力,这可能导致这个术语变得“没有用处”。虽然许多大学课程现在提供数据科学学位,但对于定义或合适的课程内容没有共识。然而,令人怀疑的是,许多数据科学和大数据项目未能产生有用的结果,往往是由于对资源的管理和利用不善。
1 History历史
The term "data science" has appeared in various contexts over the past thirty years but did not become an established term until recently. In an early usage, it was used as a substitute for computer science by Peter Naur in 1960. Naur later introduced the term "datalogy".[15] In 1974, Naur published Concise Survey of Computer Methods, which freely used the term data science in its survey of the contemporary data processing methods that are used in a wide range of applications.
“数据科学”一词在过去30年里出现在各种语境中,但直到最近才成为一个正式术语。在早期的使用中,它被Peter Naur在1960年作为计算机科学的替代品使用。纳尔后来推出了“数据统计”一词。1974年,Naur发表了《计算机方法简明概览》,它在调查广泛应用的当代数据处理方法时,自由地使用了“数据科学”一词。
In 1996, members of the International Federation of Classification Societies (IFCS) met in Kobe for their biennial conference. Here, for the first time, the term data science is included in the title of the conference ("Data Science, classification, and related methods"),[16] after the term was introduced in a roundtable discussion by Chikio Hayashi.[3]
1996年,国际船级社联合会(IFCS)成员在神户举行了两年一次的会议。在这里,在Chikio Hayashi在一次圆桌讨论中引入了数据科学这个词,这个词首次被列入会议的标题(“数据科学、分类和相关方法”)中。
In November 1997, C.F. Jeff Wu gave the inaugural lecture entitled "Statistics = Data Science?"[17] for his appointment to the H. C. Carver Professorship at the University of Michigan.[18] In this lecture, he characterized statistical work as a trilogy of data collection, data modeling and analysis, and decision making. In his conclusion, he initiated the modern, non-computer science, usage of the term "data science" and advocated that statistics be renamed data science and statisticians data scientists.[17] Later, he presented his lecture entitled "Statistics = Data Science?" as the first of his 1998 P.C. Mahalanobis Memorial Lectures.[19] These lectures honor Prasanta Chandra Mahalanobis, an Indian scientist and statistician and founder of the Indian Statistical Institute.
1997年11月在密歇根大学,C.F. Jeff Wu做了题为“统计学=数据科学?”的就职演说,他被任命为H.C.Carver教授。在这次讲座中,他将统计工作描述为数据收集、数据建模和分析以及决策的三部曲。在他的结论中,他开创了现代非计算机科学,使用“数据科学”一词,并主张将统计学重新命名为数据科学和统计数据科学家。后来,他发表了题为“统计学=数据科学?”的演讲,作为他1998年马哈拉施特拉邦纪念演讲的第一场。这些讲座是为了纪念印度科学家、统计学家、印度统计研究所创始人普拉珊塔·钱德拉·马哈拉纳比斯。
In 2001, William S. Cleveland introduced data science as an independent discipline, extending the field of statistics to incorporate "advances in computing with data" in his article "Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics," which was published in Volume 69, No. 1, of the April 2001 edition of the International Statistical Review / Revue Internationale de Statistique.[20] In his report, Cleveland establishes six technical areas which he believed to encompass the field of data science: multidisciplinary investigations, models and methods for data, computing with data, pedagogy, tool evaluation, and theory.
2001年,威廉·s·克利夫兰(William S. Cleveland)在《国际统计评论》2001年4月版第69卷第1期出版的他的文章《数据科学:扩大统计技术领域的行动计划》中,将“使用数据进行计算的进展”扩展进了统计领域,将数据科学作为一门独立学科引入。克利夫兰(William S. Cleveland)建立了他认为包含数据科学领域的六个技术领域:多学科调查、数据模型和方法、数据计算、教学法、工具评估和理论。
In April 2002, the International Council for Science (ICSU): Committee on Data for Science and Technology (CODATA)[21] started the Data Science Journal,[22] a publication focused on issues such as the description of data systems, their publication on the internet, applications and legal issues.[23] Shortly thereafter, in January 2003, Columbia University began publishing The Journal of Data Science,[24] which provided a platform for all data workers to present their views and exchange ideas. The journal was largely devoted to the application of statistical methods and quantitative research. In 2005, The National Science Board published "Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century” defining data scientists as "the information and computer scientists, database and software and programmers, disciplinary experts, curators and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection" whose primary activity is to "conduct creative inquiry and analysis."[25]
2002年4月,国际科学理事会(ICSU)下属的科学技术数据委员会(CODATA)创办了《数据科学杂志》(Data Science Journal),这是一份专注于数据系统描述、数据系统在互联网上的发布、应用和法律问题等问题的刊物。此后不久,在2003年1月,哥伦比亚大学开始出版《数据科学杂志》,为所有的数据工作者提供了一个发表观点和交换意见的平台。该杂志主要致力于统计方法和定量研究的应用。2005年,美国国家科学委员会发表了“长寿的数字数据收集:促进21世纪的研究和教育”的文章,该文章将数据科学家定义为“信息和计算机科学家、数据库和软件以及程序员、学科专家、策展人和专家注释者、图书管理员、档案管理员以及其他对成功管理数字数据收集至关重要的人”,他们的主要活动是“进行创造性的调查和分析”。
Around 2007,[citation needed] Turing award winner Jim Gray envisioned "data-driven science" as a "fourth paradigm" of science that uses the computational analysis of large data as primary scientific method[4][5] and "to have a world in which all of the science literature is online, all of the science data is online, and they interoperate with each other."[26]
2007年左右,图灵奖得主吉姆·格雷将“数据驱动科学”设想为以大数据的计算分析为主要科学方法的科学的“第四种范式”,并且“所有的科学文献都是在线的,所有的科学数据都是在线的,它们相互作用。”
In the 2012 Harvard Business Review article "Data Scientist: The Sexiest Job of the 21st Century",[6] DJ Patil claims to have coined this term in 2008 with Jeff Hammerbacher to define their jobs at LinkedIn and Facebook, respectively. He asserts that a data scientist is "a new breed", and that a "shortage of data scientists is becoming a serious constraint in some sectors", but describes a much more business-oriented role.
2012年,《哈佛商业评论》(Harvard Business Review)发表了一篇题为《数据科学家:21世纪最性感的工作》(Data Scientist: the sexy most Job of the 21st Century)的文章,作者之一DJ Patil声称在2008年和Jeff Hammerbacher一起创造了这个词,分别定义了他们在LinkedIn和Facebook的工作。他断言,数据科学家是“新一代”,“数据科学家的短缺正成为某些领域的严重制约因素”,但他描述了一个更加面向业务的角色。
In 2013, the IEEE Task Force on Data Science and Advanced Analytics[27] was launched. In 2013, the first "European Conference on Data Analysis (ECDA)" was organised in Luxembourg, establishing the European Association for Data Science (EuADS). The first international conference: IEEE International Conference on Data Science and Advanced Analytics was launched in 2014.[28] In 2014, General Assembly launched student-paid bootcamp and The Data Incubator launched a competitive free data science fellowship.[29] In 2014, the American Statistical Association section on Statistical Learning and Data Mining renamed its journal to "Statistical Analysis and Data Mining: The ASA Data Science Journal" and in 2016 changed its section name to "Statistical Learning and Data Science".[30] In 2015, the International Journal on Data Science and Analytics[31] was launched by Springer to publish original work on data science and big data analytics. In September 2015 the Gesellschaft für Klassifikation (GfKl) added to the name of the Society "Data Science Society" at the third ECDA conference at the University of Essex, Colchester, UK.
2013年,IEEE数据科学和高级分析工作组启动。2013年,首届“欧洲数据分析会议”在卢森堡举行,成立了欧洲数据科学协会(EuADS)。第一次国际会议:IEEE数据科学和高级分析国际会议于2014年启动。2014年,联合国大会发起了学生付费训练营,数据孵化器发起了一项有竞争力的免费数据科学奖学金。2014年,美国统计学会统计学习与数据挖掘分会更名为“统计分析与数据挖掘:ASA数据科学期刊”,2016年更名为“统计学习与数据科学”。2015年,施普林格创办了《国际数据科学与分析杂志》,发表了关于数据科学和大数据分析的原创作品。2015年9月,在英国科尔切斯特埃塞克斯大学举行的第三届ECDA会议上,Gesellschaft fur Klassifikation (GfKl)加入了“数据科学协会”的名称。
2 Relationship to statistics与统计学的关系
The popularity of the term "data science" has exploded in business environments and academia, as indicated by a jump in job openings.[32] However, many critical academics and journalists see no distinction between data science and statistics. Writing in Forbes, Gil Press argues that data science is a buzzword without a clear definition and has simply replaced “business analytics” in contexts such as graduate degree programs.[7] In the question-and-answer section of his keynote address at the Joint Statistical Meetings of American Statistical Association, noted applied statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician....Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.”[9] Similarly, in business sector, multiple researchers and analysts state that data scientists alone are far from being sufficient in granting companies a real competitive advantage[33] and consider data scientists as only one of the four greater job families companies require to leverage big data effectively, namely: data analysts, data scientists, big data developers and big data engineers.[34]
“数据科学”(data science)一词在商业环境和学术界的受欢迎程度大幅上升,就业机会大幅增加就说明了这一点。然而,许多持批评态度的学者和记者认为,数据科学和统计学之间没有区别。Gil Press在《福布斯》(Forbes)杂志上撰文称,数据科学是一个没有明确定义的时髦词,它只是在研究生课程等领域取代了商业分析。在他的主题演讲的问答部分美国统计协会的联合统计会议,指出应用统计学家Nate Silver说,我认为数据科学家是一个性感的术语一个统计学家....统计学是科学的一个分支。数据科学家在某种程度上有点多余,人们不应该斥责统计学家这个术语。同样,在商业领域,多名研究人员和分析师也指出:数据科学家是公司有效利用大数据所需要的四大职业之一,即:数据分析师、数据科学家、大数据开发者和大数据工程师,仅靠数据科学家远不足以让企业获得真正的竞争优势。
On the other hand, responses to criticism are as numerous. In a 2014 Wall Street Journal article, Irving Wladawsky-Berger compares the data science enthusiasm with the dawn of computer science. He argues data science, like any other interdisciplinary field, employs methodologies and practices from across the academia and industry, but then it will morph them into a new discipline. He brings to attention the sharp criticisms computer science, now a well respected academic discipline, had to once face.[35] Likewise, NYU Stern's Vasant Dhar, as do many other academic proponents of data science,[35] argues more specifically in December 2013 that data science is different from the existing practice of data analysis across all disciplines, which focuses only on explaining data sets. Data science seeks actionable and consistent pattern for predictive uses.[1] This practical engineering goal takes data science beyond traditional analytics. Now the data in those disciplines and applied fields that lacked solid theories, like health science and social science, could be sought and utilized to generate powerful predictive models.[1]
另一方面,对批评的回应也同样多。在《华尔街日报》(Wall Street Journal) 2014年的一篇文章中,欧文·沃拉达斯基-伯格(Irving Wladawsky-Berger)将数据科学的热情比作计算机科学的萌芽。他认为,数据科学与其他任何跨学科领域一样,采用了来自学术界和产业界的方法和实践,但随后将把它们转变为一门新的学科。他引发了人们对计算机科学的尖锐批评,他不得不面去面对,而现在计算机科学是一门备受尊敬的学科。同样,纽约大学斯特恩商学院的瓦桑特·达尔和其他许多数据科学的学术支持者也是如此,更具体地说,在2013年12月,他们认为数据科学不同于所有学科的现有数据分析实践,后者只关注于解释数据集。数据科学寻求可操作的和一致的模式用于预测用途。这个实际的工程目标使数据科学超越了传统的分析。现在,那些缺乏可靠理论的学科和应用领域的数据,如健康科学和社会科学,可以被用来建立强大的预测模型。
In an effort similar to Dhar's, Stanford professor David Donoho, in September 2015, takes the proposition further by rejecting three simplistic and misleading definitions of data science in lieu of criticisms.[36] First, for Donoho, data science does not equate to big data, in that the size of the data set is not a criterion to distinguish data science and statistics.[36] Second, data science is not defined by the computing skills of sorting big data sets, in that these skills are already generally used for analyses across all disciplines.[36] Third, data science is a heavily applied field where academic programs right now do not sufficiently prepare data scientists for the jobs, in that many graduate programs misleadingly advertise their analytics and statistics training as the essence of a data science program.[36][37] As a statistician, Donoho, following many in his field, champions the broadening of learning scope in the form of data science,[36] like John Chambers who urges statisticians to adopt an inclusive concept of learning from data,[38] or like William Cleveland who urges to prioritize extracting from data applicable predictive tools over explanatory theories.[20] Together, these statisticians envision an increasingly inclusive applied field that grows out of traditional statistics and beyond.
与Dhar类似,斯坦福大学教授David Donoho也在2015年9月进一步否定而不是批评了对数据科学的三种简单化和误导性的定义。首先,对于Donoho来说,数据科学并不等同于大数据,数据集的大小不是区分数据科学和统计的标准。其次,数据科学不是由对大数据集进行排序的计算技能来定义的,因为这些技能已经被广泛应用于各个学科的分析。第三,数据科学是一个应用广泛的领域,目前的学术项目并没有为数据科学家的工作做好充分的准备,许多研究生项目误导地宣传他们的分析和统计培训是数据科学项目的本质。作为一名统计学家,多诺霍(Donoho)追随他所在领域的许多人,支持以数据科学的形式扩大学习范围,就像约翰•钱伯斯(John Chambers)敦促统计学家采用从数据中学习的包容性概念一样,或者像威廉·克利夫兰(William Cleveland)那样,他敦促优先从数据中提取适用的预测工具,而不是解释性理论。综上所述,这些统计学家展望了一个越来越广泛的应用领域,它是由传统统计发展而来的。
For the future of data science, Donoho projects an ever-growing environment for open science where data sets used for academic publications are accessible to all researchers.[36] US National Institute of Health has already announced plans to enhance reproducibility and transparency of research data.[39] Other big journals are likewise following suit.[40][41] This way, the future of data science not only exceeds the boundary of statistical theories in scale and methodology, but data science will revolutionize current academia and research paradigms.[36] As Donoho concludes, "the scope and impact of data science will continue to expand enormously in coming decades as scientific data and data about science itself become ubiquitously available."[36]
对于数据科学的未来,Donoho计划为开放科学提供一个不断增长的环境,在这个环境中,用于学术出版物的数据集对所有研究人员都是开放的。美国国立卫生研究院已经宣布了提高研究数据可重复性和透明度的计划。其他大型期刊也纷纷效仿。这样,数据科学的未来不仅超越了统计理论在规模和方法论上的界限,而且数据科学将彻底改变当前的学术界和研究范式。正如多诺霍总结的那样,“随着科学数据和科学本身的数据变得无处不在,未来几十年,数据科学的范围和影响将继续大幅扩大。”
3 See also请参阅
- Information engineering
4 References参考资料
^ Jump up to: a b c Dhar, V. (2013). "Data science and prediction". Communications of the ACM. 56 (12): 64. doi:10.1145/2500499. Jump up ^ Jeff Leek (2013-12-12). "The key word in "Data Science" is not Data, it is Science". Simply Statistics. ^ Jump up to: a b Hayashi, Chikio (1998-01-01). "What is Data Science? Fundamental Concepts and a Heuristic Example". In Hayashi, Chikio; Yajima, Keiji; Bock, Hans-Hermann; Ohsumi, Noboru; Tanaka, Yutaka; Baba, Yasumasa. Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer Japan. pp. 40–51. doi:10.1007/978-4-431-65950-1_3. ISBN 9784431702085. ^ Jump up to: a b Stewart Tansley; Kristin Michele Tolle (2009). The Fourth Paradigm: Data-intensive Scientific Discovery. Microsoft Research. ISBN 978-0-9825442-0-4. ^ Jump up to: a b Bell, G.; Hey, T.; Szalay, A. (2009). "COMPUTER SCIENCE: Beyond the Data Deluge". Science. 323 (5919): 1297–1298. doi:10.1126/science.1170411. ISSN 0036-8075. ^ Jump up to: a b Davenport, Thomas H.; Patil, DJ (Oct 2012), Data Scientist: The Sexiest Job of the 21st Century, Harvard Business Review ^ Jump up to: a b c "Data Science: What's The Half-Life Of A Buzzword?". Forbes. 2013-08-19. Jump up ^ Singer, Natasha (2011-04-02). "When the Data Struts Its Stuff". Retrieved 2018-09-01. ^ Jump up to: a b "Nate Silver: What I need from statisticians". 23 Aug 2013. Jump up ^ Warden, Pete (2011-05-09). "Why the term "data science" is flawed but useful". O'Reilly Radar. Retrieved 2018-05-20. Jump up ^ "Are You Setting Your Data Scientists Up to Fail?". Harvard Business Review. 2018-01-25. Retrieved 2018-05-26. Jump up ^ "70% of Big Data projects in UK fail to realise full potential". www.consultancy.uk. Retrieved 2018-05-26. Jump up ^ "The Data Economy: Why do so many analytics projects fail? – Analytics Magazine". Analytics Magazine. 2014-07-07. Retrieved 2018-05-26. Jump up ^ "Data Science: 4 Reasons Why Most Are Failing to Deliver". www.kdnuggets.com. Retrieved 2018-05-26. Jump up ^ Naur, Peter (1 July 1966). "The science of datalogy". Communications of the ACM. 9 (7): 485. doi:10.1145/365719.366510. Jump up ^ Press, Gil. "A Very Short History Of Data Science". ^ Jump up to: a b Wu, C. F. J. (1997). "Statistics = Data Science?" (PDF). Retrieved 9 October 2014. Jump up ^ "Identity of statistics in science examined". The University Records, 9 November 1997, The University of Michigan. Retrieved 12 August 2013. Jump up ^ "P.C. Mahalanobis Memorial Lectures, 7th series". P.C. Mahalanobis Memorial Lectures, Indian Statistical Institute. Archived from the original on 26 Feb 2017. Retrieved 18 Jul 2017. ^ Jump up to: a b Cleveland, W. S. (2001). Data science: an action plan for expanding the technical areas of the field of statistics. International Statistical Review / Revue Internationale de Statistique, 21–26 Jump up ^ International Council for Science: Committee on Data for Science and Technology. (2012, April). CODATA, The Committee on Data for Science and Technology. Retrieved from International Council for Science : Committee on Data for Science and Technology: http://www.codata.org/ Jump up ^ Data Science Journal. (2012, April). Available Volumes. Retrieved from Japan Science and Technology Information Aggregator, Electronic: http://www.jstage.jst.go.jp/browse/dsj/_vols Archived 3 April 2012 at the Wayback Machine. Jump up ^ Data Science Journal. (2002, April). Contents of Volume 1, Issue 1, April 2002. Retrieved from Japan Science and Technology Information Aggregator, Electronic: http://www.jstage.jst.go.jp/browse/dsj/1/0/_contents Jump up ^ The Journal of Data Science. (2003, January). Contents of Volume 1, Issue 1, January 2003. Retrieved from http://www.jds-online.com/v1-1 Jump up ^ National Science Board. "Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century". National Science Foundation. Retrieved 30 June 2013. Jump up ^ Markoff, John (2009-12-14). "Essays Inspired by Microsoft's Jim Gray, Who Saw Science Paradigm Shift". The New York Times. ISSN 0362-4331. Retrieved 2018-04-26. Jump up ^ "IEEE Task Force on Data Science and Advanced Analytics". Jump up ^ "2014 IEEE International Conference on Data Science and Advanced Analytics". Archived from the original on 29 March 2017. Jump up ^ "NY gets new bootcamp for data scientists: It's free, but harder to get into than Harvard". Venture Beat. Retrieved 2016-02-22. Jump up ^ Talley, Jill (2016-06-01). "ASA Expands Scope, Outreach to Foster Growth, Collaboration in Data Science". AMSTATNEWS. American Statistical Association. Retrieved 2017-02-04. Jump up ^ "Journal on Data Science and Analytics". Jump up ^ Darrow, Barb (May 21, 2015). "Data science is still white hot, but nothing lasts forever". Fortune. Retrieved November 20, 2017. Jump up ^ Miller, Steven (2014-04-10). "Collaborative Approaches Needed to Close the Big Data Skills Gap". Journal of Organization Design. 3 (1): 26–30. doi:10.7146/jod.9823. ISSN 2245-408X. Jump up ^ De Mauro, Andrea; Greco, Marco; Grimaldi, Michele; Ritala, Paavo. "Human resources for Big Data professions: A systematic classification of job roles and required skill sets". Information Processing & Management. doi:10.1016/j.ipm.2017.05.004. ^ Jump up to: a b Wladawsky-Berger, Irving (May 2, 2014). "Why Do We Need Data Science When We've Had Statistics for Centuries?". The Wall Street Journal. Retrieved November 20, 2017. ^ Jump up to: a b c d e f g h Donoho, David (September 2015). "50 Years of Data Science" (PDF). Based on a talk at Tukey Centennial workshop, Princeton NJ Sept 18 2015. Jump up ^ Barlow, Mike (2013). The Culture of Big Data. O'Reilly Media, Inc. Jump up ^ Chambers, John M. (1993-12-01). "Greater or lesser statistics: a choice for future research". Statistics and Computing. 3 (4): 182–184. doi:10.1007/BF00141776. ISSN 0960-3174. Jump up ^ Collins, Francis S.; Tabak, Lawrence A. (2014-01-30). "NIH plans to enhance reproducibility". Nature. 505 (7485): 612–613. doi:10.1038/505612a. ISSN 0028-0836. PMC 4058759. PMID 24482835. Jump up ^ McNutt, Marcia (2014-01-17). "Reproducibility". Science. 343 (6168): 229–229. doi:10.1126/science.1250475. ISSN 0036-8075. PMID 24436391. Jump up ^ Peng, Roger D. (2009-07-01). "Reproducible research and Biostatistics". Biostatistics. 10 (3): 405–408. doi:10.1093/biostatistics/kxp014. ISSN 1465-4644.