大数据:深调制与不透明表征
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Big Data, Thick Mediation, and Representational Opacity
  • 作者:拉斐尔·阿尔瓦拉多 ; 保罗·汉弗莱斯 ; 薛永红
  • 英文作者:Rafael Alvarado;Paul Humphreys;
  • 关键词:大数据 ; 深调制 ; 不透明表征 ; 认识论
  • 英文关键词:datasphere;;thick mediation;;representational opacity;;epistemology
  • 中文刊名:ZXFX
  • 英文刊名:Philosophical Analysis
  • 机构:弗吉尼亚大学数据科学研究所;弗吉尼亚大学哲学系;北京师范大学哲学学院;华北科技学院理学院;
  • 出版日期:2018-06-25
  • 出版单位:哲学分析
  • 年:2018
  • 期:v.9;No.49
  • 基金:北京市社会科学基金一般项目“复杂系统科学理论视域下‘大数据’的特征及其哲学意蕴”(项目批准号:14ZXB010);; 中央高校科研基本业务费专项资金即北京师范大学2015年度自主科研基金项目“关于时间方向问题的当代科学与哲学研究”(项目批准号:SKZZB2015042)的资助
  • 语种:中文;
  • 页:ZXFX201803011
  • 页数:18
  • CN:03
  • ISSN:31-2054/C
  • 分类号:117-133+200
摘要
"大数据"一词在演变过程中逐渐派生出了两种内涵:一种是小写的大数据,另一种是大写的大数据。小写的"大数据"指的是与数据科学相关的活动和方法,而当这些活动和方法向社会各领域渗透并迅速发展时,便产生了大写的大数据。因此,大写的大数据可以看成是小写的大数据的经济和文化转向,它会对社会组织的知识结构产生历史性的变革。为了探讨两种大数据之间的联系以及相关的认识论后果,我们将引入三个核心概念——数据域、深调制以及不透明表征——作为一个理论框架,帮助我们理解大数据在经济和文化维度上——一个是地方性和生成性的,另一个是全球性和涌现性的——如何交互以及在交互过程中产生的一系列后果、问题和机遇。基于信念和可靠性的知识观所建立的"后调制的认识论",对于深入理解和探索机器学习、人工智能等认识论问题将提供有益思路和理论依据。
        There are two broad meanings associated with the phrase" big data", which we designate by big data and Big Data. In lowercase, the phrase refers to the activities and methods associated with data science when applied to data sets that are too large to yield to the traditional methods of a given field. In its uppercase form, "Big Data" refers to these methods and activities as they have been embedded and developed in society, both economically and culturally. So, Big Data refers to an economic and cultural shift in the nature of big data, it also indexes a historical transformation in the social organization of knowledge. In this article, we develop three central concepts: the datasphere, thick mediation, and representational opacity. These concepts provide a theoretical framework for making sense of how the economic and cultural dimensions(the one local and generative, the other global and emergent)interact to produce a set of effects, problems, and opportunities. Both the research approach and the epistemology of thick mediation would provide useful ideas and theoretical basis for us to understand and explore the epistemological problems of machine learning and artificial intelligence.
引文
(1)本文译自R.Alvarado and P.Humphreys,“Big Data,Thick Mediation,and Representational Opacity”,New Literary History,Vol.48,No.4,2017,pp.729-749。感谢克里斯罕·库马尔(Krishan Kumar)和切普·塔克(Chip Tucker)对初稿所提的宝贵意见!感谢北京师范大学哲学学院博士生邱实对译文所做的校对工作。
    (1) C.Anderson,“The End of Theory:The Data Deluge Makes the Scientific Method Obsolete”,Wired,Vol.16,No.7,2008,p.17.
    (2)我们将“互联网”视为一个专有名词,它的前身是阿帕网(APPANET)。
    (1) K.Polanyi,The Great Transformation:The Political and Economic Origins of our Time,Boston:Beacon,1957.
    (1) W.Gibson,“Burning Chrome”,Omni,Vol.4,No.10,1982,pp.72-77;M.Castells,“The Space of Flows”,The Information Age:Economy,Society,and Culture(Vol.1),Cambridge,MA:Wiley-Blackwell,1996,pp.376-423;S.Zuboff,“Big Other:Surveillance Capitalism and the Prospects of an Information Civilization”,in Journal of Information Technology,Vol.30,No.1,2015,pp.75-89.
    (2) D.Rushkoff,Media Virus!:Hidden Agendas in Popular Culture,New York:Ballantine Books,1994;S.Garfinkel,Database Nation:The Death of Privacy in the 21st Century,Beijing:O’Reilly Media,2000.
    (3) B.Latour,Reassembling the Social:An Introduction to Actor-Network-Theory,Oxford:Oxford University Press,2005.
    (1) N.L.Whitehead and M.Wesch,Human No More:Digital Subjectivities,Unhuman Subjects,and the End of Anthropology,Boulder:Univ.Press of Colorado,2012;H.A.Innis,Empire and Communications,Oxford:Clarendon,1950.
    (2)需要强调的是,尽管我们把注意力放在人类这一因素上,但是数据域不仅仅局限于人与人、人与机器之间的交互。自动出租车的运行、军用无人机对信息的收集、熊入侵的视频图像等,所有这些内容都是数据域的组成部分。
    (1) J.S.Brown and P.Duguid,“Mysteries of the Region:Knowledge Dynamics in Silicon Valley”,in The Silicon Valley Edge:Habitat for Innovation and Entrepreneurship,edited by Chong-Moon Lee,W.F.Miller,M.G.Hancock and H.S.Rowen,Stanford,CA:Stanford University Press,2000,pp.16-45.
    (1)祖博夫认为,数据库在组织中起着一种文本的功能,在In the Age of the Smart Machine:The Future of Work and Power(New York:Basic Books,1988)一书中,他将数据库描述为一种电子文本,发挥着“信息化”的功能,与工业机器的“自动化”过程类似。在该书中,祖博夫还根据口述与读写的相关理论详细地阐述了数据库的功能与作用。
    (2) C.E.Shannon,“A Mathematical Theory of Communication”,ACM SIGMOBILE Mobile Computing and Communications Review,Vol.5,No.1,2001,pp.3-55.
    (3)指加拿大著名传播学家马歇尔·麦克卢汉(Marshall Mcluhan,1911-1980年)对媒介的论断,最著名的有:“媒介就是信息”“媒介是人体的延伸”等。--译者
    (1) I.Gershon,“Language and the Newness of Media”,Annual Review of Anthropology,Vol.46,No.1,2017,pp.15-31.
    (2) L.Manovich,“Database as Symbolic Form”,Convergence,Vol.5,No.2,1999,pp.80-99.
    (3) J.F.Lyotard,The Postmodern Condition:A Report on Knowledge,translated by G.Bennington and B.Massumi,Minneapolis:University of Minnesota Press,1984.
    (1) Lyotard,The Postmodern Condition:A Report on knowledge,p.xiii.
    (2) N.Carr,“Is Google Making Us Stupid?”in The Atlantic Monthly,Jul/Aug 2008.https://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/306868/.
    (3) C.Shirky,“Ontology Is Overrated:Categories,Links,and Tags”,in Clay Shirky’s Writings About the Internet (blog),2005,shirky.com/writings/herecomeseverybody/ontology_overrated.html.
    (4) F.Moretti,Distant Reading,London:Verso Books,2013.
    (5) T.Underwood,Why Literary Periods Mattered:Historical Contrast and the Prestige of English Studies,Stanford,CA:Stanford University Press,2013.
    (1)我们可以将这些类型细分为语法透明(不透明)和语义透明(不透明)两种,在本文中将不再阐述。就本文而言,如果一个表征在语法上或语义上具有不透明性,那么该表征就被看作是不透明的。
    (1)关于文本的“主题”是否适应用于统计模型,存在很大争议。我们仅仅在艺术层面使用它,不支持其他用途。
    (2)对统计模型所输出的“主题”如何评估,可参见J.Chang,J.Boyd-Graber,C.Wang,S.Gerrish and D.M.Blei,“Reading Tea Leaves:How Humans Interpret Topic Models”,Advances in Neural Information Processing Systems,Vol.32,2009,pp.288-296。
    (1) J.L.Borges,Labyrinths:Selected Stories and Other Writings,New York:New Directions,1964,p.65.
    (2)关于CAT扫描的详细解释可参见Humphreys,“X-ray Data and Empirical Content”,in Logic,Methodology and Philosophy of Science XIV:Logic and Science Facing the New Technologies,edited by P.Schroeder-Heister,G.Heinzmann,W.Hodges,P.E.Bour,London:College Publications,2014。
    (1)具有代表性的讨论可参见S.Leonelli,“What Difference Does Quantity Make?On the Epistemology of Big Data in Biology”,Big Data and Society,Vol.1,No.1,2014,pp.1-11;F.Mazzocchi,“Could Big Data Be the End of Theory in Science?”EMBO Reports,Vol.16,No.10,2015,pp.1250-1255。
    (2)参见R.P.Feynman,R.B.Leighton and M.L.Sands,The Feynman Lectures on Physics(Vol.2),MA:AddisonWesley,chapter 42,1963。
    (3)有关这些方法的概述,参见Y.Le Cun,Y.Bengio,and G.Hinton,“Deep Learning”,Nature,Vol.521,No.7553,2015,pp.436-444。
    (1)详细原因参见Humphreys,Extending Ourselves:Computational Science,Empiricism,and Scientific Method,Oxford:Oxford Univ.Press,2004,and J.Bogen,“Empiricism and After”,in Oxford Handbook of Philosophy of Science,edited by Humphreys,Oxford:Oxford University Press,2016。
    (1) T.Burge,“Computer Proof,A Priori Knowledge,and Other Minds:The Sixth Philosophical Perspectives Lecture”,Nous,Vol.32,No.12,1998,pp.1-37.
    (2)这一术语是由贝尔曼(R.Bellman)创造的,参见Adaptive Control Processes:A Guided Tour,Princeton,NJ:Princeton University Press,1961。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700