数据开放 数据集
There is growing consensus in the open data community that the mere release of open data — that is data that can be freely accessed, remixed, and redistributed — is not enough to realize the full potential of openness. Successful open data initiatives don’t simply tick the ‘open’ box but produce data that actually gets used. Open data portals, in particular, are prone to the risk of becoming “data dumps”, where the number of published datasets counts more than their quality or utility.
开放数据社区中越来越多的共识是,仅释放开放数据(即可以自由访问,重新混合和重新分配的数据)不足以实现开放的全部潜力。 成功的开放数据计划不仅会简单地在“开放”框中打钩,而且还会产生实际使用的数据 。 开放数据门户尤其容易成为“数据转储”的风险,其中已发布数据集的数量比其质量或实用性更为重要。
This is why, when Sheldon.Studio was hired by the Matera | European Capital of Culture 2019 foundation to design their open-data portal, we felt we were in front of a unique challenge. How do we create an open data portal that empowers the audience, and how do we avoid an open data dump 🤪? One month into the project, here is what we learned in the process 😎.
这就是为什么当Matera雇用Sheldon.Studio的原因| 欧洲文化之都2019基金会设计了他们的开放数据门户 ,我们认为我们正在面对独特的挑战。 我们如何创建一个开放的数据门户网站来增强受众的能力,以及如何避免开放的数据转储🤪? 进入项目一个月后,这就是我们在过程中中学到的东西😎。
了解受众是以受众为中心的数据门户的第一步。 (Knowing the audience is the first step to an audience-centered data portal.)
Last year was a big one for the city of Matera, a city in Southern Italy of 60K souls whose history dates back to the Palaeolithic, as it became European Capital of Culture 2019 and witnessed the arrival of more than half a million visitors. Not only tourists but also artists, cultural workers, and social operators swarmed through the city and actively participated in more than 2400 events, many of which spanned multiple days.
去年是马泰拉(Matera)市的重要一年。马泰拉(Matera)是意大利南部一个拥有6万名灵魂的城市,其历史可以追溯到旧石器时代,当时它已成为2019年欧洲文化之都 ,目睹了超过50万游客的到来。 不仅游客,艺术家,文化工作者和社会工作者也蜂拥而至,并积极参加了2400多个活动,其中许多活动跨越了数天。
Can you imagine the amount of data visitors and citizens generated during the year? We can tell you about what we received: dozens and dozens of spreadsheets, some handcrafted, some software-generated; textual reports; photo galleries and video interviews. We could simply upload it online in some repo and be done with it. Yet, since the beginning of the collaboration, we embraced the idea of conceiving something beyond the usual. We wanted to give the data back to the people who helped produce it. This meant focusing on what the audience needed to understand.
您能想象一年中游客和市民产生的数据量吗? 我们可以告诉您我们收到了什么:数十个电子表格,一些是手工制作的,一些是软件生成的; 文字报告; 照片画廊和视频采访。 我们可以简单地将其在线上传到某个存储库中并完成它。 但是,自从合作开始以来,我们就接受了构思超出平常事物的想法。 我们希望将数据返回给帮助产生数据的人员。 这意味着关注观众需要理解的内容。
Saul Wurman, who coined the concept of information architecture in the mid of the 70s, often said: “You only understand something relative to something you already understand.” This simple, yet timeless, statement represents an essential lens through which we design information experiences at Sheldon.studio. In practice, it means that we should design upon the past experiences of our audience in order to explain something novel. So, knowing our audience was the first building block of our design process.
索尔·沃曼(Saul Wurman)在70年代中期提出了信息架构的概念,他经常说:“您只了解与已经了解的东西相关的东西。” 这个简单而又永恒的陈述代表了我们在Sheldon.studio设计信息体验的基本视角。 在实践中,这意味着我们应该根据观众过去的经历进行设计,以便解释一些新颖的事物。 因此,了解我们的观众是我们设计过程的第一步。
Another key ingredient of our human-centered design approach is the preference for simple visualizations over flamboyant charts, especially when the fancier design would entail a compromise on clarity. Other than the complexity of the data visualizations, we instead leveraged colors and animations to keep our chart designs fresh and engaging, facilitating the audience in the comprehension of hidden data patterns.
我们以人为中心的设计方法的另一个关键要素是,相对于华丽的图表 ,更喜欢简单的可视化效果 ,尤其是在更高级的设计会影响清晰度的情况下。 除了数据可视化的复杂性之外,我们还利用颜色和动画来使图表设计保持新鲜和引人入胜,从而使观众能够理解隐藏的数据模式。
From a design perspective, we rooted our visualizations around the central theme of showing the liveliness and the humanity that characterized the cultural programme of Matera. For this reason, we privileged rounded shapes and a profusion of dots swarming everywhere, a metaphor of humanity as seen from a bird’s eye view and we decided to present some visualization using the metaphor of a pack of many separate units/bubbles forming bigger clusters. We feel that this makes the numbers interesting and more intuitive to grasp also for audiences with lower data viz literacy.
从设计的角度来看,我们将视觉化植根于以马泰拉(Matera)文化节目为特色的生动活泼和人性化这一中心主题。 因此,我们优先考虑圆形和到处散布着大量点的情况,从鸟瞰的角度来看,这是人类的隐喻,因此,我们决定使用一堆由许多独立的单元/气泡组成的隐喻来呈现一些可视化,从而形成更大的簇。 我们认为,这对于具有较低数据即识字率的受众来说,也使数字变得有趣且更加直观。
In line with our endeavour to keep the data visualization accessible and easy to parse, we devised an innovative way to efficiently integrate legends, charts and text. Readers usually struggle as their eyes ping-pong back and forth between the chart and its legend to understand what’s what, so we tried to intertwine the legend in the descriptive text above it, highlighting keywords with the corresponding colours in the chart. The idea is to spark curiosity in the readers as they note that some words in the text are highlighted, or, the other way around, to prompt somebody to read the text while seeking for the legend.
为了使数据可视化易于访问和易于解析,我们设计了一种创新的方法来有效地集成图例,图表和文本。 读者通常会在图表和图例之间来回乒乓球的过程中挣扎,以了解内容是什么,因此我们尝试在图例上方的描述性文字中缠上图例,在图表中突出显示具有相应颜色的关键字。 这样做的目的是激发读者的好奇心,因为他们会注意到文本中的某些单词被突出显示,或者反之亦然,以促使人们在寻找图例时阅读文本。
下一个? 规划不同的数据素养水平。 (Next? Plan for different data literacy levels.)
The co-design sessions with our client, the Matera Foundation, surfaced the need to plan for multiple entry points and different levels of data literacy, to suit the needs of the different types of people that would visit the portal.
与我们的客户Matera基金会的共同设计会议表明,需要计划多个入口点和不同级别的数据素养,以适应访问门户网站的不同类型人员的需求。
A first step in this direction was to include qualitative data alongside the numbers and statistics. We strongly believe that quantitative data is just one possible ingredient to the story, especially when we are discussing social issues, and moreover if it’s important to include a broader audience. For this reason, we combined traditional data visualizations with original texts, and we intertwined the data stories with photos and statements by the participants.
朝这个方向迈出的第一步是将定性数据与数字和统计信息一起纳入。 我们坚信, 量化数据只是故事的一种可能成分,尤其是在我们讨论社会问题时,而且对于扩大受众范围是否重要也是如此。 因此,我们将传统的数据可视化与原始文本结合在一起,并将数据故事与参与者的照片和陈述交织在一起。
In its final version, the project unfolds across 8 thematic sections and 6 in-depth micro-stories. We opted for these two different content formats, sections, and stories, to offer two different ways of looking at the data. The thematic sections stand as metaphorical chapters that disclose the main narrative of what Matera 2019 has represented, providing a birds-eye view on the core values of its organization. The micro-stories, on the other hand, drill down on specific events or issues of particular importance. So, for instance, while the Cultural vibrancy introduces and visualizes the amount and diversity of the cultural program, the connected Open Design School micro-story unveils how the project brought talented youngsters from all over Europe during the year (see pic below).
在其最终版本中,该项目涵盖8个主题部分和6个深入的微型故事。 我们选择了这两种不同的内容格式,部分和故事,以提供两种不同的数据查看方式。 主题部分作为隐喻性章节站立,揭示了Matera 2019所代表的主要叙述,提供了其组织核心价值的鸟瞰图。 另一方面,微型故事会深入研究特定事件或特别重要的问题。 因此,例如,在文化活力介绍和形象化文化节目的数量和多样性的同时, 开放式设计学院微型故事楼揭示了该项目如何在这一年中吸引了来自欧洲各地的才华横溢的年轻人(见下图)。
The way we decided to publish the open data in the portal is itself an attempt at suiting the different data literacy levels and needs the website’s visitors may have. All the data is published in three places, each designed with a specific type of audience in mind.
我们决定在门户中发布开放数据的方式本身就是为了适应不同数据素养水平和网站访问者可能有的需求。 所有数据都在三个位置发布,每个位置的设计都考虑了特定的受众类型。
🤓🤓🤓 A dedicated GitHub repo that provides the CSV and JSON files (as a data geek would expect them).
🤓🤓🤓一个专用的GitHub存储库 ,提供CSV和JSON文件(数据极客所期望的)。
🤓🤓 An “Open Data Centre” on the website, which is essentially a traditional open data portal, listing all the raw data files along with their metadata.
🤓🤓“开放数据中心”的网站上,这基本上是一个传统的开放式数据门户网站,列出了所有的原始数据文件及其元数据。
🤓 An “Open Data Corner” at the end of each thematic section or micro-story, which includes only the data referred to the specific section or story. In each open data corner, we decided to publish not only the raw data but also the aggregated and processed data files that we used to produce each visualization that is on the page.
each每个主题部分或微型故事结尾处的“ 开放数据角 ”,其中仅包含涉及特定部分或故事的数据。 在每个开放数据角,我们决定不仅发布原始数据,还发布我们用于生成页面上每个可视化效果的汇总和处理数据文件。
We believe that the latter, the “Open Data Corner” is a core innovation in the way we designed the portal, as it empowers people who might have a lower data literacy than a typical open data user, like concerned citizens, activists, as well as journalists, to access and play with the data in a beginner-friendly manner.
我们相信后者,即“开放数据角”是我们设计门户网站方式的一项核心创新,因为它使可能具有比一般开放数据用户低的数据素养的人们(如关注的公民,活动家)作为记者,以对初学者友好的方式访问和处理数据。
最重要的是,将数据视为达到目的的手段,而不是目的本身😏。 (Most of all, think of data as a means to an end, not the end in itself 😏.)
The more we figured out how to translate the principles of human-centered design into the practice of creating the Matera 2019 data portal, the more we realized we were shifting the role data has in a traditional open data portal.
我们越想出如何将以人为本的设计原理转化为创建Matera 2019数据门户的实践,就越能意识到我们正在转移数据在传统开放数据门户中的角色。
Open data portals are typically all about the data: how many datasets, how many formats, which open licenses. Open data portals are typically all about the data: how many datasets, how many formats, which open licenses. In Matera 2019 this hierarchy is flipped: the stories come first which narrate the data and illustrate what can be done with the data, then we provide the downloadable open datasets.
开放数据门户通常与数据有关:多少数据集,多少格式以及哪种开放许可证。 开放数据门户通常与数据有关:多少数据集,多少格式以及哪种开放许可证。 在Matera 2019中,这种层次结构被颠覆了:故事首先讲述数据,并说明如何处理数据,然后提供可下载的开放数据集。
In addition, a standard open data portal will include mainly quantitative, machine-readable datasets. In the Matera 2019 open data portal, CSVs and machine-readable datasets are just one of the many components of a multi-modal narration, together with texts, videos, pictures, etc. The datasets are not stand-alone elements, but parts of an informative ecosystem covering the many facets and the complexity of what Matera European Capital of Culture 2019 has represented.
此外,标准的开放数据门户将主要包含定量的机器可读数据集。 在Matera 2019开放数据门户中,CSV和机器可读数据集只是多模式旁白的众多组成部分之一,还包括文本,视频,图片等。这些数据集不是独立的元素,而是其中的一部分一个信息丰富的生态系统,涵盖了Matera欧洲文化之都2019所代表的各个方面和复杂性。
Finally, our hope is to give rise to a recursive process that sees data as a means to an end, not an end itself. Publishing the data online was not the ultimate goal of the Matera 2019 Open data portal. It is humans and their actions that generated the datasets behind the stories of the portal. And now that the data is published, it should serve this community. We want to see the data used as a tool to foster new human interactions and to inform new processes aimed at improving the conditions of Matera’s society.
最后,我们的希望是引发一个递归过程,该过程将数据视为达到目的的手段,而不是达到目的的手段。 在线发布数据并不是Matera 2019 Open数据门户网站的最终目标。 是人类及其行为在门户故事背后生成了数据集。 既然数据已经发布,它就应该为这个社区服务。 我们希望将数据用作促进新的人类互动并为旨在改善马泰拉社会状况的新程序提供信息的工具。
With this goal in mind, an integral part of designing the Open Data Portal has been that of planning for its legacy. In the autumn 🍂, we are supporting the organization of a DataSchool, together with the Matera Foundation and with the participation of the open-data guru Maurizio Napolitano. The School will bring in the city a colourful variety of data-people, from data-activists, to design students, journalists, or social scientists, to design new forms of communication and services based on the data.
考虑到这一目标,设计开放数据门户的一个组成部分就是规划其遗留问题。 在秋季🍂,我们将与Matera基金会以及开放数据专家Maurizio Napolitano共同支持DataSchool的组织。 该学院将为城市带来各种各样的数据人,从数据活动家到设计学生,新闻工作者或社会科学家,到根据数据设计新形式的交流和服务。
Through the design of the platform, we aimed to turn open data into commons, public goods generated and maintained by the community for its wealth and awareness. Our hope, indeed, is to contribute to a more inclusive data practice, which embraces a broader audience, provides diverse and faceted entry points for personal explorations, and constitutes a stepping stone towards new forms of information, knowledge, awareness, and social care.
通过平台的设计,我们旨在将开放数据转变为公共资源,社区为社区的财富和意识而产生和维护的公共物品。 实际上,我们的希望是为更具包容性的数据实践做出贡献,该实践涵盖了更广泛的受众,为个人探索提供了多种多样且切面的切入点,并且是迈向新形式的信息,知识,意识和社会关怀的垫脚石。
Matteo Moretti is a designer and cofounder of Sheldon.studio & Alice Corona, data-journalist and founder of Batjo.eu
利玛窦莫雷蒂 是一名设计师兼创始人 Sheldon.studio 与 爱丽丝电晕 ,数据记者和创始人 Batjo.eu
数据开放 数据集