Thinking with Joins

Say you’re making a basic scatterplot using D3, and you need to create some SVG circle elements to visualize your data. You may be surprised to discover that D3 has no primitive for creating multiple DOM elements. Wait, WAT?

Sure, there’s the append method, which you can use to create a single element.

Here svg refers to a single-element selection containing an <svg> element created previously (or selected from the current page, say).

svg.append("circle")
    .attr("cx", d.x)
    .attr("cy", d.y)
    .attr("r", 2.5);

But that’s just a single circle, and you want many circles: one for each data point. Before you bust out a for loop and brute-force it, consider this mystifying sequence from one of D3’s examples.

Here data is an array of JSON objects with x and y properties, such as: [{"x": 1.0, "y": 1.1}, {"x": 2.0, "y": 2.5}, …].

svg.selectAll("circle")
  .data(data)
  .enter().append("circle")
    .attr("cx", function(d) { return d.x; })
    .attr("cy", function(d) { return d.y; })
    .attr("r", 2.5);

This code does exactly what you need: it creates a circle element for each data point, using the x and y data properties for positioning. But what’s with the selectAll("circle")? Why do you have to select elements that you know don’t exist in order to create new ones? WAT.

Here’s the deal. Instead of telling D3 how to do something, tell D3 what you want. You want the circle elements to correspond to data. You want one circle per datum. Instead of instructing D3 to create circles, then, tell D3 that the selection "circle" should correspond to data. This concept is called the data join:

Data points joined to existing elements produce the update (inner) selection. Leftover unbound data produce the enter selection (left), which represents missing elements. Likewise, any remaining unbound elements produce the exit selection (right), which represents elements to be removed.

Now we can unravel the mysterious enter-append sequence through the data join:

  1. First, svg.selectAll("circle") returns a new empty selection, since the SVG container was empty. The parent node of this selection is the SVG container.

  2. This selection is then joined to an array of data, resulting in three new selections that represent the three possible states: enterupdate, and exit. Since the selection was empty, the update and exit selections are empty, while the enter selection contains a placeholder for each new datum.

  3. The update selection is returned by selection.data, while the enter and exit selections hang off the update selection; selection.enter thus returns the enter selection.

  4. The missing elements are added to the SVG container by calling selection.append on the enter selection. This appends a new circle for each data point to the SVG container.

Thinking with joins means declaring a relationship between a selection (such as "circle") and data, and then implementing this relationship through the three enterupdate and exit states.

But why all the trouble? Why not just a primitive to create multiple elements? The beauty of the data join is that it generalizes. While the above code only handles the enter selection, which is sufficient for static visualizations, you can extend it to support dynamic visualizations with only minor modifications for update and exit. And that means you can visualize realtime data, allow interactive exploration, and transition smoothly between datasets!

Here’s an example of handling all three states:

var circle = svg.selectAll("circle")
  .data(data);

circle.exit().remove();

circle.enter().append("circle")
    .attr("r", 2.5)
  .merge(circle)
    .attr("cx", function(d) { return d.x; })
    .attr("cy", function(d) { return d.y; });

To control how data is assign­ed to elements, you can pro­vide a key function.

Whenever this code is run, it recomputes the data join and maintains the desired correspondence between elements and data. If the new dataset is smaller than the old one, the surplus elements end up in the exit selection and get removed. If the new dataset is larger, the surplus data ends up in the enter selection and new nodes are added. If the new dataset is exactly the same size, then all the elements are simply updated with new positions, and no elements are added or removed.

Thinking with joins means your code is more declarative: you handle these three states without any branching (if) or iteration (for). Instead you describe how elements should correspond to data. If a given enterupdate or exit selection happens to be empty, the corresponding code is a no-op.

Joins also let you target operations to specific states, if needed. For example, you can set constant attributes (such as the circle’s radius, defined by the "r" attribute) on enter rather than update. By reselecting elements and minimizing DOM changes, you vastly improve rendering performance! Similarly, you can target animated transitions to specific states. For example, for entering circles to expand-in:

circle.enter().append("circle")
    .attr("r", 0)
  .transition()
    .attr("r", 2.5);

Likewise, to shrink-out:

circle.exit().transition()
    .attr("r", 0)
    .remove();

Now you’re thinking with joins!

Comments or questions? Discuss on HN.

Addendum

I’ve written a series of examples on the general update pattern as a followup to this post.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Labene是一种用于语义分析的开源工具,它支持在大规模数据集上进行批量的joins操作和文件转换为VOC格式。 首先,Labene具有强大的批量joins功能,使用户能够在多个数据集之间进行高效的关联操作。通过使用Labene提供的API和内置函数,我们可以轻松地将多个文件中的数据根据特定的列进行连接,形成一个新的数据集。例如,我们可以将一个包含用户信息的文件和一个包含商品信息的文件进行连接,将用户和商品的数据关联起来。 其次,Labene还支持将文件转换为VOC(Visual Object Classes)格式。VOC是一种常用的计算机视觉数据集格式,用于训练和评估图像分类、目标检测和语义分割模型。通过使用Labene提供的文件转换功能,我们可以将不同格式的文件(如CSV、JSON等)转换为VOC格式,方便进行后续的模型训练和评估。 Labene的批量joins和文件转换功能的使用非常简单。首先,我们需要安装Labene并导入所需的库。然后,我们可以使用Labene提供的API进行批量joins操作和文件转换。根据具体的需求,我们可以指定连接的列、连接的方式和转换的目标格式。 Labene会自动处理数据的关联和转换过程,并输出相应的结果。 综上所述,Labene是一个功能强大的工具,可以帮助我们进行批量joins操作和文件转换为VOC格式。通过使用Labene,我们可以更轻松地处理大规模的数据集,并为后续的模型训练和评估提供便利和支持。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值