在Unterfranken中搜索河流:如何使用Elasticsearch在地图上查找要素

by 24ma13wg

由24ma13wg

在Unterfranken中搜索河流:如何使用Elasticsearch在地图上查找要素 (Searching for rivers in Unterfranken: how to use Elasticsearch to find features on a map)

One of the great things about working remotely is that I can work from wherever I want to. So, this month I have swapped my city desk in London for one in the spa town of Bad Kissingen, Germany.

远程工作的一大好处是我可以在任何地方工作。 因此,本月,我将伦敦的城市服务台换成了德国温泉小镇巴特基辛根的一位。

I’ve also had fun building search engines with Elasticsearch. In this post, I’m going to explore how it can be used to search for features on a map.

使用Elasticsearch构建搜索引擎也很有趣。 在本文中,我将探讨如何将其用于在地图上搜索要素。

搜索索引 (Search indices)

Sitting here at my new desk, I’m leafing through an old textbook. At the back, there is an index. It tells me on which pages certain keywords appear. So, if I want to read about something specific I can find the relevant page numbers quickly. Without the index, I would have to scan through all the pages of the book to find what I’m interested in.

坐在这里我的新办公桌旁,我翻阅一本旧教科书。 背面有一个索引。 它告诉我某些关键字出现在哪些页面上。 因此,如果我想阅读特定的内容,可以快速找到相关的页码。 没有索引,我将不得不浏览本书的所有页面以找到我感兴趣的内容。

Similarly, when we search for things on the internet — although we may not be aware of it — we are, likely, also using a (more sophisticated) index to make our search fast. We put questions to the index and get answers back. More accurately, with regard to Elasticsearch, we query the index by sending RESTful API requests, in the form of JSON. Results are returned.

同样,当我们在Internet上搜索事物时-尽管我们可能不知道它-我们很可能还会使用(更复杂的)索引来使搜索快速。 我们向索引提出问题,并获得答案。 更准确地说,对于Elasticsearch,我们通过发送JSON形式的RESTful API请求来查询索引。 返回结果。

JSON系列 (The JSON family)

JSON is a commonly used format for giving structure to data. Put simply, it expresses data as groups of name/value pairs, in a text string. For example:

JSON是为数据提供结构的常用格式。 简而言之,它以文本字符串形式将数据表示为名称/值对的组。 例如:

{  "city": "Erlangen",  "country": "Germany"},{  "city": "Würzburg",  "country": "Germany"}

Minified, our example looks like this:

精简版,我们的示例如下所示:

{"city":"Erlangen","country":"Germany"},{"city":"Würzburg","country":"Germany"}

Often we are concerned with indexing fields of data, like a product record, or full text, like a blog post. Elasticsearch handles these very well. It can also index spatial data: map features, such as locations and boundaries. We use a special kind of JSON to describe map features, called GeoJSON. It looks like this:

通常,我们关心的是对数据字段(例如产品记录)或全文(例如博客文章)建立索引。 Elasticsearch可以很好地处理这些问题。 它还可以索引空间数据:地图要素,例如位置和边界。 我们使用一种特殊的JSON来描述地图特征,称为GeoJSON 。 看起来像这样:

{  "type": "Feature",  "geometry": {    "type": "Point",    "coordinates": [49.792762, 9.939119]  },  "properties": {    "city": "Würzburg",    "country": "Germany"  }}

A geometry type may be a: Point, LineString, or Polygon. There are multi types for these: MultiPoint, MultiLineString, and MultiPolygon. Several features, like the above location, can be contained within a FeatureCollection.

几何类型可以是: PointLineStringPolygon 。 这些有多种类型: MultiPointMultiLineStringMultiPolygon 。 几个特征(如上述位置)可以包含在FeatureCollection

Bad Kissingen is one of many communities in the Lower Franconia region (Unterfranken in German). Like many of its neighbors, a river runs through it: the Fränkische Saale. The boundary of the community forms a single shape; it maps to the Polygon geometry type. The water courses that make up the river can be imagined as a series of lines joined together. They map to the MultiLineString type.

巴特基辛根(Bad Kissingen)是下弗兰肯行政区地区的众多社​​区之一(德语为Unterfranken )。 像它的许多邻居一样,一条河流贯穿:FränkischeSaale。 社区的边界形成一个单一的形状。 它映射到Polygon几何类型。 组成河流的水道可以想象成一系列连接在一起的线。 它们映射到MultiLineString类型。

I’ve found some maps of Lower Franconia online. I can process all the region’s rivers and communities into NDJSON (newline delimited — another variation of JSON). I create an Elasticsearch index, and load the data into it. Now I’m ready to search. Gut, wir machen einen Test!

我在网上找到了下弗兰肯行政区的一些地图。 我可以将区域的所有河流和社区都处理成NDJSON (换行符分隔-JSON的另一种形式)。 我创建一个Elasticsearch索引,并将数据加载到其中。 现在我可以搜索了。 胆量大,测试!

寻找河流 (Searching for rivers)

A simple term query tells me that there are 22 rivers and 360 communities in Lower Franconia. There are many more water courses in the downloaded data, but only 22 are defined as rivers. Time to try some more complex queries. I’ll begin with the region’s principal river, the river Main, which sounds like Mine in German. I wonder how many communities it flows through? The query I send to my index looks like this:

一个简单的术语查询告诉我,下弗兰肯行政区有22条河流和360个社区。 下载的数据中还有更多的水道,但只有22条被定义为河流。 是时候尝试一些更复杂的查询了。 我将从该地区的主要河流Main河流开始,听起来像德语的Mine 。 我想知道它流经多少社区? 我发送到索引的查询如下所示:

GET lower_franconia/default/_search{  "query": {    "bool": {      "filter": [        {          "term": {            "feature": "community"          }        },         {          "geo_shape": {            "geometry": {              "indexed_shape": {                "index": "lower_franconia",                "type": "default",                "id": "12",                "path": "geometry"              },              "relation": "intersects"            }          }        }      ]    }  }}

This query is being run in a filter context. This means that relevance scores are not calculated. I'm not concerned with how well things match, but rather whether a match exists or does not. In this context, I specify an array of two items.

该查询正在filter上下文中运行。 这意味着不计算相关性分数。 我不在乎事物的匹配程度,而是是否存在匹配。 在这种情况下,我指定了一个包含两个项目的数组。

In the first item, I am specifying a term key with community features as a constraint. This means that only documents in my index which have a value of community in the feature field will be returned.

在第一项中,我将指定term键,以社区功能作为约束。 这意味着将仅返回索引中在feature字段中具有community值的文档。

In the second item of the array, I have a geo_shape query specifying document number 12 (this document describes the river Main) and a relationship of intersects as the constraints.

在数组的第二项中,我有一个geo_shape查询,它指定文档编号12 (此文档描述了Main河),并将intersects关系作为约束。

Put simply, match all community shapes that intersect with a particular river line.

简而言之,匹配与特定河流线相交的所有社区形状。

I get 91 hits. A quarter of all communities are on the Main. The result is formatted in — yes you guessed it — JSON. Although JSON is quite readable, it’s not easy to understand at a glance. Better to create a data visualization with d3.js so that the results can be understood instantly.

我得到91次点击。 所有社区的四分之一在主要地区。 结果格式为JSON(是的,您猜对了)。 尽管JSON可读性强,但乍一看并不容易理解。 最好使用d3.js创建数据可视化,以便可以立即理解结果。

For more details about how this is done, see my previous post on web page cartography.

有关如何完成此操作的更多详细信息,请参阅我上一篇关于网页制图的文章。

Could a brown bear, a black bear, and a polar bear meet?Web page cartography can show us wheretowardsdatascience.com

棕熊,黑熊和北极熊会合吗? 网页制图可以向我们显示在哪里

Next up, how many rivers are close by? If I want to stroll by a river this evening, but don’t want to travel, say more than ten kilometers, what are my options?

接下来,附近有多少条河流? 如果我想今晚在河边漫步,但又不想旅行,比如说十多公里,我有什么选择?

Four hits come back: the rivers Aschach, Fränkische Saale, of course, Thulba, and Premich. This query is slightly different from the previous one. This time I only want rivers to be returned. Also, I am specifying a new shape that does not exist in the index. A circle which is centered on my current location with a radius of ten kilometers.

四大热门影片又回来了:阿沙赫河,FränkischeSaale河,图尔巴河和Premich河。 该查询与上一个查询略有不同。 这次我只希望归还河流。 另外,我指定了索引中不存在的新形状。 以我当前位置为中心的圆,半径为10公里。

One more. Where shouldn’t I go if I want to walk by a river? For this query, I use a must_not key to filter out the communities that intersect with any of the 22 rivers. I get 199 hits – just over half of the communities in Lower Franconia are without a river.

多一个。 如果我想在河边散步,该不去哪里? 对于此查询,我使用must_not键过滤掉与22条河流中的任何一条相交的社区。 我得到199次点击-下弗兰肯行政区的一半以上社区没有河流。

实际应用 (Real world application)

I have used the rivers and communities of Lower Franconia as a simple example to illustrate how map features can be indexed with Elasticsearch, and the query results visualized with d3.js.

我以下弗兰肯行政区的河流和社区为简单示例,说明了如何使用Elasticsearch索引地图要素以及如何使用d3.js可视化查询结果。

Could it have a practical application? Well, the index could be used, for example, to find out which communities to warn if a flood alert was issued for a particular river. Perhaps it could be used, in a drier region, to predict where droughts might cause problems for agriculture.

可以有实际应用吗? 好吧,该指数可以用于例如查找针对特定河流的洪水警报是否向哪些社区发出警告。 也许可以在较干燥的地区使用它来预测干旱可能在哪里给农业带来问题。

Of course, we are not limited to river courses and community boundaries. Any combination of map features can be mapped and indexed and, therefore, there are many possible applications.

当然,我们不仅限于河道和社区边界。 可以对地图要素的任何组合进行映射和索引,因此,存在许多可能的应用。

Data: OpenStreetMap + Open Data Portal des Freistaats Bayern

数据: OpenStreetMap + 拜仁开放式数据门户网站

Originally published at 24ma13wg.github.io.

最初发布于24ma13wg.github.io

翻译自: https://www.freecodecamp.org/news/searching-for-rivers-in-unterfranken-how-to-use-elasticsearch-to-find-features-on-a-map-756017ff28c7/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值