Elasticsearch mapping

最新推荐文章于 2024-04-28 10:31:14 发布

wandy0211

最新推荐文章于 2024-04-28 10:31:14 发布

阅读量248

点赞数

分类专栏： ETL

本文链接：https://blog.csdn.net/wjandy0211/article/details/109802389

版权

ETL 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

mapping是定义文档及其包含的字段的存储和索引方式的过程。例如，使用mapping定义：

哪些字符串字段应视为全文字段。
哪些字段包含数字，日期或地理位置。
日期值的格式。
自定义规则，用于控制动态添加字段的映射。

mapping定义具有：

Metadata fields

元数据字段用于自定义如何处理文档的关联元数据。元数据字段的例子包括文档 _index，_id和 _source领域。

Fields

映射包含properties与文档有关的字段列表。每个字段都有其自己的数据类型。

防止mapping爆炸的设置

在索引中定义太多字段会导致mapping爆炸，这可能会导致内存不足错误和难以恢复的情况。

考虑一种情况，其中插入的每个新文档都引入了新字段，例如带有动态映射的字段。每个新字段都添加到索引映射中，随着映射的增长，这可能会成为问题。

index.mapping.total_fields.limit: 索引中的最大字段数。字段和对象的映射以及字段别名都计入此限制。默认值为1000

indices.query.bool.max_clause_count: 将限制查询中布尔子句的最大数量

index.mapping.depth.limit: 字段的最大深度，以内部对象的数量衡量,默认值为20。

index.mapping.nested_fields.limit:

index.mapping.nested_objects.limit:

index.mapping.field_name_length.limit:

Field Data type:

Each field has a field data type, or field type. This type indicates the kind of data the field contains, such as strings or boolean values, and its intended use.

Field types are grouped by family. Types in the same family support the same search functionality but may have different space usage or performance characteristics.

Common types

binary

Binary value encoded as a Base64 string.

boolean

true and false values.

Keywords

The keyword family, including keyword, constant_keyword, and wildcard.

Numbers

Numeric types, such as long and double, used to express amounts.

Dates

Date types, including date and date_nanos.

alias

Defines an alias for an existing field.

Objects and relational types

object

A JSON object.

flattened

An entire JSON object as a single field value.

nested

A JSON object that preserves the relationship between its subfields.

join

Defines a parent/child relationship for documents in the same index.

Structured data types

Range

Range types, such as long_range, double_range, date_range, and ip_range.

ip

IPv4 and IPv6 addresses.

version

Software versions. Supports Semantic Versioning precedence rules.

murmur3

Compute and stores hashes of values.

Aggregate data typesedit

histogram

Pre-aggregated numerical values.

Text search typesedit

text

Analyzed, unstructured text.

annotated-text

Text containing special markup. Used for identifying named entities.

completion

Used for auto-complete suggestions.

search_as_you_type

text-like type for as-you-type completion.

token_count

A count of tokens in a text.

Document ranking typesedit

dense_vector

Records dense vectors of float values.

sparse_vector

Records sparse vectors of float values.

rank_feature

Records a numeric feature to boost hits at query time.

rank_features

Records numeric features to boost hits at query time.

Spatial data typesedit

geo_point

Latitude and longitude points.

geo_shape

Complex shapes, such as polygons.

point

Arbitrary cartesian points.

shape

Arbitrary cartesian geometries.

Other types

percolator

Indexes queries written in Query DSL.

Arrays

In Elasticsearch, arrays do not require a dedicated field data type. Any field can contain zero or more values by default, however, all values in the array must be of the same field type. See Arrays.

Multi-fields

It is often useful to index the same field in different ways for different purposes. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations. Alternatively, you could index a text field with the standard analyzer, the english analyzer, and the french analyzer.

This is the purpose of multi-fields. Most field types support multi-fields via the fields parameter.

Metadata fields

Each document has metadata associated with it, such as the _index, mapping _type, and _id metadata fields. The behavior of some of these metadata fields can be customized when a mapping type is created.

Identity metadata fields

`_index`	The index to which the document belongs.
`_type`	The document’s mapping type.
`_id`	The document’s ID.

Document source metadata fields

_source

The original JSON representing the body of the document.

_size

The size of the _source field in bytes, provided by the mapper-size plugin.

Indexing metadata fields

_field_names

All fields in the document which contain non-null values.

_ignored

All fields in the document that have been ignored at index time because of ignore_malformed.

Routing metadata field

_routing

A custom routing value which routes a document to a particular shard.

Other metadata field

_meta

Application specific metadata.

Mapping parameters

The following pages provide detailed explanations of the various mapping parameters that are used by field mappings:

The following mapping parameters are common to some or all field data types:

Dynamic mapping: 动态mapping

explicit mapping: 显式mapping

wandy0211

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Elasticsearch mapping

mapping是定义文档及其包含的字段的存储和索引方式的过程。例如，使用mapping定义：哪些字符串字段应视为全文字段。哪些字段包含数字，日期或地理位置。日期值的格式。自定义规则，用于控制动态添加字段的映射。mapping定义具有：Metadata fields元数据字段用于自定义如何处理文档的关联元数据。元数据字段的例子包括文档_index，_id和_source领域。Fields映射包含properties与文档有关的字段列表。每个字段都有其自己的数据...
复制链接

扫一扫

专栏目录