Vertica的这些事(二) vertica建表的一些规则

Anatomy of a Projection

The CREATE PROJECTION statement defines the individual elements of a projection, as the following graphic shows.

 

The previous example contains the following significant elements:

Column List and Encoding

Lists every column in the projection and defines the encoding for each column. Unlike traditional database architectures, HP Vertica operates on encoded data representations. Therefore, HP recommends that you use data encoding because it results in less disk I/O.

Base Query

Identifies all the columns to incorporate in the projection through column name and table name references. The base query for large table projections can contain PK/FK joins to smaller tables.

Sort Order

The sort order optimizes for a specific query or commonalities in a class of queries based on the query predicate. The best sort orders are determined by the WHERE clauses. For example, if a projection's sort order is (x, y), and the query's WHERE clause specifies (x=1 AND y=2), all of the needed data is found together in the sort order, so the query runs almost instantaneously.

You can also optimize a query by matching the projection's sort order to the query's GROUP BY clause. If you do not specify a sort order, HP Vertica uses the order in which columns are specified in the column definition as the projection's sort order.

The ORDER BY clause specifies a projection's sort order, which localizes logically grouped values so that a disk read can pick up many results at once. For maximum performance, do not sort projections on LONG VARBINARY and LONG VARCHAR columns.

Segmentation

The segmentation clause determines whether a projection is segmented across nodes within the database. Segmentation distributes contiguous pieces of projections, calledsegments, for large and medium tables across database nodes. Segmentation maximizes database performance by distributing the load. Use SEGMENTED BY HASH to segment large table projections.

For small tables, use the UNSEGMENTED keyword to direct HP Vertica to replicate these tables, rather than segment them. Replication creates and stores identical copies of projections for small tables across all nodes in the cluster. Replication ensures high availability and recovery.

For maximum performance, do not segment projections on LONG VARBINARY and LONG VARCHAR columns.

以上来自官网,理解如下:

Projection的解析

Sort Order

在建表时如何选择order by 的列:

1、              order by 后表中插入的数据是有序的,所以order by 的列就源自于你在查询语句时使用的where 字句的内容。例如,如果字句查询中有where x=1 and y=2,那么建立projection时order by (x, y)查询的时候就会迅速定位到符合条件的数据

2、              group by 后面的字段,出现在order by 中也可以优化查询。

3、              order by 不要建立在LONG VARBINARY and LONG VARCHAR的列

 

Segmentation

1、              Segmentation by hash()就是按照某一列,打散数据,把数据均匀的分布在各个节点上,对于大表,要记得使用。所以 hash里的列是主键最好,也就是说该列数据不重复的值越多,越适合做hash.

2、              Segmentation by 的列不要用LONG VARBINARY and LONG VARCHAR columns.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

数据社

码字不易,谢谢支持!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值