数据库的笔记

最新推荐文章于 2024-09-08 16:10:12 发布

Frankiehp

最新推荐文章于 2024-09-08 16:10:12 发布

阅读量168

点赞数

分类专栏：数据库文章标签： database

本文链接：https://blog.csdn.net/Frankiehp/article/details/102993040

版权

数据库专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Day1
user id and country code create index, 10000 is ok

but if 100 0000 is not ok

select count(*) from User where country_code = “KR” need country code index

select country_code, id, type from User where country_code = ‘KR’ order by id SKIP 20 limit 10 need id index

We create index for (country code and id) and get the number

and use
select count(*) from INDEX:Userx where key between [“KR”] and [“KR”]
why we use this
a.if use country index, select country_code, id, type from User where country_code = ‘KR’ order by id SKIP 20 limit 10 会变慢随着skip的数量增多而增多
b.If not use country index we can not use this command

		as the result we use (country_code, id ) index

it reduce time then
select count(*) from INDEX:Userx where key between [“KR”] and [“KR”]

但是我们不能够添加search 时候的 where条件了 T T

问题：
条件：我们有200个国家，每个国家下有200万个用户，每个用户都有id和county code这两个属性
需求：查找每个国家下的用户，切按照id来排序
需要执行的sql 语句

索引介绍：
https://www.cnblogs.com/aspwebchh/p/6652855.html
https://blog.csdn.net/weixin_34388207/article/details/88717200

sql语句优化
http://itindex.net/detail/58564-mysql-%E5%8D%83%E4%B8%87-%E5%A4%A7%E6%95%B0%E6%8D%AE

平衡树：
https://www.imooc.com/article/73156
https://blog.csdn.net/yiye2017zhangmu/article/details/81516337

Day2

什么是索引？索引大概有哪几类？

顺序索引和散列索引，顺序索引是基于值的有序序列，散列索引是基于将值均匀分布在若干个散列桶当中

顺序索引 –— 聚焦索引和非聚焦索引：
聚焦索引是指的在主键上建立的索引可以通过这个索引查询到，具体每一条数据。
非聚焦索引就是不是我们的主键但是，我们需要经常查找的一类数据，这个时候我们需要用该键值来建立索引，但是我们查出来的是主键，再用主键去查找具体内容。

散列索引
主要是Hash

大部分索引是借助于平衡树的：

简单说一下平衡二叉树
a. 左 < 根 < 右这是查找的必要条件
b. 必须保证，左右子树的层数不能相差大于1
c. 为了保证b 需要做旋转操作

平衡树
1.红黑树 2.AVL树 3.SBT 4.Treap 5.伸展树
https://blog.csdn.net/weixin_34388207/article/details/88717200
这篇文章中的很多内容还不是很理解，有待深入探讨

【切记没建立一个索引就会二额外增加很多的空间，因为你要表达一个键值上的所有内容在一个树上。所以建议是不要超过6个】
【加入索引，可能会影响写入数据的速度，因为加入新的数据要保证整个平衡二叉树的稳定】

为什么索引加快了速度？
O(n) --> O(lg(n)) 必然快了

索引会不会减慢更新插入的速度？会的话怎么办？
我目前想到的办法是，直接drop掉索引因为我目前不需要实时的进行插入操作

我们的场景可以不可以不用已知的方法？

在orient db 这边所支持的索引类型
=》 http://www.vue5.com/orientdb/orientdb_indexes.html
基本和关系型数据库一致

最后的解决办法：
因为我们现在的Vertex（关系型数据库的表）种类比较上基本上没有什么优化的空间，目前还是添加了索引，未来可能会采取分表的思路

CREATE INDEX A IF NOT EXISTS ON User (country) NOTUNIQUE
CREATE INDEX B IF NOT EXISTS ON User (country, id) UNIQUE

因为country这个属性，我们有重复的，所以是NOTUNIQUE，添加这个索引的目的是在需要count这个属性下的用户数目

select count(*) from User where country_code = ‘KR’

FETCH FROM INDEX UserCountry
key = ‘KR’
CALCULATE AGGREGATE PROJECTIONS
count(*) AS _$ $O A L I A S$ _0
GUARANTEE FOR ZERO COUNT
CALCULATE PROJECTIONS
_$ $O A L I A S$ _0 AS count(*)

从这句话的解释过程上来看
等价于
Select count(*) From INDEX:UserCountry where key = ‘KR’

select * from User where country_code = ‘KR’ and id like ‘%a%’ order by id SKIP 100 limit 10
这句话会被解释成从联合索引中查找

测试发现 orient db 中用*速度会快

补充：
group by 类似于 where 但是只会返回一部分值

select sum（年龄）性别 group by 性别

返回男年龄总和
返回女年龄总和

多表合并~~~
有些数据没有的时候需要辅助表来进行查询~~~~

Frankiehp

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录