NoSQL 数据库索引总结

最新推荐文章于 2024-07-28 18:18:03 发布

逍遥绝情

最新推荐文章于 2024-07-28 18:18:03 发布

阅读量4.2k

点赞数

分类专栏： AWS 数据库优化 DynamoDB 文章标签：索引 nosql 数据库 sql

AWS 同时被 3 个专栏收录

17 篇文章 2 订阅

订阅专栏

DynamoDB

6 篇文章 0 订阅

订阅专栏

数据库优化

5 篇文章 0 订阅

订阅专栏

SQL数据库索引

深度解析地址：
http://www.cnblogs.com/hustcat/archive/2009/10/28/1591648.html
- 索引分为聚簇索引和非聚簇索引两种，聚簇索引是按照数据存放的物理位置为顺序的，而非聚簇索引就不一样了；聚簇索引能提高多行检索的速度，而非聚簇索引对于单行的检索很快。
- 1.普通索引
- 这是最基本的索引，它没有任何限制，比如上文中为title字段创建的索引就是一个普通索引，MyIASM中默认的BTREE类型的索引，也是我们大多数情况下用到的索引。

//直接创建索引
CREATE INDEX index_name ON table(column(length))

//修改表结构的方式添加索引
ALTER TABLE table_name ADD INDEX index_name ON (column(length))

//创建表的时候同时创建索引
CREATE TABLE `table` (
`id` int(11) NOT NULL AUTO_INCREMENT ,
`title` char(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
`content` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL ,
`time` int(10) NULL DEFAULT NULL ,
PRIMARY KEY (`id`),
INDEX index_name (title(length))
)

//删除索引
DROP INDEX index_name ON table

2.唯一索引
- 与普通索引类似，不同的就是：索引列的值必须唯一，但允许有空值（注意和主键不同）。如果是组合索引，则列值的组合必须唯一，创建方法和普通索引类似。

//创建唯一索引
CREATE UNIQUE INDEX indexName ON table(column(length))

//修改表结构
ALTER TABLE table_name ADD UNIQUE indexName ON (column(length))

//创建表的时候直接指定
CREATE TABLE `table` (
    `id` int(11) NOT NULL AUTO_INCREMENT ,
    `title` char(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
    `content` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL ,
    `time` int(10) NULL DEFAULT NULL ,
    PRIMARY KEY (`id`),
    UNIQUE indexName (title(length))
);

3.全局索引（FULLTEXT）
- MySQL从3.23.23版开始支持全文索引和全文检索，FULLTEXT索引仅可用于 MyISAM 表；他们可以从CHAR、VARCHAR或TEXT列中作为CREATE TABLE语句的一部分被创建，或是随后使用ALTER TABLE 或CREATE INDEX被添加。对于较大的数据集，将你的资料输入一个没有FULLTEXT索引的表中，然后创建索引，其速度比把资料输入现有FULLTEXT索引的速度更为快。不过切记对于大容量的数据表，生成全文索引是一个非常消耗时间非常消耗硬盘空间的做法

//修改表结构添加全文索引
ALTER TABLE article ADD FULLTEXT index_content(content)
//直接创建索引
CREATE FULLTEXT INDEX index_content ON article(content)
//创建表的适合添加全文索引
CREATE TABLE `table` (
    `id` int(11) NOT NULL AUTO_INCREMENT ,
    `title` char(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL ,
    `content` text CHARACTER SET utf8 COLLATE utf8_general_ci NULL ,
    `time` int(10) NULL DEFAULT NULL ,
    PRIMARY KEY (`id`),
    FULLTEXT (content)
);

5.组合索引（最左前缀）
- 平时用的SQL查询语句一般都有比较多的限制条件，所以为了进一步榨取MySQL的效率，就要考虑建立组合索引。例如上表中针对title和time建立一个组合索引：ALTER TABLE article ADD INDEX index_titme_time (title(50),time(10))。建立这样的组合索引，其实是相当于分别建立了下面两组组合索引：

//使用到上面的索引
SELECT * FROM article WHREE title='测试' AND time=1234567890;

SELECT * FROM article WHREE utitle='测试';
//不使用上面的索引
SELECT * FROM article WHREE time=1234567890;

MySQL索引的优化
- 索引的缺点：
  - 虽然索引大大提高了查询速度，同时却会降低更新表的速度，如对表进行INSERT、UPDATE和DELETE。因为更新表时，MySQL不仅要保存数据，还要保存一下索引文件。
  - 建立索引会占用磁盘空间的索引文件。一般情况这个问题不太严重，但如果你在一个大表上创建了多种组合索引，索引文件的会膨胀很快。索引只是提高效率的一个因素，如果你的MySQL有大数据量的表，就需要花时间研究建立最优秀的索引，或优化查询语句。
- MySQL索引的注意事项和优化方法:
  - 1.索引不会包含有NULL值的列
  - 2.使用短索引
  - 3.索引列排序
    - MySQL查询只使用一个索引，因此如果where子句中已经使用了索引的话，那么order by中的列是不会使用索引的。因此数据库默认排序可以符合要求的情况下不要使用排序操作；尽量不要包含多个列的排序，如果需要最好给这些列创建复合索引。
  - 4.like语句操作
    - 一般情况下不鼓励使用like操作，如果非使用不可，如何使用也是一个问题。like “%aaa%” 不会使用索引而like “aaa%”可以使用索引。
  - 5.不要在列上进行运算
    - 例如：select * from users where YEAR(adddate)<2007，将在每个行上进行运算，这将导致索引失效而进行全表扫描，因此我们可以改成：select * from users where adddate<’2007-01-01′
最后总结一下，MySQL只对一下操作符才使用索引：<,<=,=,>,>=,between,in,以及某些时候的like(不以通配符%或_开头的情形)。而理论上每张表里面最多可创建16个索引，不过除非是数据量真的很多，否则过多的使用索引也不是那么好玩的，比如我刚才针对text类型的字段创建索引的时候，系统差点就卡死了

NoSQL数据库索引

MongoDB数据库

MongoDB默认会在所包含的所有集合的_id属性上创建一个索引。
注意解释函数的调用，这是一个查询性能的测量函数，输出也是一个文档：
- db.jihe.find().explain();
- Cursor: 游标用来返回查询结果集。
- Nscanned：被扫描的尸体总数，使用索引时对应索引实体的总数。
- nscannedObject：扫描的文档总数。
- N：返回的文档总数。
- Millis：查询的耗时，单位是毫秒。
- indexBounds:表示查新匹配的索引范围的最小建和最大键。
在MongoDB中，关键字ensureIndex完成了大部分创建索引的工作。如：
- db.集合名.ensureIndex( { 列名 : 1 (-1) } )，表示升序或者降序。
- 获得所有索引的列表使用getIndexs()函数。
- 重新建立索引使用reIndex()，也可以使用runCommand()，例如db.runCommand( { reIndex : ‘集合名’ } )。
- 可以使用dropIndex命令删除索引。
如果需要，可以使用hint方法强制查询使用特定的索引，
- 如强制使用降序：db.集合名.find( { movie_id : 1 } ).hint( { movie_id : -1} )。
可以创建组合索引.
- 如：db.集合名.ensureIndex( { 列名 : 1 (-1) , 列名 : 1 (-1) …..} )。创建后可以单用。
组合索引也可以包含嵌套字段。
可以通过显示声明来创建一个稀疏索引：
- db.集合名.ensureIndex( { 列名 : 1 (-1) , { sparse : true } } )。如果字段无值就会被忽略，所以稀疏索引可能没有引用集合中的所有文档。
可以创建唯一索引：
- db.集合名.ensureIndex( { 列名 : 1 (-1) , { unique : true } } )。
- 但是列名必须各不相同，否则不会创建，除非显式声明除第一个条目外，所有重复都可以丢弃。
- 这样的显示声明：db.集合名.ensureIndex( { 列名 : 1 (-1) , { unique : true , dropDups : true } } )。
- 如果列名为空，则保存null，但是只保存第一个。
组合索引:db.collection.ensureIndex({field:val1,field1:val2});
索引移除:db.collection.dropIndexes({});如果不设置将会remove掉所有的索引。_id索引例外.
dropDups：主要是用来和unique索引结合使用,如果在创建唯一索引时数据中已存在重复的数据,则可以通过dropDups:true来强制创建唯一索引.并且会remove重复的数据.

DynamoDB数据库

创建DynamoDB时，一定要指定一个hash键。如果不指定表的hash键，那么DynamoDB是不允许创建这张表的
DynamoDB的全局二级索引同样是由一个hash键和多个range键所组成
DynamoDB 支持两种不同的索引：
- 全局二级索引
  - 索引的主键可以是其表中的任意两个属性（可以在创建表时创建，也可以向现有表添加新全局二级索引，或者删除现有的全局二级索引）。
- 本地二级索引
  - 索引的分区键必须与其表的分区键相同。不过，排序键可以是任何其他属性（是在创建表的同时创建的。不能向现有表添加本地二级索引，也不能删除已存在的任何本地二级索引）。
DynamoDB 确保secondary index中的数据最终与其表保持一致。我们可以请求对表或local secondary index执行强一致性 Query 或 Scan 操作。但是，全局二级索引仅支持最终一致性。

创建含有全局二级索引的表
- 以下 Java 代码段创建一个保存天气数据信息的表。分区键为 Location，排序键为 Date。通过一个名为 PrecipIndex 的全局二级索引，可以快速访问各个位置的降水数据。

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);


// Attribute definitions
ArrayList<AttributeDefinition> attributeDefinitions = new ArrayList<AttributeDefinition>();

attributeDefinitions.add(new AttributeDefinition()
    .withAttributeName("Location")
    .withAttributeType("S"));
attributeDefinitions.add(new AttributeDefinition()
    .withAttributeName("Date")
    .withAttributeType("S"));
attributeDefinitions.add(new AttributeDefinition()
    .withAttributeName("Precipitation")
    .withAttributeType("N"));

// Table key schema
ArrayList<KeySchemaElement> tableKeySchema = new ArrayList<KeySchemaElement>();
tableKeySchema.add(new KeySchemaElement()
    .withAttributeName("Location")
    .withKeyType(KeyType.HASH));  //Partition key
tableKeySchema.add(new KeySchemaElement()
    .withAttributeName("Date")
    .withKeyType(KeyType.RANGE));  //Sort key

// PrecipIndex
GlobalSecondaryIndex precipIndex = new GlobalSecondaryIndex()
    .withIndexName("PrecipIndex")
    .withProvisionedThroughput(new ProvisionedThroughput()
        .withReadCapacityUnits((long) 10)
        .withWriteCapacityUnits((long) 1))
        .withProjection(new Projection().withProjectionType(ProjectionType.ALL));

ArrayList<KeySchemaElement> indexKeySchema = new ArrayList<KeySchemaElement>();

indexKeySchema.add(new KeySchemaElement()
    .withAttributeName("Date")
    .withKeyType(KeyType.HASH));  //Partition key
indexKeySchema.add(new KeySchemaElement()
    .withAttributeName("Precipitation")
    .withKeyType(KeyType.RANGE));  //Sort key

precipIndex.setKeySchema(indexKeySchema);

CreateTableRequest createTableRequest = new CreateTableRequest()
    .withTableName("WeatherData")
    .withProvisionedThroughput(new ProvisionedThroughput()
        .withReadCapacityUnits((long) 5)
        .withWriteCapacityUnits((long) 1))
    .withAttributeDefinitions(attributeDefinitions)
    .withKeySchema(tableKeySchema)
    .withGlobalSecondaryIndexes(precipIndex);

Table table = dynamoDB.createTable(createTableRequest);
System.out.println(table.getDescription());

描述含有全局二级索引的表
- 要获取有关表中的全局二级索引的信息，请使用 DescribeTable。对于每个索引，您都可以查看其名称、键架构和投影的属性。

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);

Table table = dynamoDB.getTable("WeatherData");
TableDescription tableDesc = table.describe();


Iterator<GlobalSecondaryIndexDescription> gsiIter = tableDesc.getGlobalSecondaryIndexes().iterator();
while (gsiIter.hasNext()) {
    GlobalSecondaryIndexDescription gsiDesc = gsiIter.next();
    System.out.println("Info for index "
         + gsiDesc.getIndexName() + ":");

    Iterator<KeySchemaElement> kseIter = gsiDesc.getKeySchema().iterator();
    while (kseIter.hasNext()) {
        KeySchemaElement kse = kseIter.next();
        System.out.printf("\t%s: %s\n", kse.getAttributeName(), kse.getKeyType());
    }
    Projection projection = gsiDesc.getProjection();
    System.out.println("\tThe projection type is: "
        + projection.getProjectionType());
    if (projection.getProjectionType().toString().equals("INCLUDE")) {
        System.out.println("\t\tThe non-key projected attributes are: "
            + projection.getNonKeyAttributes());
    }
}

查询全局二级索引
- 您可以对全局二级索引使用 Query（基本上与对表执行 Query 操作相同）。您需要指定索引名称、索引分区键和排序键（如果有）的查询条件以及要返回的属性。在此示例中，索引为 PrecipIndex，它具有分区键 Date 和排序键 Precipitation。该索引查询返回降水量大于零的特定日期的所有天气数据。

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);

Table table = dynamoDB.getTable("WeatherData");
Index index = table.getIndex("PrecipIndex");

QuerySpec spec = new QuerySpec()
    .withKeyConditionExpression("#d = :v_date and Precipitation = :v_precip")
    .withNameMap(new NameMap()
        .with("#d", "Date"))
    .withValueMap(new ValueMap()
        .withString(":v_date","2013-08-10")
        .withNumber(":v_precip",0));

ItemCollection<QueryOutcome> items = index.query(spec);
Iterator<Item> iter = items.iterator(); 
while (iter.hasNext()) {
    System.out.println(iter.next().toJSONPretty());
}

以下 Java 代码示例说明如何使用全局二级索引。该示例创建了一个名为 Issues 的表，它可以在用于软件开发的简单错误跟踪系统中使用。分区键为 IssueId，排序键为 Title。此表中有三个全局二级索引：
- CreateDateIndex – 分区键为 CreateDate，排序键为 IssueId。除表键外，还会将属性 Description 和 Status 投影到索引中。
- TitleIndex – 分区键为 IssueId，排序键为 Title。不会将表键外的其他属性投影到索引中。
- DueDateIndex – 分区键为 DueDate，没有排序键。所有表属性都投影到索引中。
- 创建 Issues 表后，程序为该表加载表示软件错误报告的数据，然后使用全局二级索引查询这些数据。最后，程序会删除 Issues 表。

public class DocumentAPIGlobalSecondaryIndexExample {

    static AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
    static DynamoDB dynamoDB = new DynamoDB(client);

    public static String tableName = "Issues";

    public static void main(String[] args) throws Exception {

        createTable();
        loadData();

        queryIndex("CreateDateIndex");
        queryIndex("TitleIndex");
        queryIndex("DueDateIndex");

        deleteTable(tableName);

    }

    public static void createTable() {

        // Attribute definitions
        ArrayList<AttributeDefinition> attributeDefinitions = new ArrayList<AttributeDefinition>();

        attributeDefinitions.add(new AttributeDefinition().withAttributeName("IssueId").withAttributeType("S"));
        attributeDefinitions.add(new AttributeDefinition().withAttributeName("Title").withAttributeType("S"));
        attributeDefinitions.add(new AttributeDefinition().withAttributeName("CreateDate").withAttributeType("S"));
        attributeDefinitions.add(new AttributeDefinition().withAttributeName("DueDate").withAttributeType("S"));

        // Key schema for table
        ArrayList<KeySchemaElement> tableKeySchema = new ArrayList<KeySchemaElement>();
        tableKeySchema.add(new KeySchemaElement().withAttributeName("IssueId").withKeyType(KeyType.HASH)); // Partition
                                                                                                           // key
        tableKeySchema.add(new KeySchemaElement().withAttributeName("Title").withKeyType(KeyType.RANGE)); // Sort
                                                                                                          // key

        // Initial provisioned throughput settings for the indexes
        ProvisionedThroughput ptIndex = new ProvisionedThroughput().withReadCapacityUnits(1L)
            .withWriteCapacityUnits(1L);

        // CreateDateIndex
        GlobalSecondaryIndex createDateIndex = new GlobalSecondaryIndex().withIndexName("CreateDateIndex")
            .withProvisionedThroughput(ptIndex)
            .withKeySchema(new KeySchemaElement().withAttributeName("CreateDate").withKeyType(KeyType.HASH), // Partition
                                                                                                             // key
                new KeySchemaElement().withAttributeName("IssueId").withKeyType(KeyType.RANGE)) // Sort
                                                                                                // key
            .withProjection(
                new Projection().withProjectionType("INCLUDE").withNonKeyAttributes("Description", "Status"));

        // TitleIndex
        GlobalSecondaryIndex titleIndex = new GlobalSecondaryIndex().withIndexName("TitleIndex")
            .withProvisionedThroughput(ptIndex)
            .withKeySchema(new KeySchemaElement().withAttributeName("Title").withKeyType(KeyType.HASH), // Partition
                                                                                                        // key
                new KeySchemaElement().withAttributeName("IssueId").withKeyType(KeyType.RANGE)) // Sort
                                                                                                // key
            .withProjection(new Projection().withProjectionType("KEYS_ONLY"));

        // DueDateIndex
        GlobalSecondaryIndex dueDateIndex = new GlobalSecondaryIndex().withIndexName("DueDateIndex")
            .withProvisionedThroughput(ptIndex)
            .withKeySchema(new KeySchemaElement().withAttributeName("DueDate").withKeyType(KeyType.HASH)) // Partition
                                                                                                          // key
            .withProjection(new Projection().withProjectionType("ALL"));

        CreateTableRequest createTableRequest = new CreateTableRequest().withTableName(tableName)
            .withProvisionedThroughput(
                new ProvisionedThroughput().withReadCapacityUnits((long) 1).withWriteCapacityUnits((long) 1))
            .withAttributeDefinitions(attributeDefinitions).withKeySchema(tableKeySchema)
            .withGlobalSecondaryIndexes(createDateIndex, titleIndex, dueDateIndex);

        System.out.println("Creating table " + tableName + "...");
        dynamoDB.createTable(createTableRequest);

        // Wait for table to become active
        System.out.println("Waiting for " + tableName + " to become ACTIVE...");
        try {
            Table table = dynamoDB.getTable(tableName);
            table.waitForActive();
        }
        catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public static void queryIndex(String indexName) {

        Table table = dynamoDB.getTable(tableName);

        System.out.println("\n***********************************************************\n");
        System.out.print("Querying index " + indexName + "...");

        Index index = table.getIndex(indexName);

        ItemCollection<QueryOutcome> items = null;

        QuerySpec querySpec = new QuerySpec();

        if (indexName == "CreateDateIndex") {
            System.out.println("Issues filed on 2013-11-01");
            querySpec.withKeyConditionExpression("CreateDate = :v_date and begins_with(IssueId, :v_issue)")
                .withValueMap(new ValueMap().withString(":v_date", "2013-11-01").withString(":v_issue", "A-"));
            items = index.query(querySpec);
        }
        else if (indexName == "TitleIndex") {
            System.out.println("Compilation errors");
            querySpec.withKeyConditionExpression("Title = :v_title and begins_with(IssueId, :v_issue)")
                .withValueMap(new ValueMap().withString(":v_title", "Compilation error").withString(":v_issue", "A-"));
            items = index.query(querySpec);
        }
        else if (indexName == "DueDateIndex") {
            System.out.println("Items that are due on 2013-11-30");
            querySpec.withKeyConditionExpression("DueDate = :v_date")
                .withValueMap(new ValueMap().withString(":v_date", "2013-11-30"));
            items = index.query(querySpec);
        }
        else {
            System.out.println("\nNo valid index name provided");
            return;
        }

        Iterator<Item> iterator = items.iterator();

        System.out.println("Query: printing results...");

        while (iterator.hasNext()) {
            System.out.println(iterator.next().toJSONPretty());
        }

    }

    public static void deleteTable(String tableName) {

        System.out.println("Deleting table " + tableName + "...");

        Table table = dynamoDB.getTable(tableName);
        table.delete();

        // Wait for table to be deleted
        System.out.println("Waiting for " + tableName + " to be deleted...");
        try {
            table.waitForDelete();
        }
        catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public static void loadData() {

        System.out.println("Loading data into table " + tableName + "...");

        // IssueId, Title,
        // Description,
        // CreateDate, LastUpdateDate, DueDate,
        // Priority, Status

        putItem("A-101", "Compilation error", "Can't compile Project X - bad version number. What does this mean?",
            "2013-11-01", "2013-11-02", "2013-11-10", 1, "Assigned");

        putItem("A-102", "Can't read data file", "The main data file is missing, or the permissions are incorrect",
            "2013-11-01", "2013-11-04", "2013-11-30", 2, "In progress");

        putItem("A-103", "Test failure", "Functional test of Project X produces errors", "2013-11-01", "2013-11-02",
            "2013-11-10", 1, "In progress");

        putItem("A-104", "Compilation error", "Variable 'messageCount' was not initialized.", "2013-11-15",
            "2013-11-16", "2013-11-30", 3, "Assigned");

        putItem("A-105", "Network issue", "Can't ping IP address 127.0.0.1. Please fix this.", "2013-11-15",
            "2013-11-16", "2013-11-19", 5, "Assigned");

    }

    public static void putItem(

        String issueId, String title, String description, String createDate, String lastUpdateDate, String dueDate,
        Integer priority, String status) {

        Table table = dynamoDB.getTable(tableName);

        Item item = new Item().withPrimaryKey("IssueId", issueId).withString("Title", title)
            .withString("Description", description).withString("CreateDate", createDate)
            .withString("LastUpdateDate", lastUpdateDate).withString("DueDate", dueDate)
            .withNumber("Priority", priority).withString("Status", status);

        table.putItem(item);
    }

}