Cassandra杂记13-Cassandra API简单应用

最新推荐文章于 2022-01-04 17:22:37 发布

贾诩是也

最新推荐文章于 2022-01-04 17:22:37 发布

阅读量288

点赞数

分类专栏： Cassandra

本文链接：https://blog.csdn.net/xiangxizhishi/article/details/78941503

版权

Cassandra 专栏收录该内容

21 篇文章 0 订阅

订阅专栏

本文主要关注两个部分，
1. 怎么写一个最简单cassandra的sample
2. 怎么去分析这个最简单的sample背后隐含的含义

步骤一：
首先我们创建一个工程，然后将cassandra/lib目录下的包，导入到我们的工程中。
步骤二：
创建一个类，内容如下：
Java代码复制代码收藏代码

    public class SampleOne {
        static Cassandra.Client cassandraClient;

        private static void init() throws TTransportException {
            String server = "localhost";
            int port = 9160;

            /* 首先指定cassandra server的地址 */
            TTransport socket = new TSocket(server, port);
            System.out.println(" connected to " + server + ":" + port + ".");


            /* 指定通信协议为二进制流协议 */
            TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, false, false);
            cassandraClient = new Cassandra.Client(binaryProtocol);

            /* 建立通信连接 */
            socket.open();
        }

        public static void main(String[] args) throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException {
            /* 初始化连接 */
            init();

            /* 选择需要操作的Keyspaces，可以理解成数据库的表 */
            String keyspace= "Keyspace1";
            String row = "row007";

            /* 创建一个column path */
            ColumnPath col = new ColumnPath("Standard1", null, "ahuaxuan".getBytes());

            /* 执行插入操作，指定keysapce, row, col, 和数据内容，后面两个参数一个是timestamp，另外一个是consistency_level
              * timestamp是用来做数据一致性保证的，而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo
              */
            cassandraClient.insert(keyspace, row, col, "val1".getBytes(), 1, 1);

            /* 取出刚刚塞进去的值，取值的流程和插入的流程类似，也需要指定keyspace, row, col, 最后一个参数是consistency_level */
            Column column = cassandraClient.get(keyspace, row, col, 1).column;

            System.out.println("read row " + row);
            System.out.println("column name" + new String(column.name));
            System.out.println("column value" + ":" + new String(column.value));
            System.out.println("column timestamp" + ":" + (column.timestamp));
        }
    }

public class SampleOne {
    static Cassandra.Client cassandraClient;

    private static void init() throws TTransportException {
        String server = "localhost";
        int port = 9160;

        /* 首先指定cassandra server的地址 */
        TTransport socket = new TSocket(server, port);
        System.out.println(" connected to " + server + ":" + port + ".");

        /* 指定通信协议为二进制流协议 */
        TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, false, false);
        cassandraClient = new Cassandra.Client(binaryProtocol);

        /* 建立通信连接 */
        socket.open();
    }

    public static void main(String[] args) throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException {
        /* 初始化连接 */
        init();

        /* 选择需要操作的Keyspaces，可以理解成数据库的表 */
        String keyspace= "Keyspace1";
        String row = "row007";

        /* 创建一个column path */
        ColumnPath col = new ColumnPath("Standard1", null, "ahuaxuan".getBytes());

        /* 执行插入操作，指定keysapce, row, col, 和数据内容，后面两个参数一个是timestamp，另外一个是consistency_level
          * timestamp是用来做数据一致性保证的，而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo
          */
        cassandraClient.insert(keyspace, row, col, "val1".getBytes(), 1, 1);

        /* 取出刚刚塞进去的值，取值的流程和插入的流程类似，也需要指定keyspace, row, col, 最后一个参数是consistency_level */
        Column column = cassandraClient.get(keyspace, row, col, 1).column;

        System.out.println("read row " + row);
        System.out.println("column name" + new String(column.name));
        System.out.println("column value" + ":" + new String(column.value));
        System.out.println("column timestamp" + ":" + (column.timestamp));
    }
}

好了，代码写到这里，例子里流程非常简单，而且ahuaxuan在例子中也加入了很多注释，估计童鞋们也大概了解了这个cassandra是怎么做insert和get的了
现在可以执行这段代码了，不出意外的话，你们会得到如下结果：
read row row007
column nameahuaxuan
column value: first cassandra sample of ahuaxuan
column timestamp:1

上面说到例子的流程很简单，代码也很少，但是ahuaxuan写这个例子并不只是闲得发慌，也不是无聊得没事可做，真正的目的在于理解这个例子背后的一些模型。
从刚才的这段代码里，估计给大家留下最多烦恼的就是keyspace, row, column, timestamp和consistency_level,
前面四个概念（keyspace, row, column, timestamp）和数据的存储相关，熟悉的同学估计都知道，这个是来源于bigtable的概念。
而consistency_level则是控制数据分布策略的。由于我们现在的例子只是最简单的而且没有在集群环境下的一个例子，所以consistency_level这个东西我们放在后面，我们先来讲讲
keyspace, row, column, timestamp这四个东西是个什么东西。

1. 首先我们来说说keyspace是个什么玩意
打开storage-conf.xml，找到<Keyspaces>这个xml节点，我们可以看到一段对keyspace的说明，如下：
ColumnFamily在cassandra中概念最接近关系型数据库中的表。而keyspace则是一堆ColumnFamily的集合。如果说ColumnFamily是表，那么我们可以将keyspace称之库

我们来看一段简单的配置。
Java代码复制代码收藏代码

    <Keyspaces>
        <Keyspace Name="Keyspace1">
          <ColumnFamily CompareWith="BytesType" Name="Standard1"/>
          <ColumnFamily CompareWith="UTF8Type" Name="Standard2"/>
          <ColumnFamily CompareWith="TimeUUIDType" Name="StandardByUUID1"/>
          <ColumnFamily ColumnType="Super"
                        CompareWith="UTF8Type"
                        CompareSubcolumnsWith="UTF8Type"
                        Name="Super1"
                        Comment="A column family with supercolumns, whose column and subcolumn names are UTF8 strings"/>
        </Keyspace>
        <Keyspace Name="ahuaxuan">
          <ColumnFamily CompareWith="BytesType" Name="test1"/>
          <ColumnFamily CompareWith="UTF8Type" Name="test2"/>
          <ColumnFamily ColumnType="Super"
                        CompareWith="UTF8Type"
                        CompareSubcolumnsWith="UTF8Type"
                        Name="Super1"
                        Comment="A column family with supercolumns, whose column and subcolumn names are UTF8 strings"/>
        </Keyspace>
      </Keyspaces>

<Keyspaces>
    <Keyspace Name="Keyspace1">
      <ColumnFamily CompareWith="BytesType" Name="Standard1"/>
      <ColumnFamily CompareWith="UTF8Type" Name="Standard2"/>
      <ColumnFamily CompareWith="TimeUUIDType" Name="StandardByUUID1"/>
      <ColumnFamily ColumnType="Super"
                    CompareWith="UTF8Type"
                    CompareSubcolumnsWith="UTF8Type"
                    Name="Super1"
                    Comment="A column family with supercolumns, whose column and subcolumn names are UTF8 strings"/>
    </Keyspace>
    <Keyspace Name="ahuaxuan">
      <ColumnFamily CompareWith="BytesType" Name="test1"/>
      <ColumnFamily CompareWith="UTF8Type" Name="test2"/>
      <ColumnFamily ColumnType="Super"
                    CompareWith="UTF8Type"
                    CompareSubcolumnsWith="UTF8Type"
                    Name="Super1"
                    Comment="A column family with supercolumns, whose column and subcolumn names are UTF8 strings"/>
    </Keyspace>
</Keyspaces>

这段配置表示我们的cassandra中有多个keyspace, 而每个keyspace下又有多个ColumnFamily.
在回头看一下我们的代码是如何使用ColumnFamily的呢？
Java代码复制代码收藏代码
    /* 创建一个column path */
           ColumnPath col = new ColumnPath("Standard1", null, "ahuaxuan".getBytes());

/* 创建一个column path */
        ColumnPath col = new ColumnPath("Standard1", null, "ahuaxuan".getBytes());

这行代码指定了我们要数据存放到哪个ColumnFamily中去, 这里的Standard1就是Keyspace1中的第一个ColumnFamily.
但是我们还有一个问题:
Java代码复制代码收藏代码
    cassandraClient.insert(keyspace, row, col, "val1".getBytes(), 1, 1);
cassandraClient.insert(keyspace, row, col, "val1".getBytes(), 1, 1);
这行代码中第三个参数才是ColumnPath, 第一个参数是keyspace, 两者之间还夹着一个row, 那么row是一个什么样的角色呢？

继续
步骤一：
首先我们创建一个工程，然后将cassandra/lib目录下的包，导入到我们的工程中。
步骤二：
创建一个类，内容如下：

Java代码

import org.apache.cassandra.thrift.Cassandra;
import org.apache.cassandra.thrift.Column;
import org.apache.cassandra.thrift.ColumnPath;
import org.apache.cassandra.thrift.ConsistencyLevel;
import org.apache.cassandra.thrift.InvalidRequestException;
import org.apache.cassandra.thrift.NotFoundException;
import org.apache.cassandra.thrift.TimedOutException;
import org.apache.cassandra.thrift.UnavailableException;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TTransportException;
public class SampleOne {
static Cassandra.Client cassandraClient;
static TTransport socket;
private static void init() throws TTransportException {
String server = "192.168.1.129";
// String server = "localhost";
int port = 9160;
/* 首先指定cassandra server的地址 */
socket = new TSocket(server, port);
System.out.println(" connected to " + server + ":" + port + ".");
/* 指定通信协议为二进制流协议 */
TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, false, false);
cassandraClient = new Cassandra.Client(binaryProtocol);
/* 建立通信连接 */
socket.open();
}
public static void main(String[] args) throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException {
/* 初始化连接 */
init();
/* 选择需要操作的Keyspaces，可以理解成数据库的表 */
String keyspace= "Keyspace1";
String row = "employee";
/* 创建一个Table Name */
String tableName = "Standard2";
/* 插入一条记录 */
insertOrUpdate(keyspace,tableName,row,"name","happy birthday!",System.currentTimeMillis());
/* 删除一条记录 */
//delete(keyspace,tableName,row,"name",System.currentTimeMillis());
/* 获取一条记录 (由于插入和删除是同一条记录,有可能会检索不到哦!请大家主意!*/
Column column = getByColumn(keyspace,tableName,row,"name", System.currentTimeMillis());
System.out.println("read row " + row);
System.out.println("column name " + ":" + new String(column.name));
System.out.println("column value" + ":" + new String(column.value));
System.out.println("column timestamp" + ":" + (column.timestamp));
close();
}
/**
* 插入记录
*/
public static void insertOrUpdate(String tableSpace,String tableName, String rowParam,String ColumnName,String ColumnValue,long timeStamp)
throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
/* 选择需要操作的Keyspaces，存放数据表所在的空间位置 */
String keyspace= tableSpace;
/* 数据所在的行标 */
String row = rowParam;
/* 创建一个column path */
ColumnPath col = new ColumnPath(tableName);
col.setColumn(ColumnName.getBytes());
/* 执行插入操作，指定keysapce, row, col, 和数据内容，后面两个参数一个是timestamp，另外一个是consistency_level
* timestamp是用来做数据一致性保证的，而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo
*/
cassandraClient.insert(keyspace, row, col,"i don't know".getBytes(), System.currentTimeMillis(), ConsistencyLevel.ONE);
}
/**
* 删除记录
*/
public static void delete(String tableSpace,String tableName, String rowParam,String ColumnName,long timeStamp)
throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
/* 选择需要操作的Keyspaces，存放数据表所在的空间位置 */
String keyspace= tableSpace;
/* 数据所在的行标 */
String row = rowParam;
/* 创建一个column path */
ColumnPath col = new ColumnPath(tableName);
col.setColumn(ColumnName.getBytes());
/* 执行删除操作，指定keysapce, row, col，后面两个参数一个是timestamp，另外一个是consistency_level
* timestamp是用来做数据一致性保证的，而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo
*/
cassandraClient.remove(keyspace, row, col, System.currentTimeMillis(), ConsistencyLevel.ONE);
}
/**
* 获取数据
*/
public static Column getByColumn(String tableSpace,String tableName, String rowParam,String ColumnName,long timeStamp)
throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
/* 选择需要操作的Keyspaces，存放数据表所在的空间位置 */
String keyspace= tableSpace;
/* 数据所在的行标 */
String row = rowParam;
/* 创建一个column path */
ColumnPath col = new ColumnPath(tableName);
col.setColumn(ColumnName.getBytes());
/* 执行查询操作，指定keysapce, row, col， timestamp
* timestamp是用来做数据一致性保证的，而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo
*/
Column column = cassandraClient.get(keyspace, row, col, ConsistencyLevel.ONE).column;
return column;
}
/**
* 关闭当前的远程访问连接
*/
public static void close() {
socket.close();
}
}

import org.apache.cassandra.thrift.Cassandra;
import org.apache.cassandra.thrift.Column;
import org.apache.cassandra.thrift.ColumnPath;
import org.apache.cassandra.thrift.ConsistencyLevel;
import org.apache.cassandra.thrift.InvalidRequestException;
import org.apache.cassandra.thrift.NotFoundException;
import org.apache.cassandra.thrift.TimedOutException;
import org.apache.cassandra.thrift.UnavailableException;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TTransportException;


 public class SampleOne {  
     static Cassandra.Client cassandraClient;  
     static TTransport socket;
   
   
     private static void init() throws TTransportException {  
         String server = "192.168.1.129";  
//    	 String server = "localhost";
         int port = 9160;  
   
         /* 首先指定cassandra server的地址 */  
         socket = new TSocket(server, port);  
         System.out.println(" connected to " + server + ":" + port + ".");  
   
   
         /* 指定通信协议为二进制流协议 */  
         TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket, false, false);  
         cassandraClient = new Cassandra.Client(binaryProtocol);  
   
   
         /* 建立通信连接 */  
         socket.open();  
     }  
   
   
     public static void main(String[] args) throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException {  
         /* 初始化连接 */  
         init();  
   
   
         /* 选择需要操作的Keyspaces， 可以理解成数据库的表 */  
         String keyspace= "Keyspace1";  
         String row = "employee";  
   
         /* 创建一个Table Name */  
         String tableName = "Standard2";
         
         /* 插入一条记录 */
         insertOrUpdate(keyspace,tableName,row,"name","happy birthday!",System.currentTimeMillis());
         /* 删除一条记录 */
         //delete(keyspace,tableName,row,"name",System.currentTimeMillis());
         /* 获取一条记录 (由于插入和删除是同一条记录,有可能会检索不到哦!请大家主意!*/
         Column column = getByColumn(keyspace,tableName,row,"name", System.currentTimeMillis());
         System.out.println("read row " + row);  
         System.out.println("column name " + ":" + new String(column.name));  
         System.out.println("column value" + ":" + new String(column.value));  
         System.out.println("column timestamp" + ":" + (column.timestamp));  
         
         close();
     }
     
     /**
      * 插入记录
      */
     public static void insertOrUpdate(String tableSpace,String tableName, String rowParam,String ColumnName,String ColumnValue,long timeStamp)  
     	throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
         /* 选择需要操作的Keyspaces， 存放数据表所在的空间位置 */  
         String keyspace= tableSpace;
         /* 数据所在的行标 */
         String row = rowParam;  
   
         /* 创建一个column path */  
         ColumnPath col = new ColumnPath(tableName);  
         col.setColumn(ColumnName.getBytes()); 
         
         /* 执行插入操作，指定keysapce, row, col, 和数据内容， 后面两个参数一个是timestamp， 另外一个是consistency_level 
          * timestamp是用来做数据一致性保证的， 而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo 
          */  
        cassandraClient.insert(keyspace, row, col,"i don't know".getBytes(), System.currentTimeMillis(), ConsistencyLevel.ONE);
	 }
     
     /**
      * 删除记录
      */
     public static void delete(String tableSpace,String tableName, String rowParam,String ColumnName,long timeStamp) 
     	throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
         /* 选择需要操作的Keyspaces， 存放数据表所在的空间位置 */  
         String keyspace= tableSpace;
         /* 数据所在的行标 */
         String row = rowParam;  
   
         /* 创建一个column path */  
         ColumnPath col = new ColumnPath(tableName);  
         col.setColumn(ColumnName.getBytes()); 
         
         /* 执行删除操作，指定keysapce, row, col， 后面两个参数一个是timestamp， 另外一个是consistency_level 
          * timestamp是用来做数据一致性保证的， 而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo 
          */  
        cassandraClient.remove(keyspace, row, col, System.currentTimeMillis(), ConsistencyLevel.ONE);
	 }
     
     /**
      * 获取数据
      */
     public static Column getByColumn(String tableSpace,String tableName, String rowParam,String ColumnName,long timeStamp) 
  	throws TException, TimedOutException, InvalidRequestException, UnavailableException, NotFoundException{
      /* 选择需要操作的Keyspaces， 存放数据表所在的空间位置 */  
      String keyspace= tableSpace; 
      /* 数据所在的行标 */
      String row = rowParam;  

      /* 创建一个column path */  
      ColumnPath col = new ColumnPath(tableName);  
      col.setColumn(ColumnName.getBytes()); 
      
      /* 执行查询操作，指定keysapce, row, col， timestamp 
       * timestamp是用来做数据一致性保证的， 而consistency_level是用来控制数据分布的策略，前者的理论依据是bigtable, 后者的理论依据是dynamo 
       */  
      Column column = cassandraClient.get(keyspace, row, col, ConsistencyLevel.ONE).column;  
      return column;
	 }
     
     
     /**
      *	关闭当前的远程访问连接
      */
     public static void close() {
    	 socket.close();
	}
 }

为了比较好的理解这些名词解释,我们先看看cassandra的数据模型:

Cassandra 的数据模型的基本概念：
keyspace:
用于存放 ColumnFamily 的容器，相当于关系数据库中的 Schema 或 database,
ColumnFamily :
用于存放 Column 的容器，类似关系数据库中的 table 的概念。

SuperColumn ：
它是一个特列殊的 Column, 它的 Value 值可以包函多个 Column

Java代码

{ // 这是一个SuperColumn
name: "李明杰",
// 包含一系列的Columns
value: {
street: {name: "street", value: "1234 x street", timestamp: 123456789},
city: {name: "city", value: "san francisco", timestamp: 123456789},
zip: {name: "zip", value: "94107", timestamp: 123456789},
}
}

{   // 这是一个SuperColumn
    name: "李明杰",
   // 包含一系列的Columns
   value: {
	street: {name: "street", value: "1234 x street", timestamp: 123456789},
	city: {name: "city", value: "san francisco", timestamp: 123456789},
	zip: {name: "zip", value: "94107", timestamp: 123456789},
   }
}

Columns：
Cassandra 的最基本单位。由 name , value , timestamp 组成

Java代码

{ // 这是一个column
name: "李明杰",
value: "mydream.limj@gmali.com",
timestamp: 123456789
}

{  // 这是一个column
  name: "李明杰",
  value: "mydream.limj@gmali.com",
  timestamp: 123456789
}

cassandra的数据模型主要就是由上述几种模型构建而成的,很简单吧,的确是这样,最大的好处就是读写数据的API非常简单.

贾诩是也

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Cassandra杂记13-Cassandra API简单应用

本文主要关注两个部分，1. 怎么写一个最简单cassandra的sample2. 怎么去分析这个最简单的sample背后隐含的含义步骤一：首先我们创建一个工程，然后将cassandra/lib目录下的包，导入到我们的工程中。步骤二：创建一个类，内容如下：Java代码复制代码收藏代码 public class SampleOne {
复制链接

扫一扫

专栏目录