datastax.repo_使用Datastax Java驱动程序与Cassandra进行交互

最新推荐文章于 2024-05-14 09:33:35 发布

dnc8371

最新推荐文章于 2024-05-14 09:33:35 发布

阅读量192

点赞数

文章标签： java python 数据库 mysql 大数据

原文链接：https://www.javacodegeeks.com/2018/04/interacting-with-cassandra-using-the-datastax-java-driver.html

版权

datastax.repo

今天，我这次返回了更多的Cassandra和Java集成，重点是使用Datastax Java驱动程序，而不是我已经写了很多文章的Spring Data Cassandra。 Spring Data实际上使用了Datastax驱动程序来与Cassandra进行交互，但是在它之上还附带了一些额外的功能。但是我们今天不想要任何这些！我们将直接使用Datastax驱动程序，并且在发布结束时，一旦看到如何使用它，我们便将其与Spring Data进行比较。

这篇文章假设您已经熟悉Cassandra，可能已经熟悉Spring Data Cassandra。由于我已经写了很多关于该主题的文章，所以我只讨论了Cassandra在需要上下文的地方如何工作。如果您没有此背景信息，我建议您阅读Spring Data Cassandra入门，在该文章中我显然谈到了使用Spring Data Cassandra的问题，而且比我在这篇文章中对Cassandra的工作方式进行了更详尽的解释。还有Datastax学院，它提供了一些非常有用的资源来学习如何自己使用Cassandra。

首先，依赖性。

<dependencies>
  <dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter</artifactId>
  </dependency>

  <dependency>
    <groupId>com.datastax.cassandra</groupId>
    <artifactId>cassandra-driver-core</artifactId>
    <version>3.4.0</version>
  </dependency>

  <dependency>
    <groupId>com.datastax.cassandra</groupId>
    <artifactId>cassandra-driver-mapping</artifactId>
    <version>3.4.0</version>
  </dependency>

  <dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.4</version>
  </dependency>

  <dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-lang3</artifactId>
    <version>3.7</version>
  </dependency>
</dependencies>

和往常一样，我使用Spring Boot只是因为我们剥夺了自己的Spring Data并不意味着我们需要从所有Spring库中完全摆脱掉。这里与Datastax相关的依赖关系是cassandra-driver-core和cassandra-driver-mapping 。顾名思义， cassandra-driver-core提供了与Cassandra交互的核心功能，例如建立会话和编写查询。 cassandra-driver-mapping不是查询Cassandra所必需的，但确实提供了一些对象映射，它与核心驱动程序一起将用作ORM，而不仅仅是允许我们执行CQL语句。

现在，我们已经对依赖项进行了排序，下一步是连接到Cassandra，以便我们实际上可以开始查询它了。

@Configuration
public class CassandraConfig {

  @Bean
  public Cluster cluster(
      @Value("${cassandra.host:127.0.0.1}") String host,
      @Value("${cassandra.cluster.name:cluster}") String clusterName,
      @Value("${cassandra.port:9042}") int port) {
    return Cluster.builder()
        .addContactPoint(host)
        .withPort(port)
        .withClusterName(clusterName)
        .build();
  }
  
  @Bean
  public Session session(Cluster cluster, @Value("${cassandra.keyspace}") String keyspace)
      throws IOException {
    final Session session = cluster.connect();
    setupKeyspace(session, keyspace);
    return session;
  }

  private void setupKeyspace(Session session, String keyspace) throws IOException {
    final Map<String, Object> replication = new HashMap<>();
    replication.put("class", "SimpleStrategy");
    replication.put("replication_factor", 1);
    session.execute(createKeyspace(keyspace).ifNotExists().with().replication(replication));
    session.execute("USE " + keyspace);
    //    String[] statements = split(IOUtils.toString(getClass().getResourceAsStream("/cql/setup.cql")), ";");
    //    Arrays.stream(statements).map(statement -> normalizeSpace(statement) + ";").forEach(session::execute);
  }

  @Bean
  public MappingManager mappingManager(Session session) {
    final PropertyMapper propertyMapper =
        new DefaultPropertyMapper()
            .setNamingStrategy(new DefaultNamingStrategy(LOWER_CAMEL_CASE, LOWER_SNAKE_CASE));
    final MappingConfiguration configuration =
        MappingConfiguration.builder().withPropertyMapper(propertyMapper).build();
    return new MappingManager(session, configuration);
  }
}

与使用Spring Data的类似设置相比，这里有更多的核心（与Spring Boot的自动配置结合使用时，甚至不需要该类），但是该类本身非常简单。此处显示的Cluster和Session Bean的基本设置是应用程序正常工作所需的最低要求，并且对于您编写的任何应用程序都可能保持不变。提供了更多方法，因此您可以添加任何其他配置以使其适合您的用例。

通过使用来自值application.properties我们设置的主机地址，集群名和端口Cluster 。然后，将Cluster用于创建Session 。执行此操作时，有两个选项可供选择，是否设置默认键空间。如果要设置默认键空间，则只需使用下面的代码即可。

@Bean
public Session session(Cluster cluster, @Value("${cassandra.keyspace}") String keyspace) throws IOException {
  final Session session = cluster.connect(keyspace);
  // any other setup
  return session;
}

密钥空间被传递到connect方法中，该方法将创建一个Session ，然后执行USE <keyspace>从而设置默认的密钥空间。这依赖于创建会话之前存在的键空间，如果不存在，则在执行USE语句时将失败。

如果您不知道启动时是否存在键空间，或者您肯定要基于属性文件中的键空间值动态创建键空间，则需要调用connect而不指定键空间。然后，您将需要自己创建它，以便实际使用。为此，请使用SchemaBuilder提供的createKeyspace方法。以下是用于创建密钥空间的CQL语句。

CREATE KEYSPACE IF NOT EXISTS <keyspace> WITH REPLICATION = { 'class':'SimpleStrategy', 'replication_factor':1 };

我还再次在下面添加了键空间代码，因为它离现在有点远。

private void setupKeyspace(Session session, String keyspace) throws IOException {
  final Map<String, Object> replication = new HashMap<>();
  replication.put("class", "SimpleStrategy");
  replication.put("replication_factor", 1);
  session.execute(createKeyspace(keyspace).ifNotExists().with().replication(replication));
  session.execute("USE " + keyspace);
}

SchemaBuilder非常易于使用，并且在您浏览CQL时看起来非常相似。我们添加一个ifNotExists条款和先调用设置复制的因素with ，然后通过一个Map<String, Object>进入replicationMethod 。该映射需要包含类和复制因子，基本上使用此处显示的键，但是将映射的值更改为您需要的值。不要忘记execute该语句，然后告诉会话使用刚创建的键空间。不幸的是，没有更好的方法手动设置默认键空间，唯一的选择是执行USE语句。

接下来是关于设置默认键空间的前两个选项。如果我们选择完全不设置默认键空间，则需要在创建的每个表和执行的每个查询之前添加键空间。 Datastax提供了向查询以及映射实体添加键空间名称的方法，这并不难。我不会再进一步讨论这个主题，但是要知道，如果正确设置了其他所有内容，则不设置键空间不会阻止您的应用程序正常工作。

设置键空间后，我们便可以创建表。有两种方法可以做到这一点。一种是执行一些CQL语句，无论它们是Java代码中的字符串还是从外部CQL脚本中读取的字符串。二，使用SchemaBuilder创建它们。

让我们看一下首先执行CQL语句，或更准确地说是从CQL文件执行它们。您可能已经注意到，在原始示例中我留下了一些注释掉的代码，如果没有注释，该代码将找到一个名为setup.cql的文件，读出一个CQL语句，执行它，然后移至下一条语句。又来了。

String[] statements = split(IOUtils.toString(getClass().getResourceAsStream("/cql/setup.cql")), ";");
Arrays.stream(statements).map(statement -> normalizeSpace(statement) + ";").forEach(session::execute);

以下是创建Cassandra表的文件中包含的CQL。

REATE TABLE IF NOT EXISTS people_by_country(
  country TEXT,
  first_name TEXT,
  last_name TEXT,
  id UUID,
  age INT,
  profession TEXT,
  salary INT,
  PRIMARY KEY((country), first_name, last_name, id)
);

主键由country ， first_name ， last_name和id字段组成。分区键仅由country字段组成，聚类列是键中的其余键，仅出于唯一性而包含id ，因为您显然可以使人具有相同的名字。我在之前的文章Spring Data Cassandra入门中更深入地讨论了主键主题。

此代码利用commons-io和commons-lang3依赖性。如果我们不是以这种方式执行CQL，则可以删除这些依赖关系（在本文的上下文中）。

关于使用SchemaBuilder呢？我没有在原始代码段中包含任何用于创建表的代码，因为我在玩耍并试图找出放置它的最佳位置，目前我将其粘贴在存储库中，但我仍然不相信这就是完美的地方。无论如何，我将代码粘贴到此处，以便我们现在可以查看它，然后在它再次出现时可以跳过它。

private void createTable(Session session) {
  session.execute(
      SchemaBuilder.createTable(TABLE)
          .ifNotExists()
          .addPartitionKey("country", text())
          .addClusteringColumn("first_name", text())
          .addClusteringColumn("last_name", text())
          .addClusteringColumn("id", uuid())
          .addColumn("age", cint())
          .addColumn("profession", text())
          .addColumn("salary", cint()));
}

这与上面显示的CQL非常匹配。我们可以使用addPartitionKey和addClusteringColumn定义不同的列类型， addClusteringColumn标准字段创建主键和addColumn 。还有许多其他方法，例如addStaticColumn和withOptions允许您随后调用clusteringOrder来定义集群列的排序方向。调用这些方法的顺序非常重要，因为分区键和群集列将按照调用它们各自方法的顺序来创建。 Datastax还提供了DataType类，以简化列类型的定义，例如， text与TEXT匹配， cint与INT匹配。与上一次使用SchemaBuilder ，一旦我们对表设计感到满意，就需要execute它。

在MappingManager ，下面是创建Bean的代码段。

@Bean
public MappingManager mappingManager(Session session) {
  final PropertyMapper propertyMapper =
      new DefaultPropertyMapper()
          .setNamingStrategy(new DefaultNamingStrategy(LOWER_CAMEL_CASE, LOWER_SNAKE_CASE));
  final MappingConfiguration configuration =
      MappingConfiguration.builder().withPropertyMapper(propertyMapper).build();
  return new MappingManager(session, configuration);
}

MappingManager bean来自cassandra-driver-mapping依赖项，它将ResultSet映射到一个实体（稍后我们将进行介绍）。现在，我们只需要创建bean。如果对Cassandra中没有分隔符的Java驼峰大小写转换为所有小写字母的默认命名策略不满意，我们将需要设置自己的名字。为此，我们可以传入DefaultNamingStrategy来定义我们在Java类中使用的情况以及在Cassandra中使用的情况。由于在Java中通常使用驼峰大小写，因此我们传入LOWER_CAMEL_CASE并且由于我喜欢在Cassandra中使用蛇形大小写，因此我们可以使用LOWER_SNAKE_CASE （可以在NamingConventions类中找到）。较低的引用指定字符串中第一个字符的大小写，因此LOWER_CAMEL_CASE表示firstName而UPPER_CAMEL_CASE表示FirstName 。 DefaultPropertyMapper带有用于更具体配置的额外方法，但是MappingConfiguration仅具有一项工作，即接收要传递给MappingManager的PropertyMapper 。

我们接下来要看的是将持久保留到Cassandra并从Cassandra中检索到的实体，从而节省了我们手动设置插入值和转换读取结果的工作量。 Datastax驱动程序为我们提供了一种相对简单的方法，通过使用批注来标记属性（例如其要映射的表的名称），哪个字段与Cassandra列匹配以及哪个字段由主键组成。

@Table(name = "people_by_country")
public class Person {

  @PartitionKey
  private String country;

  @ClusteringColumn
  private String firstName;

  @ClusteringColumn(1)
  private String lastName;

  @ClusteringColumn(2)
  private UUID id;

  private int age;
  private String profession;
  private int salary;

  private Person() {

  }

  public Person(String country, String firstName, String lastName, UUID id, int age, String profession, int salary) {
    this.country = country;
    this.firstName = firstName;
    this.lastName = lastName;
    this.id = id;
    this.age = age;
    this.profession = profession;
    this.salary = salary;
  }

  // getters and setters for each property

  // equals, hashCode, toString
}

该实体表示@Table表示的people_by_country表。我再次将下表的CQL放置以供参考。

CREATE TABLE IF NOT EXISTS people_by_country(
  country TEXT,
  first_name TEXT,
  last_name TEXT,
  id UUID,
  age INT,
  profession TEXT,
  salary INT,
  PRIMARY KEY((country), first_name, last_name, id)
);

该@Table注释必须指定实体代表表的名称，还配备了根据您的要求，如各种其它选项keyspace ，如果你不想使用默认密钥空间的Session bean被配置为使用和caseSensitiveTable这是不言自明的。

那主键呢？如上文所述，主键由一个分区键组成，分区键本身包含一个或多个列和/或群集列。为了与上面定义的Cassandra表匹配，我们在必填字段中添加了@PartitionKey和@ClusteringColumn批注。这两个注释都具有一个属性，即value ，它指定列在主键中的显示顺序。默认值为0 ，这就是为什么某些注释不包含值的原因。

使该实体正常工作的最后一个要求是getter，setter和默认构造函数，以便映射器能够完成任务。如果您不希望任何人访问默认构造函数，则默认构造函数可以是私有的，因为映射器使用反射来检索它。您可能不想在实体上设置setter，因为您希望对象是不变的，但是不幸的是，您对此无能为力，而您只需要让这场战斗成为可能。尽管我个人认为这很好，因为您可以（也许应该）将实体转换为可以在应用程序中传递的另一个对象，而无需任何实体注释，因此无需了解数据库本身。然后，该实体可以保持可变，而您传递的另一个对象可以完全按照您的期望工作。

在继续之前，我想提的最后一件事。还记得我们之前定义的DefaultNamingConvention吗？这意味着我们的字段将与正确的列匹配，而无需在实体中进行任何额外的工作。如果您没有执行此操作，或者想为列名提供不同的字段名，则可以使用@Column批注并在其中指定。

我们几乎拥有构建示例应用程序所需的所有组件。倒数第二个组件正在创建一个存储库，其中将包含用于持久存储和从Cassandra读取数据的所有逻辑。我们将利用我们先前创建的MappingManager bean和我们放置在实体上的注释将ResultSet转换为实体，而无需自己做任何其他事情。

@Repository
public class PersonRepository {

  private Mapper<Person> mapper;
  private Session session;

  private static final String TABLE = "people_by_country";

  public PersonRepository(MappingManager mappingManager) {
    createTable(mappingManager.getSession());
    this.mapper = mappingManager.mapper(Person.class);
    this.session = mappingManager.getSession();
  }

  private void createTable(Session session) {
    // use SchemaBuilder to create table
  }

  public Person find(String country, String firstName, String secondName, UUID id) {
    return mapper.get(country, firstName, secondName, id);
  }

  public List<Person> findAll() {
    final ResultSet result = session.execute(select().all().from(TABLE));
    return mapper.map(result).all();
  }

  public List<Person> findAllByCountry(String country) {
    final ResultSet result = session.execute(select().all().from(TABLE).where(eq("country", country)));
    return mapper.map(result).all();
  }

  public void delete(String country, String firstName, String secondName, UUID id) {
    mapper.delete(country, firstName, secondName, id);
  }

  public Person save(Person person) {
    mapper.save(person);
    return person;
  }
}

通过构造函数注入MappingManager并调用Person类的mapper方法，将返回一个Mapper<Person> ，它将亲自处理我们所有的映射需求。我们还需要检索Session才能执行查询，该查询很好地包含在我们注入的MappingManager 。

对于三个查询，我们直接依赖于映射器与Cassandra进行交互，但这仅适用于单个记录。通过接受组成Person实体的主键的值来get ， save和delete每个作品，并且必须以正确的顺序输入它们，否则您将遇到意想不到的结果，否则将引发异常。

其他情况要求在调用映射器之前执行查询，以将返回的ResultSet转换为实体或实体集合。我已经使用QueryBuilder编写查询，并且我也选择了不编写准备好的语句。尽管在大多数情况下您应该使用准备好的语句，但我想我将来会在单独的文章中介绍这些语句，尽管它们足够相似，并且QueryBuilder仍然可以使用，所以我相信您可以根据需要自行解决。

QueryBuilder提供了静态方法来创建select ， insert ， update和delete语句，然后可以将它们链接在一起以构建查询（我知道这很明显）。当您需要手动创建自己的查询而不依赖于来自Cassandra存储库的推断查询时，此处使用的QueryBuilder也可以在Spring Data Cassandra中使用。

创建这个小应用程序的最后一步实际上是在运行它。由于我们使用的是Spring Boot，因此只需添加标准@SpringBootApplication并运行该类。我已经在下面做了这些，以及使用CommandLineRunner在存储库中执行了这些方法，以便我们可以检查它们是否在执行我们期望的工作。

@SpringBootApplication
public class Application implements CommandLineRunner {

  @Autowired
  private PersonRepository personRepository;

  public static void main(String args[]) {
    SpringApplication.run(Application.class);
  }

  @Override
  public void run(String... args) {

    final Person bob = new Person("UK", "Bob", "Bobbington", UUID.randomUUID(), 50, "Software Developer", 50000);

    final Person john = new Person("UK", "John", "Doe", UUID.randomUUID(), 30, "Doctor", 100000);

    personRepository.save(bob);
    personRepository.save(john);

    System.out.println("Find all");
    personRepository.findAll().forEach(System.out::println);

    System.out.println("Find one record");
    System.out.println(personRepository.find(john.getCountry(), john.getFirstName(), john.getLastName(), john.getId()));

    System.out.println("Find all by country");
    personRepository.findAllByCountry("UK").forEach(System.out::println);

    john.setProfession("Unemployed");
    john.setSalary(0);
    personRepository.save(john);
    System.out.println("Demonstrating updating a record");
    System.out.println(personRepository.find(john.getCountry(), john.getFirstName(), john.getLastName(), john.getId()));

    personRepository.delete(john.getCountry(), john.getFirstName(), john.getLastName(), john.getId());
    System.out.println("Demonstrating deleting a record");
    System.out.println(personRepository.find(john.getCountry(), john.getFirstName(), john.getLastName(), john.getId()));
  }
}

run方法包含一些打印行，因此我们可以看到发生了什么，下面是它们的输出。

Find all
Person{country='US', firstName='Alice', lastName='Cooper', id=e113b6c2-5041-4575-9b0b-a0726710e82d, age=45, profession='Engineer', salary=1000000}
Person{country='UK', firstName='Bob', lastName='Bobbington', id=d6af6b9a-341c-4023-acb5-8c22e0174da7, age=50, profession='Software Developer', salary=50000}
Person{country='UK', firstName='John', lastName='Doe', id=f7015e45-34d7-4f25-ab25-ca3727df7759, age=30, profession='Doctor', salary=100000}

Find one record
Person{country='UK', firstName='John', lastName='Doe', id=f7015e45-34d7-4f25-ab25-ca3727df7759, age=30, profession='Doctor', salary=100000}

Find all by country
Person{country='UK', firstName='Bob', lastName='Bobbington', id=d6af6b9a-341c-4023-acb5-8c22e0174da7, age=50, profession='Software Developer', salary=50000}
Person{country='UK', firstName='John', lastName='Doe', id=f7015e45-34d7-4f25-ab25-ca3727df7759, age=30, profession='Doctor', salary=100000}

Demonstrating updating a record
Person{country='UK', firstName='John', lastName='Doe', id=f7015e45-34d7-4f25-ab25-ca3727df7759, age=30, profession='Unemployed', salary=0}

Demonstrating deleting a record
null

我们可以看到findAll返回了所有记录，而find仅检索了与输入主键值匹配的记录。 findAllByCountry已排除了爱丽丝，仅从英国找到了记录。在现有记录上再次调用save将更新记录，而不是插入。最后， delete将从数据库中删除该人的数据（例如删除facebook？！？！）。

多数民众赞成在一个包装。

将来，我将尝试编写一些后续文章，因为我们可以使用本文中未涉及的Datastax驱动程序来做一些更有趣的事情。我们在这里介绍的内容足以使您迈出使用驱动程序的第一步，并开始从您的应用程序查询Cassandra。

在我们开始之前，我想对Datastax驱动程序和Spring Data Cassandra进行一些比较。

与Spring Data Cassandra相比，Datastax驱动程序缺乏对创建表的支持（我认为）。 Spring Data能够仅基于您的实体创建表的事实消除了所有这些工作，基本上可以重写您已经编写的内容。显然，如果您不想使用实体注释，那么区别就消失了，因为您将需要在Datastax和Spring Data中手动创建表。

实体的设计方式和使用的注释也大不相同。这一点与我先前提出的观点紧密相关。因为Spring Data可以为您创建表，所以它对更精确的批注的需求更大，这些批注允许您指定表的设计，例如集群列的排序顺序。显然，这会使类变得杂乱无章，而通常不会出现这样的注释。

Spring Data还为标准查询（如findAll和插入实体集合）提供了更好的支持。显然，这并不是世界末日，实现这些将花费很少的精力，但这几乎总结了Datastax驱动程序和Spring Data Cassandra之间的主要区别。

Spring Data更加易于使用。我认为关于这个话题真的没有什么要说的。由于Spring Data Cassandra是基于Datastax驱动程序构建的，因此它显然可以执行驱动程序可以执行的所有操作，如果缺少所需的任何内容，则可以直接访问Datastax类并执行所需的操作。但是不应过分考虑Spring Data提供的便利，而且我认为我什至没有覆盖它提供的一些更有用的部分，因为本文仅涵盖了基础知识。甚至不要让我开始使用Spring Boot的自动配置和Cassandra存储库为您生成的推断查询后，它变得多么容易。

我应该停止……这变成了咆哮。

总之，使用Datastax驱动程序连接和查询Cassandra数据库是相对简单的。建立与Cassandra的连接，创建所需的实体，并编写使用前者的存储库，然后便拥有了进行所需的一切。我们还将Datastax驱动程序与Spring Data Cassandra进行了比较，这可以归结为，Datastax可以满足您的需求，但是Spring Data使其更容易。

这篇文章中使用的代码可以在我的GitHub上找到。

如果您发现此帖子有帮助，并希望了解我的最新帖子，那么可以通过@LankyDanDev在Twitter上关注我。

翻译自: https://www.javacodegeeks.com/2018/04/interacting-with-cassandra-using-the-datastax-java-driver.html

datastax.repo

dnc8371

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
datastax.repo_使用Datastax Java驱动程序与Cassandra进行交互

datastax.repo 今天，我这次返回了更多的Cassandra和Java集成，重点是使用Datastax Java驱动程序，而不是我已经写了很多文章的Spring Data Cassandra。 Spring Data实际上使用了Datastax驱动程序来与Cassandra进行交互，但是在它之上还附带了一些额外的功能。但是我们今天不想要任何这些！我们将直接使用Datastax驱动...
复制链接

扫一扫