spark连接cassandra配置说明

Cassandra Authentication Parameters

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
auth.conf.factoryDefaultAuthConfFactoryName of a Scala module or class implementing AuthConfFactory providing custom authentication configuration

Cassandra Connection Parameters

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
connection.compression Compression to use (LZ4, SNAPPY or NONE)
connection.factoryDefaultConnectionFactoryName of a Scala module or class implementingCassandraConnectionFactory providing connections to the Cassandra cluster
connection.hostlocalhostContact point to connect to the Cassandra cluster
connection.keep_alive_ms250Period of time to keep unused connections open
connection.local_dcNoneThe local DC to connect to (other nodes will be ignored)
connection.port9042Cassandra native connection port
connection.reconnection_delay_ms.max60000Maximum period of time to wait before reconnecting to a dead node
connection.reconnection_delay_ms.min1000Minimum period of time to wait before reconnecting to a dead node
connection.timeout_ms5000Maximum period of time to attempt connecting to a node
query.retry.count10Number of times to retry a timed-out query
query.retry.delay4 * 1.5The delay between subsequent retries (can be constant, like 1000; linearly increasing, like 1000+100; or exponential, like 1000*2)
read.timeout_ms120000Maximum period of time to wait for a read to return

Cassandra DataFrame Source Parameters

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
table.size.in.bytesNoneUsed by DataFrames Internally, will be updated in a future release toretrieve size from C*. Can be set manually now

Cassandra SQL Context Options

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
sql.clusterdefaultSets the default Cluster to inherit configuration from
sql.keyspaceNoneSets the default keyspace

Cassandra SSL Connection Options

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
connection.ssl.enabledfalseEnable secure connection to Cassandra cluster
connection.ssl.enabledAlgorithmsSet(TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA)SSL cipher suites
connection.ssl.protocolTLSSSL protocol
connection.ssl.trustStore.passwordNoneTrust store password
connection.ssl.trustStore.pathNonePath for the trust store being used
connection.ssl.trustStore.typeJKSTrust store type

Read Tuning Parameters

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
input.consistency.levelLOCAL_ONEConsistency level to use when reading
input.fetch.size_in_rows1000Number of CQL rows fetched per driver request
input.metricstrueSets whether to record connector specific metrics on write
input.split.size_in_mb64Approx amount of data to be fetched into a Spark partition

Write Tuning Parameters

All parameters should be prefixed with spark.cassandra.

Property NameDefaultDescription
output.batch.grouping.buffer.size1000How many batches per single Spark task can be stored inmemory before sending to Cassandra
output.batch.grouping.keyPartitionDetermines how insert statements are grouped into batches. Available values are
  • none : a batch may contain any statements
  • replica_set : a batch may contain only statements to be written to the same replica set
  • partition : a batch may contain only statements for rows sharing the same partition key value
output.batch.size.bytes1024Maximum total size of the batch in bytes. Overridden byspark.cassandra.output.batch.size.rows
output.batch.size.rowsNoneNumber of rows per single batch. The default is 'auto'which means the connector will adjust the numberof rows based on the amount of datain each row
output.concurrent.writes5Maximum number of batches executed in parallel by a single Spark task
output.consistency.levelLOCAL_ONEConsistency level for writing
output.metricstrueSets whether to record connector specific metrics on write
output.throughput_mb_per_sec2.147483647E9*(Floating points allowed)*
Maximum write throughput allowed per single core in MB/s.
Limit this on long (+8 hour) runs to 70% of your max throughput as seen on a smaller job for stability
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值