Storm默认配置-storm.yml

以下配置中以 topology 开头的,可以通过 StormSubmitter 指定。

Storm 0.7.0 及以上版本允许在每个 spout/blot 的基础上重写配置。唯一可以这样覆盖的配置是:

  • "topology.debug"
  • "topology.max.spout.pending"
  • "topology.max.task.parallelism"
  • "topology.kryo.register":这与其他的工作方式稍有不同,因为序列化对拓扑中的所有组件都是可用的。

Java API 允许两种方式指定特定组件的配置:

  • 内部:在任何 spout 或 bolt 中覆盖 getComponentConfiguration 并返回特定于组件的配置映射。
  • 外部:在 TopologyBuilder 中,setSpout 和 setBolt 返回一个带有 addConfiguration 和 addConfigurations 方法的对象,这些方法可用于覆盖组件的配置。

配置值的优先顺序是默认的:

defaults.yaml < storm.yaml < topology 指定configuration < internal component 指定 configuration < external component 指定configuration.

Storm配置说明

以下为必须修改的项:

  • storm.zookeeper.services:配置zookeeper集群的主机名称。注意配置项开头需要有空格,:后面需要跟空格,否则启动会报错

  • nimbus.host:指定了集群中nimbus的节点。

  • supervisor.slots.ports:配置控制每个supervisor节点运行多少个worker进程。这个配置定义为worker监听的端口的列表,监听端口的个数控制了supervisor节点上有多少个worker的插槽。默认的storm使用6700~6703端口,每个supervisor节点上有4个worker插槽。

  • storm.local.dir:storm工作时产生的工作文件存放的位置,注意,要避免配置到/tmp下。

  • storm ui 的启动端口号

默认配置 storm.yml

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


########### These all have default values as shown
########### Additional configuration goes into storm.yaml

java.library.path: "/usr/local/lib:/opt/local/lib:/usr/lib"

### storm.* 配置是通用配置
# the local dir is where jars are kept
storm.local.dir: "storm-local"
storm.log4j2.conf.dir: "log4j2"
storm.zookeeper.servers:
    - "localhost"
storm.zookeeper.port: 2181
storm.zookeeper.root: "/storm"
storm.zookeeper.session.timeout: 20000
storm.zookeeper.connection.timeout: 15000
storm.zookeeper.retry.times: 5
storm.zookeeper.retry.interval: 1000
storm.zookeeper.retry.intervalceiling.millis: 30000
storm.zookeeper.auth.user: null
storm.zookeeper.auth.password: null
storm.exhibitor.port: 8080
storm.exhibitor.poll.uripath: "/exhibitor/v1/cluster/list"
storm.cluster.mode: "distributed" # can be distributed or local
storm.local.mode.zmq: false
storm.thrift.transport: "org.apache.storm.security.auth.SimpleTransportPlugin"
storm.thrift.socket.timeout.ms: 600000
storm.principal.tolocal: "org.apache.storm.security.auth.DefaultPrincipalToLocal"
storm.group.mapping.service: "org.apache.storm.security.auth.ShellBasedGroupsMapping"
storm.group.mapping.service.params: null
storm.messaging.transport: "org.apache.storm.messaging.netty.Context"
storm.nimbus.retry.times: 5
storm.nimbus.retry.interval.millis: 2000
storm.nimbus.retry.intervalceiling.millis: 60000
storm.nimbus.zookeeper.acls.check: true
storm.nimbus.zookeeper.acls.fixup: true
storm.auth.simple-white-list.users: []
storm.auth.simple-acl.users: []
storm.auth.simple-acl.users.commands: []
storm.auth.simple-acl.admins: []
storm.cluster.state.store: "org.apache.storm.cluster_state.zookeeper_state_factory"
storm.meta.serialization.delegate: "org.apache.storm.serialization.GzipThriftSerializationDelegate"
storm.codedistributor.class: "org.apache.storm.codedistributor.LocalFileSystemCodeDistributor"
storm.workers.artifacts.dir: "workers-artifacts"
storm.health.check.dir: "healthchecks"
storm.health.check.timeout.ms: 5000
storm.disable.symlinks: false

### nimbus.* 主节点配置
nimbus.seeds : ["localhost"]
nimbus.thrift.port: 6627
nimbus.thrift.threads: 64
nimbus.thrift.max_buffer_size: 1048576
nimbus.childopts: "-Xmx1024m"
nimbus.task.timeout.secs: 30
nimbus.supervisor.timeout.secs: 60
nimbus.monitor.freq.secs: 10
nimbus.cleanup.inbox.freq.secs: 600
nimbus.inbox.jar.expiration.secs: 3600
nimbus.code.sync.freq.secs: 120
nimbus.task.launch.secs: 120
nimbus.file.copy.expiration.secs: 600
nimbus.topology.validator: "org.apache.storm.nimbus.DefaultTopologyValidator"
topology.min.replication.count: 1
topology.max.replication.wait.time.sec: 60
nimbus.credential.renewers.freq.secs: 600
nimbus.queue.size: 100000
scheduler.display.resource: false

### ui.* 主节点UI配置
ui.host: 0.0.0.0
ui.port: 8080
ui.childopts: "-Xmx768m"
ui.actions.enabled: true
ui.filter: null
ui.filter.params: null
ui.users: null
ui.header.buffer.bytes: 4096
ui.http.creds.plugin: org.apache.storm.security.auth.DefaultHttpCredentialsPlugin
ui.http.x-frame-options: DENY
ui.pagination: 20

logviewer.port: 8000
logviewer.childopts: "-Xmx128m"
logviewer.cleanup.age.mins: 10080
logviewer.appender.name: "A1"
logviewer.max.sum.worker.logs.size.mb: 4096
logviewer.max.per.worker.logs.size.mb: 2048

logs.users: null

drpc.port: 3772
drpc.worker.threads: 64
drpc.max_buffer_size: 1048576
drpc.queue.size: 128
drpc.invocations.port: 3773
drpc.invocations.threads: 64
drpc.request.timeout.secs: 600
drpc.childopts: "-Xmx768m"
drpc.http.port: 3774
drpc.https.port: -1
drpc.https.keystore.password: ""
drpc.https.keystore.type: "JKS"
drpc.http.creds.plugin: org.apache.storm.security.auth.DefaultHttpCredentialsPlugin
drpc.authorizer.acl.filename: "drpc-auth-acl.yaml"
drpc.authorizer.acl.strict: false

transactional.zookeeper.root: "/transactional"
transactional.zookeeper.servers: null
transactional.zookeeper.port: null

## blobstore configs
supervisor.blobstore.class: "org.apache.storm.blobstore.NimbusBlobStore"
supervisor.blobstore.download.thread.count: 5
supervisor.blobstore.download.max_retries: 3
supervisor.localizer.cache.target.size.mb: 10240
supervisor.localizer.cleanup.interval.ms: 600000

nimbus.blobstore.class: "org.apache.storm.blobstore.LocalFsBlobStore"
nimbus.blobstore.expiration.secs: 600

storm.blobstore.inputstream.buffer.size.bytes: 65536
client.blobstore.class: "org.apache.storm.blobstore.NimbusBlobStore"
storm.blobstore.replication.factor: 3
# 对于安全模式,我们希望将此配置更改为true
storm.blobstore.acl.validation.enabled: false

### supervisor.* configs are for node supervisors
# 定义可以在此机器上运行的 worker 的数量。每个 worker 被分配一个端口用于通信
supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703
supervisor.childopts: "-Xmx256m"
supervisor.run.worker.as.user: false
# supervisor 要等多久才能确保 worker 进程启动
supervisor.worker.start.timeout.secs: 120
# 在 supervisor 认为 worker 已经死亡并试图重新启动心跳之间的时间
supervisor.worker.timeout.secs: 30
# 在关闭 worker 线程之前需要多少秒的睡眠时间
supervisor.worker.shutdown.sleep.secs: 3
# supervisor 多久检查一次它所监视的进程的状态并在必要时重新启动
supervisor.monitor.frequency.secs: 3
# supervisor 心跳到集群状态的频率(对于nimbus)
supervisor.heartbeat.frequency.secs: 5
supervisor.enable: true
supervisor.supervisors: []
supervisor.supervisors.commands: []
supervisor.memory.capacity.mb: 3072.0
# 按照惯例,1个cpu核心应该是大约100个,但是如果需要的话可以调整
# using 100 makes it simple to set the desired value to the capacity measurement
# 单线程 bolts
supervisor.cpu.capacity: 400.0

### worker.* configs are for task workers
worker.heap.memory.mb: 768
worker.childopts: "-Xmx%HEAP-MEM%m -XX:+PrintGCDetails -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=artifacts/heapdump"
worker.gc.childopts: ""

# 解锁商业功能需要获得Oracle的特殊许可。
# See http://www.oracle.com/technetwork/java/javase/terms/products/index.html
# 由于这个原因,profiler 特性在默认情况下是禁用的。
worker.profiler.enabled: false
worker.profiler.childopts: "-XX:+UnlockCommercialFeatures -XX:+FlightRecorder"
worker.profiler.command: "flight.bash"
worker.heartbeat.frequency.secs: 1

# 检查动态日志级别是否可以在工作人员中从 DEBUG 重置为 INFO
worker.log.level.reset.poll.secs: 30

# 控制每个 worker 线程需要多少 worker receiver 线程
topology.worker.receiver.thread.count: 1

task.heartbeat.frequency.secs: 3
task.refresh.poll.secs: 10
task.credentials.poll.secs: 30
task.backpressure.poll.secs: 30

# 现在默认应该是null
topology.backpressure.enable: false
backpressure.disruptor.high.watermark: 0.9
backpressure.disruptor.low.watermark: 0.4
backpressure.znode.timeout.secs: 30
backpressure.znode.update.freq.secs: 15

zmq.threads: 1
zmq.linger.millis: 5000
zmq.hwm: 0


storm.messaging.netty.server_worker_threads: 1
storm.messaging.netty.client_worker_threads: 1
storm.messaging.netty.buffer_size: 5242880 #5MB buffer
# 自 nimbus.task.launch.secs 和 supervisor.worker.start.timeout.secs 是 120。其他 worker 也应该至少等那么久才放弃连接到另一个 worker。重新连接的时间也需要大于storm.zookeeper.session。超时(默认为20s),以便在目标 worker 死亡时终止重新连接。
storm.messaging.netty.max_retries: 300
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100

# 如果 Netty 消息传递层繁忙(netty内部缓冲区不可写).Netty 客户机将尝试尽可能多地批处理消息,直到storm.messaging.netty.transfer.batch.size 字节的大小,否则它将尝试尽快刷新消息以减少延迟。
storm.messaging.netty.transfer.batch.size: 262144
# 设置 backlog 值,以指定通道何时绑定到本地地址
storm.messaging.netty.socket.backlog: 500
 
# 默认情况下,Netty SASL 身份验证设置为 false。用户可以针对特定的拓扑重写并将其设置为 true。storm.messaging.netty.authentication: false

# 用于自动网络拓扑发现的默认插件
storm.network.topography.plugin: org.apache.storm.networktopography.DefaultRackDNSToSwitchMapping

# default number of seconds group mapping service will cache user group(默认秒数组映射服务将缓存用户组)
storm.group.mapping.service.cache.duration.secs: 120

### topology.* configs are for 指定执行 storm
topology.enable.message.timeouts: true
topology.debug: false
topology.workers: 1
topology.acker.executors: null
topology.eventlogger.executors: 0
topology.tasks: null
# 消息在被认为失败之前必须完成的最长时间
topology.message.timeout.secs: 30
topology.multilang.serializer: "org.apache.storm.multilang.JsonSerializer"
topology.shellbolt.max.pending: 100
topology.skip.missing.kryo.registrations: false
topology.max.task.parallelism: null
topology.max.spout.pending: null
topology.state.synchronization.timeout.secs: 60
topology.stats.sample.rate: 0.05
topology.builtin.metrics.bucket.size.secs: 60
topology.fall.back.on.java.serialization: true
topology.worker.childopts: null
topology.worker.logwriter.childopts: "-Xmx64m"
topology.executor.receive.buffer.size: 1024 #batched
topology.executor.send.buffer.size: 1024 #individual messages
topology.transfer.buffer.size: 1024 # batched(批)
topology.tick.tuple.freq.secs: null
topology.worker.shared.thread.pool.size: 4
topology.spout.wait.strategy: "org.apache.storm.spout.SleepSpoutWaitStrategy"
topology.sleep.spout.wait.strategy.time.ms: 1
topology.error.throttle.interval.secs: 10
topology.max.error.report.per.interval: 5
topology.kryo.factory: "org.apache.storm.serialization.DefaultKryoFactory"
topology.tuple.serializer: "org.apache.storm.serialization.types.ListDelegateSerializer"
topology.trident.batch.emit.interval.millis: 500
topology.testing.always.try.serialize: false
topology.classpath: null
topology.environment: null
topology.bolts.outgoing.overflow.buffer.enable: false
topology.disruptor.wait.timeout.millis: 1000
topology.disruptor.batch.size: 100
topology.disruptor.batch.timeout.millis: 1
topology.disable.loadaware.messaging: false
topology.state.checkpoint.interval.ms: 1000

# 用于资源感知调度器的配置
# 拓扑优先级描述拓扑的重要性从0开始递减(即0是最高优先级,优先级重要性随着优先级的增加而降低)。
# 建议范围0-29,但没有设置硬限制。
topology.priority: 29
topology.component.resources.onheap.memory.mb: 128.0
topology.component.resources.offheap.memory.mb: 0.0
topology.component.cpu.pcore.percent: 10.0
topology.worker.max.heap.size.mb: 768.0
topology.scheduler.strategy: "org.apache.storm.scheduler.resource.strategies.scheduling.DefaultResourceAwareStrategy"
resource.aware.scheduler.eviction.strategy: "org.apache.storm.scheduler.resource.strategies.eviction.DefaultEvictionStrategy"
resource.aware.scheduler.priority.strategy: "org.apache.storm.scheduler.resource.strategies.priority.DefaultSchedulingPriorityStrategy"

dev.zookeeper.path: "/tmp/dev-storm-zookeeper"

pacemaker.host: "localhost"
pacemaker.port: 6699
pacemaker.base.threads: 10
pacemaker.max.threads: 50
pacemaker.thread.timeout: 10
pacemaker.childopts: "-Xmx1024m"
pacemaker.auth.method: "NONE"
pacemaker.kerberos.users: []

# 默认的 storm 守护进程指标报告插件
storm.daemon.metrics.reporter.plugins:
     - "org.apache.storm.daemon.metrics.reporters.JmxPreparableReporter"

# 集群指标消费者的配置
storm.cluster.metrics.consumer.publish.interval.secs: 60


# Metrics v2 configuration (可选)
#storm.metrics.reporters:
#  # Graphite Reporter
#  - class: "org.apache.storm.metrics2.reporters.GraphiteStormReporter"
#    daemons:
#        - "supervisor"
#        - "nimbus"
#        - "worker"
#    report.period: 60
#    report.period.units: "SECONDS"
#    graphite.host: "localhost"
#    graphite.port: 2003
#
#  # Console Reporter
#  - class: "org.apache.storm.metrics2.reporters.ConsoleStormReporter"
#    daemons:
#        - "worker"
#    report.period: 10
#    report.period.units: "SECONDS"
#    filter:
#        class: "org.apache.storm.metrics2.filters.RegexFilter"
#        expression: ".*my_component.*emitted.*"

参考:

http://storm.apache.org/releases/1.2.3/Configuration.html

https://github.com/apache/storm/blob/v1.2.3/conf/defaults.yaml

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值