Temporal 部署

1、集群架构

操作部署前请仔细阅读 03 Temporal 详细介绍 篇

2、Temporal Server 部署流程

原理是:启动前先用 dockerize 生成一遍实际使用的配置,然后再启动Server本体

Self-hosted Temporal Cluster guide | Temporal Documentation

新建应用

app:​编辑temporal-eco - 云效应用 - temporal-eco

数据库:temporal_eco 和 temporal_eco_vis 

数据库 Schema 创建及数据初始化

涉及系统环境变量

TEMPORAL_STORE_DB_NAME
TEMPORAL_STORE_DB_HOST
TEMPORAL_STORE_DB_PORT
TEMPORAL_STORE_DB_USER
TEMPORAL_STORE_DB_PWD
TEMPORAL_VISIBILITY_STORE_DB_NAME
TEMPORAL_VISIBILITY_STORE_DB_HOST
TEMPORAL_VISIBILITY_STORE_DB_PORT
TEMPORAL_VISIBILITY_STORE_DB_USER
TEMPORAL_VISIBILITY_STORE_DB_PWD

参考 链接 和 v1.12.0 Makefile

# in https://github.com/temporalio/temporal git repo dir

export SQL_PLUGIN=mysql

export SQL_HOST=mysql_host

export SQL_PORT=3306

export SQL_USER=mysql_user

export SQL_PASSWORD=mysql_password

  

./temporal-sql-tool create-database -database temporal

SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -0.0

SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/mysql/v57/temporal/versioned

  

./temporal-sql-tool create-database -database temporal_visibility

SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -0.0

SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/mysql/v57/visibility/versioned

集群部署

frontend history matching 在创建服务时选 GRPC 服务,worker 选择其他 (它在某个版本后不会暴露 GRPC 端口),然后客户端服务用 GRPC + 公司的服务发现连到 frontend 即可

Shards 数量

参考链接,确定数量为 4k = 4096。Shards 数量一旦确定,后续无法改变(唯一无法改变的配置)。Shards are very lightweight. There are no real implications on the cost of clusters. We (Temporal) have tested the system up to 16k shards.

组件数量及资源量

推荐值(v)参考链接

组件

数量

推荐 CPU 资源量

推荐内存资源量

frontend344Gi
history588Gi
matching344Gi
worker244Gi

测试环境数量及资源量

组件

数量 = floor(v/2)

推荐 CPU 资源量 = floor(v/4)

推荐内存资源量 = floor(v/4)

frontend111Gi
history222Gi
matching111Gi
worker111Gi

生产环境数量及资源量

组件

数量 = floor(v/2)

推荐 CPU 资源量 = floor(v/2)

推荐内存资源量 = floor(v/2)

frontend122Gi
history244Gi
matching122Gi
worker122Gi

对接外部应用的 Mesh 地址

update_at

${unit}--master.app.svc

2023.05.04temporal-frontend--master.temporal-eco.svc.cluster.local:80

3、部署细节

生产环境如何部署:Temporal Platform production deployments | Temporal Documentation

  • 支持 Cassandra、MySQL 和 PostgreSQL(版本见链接),需要确定与 TiDB 的兼容性
  • 数据库 Schema 升级
  • 通过指标来进行系统性能监控与调优
  • Server 拓扑中的不同服务特性不同,最好独立部署
    • the Frontend service is more CPU bound
    • the History and Matching services require more memory
  • Server 的一些数据大小限制
  • what-is-the-recommended-setup-for-running-cadence-temporal-with-cassandra-on-production
    • Number of history shards is a setting which cannot be updated after the cluster is provisioned. For all other parameters you could start small and scale your cluster based on need with time but this one you have to think upfront about your maximum load
    • Temporal server consists of 4 roles. Although you can run all roles within same process but we highly recommend running them separately as they have completely different concerns and scale characteristics. It also makes it operationally much simpler to isolate problems in production. All of the roles are completely stateless and system scales horizontally as you spin up more instances of role once you identify any bottleneck. Here are some recommendations to use as a starting point:

      • Frontend: Responsible for hosting all service api. All client interaction goes through frontend and mostly scales with rps for the cluster. Start with 3 instances of 4 cores and 4GB memory.
      • History: This hosts the workflow state transition logic. Each history host is running a shard controller which is responsible for activating and passivating shards on that host. If you provision a cluster with 4k shards then they are distributed across all available history hosts within the cluster through shard controller. If history hosts are scalability bottleneck, you just add more history hosts to the cluster. All history hosts form its own membership ring and shards are distributed among available nodes in the hash ring. They are quite memory intensive as they host mutable state and event caches. Start with 5 history instances with 8 cores and 8 GB memory.
      • Matching: They are responsible for hosting TaskQueues within the system. Each TaskQueue partition is placed separately on all available matching hosts. They usually scale with the number of workers connecting for workflow or activity task, throughput of workflow/activity/query task, and number of total active TaskQueues in the system. Start with 3 matching instances each with 4 cores and 4 GB memory.
      • Worker: This is needed for various background logic for ElasticSearch kafka processor, CrossDC consumers, and some system workflows (archival, batch processing, etc). You can just start with 2 instances each with 4 cores and 4 GB memory.
  • 接入 SSO
  • 打指标到 Prometheus
  • 如何配置 Grafana Dashboard
  • 是否部署至 K8s 的讨论

4、Temporal UI 部署流程

https://temporal-eco-ui.pek01.in.zhihu.com/namespaces/default/workflows

步骤详细遇到问题
本地部署

 
基础环境
  • apt install nodejs
  • apt install npm
  • npm install pnpm -g
  • apt-get install -y nodejs
  • /usr/bin/node -v
  • npm install n -g
  • n stable # 更新到稳定版本

安装依赖

(会产出temporal web 所需要的前端环境及代码)

安装最新的 temporal server

  • pnpm install
  • pnpm start

构建 ui-server

  • pnpm run build:local
  • pnpm run build:cloud

前端环境问题:nodejs、npm、pnpm

  • 本地安装部署,前端环境及代码整体打包到线上

pnpm build:server 生成 assets 文件,否则报 ui/assets 的 all:assets 找不到

Temporal UI requires Temporal v1.16.0 or later

第三方依赖

(产出通信所需的 proto 文件)

  • make install
  • make build-grpc

grpc 依赖 protobuf 协议(指定了依赖的代码路径)

  • 配置submodule,产出 proto 文件

尝试启动
  • go build -o ui-server ./cmd/server/main.go
  • ./ui-server start
然后切到自建的 temporal server 上,再次启动
需关注文件
  • package.json/script # pnpm 执行脚本文件
  • ./env  # 映射环境配置
    • 默认 dev 环境,package.json/script
    • "start": "pnpm run dev:local -- --open"
    • "dev:local": ". ./.env && VITE_TEMPORAL_UI_BUILD_TARGET=local vite dev --port 3000"
  • config/base.yaml # 依赖 temporal server 地址
  • config/development.yaml # 暴露页面端口

目前  temporal server 地址 是写死在 temporal ui 代码库里的(但一般应该不会变)

已做了动态配置生成

线上部署

git clone git@git.in.zhihu.com:xujialong01/temporal-eco-server.git

配置 ui-server 服务地址

外部包导入个人仓库,加快下载

与 UI 版本对齐(temporal server upgrade)

参考资料

​​​​​​​Temporal Platform production deployments | Temporal Documentation

  • 20
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
BEVFormer是一种纯视觉的自动驾驶感知算法。它通过融合环视相机图像的空间和时序特征来生成具有强表征能力的BEV(Bird's Eye View)特征,并应用于下游的3D检测、分割等任务,取得了SOTA(State-of-the-Art)的结果。在BEVFormer算法的部署中,主要包括以下几个部分。 首先是backbone,用于从6个角度的环视图像中提取多尺度的multi-camera feature。这个过程主要通过对多个相机的内外参信息进行特征提取,以获得统一的BEV视角的multi-camera feature。 其次是BEV encoder,该模块主要包括Temporal self-Attention和Spatial Cross-Attention两个部分。Spatial Cross-Attention结合多个相机的内外参信息对对应位置的multi-camera feature进行query,从而在统一的BEV视角下将multi-camera feature进行融合。Temporal self-Attention将History BEV feature和current BEV feature通过self-attention module进行融合,以获取具有时序信息的BEV feature。 最后是Det&Seg Head,这是针对特定任务的task head。它进一步使用BEV feature进行3D检测和分割任务。 在BEVFormer的部署中,我们还对训练代码进行了优化,包括数据读取和减少内存拷贝消耗等方面的优化。此外,我们还使用了推理优化工具PAI-Blade对模型进行了优化,以提高推理速度。通过PAI-Blade优化后的模型,在A100机器下能够获得42%的优化加速。 总而言之,BEVFormer的部署主要包括backbone、BEV encoder和Det&Seg Head三个部分,并通过优化训练代码和使用推理优化工具来提高算法的训练速度和推理速度。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值