Superset 自助数据分析工具安装记录

SupersetSuperset其实是一个自助式数据分析工具,它的主要目标是简化我们的数据探索分析操作,它的强大之处在于整个过程一气呵成,几乎不用片刻的等待。
Superset 的特性
Superset通过让用户创建并且分享仪表盘的方式为数据分析人员提供一个快速的数据可视化功能。
在你用这种丰富的数据可视化方案来分析你的数据的同时,Superset还可以兼顾数据格式的拓展性、数据模型的高粒度保证、快速的复杂规则查询、兼容主流鉴权模式(数据库、OpenID、LDAP、OAuth或者基于Flask AppBuilder的REMOTE_USER)
通过一个定义字段、下拉聚合规则的简单的语法层操作就让我们可以将数据源在U上丰富地呈现。Superset还深度整合了Druid以保证我们在操作超大、实时数据的分片和切分都能行云流水。

安装基础依赖包
yum -y install gcc libffi-devel python-devel python-pip python-wheel openssl-devel libsasl2-devel openldap-devel 

会提示没有python-pip包可安装,可先安装epel-release,完成后即可正常安装
yum -y install epel-release
Running transaction
  Installing : python-wheel-0.24.0-2.el7.noarch                                                                                                                             1/2
  Installing : python2-pip-8.1.2-5.el7.noarch                                                                                                                               2/2
  Verifying  : python2-pip-8.1.2-5.el7.noarch                                                                                                                               1/2
  Verifying  : python-wheel-0.24.0-2.el7.noarch                                                                                                                             2/2

Installed:
  python-wheel.noarch 0:0.24.0-2.el7                                                      python2-pip.noarch 0:8.1.2-5.el7

Complete!

安装virtualenv
[root@server01 yum.repos.d]# pip install virtualenv
Collecting virtualenv
  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x3397210>: Failed to establish a new connection: [Errno 101] Network is unreachable',)': /simple/virtualenv/
  Downloading virtualenv-15.1.0-py2.py3-none-any.whl (1.8MB)
    100% |████████████████████████████████| 1.8MB 35kB/s
Installing collected packages: virtualenv
Successfully installed virtualenv-15.1.0
You are using pip version 8.1.2, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

安装venv
[root@server01 yum.repos.d]#  virtualenv venv
New python executable in /etc/yum.repos.d/venv/bin/python2
Also creating executable in /etc/yum.repos.d/venv/bin/python
Installing setuptools, pip, wheel...done.

进入venv
[root@server01 yum.repos.d]# . ./venv/bin/activate
(venv) [root@server01 yum.repos.d]#

安装superset,是个漫长的过程
(venv) [root@server01 yum.repos.d]# pip install superset
Collecting superset
  Downloading superset-0.22.1.tar.gz (58.3MB)
    100% |████████████████████████████████| 58.3MB 5.8kB/s
Collecting boto3>=1.4.6 (from superset)
  Downloading boto3-1.5.22-py2.py3-none-any.whl (128kB)
    100% |████████████████████████████████| 133kB 17kB/s
Collecting celery==4.1.0 (from superset)
  Downloading celery-4.1.0-py2.py3-none-any.whl (400kB)
    100% |████████████████████████████████| 409kB 16kB/s
Collecting colorama==0.3.9 (from superset)
  Downloading colorama-0.3.9-py2.py3-none-any.whl
Collecting cryptography==1.9 (from superset)
  Downloading cryptography-1.9.tar.gz (409kB)
    100% |████████████████████████████████| 419kB 17kB/s
Collecting flask==0.12.2 (from superset)
  Downloading Flask-0.12.2-py2.py3-none-any.whl (83kB)
    100% |████████████████████████████████| 92kB 32kB/s
Collecting flask-appbuilder==1.9.4 (from superset)
  Downloading Flask-AppBuilder-1.9.4.tar.gz (1.4MB)
    100% |████████████████████████████████| 1.5MB 25kB/s
Collecting flask-cache==0.13.1 (from superset)
  Downloading Flask-Cache-0.13.1.tar.gz (45kB)
    100% |████████████████████████████████| 51kB 22kB/s
Collecting flask-migrate==2.0.3 (from superset)
  Downloading Flask-Migrate-2.0.3.tar.gz
Collecting flask-script==2.0.5 (from superset)
  Downloading Flask-Script-2.0.5.tar.gz (42kB)
    100% |████████████████████████████████| 51kB 18kB/s
Collecting flask-sqlalchemy==2.1 (from superset)
  Downloading Flask-SQLAlchemy-2.1.tar.gz (95kB)
    100% |████████████████████████████████| 102kB 17kB/s
Collecting flask-testing==0.6.2 (from superset)
  Downloading Flask-Testing-0.6.2.tar.gz (129kB)
    100% |████████████████████████████████| 133kB 46kB/s
Collecting flask-wtf==0.14.2 (from superset)
  Downloading Flask_WTF-0.14.2-py2.py3-none-any.whl
Collecting flower==0.9.1 (from superset)
  Downloading flower-0.9.1.tar.gz (3.9MB)
    100% |████████████████████████████████| 3.9MB 26kB/s
Collecting future<0.17,>=0.16.0 (from superset)
  Downloading future-0.16.0.tar.gz (824kB)
    100% |████████████████████████████████| 829kB 35kB/s
Collecting humanize==0.5.1 (from superset)
  Downloading humanize-0.5.1.tar.gz
Collecting gunicorn==19.7.1 (from superset)
  Downloading gunicorn-19.7.1-py2.py3-none-any.whl (111kB)
    100% |████████████████████████████████| 112kB 27kB/s
Collecting idna==2.5 (from superset)
  Downloading idna-2.5-py2.py3-none-any.whl (55kB)
    100% |████████████████████████████████| 61kB 15kB/s
Collecting markdown==2.6.8 (from superset)
  Downloading Markdown-2.6.8.tar.gz (307kB)
    100% |████████████████████████████████| 317kB 27kB/s
Collecting pandas==0.20.3 (from superset)
  Downloading pandas-0.20.3-cp27-cp27mu-manylinux1_x86_64.whl (22.4MB)
    100% |████████████████████████████████| 22.4MB 20kB/s
Collecting parsedatetime==2.0.0 (from superset)
  Downloading parsedatetime-2.0-py2-none-any.whl
Collecting pathlib2==2.3.0 (from superset)
  Downloading pathlib2-2.3.0-py2.py3-none-any.whl
Collecting pydruid==0.3.1 (from superset)
  Downloading pydruid-0.3.1-py2.py3-none-any.whl
Collecting PyHive>=0.4.0 (from superset)
  Downloading PyHive-0.5.0.tar.gz (40kB)
    100% |████████████████████████████████| 40kB 29kB/s
Collecting python-dateutil==2.6.0 (from superset)
  Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194kB)
    100% |████████████████████████████████| 194kB 38kB/s
Collecting pyyaml>=3.11 (from superset)
  Downloading PyYAML-3.12.tar.gz (253kB)
    100% |████████████████████████████████| 256kB 40kB/s
Collecting requests==2.17.3 (from superset)
  Downloading requests-2.17.3-py2.py3-none-any.whl (87kB)
    100% |████████████████████████████████| 92kB 33kB/s
Collecting simplejson==3.10.0 (from superset)
  Downloading simplejson-3.10.0.tar.gz (77kB)
    100% |████████████████████████████████| 81kB 53kB/s
Collecting six==1.10.0 (from superset)
  Downloading six-1.10.0-py2.py3-none-any.whl
Collecting sqlalchemy==1.1.9 (from superset)
  Downloading SQLAlchemy-1.1.9.tar.gz (5.2MB)
    100% |████████████████████████████████| 5.2MB 74kB/s
Collecting sqlalchemy-utils==0.32.16 (from superset)
  Downloading SQLAlchemy-Utils-0.32.16.tar.gz (120kB)
    100% |████████████████████████████████| 122kB 64kB/s
Collecting sqlparse==0.2.3 (from superset)
  Downloading sqlparse-0.2.3-py2.py3-none-any.whl
Collecting thrift>=0.9.3 (from superset)
  Downloading thrift-0.11.0.tar.gz (52kB)
    100% |████████████████████████████████| 61kB 20kB/s
Collecting thrift-sasl>=0.2.1 (from superset)
  Downloading thrift_sasl-0.3.0.tar.gz
Collecting unidecode>=0.04.21 (from superset)
  Downloading Unidecode-1.0.22-py2.py3-none-any.whl (235kB)
    100% |████████████████████████████████| 235kB 41kB/s
Collecting botocore<1.9.0,>=1.8.36 (from boto3>=1.4.6->superset)
  Downloading botocore-1.8.36-py2.py3-none-any.whl (4.1MB)
    100% |████████████████████████████████| 4.1MB 22kB/s
Collecting jmespath<1.0.0,>=0.7.1 (from boto3>=1.4.6->superset)
  Downloading jmespath-0.9.3-py2.py3-none-any.whl
Collecting s3transfer<0.2.0,>=0.1.10 (from boto3>=1.4.6->superset)
  Downloading s3transfer-0.1.12-py2.py3-none-any.whl (59kB)
    100% |████████████████████████████████| 61kB 34kB/s
Collecting kombu<5.0,>=4.0.2 (from celery==4.1.0->superset)
  Downloading kombu-4.1.0-py2.py3-none-any.whl (181kB)
    100% |████████████████████████████████| 184kB 53kB/s
Collecting pytz>dev (from celery==4.1.0->superset)
  Downloading pytz-2017.3-py2.py3-none-any.whl (511kB)
    100% |████████████████████████████████| 512kB 37kB/s
Collecting billiard<3.6.0,>=3.5.0.2 (from celery==4.1.0->superset)
  Downloading billiard-3.5.0.3.tar.gz (149kB)
    100% |████████████████████████████████| 153kB 32kB/s
Collecting asn1crypto>=0.21.0 (from cryptography==1.9->superset)
  Downloading asn1crypto-0.24.0-py2.py3-none-any.whl (101kB)
    100% |████████████████████████████████| 102kB 25kB/s
Collecting enum34 (from cryptography==1.9->superset)
  Downloading enum34-1.1.6-py2-none-any.whl
Collecting ipaddress (from cryptography==1.9->superset)
  Downloading ipaddress-1.0.19.tar.gz
Collecting cffi>=1.7 (from cryptography==1.9->superset)
  Downloading cffi-1.11.4-cp27-cp27mu-manylinux1_x86_64.whl (406kB)
    100% |████████████████████████████████| 409kB 32kB/s
Collecting Jinja2>=2.4 (from flask==0.12.2->superset)
  Downloading Jinja2-2.10-py2.py3-none-any.whl (126kB)
    100% |████████████████████████████████| 133kB 38kB/s
Collecting Werkzeug>=0.7 (from flask==0.12.2->superset)
  Downloading Werkzeug-0.14.1-py2.py3-none-any.whl (322kB)
    100% |████████████████████████████████| 327kB 28kB/s
Collecting click>=2.0 (from flask==0.12.2->superset)
  Downloading click-6.7-py2.py3-none-any.whl (71kB)
    100% |████████████████████████████████| 71kB 29kB/s
Collecting itsdangerous>=0.21 (from flask==0.12.2->superset)
  Downloading itsdangerous-0.24.tar.gz (46kB)
    100% |████████████████████████████████| 51kB 27kB/s
Collecting Flask-Babel==0.11.1 (from flask-appbuilder==1.9.4->superset)
  Downloading Flask-Babel-0.11.1.tar.gz (40kB)
    100% |████████████████████████████████| 40kB 15kB/s
Collecting Flask-Login==0.2.11 (from flask-appbuilder==1.9.4->superset)
  Downloading Flask-Login-0.2.11.tar.gz
Collecting Flask-OpenID==1.2.5 (from flask-appbuilder==1.9.4->superset)
  Downloading Flask-OpenID-1.2.5.tar.gz (43kB)
    100% |████████████████████████████████| 51kB 48kB/s
Collecting alembic>=0.6 (from flask-migrate==2.0.3->superset)
  Downloading alembic-0.9.7.tar.gz (1.0MB)
    100% |████████████████████████████████| 1.0MB 104kB/s
Collecting WTForms (from flask-wtf==0.14.2->superset)
  Downloading WTForms-2.1.zip (553kB)
    100% |████████████████████████████████| 563kB 53kB/s
Collecting tornado==4.2.0 (from flower==0.9.1->superset)
  Downloading tornado-4.2.tar.gz (433kB)
    100% |████████████████████████████████| 440kB 37kB/s
Collecting babel>=1.0 (from flower==0.9.1->superset)
  Downloading Babel-2.5.3-py2.py3-none-any.whl (6.8MB)
    100% |████████████████████████████████| 6.8MB 30kB/s
Collecting futures (from flower==0.9.1->superset)
  Downloading futures-3.2.0-py2-none-any.whl
Collecting numpy>=1.7.0 (from pandas==0.20.3->superset)
  Downloading numpy-1.14.0-cp27-cp27mu-manylinux1_x86_64.whl (16.9MB)
    100% |████████████████████████████████| 16.9MB 17kB/s
Collecting scandir; python_version < "3.5" (from pathlib2==2.3.0->superset)
  Downloading scandir-1.6.tar.gz
Collecting certifi>=2017.4.17 (from requests==2.17.3->superset)
  Downloading certifi-2018.1.18-py2.py3-none-any.whl (151kB)
    100% |████████████████████████████████| 153kB 85kB/s
Collecting chardet<3.1.0,>=3.0.2 (from requests==2.17.3->superset)
  Downloading chardet-3.0.4-py2.py3-none-any.whl (133kB)
    100% |████████████████████████████████| 143kB 52kB/s
Collecting urllib3<1.22,>=1.21.1 (from requests==2.17.3->superset)
  Downloading urllib3-1.21.1-py2.py3-none-any.whl (131kB)
    100% |████████████████████████████████| 133kB 32kB/s
Collecting sasl>=0.2.1 (from thrift-sasl>=0.2.1->superset)
  Downloading sasl-0.2.1.tar.gz
Collecting docutils>=0.10 (from botocore<1.9.0,>=1.8.36->boto3>=1.4.6->superset)
  Downloading docutils-0.14-py2-none-any.whl (543kB)
    100% |████████████████████████████████| 552kB 50kB/s
Collecting amqp<3.0,>=2.1.4 (from kombu<5.0,>=4.0.2->celery==4.1.0->superset)
  Downloading amqp-2.2.2-py2.py3-none-any.whl (48kB)
    100% |████████████████████████████████| 51kB 50kB/s
Collecting pycparser (from cffi>=1.7->cryptography==1.9->superset)
  Downloading pycparser-2.18.tar.gz (245kB)
    100% |████████████████████████████████| 256kB 94kB/s
Collecting MarkupSafe>=0.23 (from Jinja2>=2.4->flask==0.12.2->superset)
  Downloading MarkupSafe-1.0.tar.gz
Collecting python-openid>=2.0 (from Flask-OpenID==1.2.5->flask-appbuilder==1.9.4->superset)
  Downloading python-openid-2.2.5.tar.gz (301kB)
    100% |████████████████████████████████| 307kB 98kB/s
Collecting Mako (from alembic>=0.6->flask-migrate==2.0.3->superset)
  Downloading Mako-1.0.7.tar.gz (564kB)
    100% |████████████████████████████████| 573kB 75kB/s
Collecting python-editor>=0.3 (from alembic>=0.6->flask-migrate==2.0.3->superset)
  Downloading python-editor-1.0.3.tar.gz
Collecting backports.ssl_match_hostname (from tornado==4.2.0->flower==0.9.1->superset)
  Downloading backports.ssl_match_hostname-3.5.0.1.tar.gz
Collecting vine>=1.1.3 (from amqp<3.0,>=2.1.4->kombu<5.0,>=4.0.2->celery==4.1.0->superset)
  Downloading vine-1.1.4-py2.py3-none-any.whl
Building wheels for collected packages: superset, cryptography, flask-appbuilder, flask-cache, flask-migrate, flask-script, flask-sqlalchemy, flask-testing, flower, future, humanize, markdown, PyHive, pyyaml, simplejson, sqlalchemy, sqlalchemy-utils, thrift, thrift-sasl, billiard, ipaddress, itsdangerous, Flask-Babel, Flask-Login, Flask-OpenID, alembic, WTForms, tornado, scandir, sasl, pycparser, MarkupSafe, python-openid, Mako, python-editor, backports.ssl-match-hostname
  Running setup.py bdist_wheel for superset ... done
  Stored in directory: /root/.cache/pip/wheels/17/53/df/9bc791dd9cd4fae01688d5d134daa6a50def2b7bfd9e2275ea
  Running setup.py bdist_wheel for cryptography ... done
  Stored in directory: /root/.cache/pip/wheels/ff/a5/ef/186bb4f6a89ef0bb8373bf53e5c9884b96722f0857bd3111b8
  Running setup.py bdist_wheel for flask-appbuilder ... done
  Stored in directory: /root/.cache/pip/wheels/76/fa/88/d23864a02913bc4ad1c60ba9054d16e8020f5c6e79c77d753d
  Running setup.py bdist_wheel for flask-cache ... done
  Stored in directory: /root/.cache/pip/wheels/d3/ea/07/db4bcd93163f4ac63974a7ce7aa15df9d45cdc9864c8232f9c
  Running setup.py bdist_wheel for flask-migrate ... done
  Stored in directory: /root/.cache/pip/wheels/4f/1a/cd/241202c77554d1500b47f169a59432c33834f941e90769bf0e
  Running setup.py bdist_wheel for flask-script ... done
  Stored in directory: /root/.cache/pip/wheels/e2/ea/d8/8d114e46cef819f7d9879504a7f9cb2a88a479af2858223d9f
  Running setup.py bdist_wheel for flask-sqlalchemy ... done
  Stored in directory: /root/.cache/pip/wheels/cf/9f/1b/390c152e645c6e300fda9ed9c678c6e22717a3020fd02acb4d
  Running setup.py bdist_wheel for flask-testing ... done
  Stored in directory: /root/.cache/pip/wheels/10/34/47/2378abdc5f5ce79b1d9b26be4a1f14d485f0376e5dc6512822
  Running setup.py bdist_wheel for flower ... done
  Stored in directory: /root/.cache/pip/wheels/a3/0a/36/7c3642bbba1ded7a79c64c5bdc2a0958b88b73c84d60550b26
  Running setup.py bdist_wheel for future ... done
  Stored in directory: /root/.cache/pip/wheels/c2/50/7c/0d83b4baac4f63ff7a765bd16390d2ab43c93587fac9d6017a
  Running setup.py bdist_wheel for humanize ... done
  Stored in directory: /root/.cache/pip/wheels/d4/80/38/cfbfd95752f71f3812505b948b43383ddc99eedf835fc13b09
  Running setup.py bdist_wheel for markdown ... done
  Stored in directory: /root/.cache/pip/wheels/85/a7/08/33ee5cd488d0365d8bed79d1d4e5c28dd3fbfc7f6d0ad4bb09
  Running setup.py bdist_wheel for PyHive ... done
  Stored in directory: /root/.cache/pip/wheels/e7/59/27/943bcc03c98a37876394bc9f902bc9056f00166a9746555311
  Running setup.py bdist_wheel for pyyaml ... done
  Stored in directory: /root/.cache/pip/wheels/2c/f7/79/13f3a12cd723892437c0cfbde1230ab4d82947ff7b3839a4fc
  Running setup.py bdist_wheel for simplejson ... done
  Stored in directory: /root/.cache/pip/wheels/43/c5/ef/edcebbb19becffd2ba75bf219afdbb4ca85198b2d909f1b31b
  Running setup.py bdist_wheel for sqlalchemy ... done
  Stored in directory: /root/.cache/pip/wheels/62/c3/8f/12a643439a7ba36143e21533ac633b99da8537b1deb8d0f0c3
  Running setup.py bdist_wheel for sqlalchemy-utils ... done
  Stored in directory: /root/.cache/pip/wheels/f0/05/32/bf092b262dcb4f4a9eb87e93c1b88c63fb6e345ef88534d65d
  Running setup.py bdist_wheel for thrift ... done
  Stored in directory: /root/.cache/pip/wheels/7c/89/70/14df5740427cacf181649caeac8b673bbaba4698b28bf0bd12
  Running setup.py bdist_wheel for thrift-sasl ... done
  Stored in directory: /root/.cache/pip/wheels/c1/d3/ff/61b8321fd5fb3ec9aebee95e063cd53a48cf880db3513c35f0
  Running setup.py bdist_wheel for billiard ... done
  Stored in directory: /root/.cache/pip/wheels/85/15/e4/11683b23ab74c2a835845811976e664ab33df7d23c3cb23500
  Running setup.py bdist_wheel for ipaddress ... done
  Stored in directory: /root/.cache/pip/wheels/d7/6b/69/666188e8101897abb2e115d408d139a372bdf6bfa7abb5aef5
  Running setup.py bdist_wheel for itsdangerous ... done
  Stored in directory: /root/.cache/pip/wheels/fc/a8/66/24d655233c757e178d45dea2de22a04c6d92766abfb741129a
  Running setup.py bdist_wheel for Flask-Babel ... done
  Stored in directory: /root/.cache/pip/wheels/99/65/6c/927249178edfdc24c9cb2d9fcea27f598a73b323a1b5e3a8fc
  Running setup.py bdist_wheel for Flask-Login ... done
  Stored in directory: /root/.cache/pip/wheels/4b/58/2e/fbba562e845fb419f6157a504055275a4d1783a22ebe3124e8
  Running setup.py bdist_wheel for Flask-OpenID ... done
  Stored in directory: /root/.cache/pip/wheels/3b/36/b4/ab2c592ee3b385f9db7fbcdeacdf766bca3dd4b5270d40690e
  Running setup.py bdist_wheel for alembic ... done
  Stored in directory: /root/.cache/pip/wheels/70/52/76/48b43681474e215f8e581e90f1bbb075a780ecf3c37a4fc4aa
  Running setup.py bdist_wheel for WTForms ... done
  Stored in directory: /root/.cache/pip/wheels/36/35/f3/7452cd24daeeaa5ec5b2ea13755316abc94e4e7702de29ba94
  Running setup.py bdist_wheel for tornado ... done
  Stored in directory: /root/.cache/pip/wheels/61/a8/89/044b56fd7bb4d2d6fd3ff45cc5c98b7b3bb68fed70617ffe13
  Running setup.py bdist_wheel for scandir ... done
  Stored in directory: /root/.cache/pip/wheels/6e/33/69/090d9633efb6fe3c8077c40e1676819ed8d5a59b41cc9a5bea
  Running setup.py bdist_wheel for sasl ... done
  Stored in directory: /root/.cache/pip/wheels/03/97/72/71e18efd8929d907aaf6b33a43b5c463399bee8f59dc530ec2
  Running setup.py bdist_wheel for pycparser ... done
  Stored in directory: /root/.cache/pip/wheels/95/14/9a/5e7b9024459d2a6600aaa64e0ba485325aff7a9ac7489db1b6
  Running setup.py bdist_wheel for MarkupSafe ... done
  Stored in directory: /root/.cache/pip/wheels/88/a7/30/e39a54a87bcbe25308fa3ca64e8ddc75d9b3e5afa21ee32d57
  Running setup.py bdist_wheel for python-openid ... done
  Stored in directory: /root/.cache/pip/wheels/0a/da/67/e9e68f4b5e03732dc17a545b4ce3ce84b4a9bef67253d4ff72
  Running setup.py bdist_wheel for Mako ... done
  Stored in directory: /root/.cache/pip/wheels/33/bf/8f/036f36c35e0e3c63a4685e306bce6b00b6349fec5b0947586e
  Running setup.py bdist_wheel for python-editor ... done
  Stored in directory: /root/.cache/pip/wheels/84/d6/b8/082dc3b5cd7763f17f5500a193b6b248102217cbaa3f0a24ca
  Running setup.py bdist_wheel for backports.ssl-match-hostname ... done
  Stored in directory: /root/.cache/pip/wheels/5d/72/36/b2a31507b613967b728edc33378a5ff2ada0f62855b93c5ae1
Successfully built superset cryptography flask-appbuilder flask-cache flask-migrate flask-script flask-sqlalchemy flask-testing flower future humanize markdown PyHive pyyaml simplejson sqlalchemy sqlalchemy-utils thrift thrift-sasl billiard ipaddress itsdangerous Flask-Babel Flask-Login Flask-OpenID alembic WTForms tornado scandir sasl pycparser MarkupSafe python-openid Mako python-editor backports.ssl-match-hostname
Installing collected packages: jmespath, docutils, six, python-dateutil, botocore, futures, s3transfer, boto3, vine, amqp, kombu, pytz, billiard, celery, colorama, idna, asn1crypto, enum34, ipaddress, pycparser, cffi, cryptography, MarkupSafe, Jinja2, Werkzeug, click, itsdangerous, flask, babel, Flask-Babel, Flask-Login, python-openid, Flask-OpenID, sqlalchemy, flask-sqlalchemy, WTForms, flask-wtf, flask-appbuilder, flask-cache, Mako, python-editor, alembic, flask-script, flask-migrate, flask-testing, backports.ssl-match-hostname, certifi, tornado, flower, future, humanize, gunicorn, markdown, numpy, pandas, parsedatetime, scandir, pathlib2, pydruid, PyHive, pyyaml, chardet, urllib3, requests, simplejson, sqlalchemy-utils, sqlparse, thrift, sasl, thrift-sasl, unidecode, superset
Successfully installed Flask-Babel-0.11.1 Flask-Login-0.2.11 Flask-OpenID-1.2.5 Jinja2-2.10 Mako-1.0.7 MarkupSafe-1.0 PyHive-0.5.0 WTForms-2.1 Werkzeug-0.14.1 alembic-0.9.7 amqp-2.2.2 asn1crypto-0.24.0 babel-2.5.3 backports.ssl-match-hostname-3.5.0.1 billiard-3.5.0.3 boto3-1.5.22 botocore-1.8.36 celery-4.1.0 certifi-2018.1.18 cffi-1.11.4 chardet-3.0.4 click-6.7 colorama-0.3.9 cryptography-1.9 docutils-0.14 enum34-1.1.6 flask-0.12.2 flask-appbuilder-1.9.4 flask-cache-0.13.1 flask-migrate-2.0.3 flask-script-2.0.5 flask-sqlalchemy-2.1 flask-testing-0.6.2 flask-wtf-0.14.2 flower-0.9.1 future-0.16.0 futures-3.2.0 gunicorn-19.7.1 humanize-0.5.1 idna-2.5 ipaddress-1.0.19 itsdangerous-0.24 jmespath-0.9.3 kombu-4.1.0 markdown-2.6.8 numpy-1.14.0 pandas-0.20.3 parsedatetime-2.0 pathlib2-2.3.0 pycparser-2.18 pydruid-0.3.1 python-dateutil-2.6.0 python-editor-1.0.3 python-openid-2.2.5 pytz-2017.3 pyyaml-3.12 requests-2.17.3 s3transfer-0.1.12 sasl-0.2.1 scandir-1.6 simplejson-3.10.0 six-1.10.0 sqlalchemy-1.1.9 sqlalchemy-utils-0.32.16 sqlparse-0.2.3 superset-0.22.1 thrift-0.11.0 thrift-sasl-0.3.0 tornado-4.2 unidecode-1.0.22 urllib3-1.21.1 vine-1.1.4

创建superset管理员账户及密码
(venv) [root@server01 yum.repos.d]# fabmanager create-admin --app superset
Username [admin]: admin
User first name [admin]: langfeng
User last name [user]: bo
Email [admin@fab.org]:
Password:
Repeat for confirmation:
Recognized Database Authentications.
Admin User admin created.

更新数据库
(venv) [root@server01 yum.repos.d]# superset db upgrade
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 4e6a06bad7a8, Init
INFO  [alembic.runtime.migration] Running upgrade 4e6a06bad7a8 -> 5a7bad26f2a7, empty message
INFO  [alembic.runtime.migration] Running upgrade 5a7bad26f2a7 -> 1e2841a4128, empty message
INFO  [alembic.runtime.migration] Running upgrade 1e2841a4128 -> 2929af7925ed, TZ offsets in data sources
INFO  [alembic.runtime.migration] Running upgrade 2929af7925ed -> 289ce07647b, Add encrypted password field
INFO  [alembic.runtime.migration] Running upgrade 289ce07647b -> 1a48a5411020, adding slug to dash
INFO  [alembic.runtime.migration] Running upgrade 1a48a5411020 -> 315b3f4da9b0, adding log model
INFO  [alembic.runtime.migration] Running upgrade 315b3f4da9b0 -> 55179c7f25c7, sqla_descr
INFO  [alembic.runtime.migration] Running upgrade 55179c7f25c7 -> 12d55656cbca, is_featured
/etc/yum.repos.d/venv/lib/python2.7/site-packages/alembic/util/messaging.py:69: UserWarning: Skipping unsupported ALTER for creation of implicit constraint
  warnings.warn(msg)
INFO  [alembic.runtime.migration] Running upgrade 12d55656cbca -> 2591d77e9831, user_id
INFO  [alembic.runtime.migration] Running upgrade 2591d77e9831 -> 8e80a26a31db, empty message
INFO  [alembic.runtime.migration] Running upgrade 8e80a26a31db -> 7dbf98566af7, empty message
INFO  [alembic.runtime.migration] Running upgrade 7dbf98566af7 -> 43df8de3a5f4, empty message
INFO  [alembic.runtime.migration] Running upgrade 43df8de3a5f4 -> d827694c7555, css templates
INFO  [alembic.runtime.migration] Running upgrade d827694c7555 -> 430039611635, log more
INFO  [alembic.runtime.migration] Running upgrade 430039611635 -> 18e88e1cc004, making audit nullable
INFO  [alembic.runtime.migration] Running upgrade 18e88e1cc004 -> 836c0bf75904, cache_timeouts
INFO  [alembic.runtime.migration] Running upgrade 18e88e1cc004 -> a2d606a761d9, adding favstar model
INFO  [alembic.runtime.migration] Running upgrade a2d606a761d9, 836c0bf75904 -> d2424a248d63, empty message
INFO  [alembic.runtime.migration] Running upgrade d2424a248d63 -> 763d4b211ec9, fixing audit fk
INFO  [alembic.runtime.migration] Running upgrade d2424a248d63 -> 1d2ddd543133, log dt
INFO  [alembic.runtime.migration] Running upgrade 1d2ddd543133, 763d4b211ec9 -> fee7b758c130, empty message
INFO  [alembic.runtime.migration] Running upgrade fee7b758c130 -> 867bf4f117f9, Adding extra field to Database model
INFO  [alembic.runtime.migration] Running upgrade 867bf4f117f9 -> bb51420eaf83, add schema to table model
INFO  [alembic.runtime.migration] Running upgrade bb51420eaf83 -> b4456560d4f3, change_table_unique_constraint
INFO  [alembic.runtime.migration] Running upgrade b4456560d4f3 -> 4fa88fe24e94, owners_many_to_many
INFO  [alembic.runtime.migration] Running upgrade 4fa88fe24e94 -> c3a8f8611885, Materializing permission
INFO  [alembic.runtime.migration] Running upgrade c3a8f8611885 -> f0fbf6129e13, Adding verbose_name to tablecolumn
INFO  [alembic.runtime.migration] Running upgrade f0fbf6129e13 -> 956a063c52b3, adjusting key length
INFO  [alembic.runtime.migration] Running upgrade 956a063c52b3 -> 1226819ee0e3, Fix wrong constraint on table columns
WARNI [root] Could not find or drop constraint on `columns`
INFO  [alembic.runtime.migration] Running upgrade 1226819ee0e3 -> d8bc074f7aad, Add new field 'is_restricted' to SqlMetric and DruidMetric
INFO  [alembic.runtime.migration] Running upgrade d8bc074f7aad -> 27ae655e4247, Make creator owners
INFO  [alembic.runtime.migration] Running upgrade 27ae655e4247 -> 960c69cb1f5b, add dttm_format related fields in table_columns
INFO  [alembic.runtime.migration] Running upgrade 960c69cb1f5b -> f162a1dea4c4, d3format_by_metric
INFO  [alembic.runtime.migration] Running upgrade f162a1dea4c4 -> ad82a75afd82, Update models to support storing the queries.
INFO  [alembic.runtime.migration] Running upgrade ad82a75afd82 -> 3c3ffe173e4f, add_sql_string_to_table
INFO  [alembic.runtime.migration] Running upgrade 3c3ffe173e4f -> 41f6a59a61f2, database options for sql lab
INFO  [alembic.runtime.migration] Running upgrade 41f6a59a61f2 -> 4500485bde7d, allow_run_sync_async
INFO  [alembic.runtime.migration] Running upgrade 4500485bde7d -> 65903709c321, allow_dml
INFO  [alembic.runtime.migration] Running upgrade 41f6a59a61f2 -> 33d996bcc382
INFO  [alembic.runtime.migration] Running upgrade 33d996bcc382, 65903709c321 -> b347b202819b, empty message
INFO  [alembic.runtime.migration] Running upgrade b347b202819b -> 5e4a03ef0bf0, Add access_request table to manage requests to access datastores.
INFO  [alembic.runtime.migration] Running upgrade 5e4a03ef0bf0 -> eca4694defa7, sqllab_setting_defaults
INFO  [alembic.runtime.migration] Running upgrade eca4694defa7 -> ab3d66c4246e, add_cache_timeout_to_druid_cluster
INFO  [alembic.runtime.migration] Running upgrade eca4694defa7 -> 3b626e2a6783, Sync DB with the models.py.
WARNI [root] No such constraint: 'slices_ibfk_1'
WARNI [root] Constraint must have a name
WARNI [root] No such index: 'table_name'
INFO  [alembic.runtime.migration] Running upgrade 3b626e2a6783, ab3d66c4246e -> ef8843b41dac, empty message
INFO  [alembic.runtime.migration] Running upgrade ef8843b41dac -> b46fa1b0b39e, Add json_metadata to the tables table.
INFO  [alembic.runtime.migration] Running upgrade b46fa1b0b39e -> 7e3ddad2a00b, results_key to query
INFO  [alembic.runtime.migration] Running upgrade 7e3ddad2a00b -> ad4d656d92bc, Add avg() to default metrics
INFO  [alembic.runtime.migration] Running upgrade ad4d656d92bc -> c611f2b591b8, dim_spec
INFO  [alembic.runtime.migration] Running upgrade c611f2b591b8 -> e46f2d27a08e, materialize perms
INFO  [alembic.runtime.migration] Running upgrade e46f2d27a08e -> f1f2d4af5b90, Enable Filter Select
INFO  [alembic.runtime.migration] Running upgrade e46f2d27a08e -> 525c854f0005, log_this_plus
INFO  [alembic.runtime.migration] Running upgrade 525c854f0005, f1f2d4af5b90 -> 6414e83d82b7, empty message
INFO  [alembic.runtime.migration] Running upgrade 6414e83d82b7 -> 1296d28ec131, Adds params to the datasource (druid) table
INFO  [alembic.runtime.migration] Running upgrade 1296d28ec131 -> f18570e03440, Add index on the result key to the query table.
INFO  [alembic.runtime.migration] Running upgrade f18570e03440 -> bcf3126872fc, Add keyvalue table
INFO  [alembic.runtime.migration] Running upgrade f18570e03440 -> db0c65b146bd, update_slice_model_json
INFO  [alembic.runtime.migration] Running upgrade db0c65b146bd -> a99f2f7c195a, rewriting url from shortner with new format
INFO  [alembic.runtime.migration] Running upgrade a99f2f7c195a, bcf3126872fc -> d6db5a5cdb5d, empty message
INFO  [alembic.runtime.migration] Running upgrade d6db5a5cdb5d -> b318dfe5fb6c, adding verbose_name to druid column
INFO  [alembic.runtime.migration] Running upgrade d6db5a5cdb5d -> 732f1c06bcbf, add fetch values predicate
INFO  [alembic.runtime.migration] Running upgrade 732f1c06bcbf, b318dfe5fb6c -> ea033256294a, empty message
INFO  [alembic.runtime.migration] Running upgrade b318dfe5fb6c -> db527d8c4c78, Add verbose name to DruidCluster and Database
INFO  [alembic.runtime.migration] Running upgrade db527d8c4c78, ea033256294a -> 979c03af3341, empty message
INFO  [alembic.runtime.migration] Running upgrade 979c03af3341 -> a6c18f869a4e, query.start_running_time
INFO  [alembic.runtime.migration] Running upgrade a6c18f869a4e -> 2fcdcb35e487, saved_queries
INFO  [alembic.runtime.migration] Running upgrade 2fcdcb35e487 -> a65458420354, add_result_backend_time_logging
INFO  [alembic.runtime.migration] Running upgrade a65458420354 -> ca69c70ec99b, tracking_url
INFO  [alembic.runtime.migration] Running upgrade ca69c70ec99b -> a9c47e2c1547, add impersonate_user to dbs
INFO  [alembic.runtime.migration] Running upgrade ca69c70ec99b -> ddd6ebdd853b, annotations
INFO  [alembic.runtime.migration] Running upgrade a9c47e2c1547, ddd6ebdd853b -> d39b1e37131d, empty message
INFO  [alembic.runtime.migration] Running upgrade ca69c70ec99b -> 19a814813610, Adding metric warning_text
INFO  [alembic.runtime.migration] Running upgrade 19a814813610, a9c47e2c1547 -> 472d2f73dfd4, empty message
INFO  [alembic.runtime.migration] Running upgrade 472d2f73dfd4, d39b1e37131d -> f959a6652acd, empty message
INFO  [alembic.runtime.migration] Running upgrade f959a6652acd -> 4736ec66ce19, empty message
/etc/yum.repos.d/venv/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py:1427: SAWarning: WARNING: SQL-parsed foreign key constraint '(u'datasource_name', u'datasources', u'datasource_name')' could not be located in PRAGMA foreign_keys for table metrics
  table_name

加载实例
(venv) [root@server01 yum.repos.d]# superset load_examples
Loading examples into <SQLA engine=u'sqlite:root/.superset/superset.db'>
Creating default CSS templates
Loading energy related dataset
Creating table [wb_health_population] reference
2018-01-27 17:30:07,570:INFO:root:Creating database reference
2018-01-27 17:30:07,615:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Loading [World Bank's Health Nutrition and Population Stats]
Creating table [wb_health_population] reference
2018-01-27 17:30:21,899:INFO:root:Creating database reference
2018-01-27 17:30:22,004:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating slices
Creating a World's Health Bank dashboard
Loading [Birth names]
Done loading table!
--------------------------------------------------------------------------------
Creating table [birth_names] reference
2018-01-27 17:30:25,343:INFO:root:Creating database reference
2018-01-27 17:30:25,378:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating some slices
Creating a dashboard
Loading [Random time series data]
Done loading table!
--------------------------------------------------------------------------------
Creating table [random_time_series] reference
2018-01-27 17:30:26,121:INFO:root:Creating database reference
2018-01-27 17:30:26,152:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating a slice
Loading [Random long/lat data]
Done loading table!
--------------------------------------------------------------------------------
Creating table reference
2018-01-27 17:30:33,725:INFO:root:Creating database reference
2018-01-27 17:30:33,779:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating a slice
Loading [Country Map data]
Done loading table!
--------------------------------------------------------------------------------
Creating table reference
2018-01-27 17:30:33,881:INFO:root:Creating database reference
2018-01-27 17:30:33,916:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating a slice
Loading [Multiformat time series]
Done loading table!
--------------------------------------------------------------------------------
Creating table [multiformat_time_series] reference
2018-01-27 17:30:34,028:INFO:root:Creating database reference
2018-01-27 17:30:34,060:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
Creating some slices
Loading [Misc Charts] dashboard
Creating the dashboard
Loading DECK.gl demo
Loading deck.gl dashboard
Creating Scatterplot slice
Creating Screen Grid slice
Creating Hex slice
Creating Grid slice
Creating a dashboard
Loading flights data
Done loading table!
Creating table [random_time_series] reference
2018-01-27 17:30:38,306:INFO:root:Creating database reference
2018-01-27 17:30:38,342:INFO:root:Database.get_sqla_engine(). Masked URL: sqlite:root/.superset/superset.db
  
初始化superset,
(venv) [root@server01 yum.repos.d]# superset init
2018-01-27 17:31:05,313:INFO:root:Syncing role definition
2018-01-27 17:31:05,313:INFO:root:Creating database reference
2018-01-27 17:31:05,384:INFO:root:Syncing Admin perms
2018-01-27 17:31:05,488:INFO:root:Syncing Alpha perms
2018-01-27 17:31:05,850:INFO:root:Syncing Gamma perms
2018-01-27 17:31:06,211:INFO:root:Syncing granter perms
2018-01-27 17:31:06,503:INFO:root:Syncing sql_lab perms
2018-01-27 17:31:06,869:INFO:root:Fetching a set of all perms to lookup which ones are missing
2018-01-27 17:31:06,965:INFO:root:Creating missing datasource permissions.
2018-01-27 17:31:06,971:INFO:root:Creating missing database permissions.
2018-01-27 17:31:07,013:INFO:root:Creating missing metrics permissions

启动sueprset
(venv) [root@server01 yum.repos.d]# superset runserver
Starting server with command:
gunicorn -w 2 --timeout 60 -b  0.0.0.0:8088 --limit-request-line 0 --limit-request-field_size 0 superset:app

[2018-01-27 17:31:28 +0000] [42819] [INFO] Starting gunicorn 19.7.1
[2018-01-27 17:31:28 +0000] [42819] [INFO] Listening at: http://0.0.0.0:8088 (42819)
[2018-01-27 17:31:28 +0000] [42819] [INFO] Using worker: sync
[2018-01-27 17:31:28 +0000] [42824] [INFO] Booting worker with pid: 42824
[2018-01-27 17:31:28 +0000] [42825] [INFO] Booting worker with pid: 42825

通过流量器访问,利用配置管理员账户登录
http://192.168.42.100:8088/

c06237d2f05fe4155a49bf458082cc63f60.jpg

转载于:https://my.oschina.net/peakfang/blog/2877822

<think>嗯,用户想写一个基于电信增值业务的实时数仓。首先,我需要理解电信增值业务的具体内容。电信增值业务通常包括短信、彩信、来电显示、语音信箱、流量套餐、国际漫游服务等等。这些业务会产生大量的实时数据,比如用户的使用记录、计费信息、服务开通/关闭请求等。 实时数仓的设计需要考虑几个方面。首先,数据源有哪些?可能包括计费系统、用户管理系统、服务使用日志、网络设备监控数据等。这些数据源可能是数据库的变化数据,比如MySQL的binlog,或者是实时日志流,比如Kafka中的消息。 接下来,用户的需求是什么?实时数仓通常用于实时监控、实时分析、实时决策支持。比如,实时监控用户使用增值业务的情况,检测异常流量,防止欺诈;或者实时分析用户行为,进行个性化推荐;或者实时计费,避免延迟带来的用户投诉。 然后,技术选型方面,需要考虑数据采集、数据传输、数据处理、数据存储和数据分析。数据采集可能需要用Flume、Logstash或者Debezium来捕获数据库变更。数据传输通常用消息队列,比如Kafka或者Pulsar,确保高吞吐量和低延迟。数据处理部分,可能需要流处理框架,比如Flink或者Storm,因为它们支持实时计算和复杂事件处理。数据存储的话,实时数仓可能分为不同的层次,比如ODS原始数据层,DWD明细层,DWS汇总层,ADS应用层。存储技术可能包括HBase、Redis用于实时查询,ClickHouse或者DorisDB用于OLAP分析,同时可能需要HDFS或者Iceberg存储历史数据。数据分析方面,可能需要实时仪表盘,比如Grafana或Superset,或者对接BI工具进行即时查询。 架构设计方面,Lambda架构和Kappa架构是常见的选择。Lambda架构同时处理批处理和流处理,适合需要历史数据和实时数据结合的场景。Kappa架构则完全基于流处理,简化架构,但可能需要更复杂的流处理逻辑。考虑到电信业务对实时性的高要求,可能更适合Kappa架构,或者使用Flink的流批一体能力。 数据治理也是不可忽视的部分。实时数仓需要保证数据质量,比如处理迟到数据、去重、数据一致性。此外,数据安全方面,电信数据涉及用户隐私,需要加密传输和存储,访问控制,合规性检查。 可能的挑战包括高并发数据处理,如何保证低延迟和高吞吐量;系统的容错性和可靠性,比如节点故障时的数据恢复;以及如何整合多个数据源,保证数据的一致性和准确性。 用户可能需要具体的架构图示例,或者某个技术组件的配置建议。比如使用Flink处理Kafka数据,写入ClickHouse做实时分析。或者,如何设计分层存储,将实时数据和离线数据结合使用。 另外,需要考虑业务指标的定义,比如实时监控的指标有哪些,如每秒请求数、服务开通成功率、实时收入统计等。这些指标需要明确,以便设计合适的流处理逻辑和存储结构。 最后,测试和监控也是关键。实时数仓需要实时监控自身的健康状况,比如处理延迟、资源使用情况,以及业务指标的异常检测。日志和监控系统如Prometheus+Grafana可能在这里派上用场。 总结下来,用户可能需要一个从数据源到最终应用的完整架构设计,包括技术选型、各层组件、数据处理流程以及保障措施。需要进一步询问用户是否有特定的技术栈偏好,或者具体的业务场景需求,以便给出更针对性的建议。</think>构建基于电信增值业务的实时数仓需结合高并发、低延迟、高可靠性的特点,以下是分步设计与实现建议: --- ### **1. 明确业务场景与需求** - **典型场景** - 实时计费(如流量包使用监控) - 异常检测(如欺诈通话、异常流量) - 用户行为分析(如增值服务开通实时推荐) - SLA监控(如短信发送成功率) - **核心指标** - 每秒事务数(TPS)、服务响应时间 - 实时收入统计、用户活跃度 - 异常阈值(如单用户1小时内百条国际短信) --- ### **2. 技术选型** #### **数据采集层** - **日志类数据**:Filebeat + Kafka(如CDR话单、服务访问日志) - **数据库变更**:Debezium监听MySQL/Oracle的binlog,推送至Kafka - **API数据**:Flink Connector直接消费RESTful API流 #### **传输层** - **消息队列**:Apache Kafka(高吞吐、持久化)或 Apache Pulsar(低延迟分层存储) #### **计算层** - **流处理引擎**:Apache Flink(支持事件时间、状态管理、CEP复杂事件处理) - 关键用途:实时聚合(如每分钟收入)、窗口计算(滑动窗口检测异常)、数据清洗(去重、格式标准化) - **轻量级处理**:Kafka Streams(简单ETL场景) #### **存储层** - **实时存储** - **OLAP**:ClickHouse(高压缩比列存,适合聚合查询)、Apache Doris(MPP架构,低延迟响应) - **KV存储**:Redis(缓存实时统计结果)、HBase(存储用户画像实时更新) - **历史存储**:HDFS + Iceberg(支持ACID,合并实时与离线数据) #### **服务层** - **查询引擎**:Presto/Trino(跨实时与历史数据联邦查询) - **可视化**:Grafana(实时监控大屏)、Superset自助分析) --- ### **3. 分层架构设计(Lambda与Kappa结合)** ```plaintext 数据流向示例: [数据源] -> Kafka -> Flink实时ETL -> [实时层: ClickHouse/Doris] | +-> Flink批处理(夜间校准) -> [离线层: HDFS/Iceberg] | +-> 异常事件输出 -> [告警系统] ``` - **实时层** - **ODS**:原始数据分区存储(如Kafka Topic按业务划分) - **DWD**:Flink清洗后写入Parquet格式至HDFS,同时写入ClickHouse明细表 - **DWS**:Flink滚动窗口聚合结果存入ClickHouse聚合表(如每分钟各套餐收入) - **ADS**:ClickHouse物化视图或Redis缓存提供API接口 - **离线层** - 每日定时校准实时数据(如补全丢失数据),确保最终一致性 --- ### **4. 关键实现细节** #### **数据一致性保障** - **Exactly-Once语义**:Flink Checkpoint + Kafka事务输出 - **迟到数据处理**:Flink允许设置最大延迟时间(如10秒),窗口触发后更新结果表 - **维表关联**: - 静态维表(如套餐信息)预加载至Flink状态 - 动态维表(如实时用户状态)通过Async I/O查询HBase #### **性能优化** - **Flink反压处理**:开启缓冲超时,动态调整Kafka消费速率 - **ClickHouse优化**: - 使用MergeTree引擎按时间分区 - 预聚合(SummingMergeTree)减少查询计算量 - **资源隔离**:独立Kafka集群处理计费类高优先级数据 #### **容灾方案** - Kafka多副本 + Flink Savepoint定期备份 - 实时层与离线层数据定期对比(如通过Presto校验),触发异常告警 --- ### **5. 示例场景:实时防欺诈** ```sql -- Flink SQL实现异常检测 CREATE TABLE CDR ( userId STRING, callTime TIMESTAMP(3), destCountry STRING, duration INT ) WITH (...); -- 定义10分钟滑动窗口,统计单个用户拨打高危国家次数 SELECT userId, COUNT(*) as cnt FROM CDR WHERE destCountry IN ('高风险国家列表') GROUP BY userId, HOP(callTime, INTERVAL '1' MINUTE, INTERVAL '10' MINUTE) HAVING cnt > 5; -- 结果输出至Kafka告警Topic,并存入HBase黑名单 ``` --- ### **6. 运维与监控** - **基础设施监控**:Prometheus收集Flink/Kafka指标,Grafana展示 - **数据质量**: - 端到端延迟监控(如数据产生到可查询的时延) - 实时数据与离线数据差异报警 - **安全合规**:数据脱敏(如Flink SQL动态掩码手机号)、审计日志记录 --- 通过以上设计,可构建一个兼顾实时性与可靠性的电信增值业务数仓,支撑秒级响应的业务决策。实际实施时建议分阶段迭代,优先实现核心计费与风控场景。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值