初步探索 superset新增导出excel功能

一、环境

  • Python 3.8.10
  • Node v16.16.0
  • Windows 10 专业版 64 位操作系统
  • Superset 1.5

二、前端操作步骤

我们可以参照csv的导出方式,去扩展excel。以以下页面为例

2.1 确定导出csv的地方

2.1.1查找样式

发现有个chart-slice的类名,以这个关键词进行全局搜索发现Chart下有包含

2.1.2 找到渲染导出csv的button组件

进入chart组件可以看到关联组件SliceHeader

按照查找思路SliceHeader --> SliceHeaderControls,该文件下全局搜索CSV,可以确定csv的button组件导出位置

2.2 写入导出EXCEL组件

\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx

2.2.1 添加导出菜单

目前可以看菜单按钮有两块

{this.props.slice.viz_type !== 'filter_box' &&
    this.props.supersetCanCSV && (
    <Menu.Item key={MENU_KEYS.EXPORT_CSV}>{t('Export CSV')}</Menu.Item>
)}
          

{this.props.slice.viz_type !== 'filter_box' &&
  isFeatureEnabled(FeatureFlag.ALLOW_FULL_CSV_EXPORT) &&
  this.props.supersetCanCSV &&
  isTable && (
    <Menu.Item key={MENU_KEYS.EXPORT_FULL_CSV}>
      {t('Export full CSV')}
    </Menu.Item>
  )}

依葫芦画瓢,在后面新增两个excel导出按钮

{this.props.slice.viz_type !== 'filter_box' &&
  this.props.supersetCanExcel && (
    <Menu.Item key={MENU_KEYS.EXPORT_EXCEL}>{t('Export EXCEL')}</Menu.Item>
  )}
{this.props.slice.viz_type !== 'filter_box' &&
    isFeatureEnabled(FeatureFlag.ALLOW_FULL_EXCEL_EXPORT) &&
    this.props.supersetCanExcel &&
    isTable && (
      <Menu.Item key={MENU_KEYS.EXPORT_FULL_EXCEL}>
        {t('Export full EXCEL')}
      </Menu.Item>
  )}

2.2.2 添加菜单点击事件

在\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx 中找到handleMenuClick,添加以下代码:

case MENU_KEYS.EXPORT_EXCEL:
  // eslint-disable-next-line no-unused-expressions
  this.props.exportExcel && this.props.exportExcel(this.props.slice.slice_id);
  break;
case MENU_KEYS.EXPORT_FULL_CSV:
// eslint-disable-next-line no-unused-expressions
    this.props.exportFullCSV &&
      this.props.exportFullCSV(this.props.slice.slice_id);
    break;

 2.2.3 添加相应字段

此时应该有很多未定义报错,所以需要添加以下代码

找到MENU_KEYS,在末尾添加

EXPORT_FULL_CSV: 'export_full_csv',
EXPORT_FULL_EXCEL: 'export_full_excel',

找到\superset-1.5\superset-frontend\packages\superset-ui-core\src\utils\featureFlags.ts中的FeatureFlag,在末尾添加

ALLOW_FULL_EXCEL_EXPORT = 'ALLOW_FULL_EXCEL_EXPORT',

2.2.4 消除props不通过的校验

现在还有些问题是因为有些属性没有声明,props校验不通过,所以需要添加属性。目前有三个属性未添加supersetCanExcel、exportFullExcel、exportExcel。

  • \superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx

找到export interface SliceHeaderControlsProps,在末尾添加

exportExcel?: (sliceId: number) => void;
exportFullExcel?: (sliceId: number) => void;
supersetCanExcel?: boolean;
  • \superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx

找到const SliceHeader: FC<SliceHeaderProps>,在末尾添加

exportExcel = () => ({}),
exportFullExcel,
supersetCanExcel = false,
  • superset-1.5\superset-frontend\src\dashboard\containers\Chart.jsx

在supersetCanCSV: !!dashboardInfo.superset_can_csv, 后面新增

supersetCanExcel: !!dashboardInfo.superset_can_excel,
  • \superset-1.5\superset-frontend\src\dashboard\actions\hydrate.js

由于数据源来自于dashboardInfo,所以我们找到来源为hydrate.js,此时发现这个步骤牵扯到权限问题,目前还未实现,所以我们暂时写死

// superset_can_excel: findPermission('can_excel', 'Superset', roles),
  superset_can_excel: true,

2.2.5 组件传值

在\superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx中找到

exportCSV={exportCSV}
exportFullCSV={exportFullCSV}

在后面添加

exportExcel={exportExcel}
exportFullExcel={exportFullExcel}
supersetCanExcel={supersetCanExcel}

2.3 添加Excel导出方法

2.3.1 \superset-1.5\superset-frontend\src\dashboard\components\gridComponents\Chart.jsx

现在需要新增导出excel方法,同样参照exportCSV添加以下代码

exportExcel(isFullExcel = false) {
    this.props.logEvent(LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART, {
      slice_id: this.props.slice.slice_id,
      is_cached: this.props.isCached,
    });
    exportChart({
      formData: isFullExcel
      ? { ...this.props.formData, row_limit: this.props.maxRows }
      : this.props.formData,
      // resultType: 'results',
      resultType: 'full',
      resultFormat: 'excel',
    });
}

  exportFullExcel() {
    this.exportExcel(true);
  }

2.3.2 \superset-1.5\superset-frontend\src\logger\LogUtils.ts

需要添加

  export const LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART =
  'export_excel_dashboard_chart'; 

2.4 修改exportChart相关导出方法

exportChart是主要的导出csv的方法

export const exportChart = ({
  formData,
  resultFormat = 'json',
  resultType = 'full',
  force = false,
  ownState = {},
}) => {
  let url;
  let payload;
  if (shouldUseLegacyApi(formData)) {
    const endpointType = getLegacyEndpointType({ resultFormat, resultType });
    url = getExploreUrl({
      formData,
      endpointType,
      allowDomainSharding: false,
    });
    payload = formData;
  } else {
    url = '/api/v1/chart/data';
    payload = buildV1ChartDataPayload({
      formData,
      force,
      resultFormat,
      resultType,
      ownState,
    });
  }
  console.log("url:", url, payload);
  postForm(url, payload);
};

shouldUseLegacyApi这个函数主要是根据useLegacyApi来判断调用哪个接口,useLegacyApi是在图表注册时就已经确定好了。里面有两个函数,getLegacyEndpointType(生成最终类型的函数), getExploreUrl(生成导出url的函数)。

2.4.1 修改getLegacyEndpointType

export const getLegacyEndpointType = ({ resultType, resultFormat }) =>
  (resultFormat === 'csv' || resultFormat === 'excel' ) ? resultFormat : resultType;

2.4.2 修改 getExploreUrl

找到

if (endpointType === 'csv') {
    search.csv = 'true';
  }

在后面新增

if(endpointType === 'excel') {
    search.excel = 'true';
}

前端已经修改完毕,最终修改结果

三、后端操作步骤

此时点击export excel可以发现导出的连接未http://localhost:5000/superset/explore_json/?form_data=%7B%22slice_id%22%3A351%7D&excel=true

我们可以推测后台的实现方法和explore_json有关,所以在后端查找该方法。

\superset-1.5\superset\views\core.py

@api
    @has_access_api
    @handle_api_exception
    @event_logger.log_this
    @expose(
        "/explore_json/<datasource_type>/<int:datasource_id>/",
        methods=EXPLORE_JSON_METHODS,
    )
    @expose("/explore_json/", methods=EXPLORE_JSON_METHODS)
    @etag_cache()
    @check_resource_permissions(check_datasource_perms)
    def explore_json(
        self, datasource_type: Optional[str] = None, datasource_id: Optional[int] = None
    ) -> FlaskResponse:
        """Serves all request that GET or POST form_data

        This endpoint evolved to be the entry point of many different
        requests that GETs or POSTs a form_data.

        `self.generate_json` receives this input and returns different
        payloads based on the request args in the first block

        TODO: break into one endpoint for each return shape"""

        response_type = ChartDataResultFormat.JSON.value
        responses: List[Union[ChartDataResultFormat, ChartDataResultType]] = list(
            ChartDataResultFormat
        )
        responses.extend(list(ChartDataResultType))
        for response_option in responses:
            if request.args.get(response_option) == "true":
                response_type = response_option
                break
        # Verify user has permission to export CSV file
        if (
            response_type == ChartDataResultFormat.CSV
            and not security_manager.can_access("can_csv", "Superset")
        ):
            return json_error_response(
                _("You don't have the rights to ") + _("download as csv"),
                status=403,
            )

        form_data = get_form_data()[0]

        try:
            datasource_id, datasource_type = get_datasource_info(
                datasource_id, datasource_type, form_data
            )

            force = request.args.get("force") == "true"

            # TODO: support CSV, SQL query and other non-JSON types
            if (
                is_feature_enabled("GLOBAL_ASYNC_QUERIES")
                and response_type == ChartDataResultFormat.JSON
            ):
                # First, look for the chart query results in the cache.
                try:
                    viz_obj = get_viz(
                        datasource_type=cast(str, datasource_type),
                        datasource_id=datasource_id,
                        form_data=form_data,
                        force_cached=True,
                        force=force,
                    )
                    payload = viz_obj.get_payload()

                    # If the chart query has already been cached, return it immediately.
                    if payload is not None:
                        return self.send_data_payload_response(viz_obj, payload)
                except CacheLoadError:
                    pass

                # Otherwise, kick off a background job to run the chart query.
                # Clients will either poll or be notified of query completion,
                # at which point they will call the /explore_json/data/<cache_key>
                # endpoint to retrieve the results.
                try:
                    async_channel_id = async_query_manager.parse_jwt_from_request(
                        request
                    )["channel"]
                    job_metadata = async_query_manager.init_job(
                        async_channel_id, g.user.get_id()
                    )
                    load_explore_json_into_cache.delay(
                        job_metadata, form_data, response_type, force
                    )
                except AsyncQueryTokenException:
                    return json_error_response("Not authorized", 401)

                return json_success(json.dumps(job_metadata), status=202)

            viz_obj = get_viz(
                datasource_type=cast(str, datasource_type),
                datasource_id=datasource_id,
                form_data=form_data,
                force=force,
            )

            return self.generate_json(viz_obj, response_type)
        except SupersetException as ex:
            return json_error_response(utils.error_msg_from_exception(ex), 400)

3.1 添加类型

根据上述代码,我们需要在ChartDataResultFormat里面新增一个excel的类型

\superset-1.5\superset\common\chart_data.py

class ChartDataResultFormat(str, Enum):
    """
    Chart data response format
    """

    CSV = "csv"
    JSON = "json"
    EXCEL = "excel" // 新增

3.2 实现generate_json

\superset-1.5\superset\views\core.py

找到最后return的地方,可以看到需要实现generate_json,在这个方法新增如下代码

if response_type == ChartDataResultFormat.EXCEL:
   return ExcelResponse(
      viz_obj.get_excel(), headers=generate_download_headers("xlsx")
   )

3.3 实现ExcelResponse

\superset-1.5\superset\views\base.py

class ExcelResponse(Response):  # pylint: disable=too-many-ancestors
    """
       Override Response to take into account xlsx encoding from config.py
    """

    charset = conf.get("EXCEL_EXPORT").get("encoding", "utf-8")

3.4 实现get_excel

\superset-1.5\superset\viz.py

在头部导入

import io

在get_csv后面新增如下代码

def get_excel(self) -> bytes:
    data = io.BytesIO()
    df = self.get_df_payload()["df"]
    include_index = not isinstance(df.index, pd.RangeIndex)
    df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
    data.seek(0)
    return data.read()

3.5 修改config.py

\superset-1.5\superset\config.py

CSV_EXPORT = {"encoding": "utf-8"} 后面新增

EXCEL_EXPORT = {"encoding": "utf-8"}

3.6 下载openpyxl

此时调用会报错,原因是缺少openpyxl包,所以执行

pip install openpyxl

最后一个流程就大致完成了,我们可以在这个页面测试下

  下载成功

四、总结

这是superset二次开发支持excel的初步探索,目前还有一些问题未解决。如下:

  • Sqllab中点击.xlsx

 点击.xlsx,需要实现对应的/superset/excel/${this.props.query.id}方法。可以参考

\superset-1.5\superset\views\core.py中的这段实现csv的方法

@has_access
@event_logger.log_this
@expose("/csv/<client_id>")
def csv(  # pylint: disable=no-self-use,too-many-locals
    self, client_id: str
) -> FlaskResponse:
    """Download the query results as csv."""
    logger.info("Exporting CSV file [%s]", client_id)
    query = db.session.query(Query).filter_by(client_id=client_id).one()

    try:
        query.raise_for_access()
    except SupersetSecurityException as ex:
        flash(ex.error.message)
        return redirect("/")

    blob = None
    if results_backend and query.results_key:
        logger.info("Fetching CSV from results backend [%s]", query.results_key)
        blob = results_backend.get(query.results_key)
    if blob:
        logger.info("Decompressing")
        payload = utils.zlib_decompress(
            blob, decode=not results_backend_use_msgpack
        )
        obj = _deserialize_results_payload(
            payload, query, cast(bool, results_backend_use_msgpack)
        )
        columns = [c["name"] for c in obj["columns"]]
        df = pd.DataFrame.from_records(obj["data"], columns=columns)
        logger.info("Using pandas to convert to CSV")
    else:
        logger.info("Running a query to turn into CSV")
        if query.select_sql:
            sql = query.select_sql
            limit = None
        else:
            sql = query.executed_sql
            limit = ParsedQuery(sql).limit
        if limit is not None and query.limiting_factor in {
            LimitingFactor.QUERY,
            LimitingFactor.DROPDOWN,
            LimitingFactor.QUERY_AND_DROPDOWN,
        }:
            # remove extra row from `increased_limit`
            limit -= 1
        df = query.database.get_df(sql, query.schema)[:limit]

    csv_data = csv.df_to_escaped_csv(df, index=False, **config["CSV_EXPORT"])
    quoted_csv_name = parse.quote(query.name)
    response = CsvResponse(
        csv_data, headers=generate_download_headers("csv", quoted_csv_name)
    )
    event_info = {
        "event_type": "data_export",
        "client_id": client_id,
        "row_count": len(df.index),
        "database": query.database.name,
        "schema": query.schema,
        "sql": query.sql,
        "exported_format": "csv",
    }
    event_rep = repr(event_info)
    logger.debug(
        "CSV exported: %s", event_rep, extra={"superset_event": event_info}
    )
    return response

  • 导出excel权限问题

其余的导出excel功能也可以参考这种思路去实现

参考:

注:若有错误,欢迎指出

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值