一、环境
- Python 3.8.10
- Node v16.16.0
- Windows 10 专业版 64 位操作系统
-
Superset 1.5
二、前端操作步骤
我们可以参照csv的导出方式,去扩展excel。以以下页面为例
2.1 确定导出csv的地方
2.1.1查找样式
发现有个chart-slice的类名,以这个关键词进行全局搜索发现Chart下有包含
2.1.2 找到渲染导出csv的button组件
进入chart组件可以看到关联组件SliceHeader
按照查找思路SliceHeader --> SliceHeaderControls,该文件下全局搜索CSV,可以确定csv的button组件导出位置
2.2 写入导出EXCEL组件
\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx
2.2.1 添加导出菜单
目前可以看菜单按钮有两块
{this.props.slice.viz_type !== 'filter_box' &&
this.props.supersetCanCSV && (
<Menu.Item key={MENU_KEYS.EXPORT_CSV}>{t('Export CSV')}</Menu.Item>
)}
{this.props.slice.viz_type !== 'filter_box' &&
isFeatureEnabled(FeatureFlag.ALLOW_FULL_CSV_EXPORT) &&
this.props.supersetCanCSV &&
isTable && (
<Menu.Item key={MENU_KEYS.EXPORT_FULL_CSV}>
{t('Export full CSV')}
</Menu.Item>
)}
依葫芦画瓢,在后面新增两个excel导出按钮
{this.props.slice.viz_type !== 'filter_box' &&
this.props.supersetCanExcel && (
<Menu.Item key={MENU_KEYS.EXPORT_EXCEL}>{t('Export EXCEL')}</Menu.Item>
)}
{this.props.slice.viz_type !== 'filter_box' &&
isFeatureEnabled(FeatureFlag.ALLOW_FULL_EXCEL_EXPORT) &&
this.props.supersetCanExcel &&
isTable && (
<Menu.Item key={MENU_KEYS.EXPORT_FULL_EXCEL}>
{t('Export full EXCEL')}
</Menu.Item>
)}
2.2.2 添加菜单点击事件
在\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx 中找到handleMenuClick,添加以下代码:
case MENU_KEYS.EXPORT_EXCEL:
// eslint-disable-next-line no-unused-expressions
this.props.exportExcel && this.props.exportExcel(this.props.slice.slice_id);
break;
case MENU_KEYS.EXPORT_FULL_CSV:
// eslint-disable-next-line no-unused-expressions
this.props.exportFullCSV &&
this.props.exportFullCSV(this.props.slice.slice_id);
break;
2.2.3 添加相应字段
此时应该有很多未定义报错,所以需要添加以下代码
找到MENU_KEYS,在末尾添加
EXPORT_FULL_CSV: 'export_full_csv',
EXPORT_FULL_EXCEL: 'export_full_excel',
找到\superset-1.5\superset-frontend\packages\superset-ui-core\src\utils\featureFlags.ts中的FeatureFlag,在末尾添加
ALLOW_FULL_EXCEL_EXPORT = 'ALLOW_FULL_EXCEL_EXPORT',
2.2.4 消除props不通过的校验
现在还有些问题是因为有些属性没有声明,props校验不通过,所以需要添加属性。目前有三个属性未添加supersetCanExcel、exportFullExcel、exportExcel。
-
\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx
找到export interface SliceHeaderControlsProps,在末尾添加
exportExcel?: (sliceId: number) => void;
exportFullExcel?: (sliceId: number) => void;
supersetCanExcel?: boolean;
-
\superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx
找到const SliceHeader: FC<SliceHeaderProps>,在末尾添加
exportExcel = () => ({}),
exportFullExcel,
supersetCanExcel = false,
-
superset-1.5\superset-frontend\src\dashboard\containers\Chart.jsx
在supersetCanCSV: !!dashboardInfo.superset_can_csv, 后面新增
supersetCanExcel: !!dashboardInfo.superset_can_excel,
-
\superset-1.5\superset-frontend\src\dashboard\actions\hydrate.js
由于数据源来自于dashboardInfo,所以我们找到来源为hydrate.js,此时发现这个步骤牵扯到权限问题,目前还未实现,所以我们暂时写死
// superset_can_excel: findPermission('can_excel', 'Superset', roles),
superset_can_excel: true,
2.2.5 组件传值
在\superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx中找到
exportCSV={exportCSV}
exportFullCSV={exportFullCSV}
在后面添加
exportExcel={exportExcel}
exportFullExcel={exportFullExcel}
supersetCanExcel={supersetCanExcel}
2.3 添加Excel导出方法
2.3.1 \superset-1.5\superset-frontend\src\dashboard\components\gridComponents\Chart.jsx
现在需要新增导出excel方法,同样参照exportCSV添加以下代码
exportExcel(isFullExcel = false) {
this.props.logEvent(LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART, {
slice_id: this.props.slice.slice_id,
is_cached: this.props.isCached,
});
exportChart({
formData: isFullExcel
? { ...this.props.formData, row_limit: this.props.maxRows }
: this.props.formData,
// resultType: 'results',
resultType: 'full',
resultFormat: 'excel',
});
}
exportFullExcel() {
this.exportExcel(true);
}
2.3.2 \superset-1.5\superset-frontend\src\logger\LogUtils.ts
需要添加
export const LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART =
'export_excel_dashboard_chart';
2.4 修改exportChart相关导出方法
exportChart是主要的导出csv的方法
export const exportChart = ({
formData,
resultFormat = 'json',
resultType = 'full',
force = false,
ownState = {},
}) => {
let url;
let payload;
if (shouldUseLegacyApi(formData)) {
const endpointType = getLegacyEndpointType({ resultFormat, resultType });
url = getExploreUrl({
formData,
endpointType,
allowDomainSharding: false,
});
payload = formData;
} else {
url = '/api/v1/chart/data';
payload = buildV1ChartDataPayload({
formData,
force,
resultFormat,
resultType,
ownState,
});
}
console.log("url:", url, payload);
postForm(url, payload);
};
shouldUseLegacyApi这个函数主要是根据useLegacyApi来判断调用哪个接口,useLegacyApi是在图表注册时就已经确定好了。里面有两个函数,getLegacyEndpointType(生成最终类型的函数), getExploreUrl(生成导出url的函数)。
2.4.1 修改getLegacyEndpointType
export const getLegacyEndpointType = ({ resultType, resultFormat }) =>
(resultFormat === 'csv' || resultFormat === 'excel' ) ? resultFormat : resultType;
2.4.2 修改 getExploreUrl
找到
if (endpointType === 'csv') {
search.csv = 'true';
}
在后面新增
if(endpointType === 'excel') {
search.excel = 'true';
}
前端已经修改完毕,最终修改结果
三、后端操作步骤
此时点击export excel可以发现导出的连接未http://localhost:5000/superset/explore_json/?form_data=%7B%22slice_id%22%3A351%7D&excel=true
我们可以推测后台的实现方法和explore_json有关,所以在后端查找该方法。
\superset-1.5\superset\views\core.py
@api
@has_access_api
@handle_api_exception
@event_logger.log_this
@expose(
"/explore_json/<datasource_type>/<int:datasource_id>/",
methods=EXPLORE_JSON_METHODS,
)
@expose("/explore_json/", methods=EXPLORE_JSON_METHODS)
@etag_cache()
@check_resource_permissions(check_datasource_perms)
def explore_json(
self, datasource_type: Optional[str] = None, datasource_id: Optional[int] = None
) -> FlaskResponse:
"""Serves all request that GET or POST form_data
This endpoint evolved to be the entry point of many different
requests that GETs or POSTs a form_data.
`self.generate_json` receives this input and returns different
payloads based on the request args in the first block
TODO: break into one endpoint for each return shape"""
response_type = ChartDataResultFormat.JSON.value
responses: List[Union[ChartDataResultFormat, ChartDataResultType]] = list(
ChartDataResultFormat
)
responses.extend(list(ChartDataResultType))
for response_option in responses:
if request.args.get(response_option) == "true":
response_type = response_option
break
# Verify user has permission to export CSV file
if (
response_type == ChartDataResultFormat.CSV
and not security_manager.can_access("can_csv", "Superset")
):
return json_error_response(
_("You don't have the rights to ") + _("download as csv"),
status=403,
)
form_data = get_form_data()[0]
try:
datasource_id, datasource_type = get_datasource_info(
datasource_id, datasource_type, form_data
)
force = request.args.get("force") == "true"
# TODO: support CSV, SQL query and other non-JSON types
if (
is_feature_enabled("GLOBAL_ASYNC_QUERIES")
and response_type == ChartDataResultFormat.JSON
):
# First, look for the chart query results in the cache.
try:
viz_obj = get_viz(
datasource_type=cast(str, datasource_type),
datasource_id=datasource_id,
form_data=form_data,
force_cached=True,
force=force,
)
payload = viz_obj.get_payload()
# If the chart query has already been cached, return it immediately.
if payload is not None:
return self.send_data_payload_response(viz_obj, payload)
except CacheLoadError:
pass
# Otherwise, kick off a background job to run the chart query.
# Clients will either poll or be notified of query completion,
# at which point they will call the /explore_json/data/<cache_key>
# endpoint to retrieve the results.
try:
async_channel_id = async_query_manager.parse_jwt_from_request(
request
)["channel"]
job_metadata = async_query_manager.init_job(
async_channel_id, g.user.get_id()
)
load_explore_json_into_cache.delay(
job_metadata, form_data, response_type, force
)
except AsyncQueryTokenException:
return json_error_response("Not authorized", 401)
return json_success(json.dumps(job_metadata), status=202)
viz_obj = get_viz(
datasource_type=cast(str, datasource_type),
datasource_id=datasource_id,
form_data=form_data,
force=force,
)
return self.generate_json(viz_obj, response_type)
except SupersetException as ex:
return json_error_response(utils.error_msg_from_exception(ex), 400)
3.1 添加类型
根据上述代码,我们需要在ChartDataResultFormat里面新增一个excel的类型
\superset-1.5\superset\common\chart_data.py
class ChartDataResultFormat(str, Enum):
"""
Chart data response format
"""
CSV = "csv"
JSON = "json"
EXCEL = "excel" // 新增
3.2 实现generate_json
\superset-1.5\superset\views\core.py
找到最后return的地方,可以看到需要实现generate_json,在这个方法新增如下代码
if response_type == ChartDataResultFormat.EXCEL:
return ExcelResponse(
viz_obj.get_excel(), headers=generate_download_headers("xlsx")
)
3.3 实现ExcelResponse
\superset-1.5\superset\views\base.py
class ExcelResponse(Response): # pylint: disable=too-many-ancestors
"""
Override Response to take into account xlsx encoding from config.py
"""
charset = conf.get("EXCEL_EXPORT").get("encoding", "utf-8")
3.4 实现get_excel
\superset-1.5\superset\viz.py
在头部导入
import io
在get_csv后面新增如下代码
def get_excel(self) -> bytes:
data = io.BytesIO()
df = self.get_df_payload()["df"]
include_index = not isinstance(df.index, pd.RangeIndex)
df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
data.seek(0)
return data.read()
3.5 修改config.py
\superset-1.5\superset\config.py
CSV_EXPORT = {"encoding": "utf-8"} 后面新增
EXCEL_EXPORT = {"encoding": "utf-8"}
3.6 下载openpyxl
此时调用会报错,原因是缺少openpyxl包,所以执行
pip install openpyxl
最后一个流程就大致完成了,我们可以在这个页面测试下
下载成功
四、总结
这是superset二次开发支持excel的初步探索,目前还有一些问题未解决。如下:
- Sqllab中点击.xlsx
点击.xlsx,需要实现对应的/superset/excel/${this.props.query.id}方法。可以参考
\superset-1.5\superset\views\core.py中的这段实现csv的方法
@has_access
@event_logger.log_this
@expose("/csv/<client_id>")
def csv( # pylint: disable=no-self-use,too-many-locals
self, client_id: str
) -> FlaskResponse:
"""Download the query results as csv."""
logger.info("Exporting CSV file [%s]", client_id)
query = db.session.query(Query).filter_by(client_id=client_id).one()
try:
query.raise_for_access()
except SupersetSecurityException as ex:
flash(ex.error.message)
return redirect("/")
blob = None
if results_backend and query.results_key:
logger.info("Fetching CSV from results backend [%s]", query.results_key)
blob = results_backend.get(query.results_key)
if blob:
logger.info("Decompressing")
payload = utils.zlib_decompress(
blob, decode=not results_backend_use_msgpack
)
obj = _deserialize_results_payload(
payload, query, cast(bool, results_backend_use_msgpack)
)
columns = [c["name"] for c in obj["columns"]]
df = pd.DataFrame.from_records(obj["data"], columns=columns)
logger.info("Using pandas to convert to CSV")
else:
logger.info("Running a query to turn into CSV")
if query.select_sql:
sql = query.select_sql
limit = None
else:
sql = query.executed_sql
limit = ParsedQuery(sql).limit
if limit is not None and query.limiting_factor in {
LimitingFactor.QUERY,
LimitingFactor.DROPDOWN,
LimitingFactor.QUERY_AND_DROPDOWN,
}:
# remove extra row from `increased_limit`
limit -= 1
df = query.database.get_df(sql, query.schema)[:limit]
csv_data = csv.df_to_escaped_csv(df, index=False, **config["CSV_EXPORT"])
quoted_csv_name = parse.quote(query.name)
response = CsvResponse(
csv_data, headers=generate_download_headers("csv", quoted_csv_name)
)
event_info = {
"event_type": "data_export",
"client_id": client_id,
"row_count": len(df.index),
"database": query.database.name,
"schema": query.schema,
"sql": query.sql,
"exported_format": "csv",
}
event_rep = repr(event_info)
logger.debug(
"CSV exported: %s", event_rep, extra={"superset_event": event_info}
)
return response
-
导出excel权限问题
其余的导出excel功能也可以参考这种思路去实现
参考:
注:若有错误,欢迎指出