sino风控数据整体流程梳理

风控服务的模块和对象

分类模块交付形式业务方/使用方核心场景
风控:
我司对客在
授信、调额、开户、充值
等方面的管控
风控database/query数据集mc数据分析:赵鹏、卢少烽客户线下标签dwd_admkt_risk_corporation_offline_tag_df
企查查数据ods_qcc_shell_scan_list_df...
客户认证bi风控策略:申秋语客户认证过程明细dwd_admkt_cust_corporation_auth_df
dwd_admkt_cust_guarantee_auth_df
dwd_admkt_cust_corporation_final_auth_df
ads_admkt_cust_corporation_final_auth_df
oneidbi、接口风控信审:赵鹏、xray、风控系统客户唯一聚合IDads_admkt_cust_pub_oneid_corporation,dim_admkt_cust_oneid_corporation_pair_snap,ads_admkt_oneid_corporation_spend_df
风控大盘bi风控信审:赵鹏、王琳客户综合评分dwd_admkt_risk_corporation_offline_tag_df
,ads_admkt_risk_corporation_base_df,ads_admkt_risk_corporation_performance_df,ads_admkt_risk_corporation_tag_df
ads_admkt_risk_corporation_synthesize_score_df,ads_admkt_risk_corporation_relation_score_df
风控系统接口风控系统:张洁、张帅客户风险提示标签ads_admkt_risk_corporation_risk_tag_df,ads_admkt_risk_spend_account_df
xray客户评级接口媒资:程晓君、高亚召fb客户评级ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df
gg客户评级ads_admkt_adver_violation_act_disab_gg_account_corporation_df
tt客户评级ads_admkt_risk_tt_adver_rpt
ads_admkt_risk_xray_corporation_level_tt_account_df
ads_admkt_risk_xray_corporation_level_corporation_df
ads_admkt_risk_xray_corporation_level_adver_df
真金客户报表bi风控策略:何雨瑶、真金广告、客户报表ads_admkt_risk_channel_standard_corporation_violation_type_df
合规:
媒体对账户(媒体没有签约主体的概念)、媒体对我司(BM,MCC,BC等)
在媒体政策和整体违规率水平
的要求
xray广告素材审核接口媒体合规:周玉珊
数字化工程:王佳佳、陈思捷
fb广告素材审核ads_pub_fb_adinsights_larr_drr_spend
gg广告素材审核ads_admkt_risk_xray_review_ad_material_di,ads_admkt_risk_xray_review_ad_account_corporation_df
tt广告素材审核ads_admkt_adver_pub_multi_ad_status,ads_admkt_risk_ad_cluster_violation_type_score
fb/gg/tt3大渠道合规报表bi风控策略:何雨瑶、申秋语fb合规报表ads_admkt_risk_violation_channel_fb_account_df,ads_admkt_risk_violation_channel_corporation_df,
gg合规报表dwd_admkt_adver_act_disab_gg_account_df,ads_admkt_adver_violation_act_disab_gg_account_corporation_df
tt合规报表ads_admkt_risk_tt_adver_rpt
开A跑Bbi媒体合规:周玉珊fbdwd_admkt_adver_account_link_df 全渠道全量账户开户链接
dwd_admkt_adver_ad_link_df 3大渠道有消耗的广告投放链接
dwd_admkt_risk_act_ad_link_compare_df fb/gg开A跑B标签
ads_admkt_risk_gg_act_ad_link_compare_df GG开A跑B应用表
ads_admkt_risk_fb_act_ad_link_compare_df FB开A跑B应用表
gg
其他蚂蚁金融合作接口风控金融:梅君城ods_risk_external_policy_info
ods_risk_external_corporation_audit_result_info
ads_admkt_risk_external_corporation_order_info_df
ads_admkt_risk_external_bill_invoice_info_df
ads_admkt_risk_external_corporation_abolish_invoice_info_df
ads_admkt_risk_external_invoice_payment_df
风控sinoclick朱佳慧ads_admkt_risk_violation_fb_ad_df
预付资损风控资管:徐静、吴极之前的文档:
案件汇报链路: 案件汇报链路
预付问题说明: Mediabuy预付问题说明

技术配合梳理的文档:
1,第一批(深圳翊家,蓝色跳动)to 检察院
第一批预付咨损证据(to 检察院)
2,第三批第一波
第三批第一波预付咨损证据
3,第三批第二波
第三批第二波预付咨损证据

合规管理的3个阶段

1、投前审核。(数据配合媒资系统开户审批)
2、投中审查。(算法和数据一起检测违规)
3、投后劝退。(风控部劝退屡教不改客户)
暂时无法在飞书文档外展示此内容

指标维度总线矩阵

维度业务用数
媒体业务过程指标指标释义数据时效广告账户签约主体广告主BM风控报表客户评级素材审核开A跑B风控系统真金开户审核
1d7d28d30d1d7d28d30d1d7d28d30d12mhis1d7d28d30dhis1d7d28d风控日常媒体违规率运营合规客户违规率运营、管控fb客户评级gg客户评级tt客户评级fb素材审核gg素材审核tt素材审核fb-AB名单管控gg-AB名单管控授信申请授信调额风控额度管控风控日常消耗管控资管逾期管控开户申请审核
Meta广告违规广告违规消耗媒体判定的违规消耗T-4
总消耗广告总消耗
LARR违规消耗/总消耗
违规广告数媒体判定的违规广告数
有消耗广告数有消耗广告数
账户被封账户被封消耗账户被封消耗T-1
总消耗账户总消耗
DRR账户被封消耗/账户总消耗
被封账户数被封账户数
有消耗账户数有消耗账户数
GG账户被封被封账户数(lag)含30天内转出的被封账户数T-1
账户总数(lag)含30天内转出的总账户数
封户率(lag)含30天内转出的被封账户数/含30天内转出的总账户数
被封账户数(无lag)不含30天内转出的被封账户数
账户总数(无lag)不含30天内转出的总账户数
封户率(无lag)不含30天内转出的被封账户数/不含30天内转出的总账户数
新开被封账户数新开被封账户数
新开账户数新开账户数
新户封户率新开被封账户数/新开账户数
被封消耗(美金)被封消耗(美金)
消耗(美金)总消耗(美金)
被封消耗占比被封消耗(美金)/总消耗(美金)
TT账户被封账户总数账户总数T-2
被封账户数被封账户数
消耗总消耗
被封消耗被封消耗
新开账户数新开账户数
新开被封账户数新开被封账户数
新开账户消耗新开账户消耗
新开被封账户消耗新开被封账户消耗
Snap账号违规高危账户当日广告拒登率≥80%且昨日广告拒登率<80%T-1
广告违规违规广告数违规广告数
广告数广告数
广告拒登率违规广告数/广告数
全渠道
 
消耗
 
总消耗账单+api消耗T-1

 
消耗月数消耗月数
月均消耗月均消耗

余额
余额余额(剔除无上限账户)`
活跃账户余额活跃账户余额

一、风控

一、风控database/query数据集

风控database的技术方案: 风控database建设

二、风控大盘(指标+综合评分)

1,文档

1.产品需求文档: 第1批基础指标和报表

2,流程图

暂时无法在飞书文档外展示此内容
暂时无法在飞书文档外展示此内容
先把字段按基本信息、指标、标签来分,是基于应用层公共能力建设的考虑,对于这个需求 如果关联提供不满足时效要求,就在数仓先组装好 做个性化的ads表提供。
反思总结:1,需求频次太杂,比如第4第5次迭代中的 有关字段值格式的调整和加字段这类需求可以调研清楚之后合并起来做。可以减少开发、测试、上线、刷数等一系列繁复的流程。调格式 加字段这类的需求 业务价值并不很大,最好是业务产品核对清楚后 一次性解决上线。开发可以参与一些业务调研性的事情,给出合理建议。

三、风控系统

1,文档

2,流程图

暂时无法在飞书文档外展示此内容

四、xray-客户评级流程

1,文档

2,流程图

暂时无法在飞书文档外展示此内容

二,合规

一、FB违规指标加工流程

1,文档

1.产品需求文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=478478400
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=465895586

2,流程图

暂时无法在飞书文档外展示此内容

3,问题解释

-- fb官方名词解释:larr drr rad
分子1 违规广告消耗(rad)
分子2 被封账户消耗
分母 总消耗
larr = 分子1 / 分母(计算时分子分母的时间窗口必须保持一致)
drr = 分子2 / 分母
-------------------------------------------------------------------
LARR 是 update_date-27~update_date是因为媒体给的违规数据无法做到T-1的频率去更新(API抓不到很多拒登广告),所以根据以往我们对更新频率的观察给的这个时间区间;
DRR 是 t-28~t-1是因为分子的被封消耗我们API大部分能抓取到且能达到T-1的更新频率;
账户的drr in ('0','1',null)
larr>1?
select * -- 124个账户
from sino_dc_cdm.dws_admkt_adver_violation_fb_account_nd
where dt = '20230330'
and fb_larr_28d_stu > 1
为什么是近28天:媒体就用近28天的数据来考核
fb-BM介绍:https://zhuanlan.zhihu.com/p/575805295

二、GG违规指标加工流程

1,文档

2,流程图

暂时无法在飞书文档外展示此内容
通知(被封):ods_admkt_adver_gg_mcc_illegal_act_disab_df -> dwd_admkt_adver_act_disab_gg_account_df
子账号设置:ods_admkt_gg_mcc_subaccount_setup_report_df -> dwd_admkt_adver_act_gg_mcc_trans_scd
预算: ods_admkt_gg_mcc_budget_report_df
MCC每日全量账户信息表 ads_admkt_adver_violation_act_disab_gg_mcc_account_df(scd)
google被封账户指标明细表 ads_admkt_adver_violation_act_disab_gg_account(act_disab_gg)
google被封账户汇总表 ads_admkt_adver_violation_act_disab_gg_account_banned
google签约主体合规明细 ads_admkt_adver_violation_act_disab_gg_account_corporation_df

3,名词解释

我司2个大MCC:'Sinointeractive HK','深诺MCC'
木瓜:'GCC-Papaya' -- 20230817上线新增新增木瓜、赛文斯的逻辑,20231013木瓜账户全部转入我司
赛文斯:'GCC_s'
 
select account as mcc ,count(distinct customer_id) as act_cnt from sino_dc_ods.ods_admkt_gg_mcc_subaccount_setup_report_df where dt = '20240306' group by account;
mcc
act_cnt
GCC_s
430
Sinointeractive HK
60095
深诺MCC
38531

三、TT违规指标加工流程

1,文档

1.产品需求文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=431063130
http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=494371267
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=448692378

2,流程图

暂时无法在飞书文档外展示此内容

3,名词解释

四、xray-广告素材审核

Xray pre:https://pre-xray.meetsocial.cn/facebook/advertising_blacklist/advertising_sub#/ tt/gg 分组坑位:title,素材id,落地页

1,文档

1.产品需求文档: 素材违规数据需求
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=445808911

2,流程图

v1、现状(fb/tt):
暂时无法在飞书文档外展示此内容
fb现状流程:( 0617-FB素材审核数据缺失

思考/未来工作方向
1,tt广告素材审核,走算法,数仓,xray这个流程,ai先审一边再给人工审,应该是可以很大程度上提升审核效率。 2,业务系统不直接调用算法,数仓提供实时能力。
3,json结构不灵活,如果新增审核类型,算法 数仓 xray 都要调整处理json的代码。json格式如何调整。 xray可以直接用数仓的解析json的结果,不用重复解析。
v2、规划和方向():
暂时无法在飞书文档外展示此内容
gg:
暂时无法在飞书文档外展示此内容

3,xray相关表

 
任务名(xray-fb客户评级相关) ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df ads_admkt_risk_xray_corporation_level_fb_ad_df 任务名(xray-tt客户评级相关) ai_oneid_client_group ads_admkt_risk_xray_corporation_level_tt_account_df ads_admkt_risk_xray_corporation_level_corporation_df ads_admkt_risk_xray_corporation_level_adver_df http_message_ads_admkt_risk_xray ads_admkt_risk_tt_adver_rpt XRAY-FB广告素材审核 ads_pub_fb_adinsights_larr_drr_spend XRAY-TT广告素材审核 ads_admkt_risk_ad_cluster_violation_type_score ads_admkt_risk_ad_violation_type_score ads_admkt_cust_pub_corporation_nd ads_admkt_adver_pub_multi_ad_status /api/dispatch/getAdMaterialList 获取广告 限制scheduleStartTime (限制20000) /api/dispatch/textViolationList + /api/dispatch/activisionViolationList 1970<ad_create_time<2029 /api/dispatch/getCorporationViolationsCount 累计违规次数做排序 channel_warn_cnt_std,channel_disabled_cnt_std,exposure 排序 xray-分组坑位规则:md5(title,素材id,落地页) XRAY-GG广告素材审核 ads_admkt_risk_xray_review_ad_material_di exp_ads_admkt_risk_xray_review_ad_material_di_olap ads_admkt_risk_xray_review_ad_account_corporation_df exp_ads_admkt_risk_xray_review_ad_account_corporation_df_olap http_message_ads_admkt_risk_xray_ad_material_review

4, 违规类型分析统计

三、风控相关接口梳理

业务流

暂时无法在飞书文档外展示此内容

接口详情

暂时无法在飞书文档外展示此内容
名称API PathappidSLA接口文档
媒资
签约主体分渠道未使用账户数olap_holoads_admkt_cust_unused_account_detail/api/dispatch/custom/category/corporation/unusedActCnt/list10:00签约主体分渠道未使用账户数_接口
风控系统
风险提示标签one_service->olap_holoads_admkt_risk_corporation_risk_tag_df/api/dispatch/risk/corporation/corporationRiskTag/list签约主体风险提示标签_接口
账户日消耗one_service->olap_holoads_admkt_risk_spend_account_df/api/dispatch/risk/account/accountSpend30d/list10:00近30天账户每日api消耗
账户属性列表one_servicedim_admkt_adver_account_corporation_opt/api/dispatch/industry/category/account/list广告账户属性列表
签约主体预警标签_接口digit_fastsino_dc_risk.risk_warning_label_application/api/dispatch/risk/corporation/corporationWarnTag/list签约主体预警标签_接口
蚂蚁02:00https://q6y68vu0j8.feishu.cn/drive/folder/fldcnIRbzA8zns1XkihIiG5GRef
蚂蚁-保单数据列表external->olap_holoods_risk_external_policy_info/api/dispatch/ant/guaranteeSlip/listant-group蚂蚁风险-保单数据列表
蚂蚁-客户申报列表external->olap_holoods_risk_external_corporation_audit_result_info/api/dispatch/ant/accountDeclaration/list蚂蚁风险-客户申报列表
蚂蚁-客户补充信息列表external->olap_holoads_admkt_risk_external_corporation_order_info_df/api/dispatch/ant/customerSupplInfo/list蚂蚁风险-客户补充信息列表
蚂蚁-账单回款列表external->olap_holoads_admkt_risk_external_bill_invoice_info_df/api/dispatch/ant/accountRefund/list蚂蚁风险-账单回款列表
蚂蚁-签约主体已作废发票列表external->olap_holoads_admkt_risk_external_corporation_abolish_invoice_info_df/api/dispatch/ant/corpAbolishInvoice/list蚂蚁风险-签约主体已作废发票列表
蚂蚁-发票还款明细列表olap_holoads_admkt_risk_external_invoice_payment_df/api/dispatch/ant/invoicePaymentItem/list蚂蚁风险-发票还款明细列表
xray/fb评级
FB渠道-每日违规消耗数据列表one_service->olap_holoads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df/api/dispatch/facebook/illegal/spend/Dailysino-dam10:00FB渠道-每日违规消耗数据列表
FB渠道-签约主体违规消耗topN广告账户列表one_service->olap_holoads_admkt_risk_xray_corporation_level_fb_account_disap_type_df/api/dispatch/facebook/top/illegal/spend/actListsino-dam10:00FB渠道-签约主体违规消耗topN广告账户列表
FB渠道-签约主体违规原因次数topN列表one_service->olap_holoads_admkt_risk_xray_corporation_level_fb_account_disap_type_df/api/dispatch/facebook/top/reviewsino-dam10:00FB渠道-签约主体违规原因次数topN列表
FB渠道-签约主体下违规消耗topN的广告列表one_service->olap_holoads_admkt_risk_xray_corporation_level_fb_ad_df/api/dispatch/facebook/top/illegal/spend/adListsino-dam10:00FB渠道-签约主体下违规消耗topN的广告列表
oneid_签约主体间关系信息列表olap_holoads_admkt_oneid_corporation_spend_df/api/dispatch/corporation/cluster/source/target/relation/Listoneid_签约主体间关系信息列表
签约主体oneid关系查询one_service->olap_holoai_oneid_client_group/api/dispatch/corporation/cluster/relation/List签约主体oneid关系列表
xray/tt评级10:00https://q6y68vu0j8.feishu.cn/drive/folder/BMBafXr3HlusBXdCF53c4cm3nKh?from=space_personal_filelist
TT渠道-被封账户列表one_service->olap_holoads_admkt_risk_xray_corporation_level_tt_account_df/api/dispatch/risk/account/ttActDisab/listTT渠道-被封账户列表
FB渠道-广告违规明细数据列表one_service->olap_holoads_admkt_adver_violation_ad_disap_fb_ad_df/api/dispatch/risk/ad/fbAdDisap/listFB渠道-广告违规明细数据列表
FB渠道-账户被封明细数据列表one_service->olap_holoads_admkt_adver_violation_act_disab_fb_account_df/api/dispatch/risk/account/fbActDisab/listFB渠道-账户被封明细数据列表
签约主体客户评级指标数据列表one_service->olap_holoads_admkt_risk_xray_corporation_level_corporation_df/api/dispatch/risk/corporation/corporationLevelIndex/list签约主体客户评级指标数据列表
广告主客户评级指标数据列表one_service->olap_holoads_admkt_risk_xray_corporation_level_adver_df/api/dispatch/risk/adver/corporationLevelIndex/list广告主客户评级指标数据列表
签约主体广告主客户评级指标数据列表one_service->olap_holoads_admkt_risk_tt_adver_rpt/api/dispatch/risk/corporationAdver/corporationLevelIndex/list签约主体广告主客户评级指标数据列表
xray/gg评级
GG渠道-签约主体违规指标列表digit_fastads_admkt_adver_violation_act_disab_gg_account_corporation_df/api/dispatch/risk/gg/corporation/corporationLevelIndex/listGG渠道-签约主体违规指标列表
GG渠道-账户被封明细数据digit_fastads_admkt_adver_violation_act_disab_gg_account/api/dispatch/risk/account/ggActDisab/listGG渠道-账户被封明细数据
签约主体历史指标接口one_service->olap_holoads_admkt_risk_corporation_month_cm/api/dispatch/historyIndicators
签约主体成效指标接口one_service->olap_holoads_admkt_risk_corporation_digital_rpt/api/dispatch/getCorporationDetail
签约主体基础信息接口one_service->digit_fastdim_admkt_cust_corporation/api/dispatch/getCorporationListsino-rm-report
素材审核
文字素材违规查询olap_holoads_admkt_risk_ad_violation_type_score/api/dispatch/textViolationListsinoclick,sino-xray文字素材违规查询
图片视频素材违规查询olap_holoads_admkt_risk_ad_cluster_violation_type_score/api/dispatch/activisionViolationListsinoclick,sino-xray图片视频素材违规查询
素材违规最新日期olap_holoads_admkt_risk_ad_violation_type_score/api/dispatch/violationMaxTimeDatasinoclickFB渠道素材违规最新日期
获取签约主体违规次数one_serviceads_admkt_cust_pub_corporation_nd/api/dispatch/getCorporationViolationsCountsino-xray,sino-ka获取签约主体违规信息聚合
获取tiktok广告素材数据one_service->olap_holoads_admkt_adver_pub_multi_ad_status/api/dispatch/getAdMaterialListsino-xray获取广告素材审查列表
素材审核-广告素材(无算法)列表digit_fastads_admkt_risk_xray_review_ad_material_di/api/dispatch/review/adMaterial/Listsino-xray素材审核-广告素材(无算法)列表
素材审核-广告账户签约主体(无算法)列表digit_fastads_admkt_risk_xray_review_ad_account_corporation_df/api/dispatch/review/adActCorp/Listsino-xray素材审核-广告账户签约主体(无算法)列表
阿里APIFB违规数据接口material_holo->olap_holoads_pub_fb_adinsights_larr_drr_spend/api/dispatch/alapi/facebook/adinsights/larr/arr/spend/listsino-xray素材审核-fb广告素材(老链路)列表
媒资开户https://q6y68vu0j8.feishu.cn/drive/folder/TmyXfA950ls3Q3d6dU5cApXEnpe
媒资开户-签约主体/广告主分渠道指标-列表digit_fastads_admkt_risk_dam_open_act_corporation_channel_df/api/dispatch/xray/dam/openAct/corpAdver/channel/Listsino-dam媒资开户签约主体/广告主分渠道指标-列表
媒资开户-开户链接跨账号/签约主体使用指标-列表digit_fastads_admkt_risk_dam_open_act_account_ad_url_df/api/dispatch/xray/dam/openAct/ActAdUrl/actCorpCnt/Listsino-dam媒资开户-开户链接跨账号/签约主体使用指标-列表
国家地区公共明细-列表digit_fastdim_admkt_pub_country/api/dispatch/pub/country/continents/Listsino-dam国家地区公共明细-列表
账户原始开户链接一级域名匹配列表digit_fastdwd_admkt_adver_account_link_df/api/dispatch/pub/account/actOriginParseUrl/Listsino-xray账户原始开户链接一级域名匹配列表

四、其他

一、余额预警

二、URL公共能力建设

暂时无法在飞书文档外展示此内容

  三.资产文档

   1)风控域数据字段,指标字典: 风控域指标资产
   2)风控数据的想法和思考
暂时无法在飞书文档外展示此内容

四、应急处理

1.场景1:爬虫爬取违规广告明细异常

  1. 20230720之前
爬虫爬取违规广告明细,如果当天拉取不到
1)通知xray先暂停拉数,等数据好了(爬虫-mc-holo常规流程跑完 验数)以后再开启。
2)通知xray先暂停拉数,如果当天爬虫没有具体交付时间,或者时间太晚。
和业务方沟通是否可以(昨天拉到的违规广告明细数据,作为今天的违规广告明细,继续往下跑。) 2),1 dwd_admkt_adver_ad_disap_fb_pcpi_ad_df设置强规则,执行下面语句后,在运维中心将任务置成功即可。 INSERT OVERWRITE TABLE sino_dc_cdm.dwd_admkt_adver_ad_disap_fb_pcpi_ad_df PARTITION (dt = '20230508') SELECT account_id ,ad_id ,disapproved_date ,disap_spend_usd ,review_feedback_type ,landing_url ,pcbm ,update_date ,dw_create_time FROM sino_dc_cdm.dwd_admkt_adver_ad_disap_fb_pcpi_ad_df WHERE dt = '20230507' ;
2),1 dwd_admkt_adver_ad_disap_fb_pcpi_ad_df设置弱规则,执行下面语句后,需手动按照上下游依赖手动调起任务。
  1. 20230720之后,增加兜底逻辑:当天数据异常用前一天爬取的数据。
正常情况下pcpi的违规广告明细2点开始获取,4点30获取完成。中间爬虫会有异常告警机制,出现问题会介入处理。
如果当天pcpi的违规广告明细, 如果当天8点钟还没拉下来,就用前一天爬到的数据 当做今天爬到的数据。(这种异常处理,代码层面做控制,无需人工手动处理)
如果连续2天都爬不到数据,需要人工介入处理。

2.场景2:违规广告明细正常获取,广告域fb数据延迟。

1,冻结推送3个推送holo的任务+1个发mq的任务。 exp_ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df_holo exp_ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df_holo
exp_ads_admkt_risk_xray_corporation_level_fb_ad_df_holo
send_rabbitmq_xray_corporation_level 2,广告域fb分时区后,计算评级指标的任务都是8点起,会依赖多个fb任务实例。 在运维中心几个关键指标表,手动操作-去除依赖, dws_admkt_adver_violation_fb_account_nd
dws_admkt_adver_violation_corporation_nd
dws_admkt_adver_violation_advertiser_nd 3,3个 ads_admkt_risk_xray_corporation_level_fb_* 表跑完后,验证下数据。 4,启动 推送3个推送holo的任务+1个发mq的任务

五、风控数据质量保障方案

六、flink

  1. 流程图

暂时无法在飞书文档外展示此内容
  1. 代码

  1. kafka_source

 
CREATE TEMPORARY TABLE `ods_amazon_listing_df_pre_source` ( `review_type` VARCHAR, `task_status` VARCHAR, `task_version` VARCHAR, `task_no` VARCHAR, `marketplace_id` VARCHAR, `seller_id` VARCHAR, `code` VARCHAR, `send_kafka_time` VARCHAR, `insert_time` VARCHAR, `task_type` VARCHAR, `api_only`VARCHAR, `asin` VARCHAR, `data` VARCHAR ) with ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '${secret_values.kafka_servers}', 'properties.group.id' = 'datahub-amazon-listing-info-prod', 'topic' = 'datahub-amazon-listing-info', 'scan.topic-partition-discovery.interval' = '100000', 'properties.auto.offset.reset' = 'earliest', 'format' = 'json' ); --listing_sink_mc CREATE TEMPORARY TABLE `ods_amazon_listing_df_pre_sink_mc` ( task_version STRING, task_type STRING, category_browse_path_by_id STRING, category_browse_path_by_name STRING, main_image STRING, attached_images STRING, videos STRING, brand STRING, brand_store_link STRING, brand_store_text STRING, rating_star STRING, rating_star_rating STRING, price_discount STRING, price_list_price STRING , price STRING, amazon_choice STRING, coupon STRING, key_attributes STRING, desc_five STRING, bundle STRING, aplus STRING, brand_story STRING, product_info STRING, buyer_videos STRING, rating_5star STRING, rating_4star STRING, rating_3star STRING, rating_2star STRING, rating_1star STRING, deal STRING, best_seller STRING, new_release STRING, product_desc STRING, search_term STRING, `asin` STRING, pasin STRING, marketplace_id STRING, title STRING, api_only STRING, seller_id STRING, by_feature STRING, variants STRING, send_kafka_time STRING, insert_time STRING, dh_create_time STRING, attribution string, pt STRING ) WITH ( 'connector' = 'odps', 'endpoint' = 'http://service.cn-shenzhen.maxcompute.aliyun-inc.com/api', 'tunnelEndpoint' = 'http://dt.cn-shenzhen.maxcompute.aliyun-inc.com', 'project' = '${secret_values.mc_project_name}', 'tablename' = 'odsopr_amazon_listing_df', 'accessid' = '${secret_values.accessid}', 'accesskey' = '${secret_values.accesskey}', 'partition' = 'pt' ) ; --listing_sink_holo create TEMPORARY table `ods_amazon_listing_df_pre_sink_holo`( task_version VARCHAR, task_type int, category_browse_path_by_id VARCHAR, category_browse_path_by_name VARCHAR, main_image VARCHAR, attached_images VARCHAR, videos VARCHAR, brand VARCHAR, brand_store_link VARCHAR, brand_store_text VARCHAR, rating_star numeric(38,4), rating_star_rating bigint, price_discount VARCHAR, price_list_price VARCHAR, price VARCHAR, amazon_choice VARCHAR, coupon VARCHAR, key_attributes VARCHAR, desc_five VARCHAR, bundle VARCHAR, aplus VARCHAR, brand_story VARCHAR, product_info VARCHAR, buyer_videos VARCHAR, rating_5star VARCHAR, rating_4star VARCHAR, rating_3star VARCHAR, rating_2star VARCHAR, rating_1star VARCHAR, deal VARCHAR, best_seller VARCHAR, new_release VARCHAR, product_desc VARCHAR, search_term VARCHAR, `asin` VARCHAR, pasin VARCHAR, marketplace_id VARCHAR, title VARCHAR, api_only VARCHAR, seller_id VARCHAR, by_feature STRING, variants STRING, send_kafka_time VARCHAR, insert_time bigint, dh_create_time VARCHAR, attribution varchar ) with ( 'connector' = 'hologres', 'dbname' = '${secret_values.holo_dbname_datahub}', 'tablename' = 'ods.ods_amazon_listing_df', 'username' = '${secret_values.holo_username_datahub}', 'password' = '${secret_values.holo_password_datahub}', 'endpoint' = '${secret_values.holo_endpoint_datahub}', 'mutatetype' = 'insertorupdate', 'ignoredelete' = 'true' ); BEGIN STATEMENT SET; --listing insert into ods_amazon_listing_df_pre_sink_mc select task_version, task_type, JSON_VALUE(data, '$.category.browsePathById') as category_browse_path_by_id, JSON_VALUE(data, '$.category.browsePathByName') as category_browse_path_by_name, JSON_VALUE(data, '$.main_image') as main_image, JSON_VALUE(data, '$.attached_images') as attached_images, JSON_VALUE(data, '$.videos') as videos, JSON_VALUE(data, '$.brand') as brand, JSON_VALUE(data, '$.brand_store.link') as brand_store_link, JSON_VALUE(data, '$.brand_store.text') as brand_store_text, JSON_VALUE(data, '$.rating_star.star') as rating_star, JSON_VALUE(data, '$.rating_star.rating') as rating_star_rating, JSON_VALUE(data, '$.price.discount') as price_discount, JSON_VALUE(data, '$.price.list_price') as price_list_price, JSON_VALUE(data, '$.price.price') as price, JSON_VALUE(data, '$.amazon_choice') as amazon_choice, JSON_VALUE(data, '$.coupon') as coupon, JSON_VALUE(data, '$.key_attributes') as key_attributes, JSON_VALUE(data, '$.desc_five') as desc_five, JSON_VALUE(data, '$.bundle') as bundle, JSON_VALUE(data, '$.aplus') as aplus, JSON_VALUE(data, '$.brand_story') as brand_story, JSON_VALUE(data, '$.product_info') as product_info, JSON_VALUE(data, '$.buyer_videos') as buyer_videos, JSON_VALUE(data, '$.rating.5') as rating_5star, JSON_VALUE(data, '$.rating.4') as rating_4star, JSON_VALUE(data, '$.rating.3') as rating_3star, JSON_VALUE(data, '$.rating.2') as rating_2star, JSON_VALUE(data, '$.rating.1') as rating_1star, JSON_VALUE(data, '$.deal') as deal, JSON_VALUE(data, '$.best_seller') as best_seller, JSON_VALUE(data, '$.new_release') as new_release, JSON_VALUE(data, '$.product_desc') as product_desc, JSON_VALUE(data, '$.search_term') as search_term, `asin`, JSON_VALUE(data, '$.pasin ') as pasin, marketplace_id, JSON_VALUE(data, '$.title') as title, api_only, seller_id, JSON_VALUE(data,'$.by_feature') as by_feature, JSON_VALUE(data,'$.variants') as variants, send_kafka_time, insert_time, cast (LOCALTIMESTAMP as String) as dh_create_time, JSON_VALUE(data, '$.attribution') as attribution, DATE_FORMAT(LOCALTIMESTAMP, 'yyyyMMdd') as pt from ods_amazon_listing_df_pre_source where task_status = '1' ; insert into ods_amazon_listing_df_pre_sink_holo select task_version, cast(task_type as int) as task_type, JSON_VALUE(data, '$.category.browsePathById') as category_browse_path_by_id, JSON_VALUE(data, '$.category.browsePathByName') as category_browse_path_by_name, JSON_VALUE(data, '$.main_image') as main_image, JSON_VALUE(data, '$.attached_images') as attached_images, JSON_VALUE(data, '$.videos') as videos, JSON_VALUE(data, '$.brand') as brand, JSON_VALUE(data, '$.brand_store.link') as brand_store_link, JSON_VALUE(data, '$.brand_store.text') as brand_store_text, cast(JSON_VALUE(data, '$.rating_star.star') as numeric(38,4)) as rating_star, cast(JSON_VALUE(data, '$.rating_star.rating') as bigint) as rating_star_rating, JSON_VALUE(data, '$.price.discount') as price_discount, JSON_VALUE(data, '$.price.list_price') as price_list_price, JSON_VALUE(data, '$.price.price') as price, JSON_VALUE(data, '$.amazon_choice') as amazon_choice, JSON_VALUE(data, '$.coupon') as coupon, JSON_VALUE(data, '$.key_attributes') as key_attributes, JSON_VALUE(data, '$.desc_five') as desc_five, JSON_VALUE(data, '$.bundle') as bundle, JSON_VALUE(data, '$.aplus') as aplus, JSON_VALUE(data, '$.brand_story') as brand_story, JSON_VALUE(data, '$.product_info') as product_info, JSON_VALUE(data, '$.buyer_videos') as buyer_videos, JSON_VALUE(data, '$.rating.5') as rating_5star, JSON_VALUE(data, '$.rating.4') as rating_4star, JSON_VALUE(data, '$.rating.3') as rating_3star, JSON_VALUE(data, '$.rating.2') as rating_2star, JSON_VALUE(data, '$.rating.1') as rating_1star, JSON_VALUE(data, '$.deal') as deal, JSON_VALUE(data, '$.best_seller') as best_seller, JSON_VALUE(data, '$.new_release') as new_release, JSON_VALUE(data, '$.product_desc') as product_desc, JSON_VALUE(data, '$.search_term') as search_term, `asin`, JSON_VALUE(data, '$.pasin ') as pasin, marketplace_id, JSON_VALUE(data, '$.title') as title, api_only, COALESCE(seller_id,'') as seller_id, JSON_VALUE(data,'$.by_feature') as by_feature, JSON_VALUE(data,'$.variants') as variants, send_kafka_time, cast(insert_time as bigint) as insert_time, cast (LOCALTIMESTAMP as String) as dh_create_time, JSON_VALUE(data, '$.attribution') as attribution from ods_amazon_listing_df_pre_source where task_status = '1' ; end;
  1. mysql_source

 
CREATE temporary TABLE mysqlcdc_ods_amazon_sales_metrics_df_source ( `id` bigint not null, `create_time` TIMESTAMP(6), `update_time` TIMESTAMP(6), `seller_id` char(14), `marketplace_id` char(14), `identity` string, `identity_type` varchar(10), `granularity` varchar(10), `account_currency` varchar(32), `sale_value` decimal(20,4), `sale_cnt` int, `order_item_cnt` int, `order_cnt` int, `calendar_date` date, `interval_start_time_utc` timestamp, `interval_end_time_utc` timestamp, `interval_start_time` timestamp, `interval_end_time` timestamp, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = '' ,'hostname' = '' ,'password' = '' ,'port' = '' ,'username' = '' ,'database-name' = '' ,'table-name' = '' ) ; --主订单 CREATE TEMPORARY TABLE `mysqlcdc_ods_amazon_sales_metrics_df_mc_sink` ( id bigint, create_time string, update_time string, seller_id string, marketplace_id string, `identity` string, identity_type string, granularity string, account_currency string, sale_value decimal(38,4), sale_cnt bigint, order_item_cnt bigint, order_cnt bigint, calendar_date string, interval_start_time_utc string, interval_end_time_utc string, interval_start_time string, interval_end_time string, pt String ) WITH ( 'connector' = 'odps', 'endpoint' = 'http://service.cn-shenzhen.maxcompute.aliyun-inc.com/api', 'tunnelEndpoint' = 'http://dt.cn-shenzhen.maxcompute.aliyun-inc.com', 'project' = '${secret_values.mc_project_name}', 'tablename' = 'odsopr_amazon_sales_metrics_df', 'accessid' = '${secret_values.accessid}', 'accesskey' = '${secret_values.accesskey}', 'partition' = 'pt' ) ; CREATE TEMPORARY TABLE `mysqlcdc_ods_amazon_sales_metrics_df_holo_sink` ( id bigint, create_time timestamp, update_time timestamp, seller_id string, marketplace_id string, `identity` string, identity_type string, granularity string, account_currency string, sale_value decimal(38,4), sale_cnt bigint, order_item_cnt bigint, order_cnt bigint, calendar_date date, interval_start_time_utc timestamp, interval_end_time_utc timestamp, interval_start_time timestamp, interval_end_time timestamp, dw_create_time string ) WITH ( 'connector' = 'hologres', 'dbname' = '${secret_values.holo_dbname_datahub}', 'tablename' = 'ods.ods_amazon_sales_metrics_df', 'username' = '${secret_values.holo_username_datahub}', 'password' = '${secret_values.holo_password_datahub}', 'endpoint' = '${secret_values.holo_endpoint_datahub}', 'mutatetype' = 'insertorupdate', 'ignoredelete' = 'true' ) ; --子订单 BEGIN STATEMENT SET; --mc表 insert into mysqlcdc_ods_amazon_sales_metrics_df_mc_sink select id, cast(create_time as string) as create_time, cast(update_time as string) as update_time, cast(seller_id as string) as seller_id, cast(marketplace_id as string) as marketplace_id, cast(`identity` as string) as `identity`, cast(identity_type as string) as identity_type, cast(granularity as string) as granularity, cast(account_currency as string) as account_currency, cast(sale_value as decimal(38,4)) as sale_value, cast(sale_cnt as bigint) as sale_cnt, cast(order_item_cnt as bigint) as order_item_cnt, cast(order_cnt as bigint) as order_cnt, cast(calendar_date as string) as calendar_date, cast(interval_start_time_utc as string) as interval_start_time_utc, cast(interval_end_time_utc as string) as interval_end_time_utc, cast(interval_start_time as string) as interval_start_time, cast(interval_end_time as string) as interval_end_time, DATE_FORMAT(LOCALTIMESTAMP, 'yyyyMMdd') as pt from mysqlcdc_ods_amazon_sales_metrics_df_source ; --holo表 insert into mysqlcdc_ods_amazon_sales_metrics_df_holo_sink select id, cast(create_time as TIMESTAMP) as create_time, cast(update_time as TIMESTAMP) as update_time, cast(seller_id as string) as seller_id, cast(marketplace_id as string) as marketplace_id, cast(`identity` as string) as `identity`, cast(identity_type as string) as identity_type, cast(granularity as string) as granularity, cast(account_currency as string) as account_currency, cast(sale_value as numeric(38,4)) as sale_value, cast(sale_cnt as bigint) as sale_cnt, cast(order_item_cnt as bigint) as order_item_cnt, cast(order_cnt as bigint) as order_cnt, cast(calendar_date as date) as calendar_date, cast(interval_start_time_utc as TIMESTAMP) as interval_start_time_utc, cast(interval_end_time_utc as TIMESTAMP) as interval_end_time_utc, cast(interval_start_time as TIMESTAMP) as interval_start_time, cast(interval_end_time as TIMESTAMP) as interval_end_time, cast (substr(LOCALTIMESTAMP,1,19) as String) as dw_create_time from mysqlcdc_ods_amazon_sales_metrics_df_source ; end;

六、UDF

  1. 使用

 
-- 注意:需要整体执行。不运行set(内网+专线)直接运行函数,会报错 set odps.session.networklink=sz_data; -- 内网+专线 set odps.stage.reducer.num = 40; select sino_dc_cdm.LinkRecognizeUDF('https://media.meetsocial.com/gg_media/2156178354047647962.jpeg','dataworks');
 
select sino_dc_cdm.ContainsAllSubstrUDF('','play.google,itunes.apple') ; --<>[]表示子字符串中至少有一个在链接中可以找到
 
select sino_dc_cdm.SecretAESUDF('Gla6hPHj8MeK1gMTDk8ZVfcsZ4xaf1XRf3cpxSnsAOc=') ; SELECT corporation_id ,name ,mobile ,id_no ,bank_card_no ,sino_dc_cdm.SecretAESUDF(name ) as name_decryption ,sino_dc_cdm.SecretAESUDF(mobile ) as mobile_decryption ,sino_dc_cdm.SecretAESUDF(id_no ) as id_no_decryption ,sino_dc_cdm.SecretAESUDF(bank_card_no) as bank_card_no_decryption FROM sino_dc_ods.ods_crm_corporation_certify_data_df a WHERE a.dt = '${dt}' ;
  1. 代码

  1. LinkRecognizeUDF

 
import darabonba.core.RequestModel; import okhttp3.*; import org.apache.hadoop.hive.ql.exec.UDF; import org.eclipse.jetty.http.MetaData; import org.json.JSONObject; import java.io.IOException; import java.io.PrintWriter; import java.io.StringWriter; public class LinkRecognizeUDF extends UDF { public String evaluate(String strurl , String appid) throws IOException { if (strurl == null) { return null; } if (strurl.length() == 0) { return null; } return linkRecognize(strurl,appid); } public String linkRecognize(String strurl,String appid) throws IOException { OkHttpClient client = new OkHttpClient(); // 使用 HttpUrl.Builder 来构建 URL,添加 GET 参数 // https://o-test-sino-dc-stb.meetsocial.cn/ // https://n-pre-sino-dc-stb.meetsocial.cn/ // https://n-sino-dc-stb.meetsocial.cn/ HttpUrl.Builder urlBuilder = HttpUrl.parse("https://n-sino-dc-stb.meetsocial.cn/url/usabilityTest").newBuilder(); urlBuilder.addQueryParameter("url", strurl); String url = urlBuilder.build().toString(); // 构建 Request 对象 Request request = new Request.Builder() .url(url) .addHeader("X-SINO-APP-ID", appid) .build(); String responseStr = ""; Integer maxRetry = 2; Integer retryCount = 0; while (retryCount < maxRetry) { try { // 执行 GET 请求 Response response = client.newCall(request).execute(); if (response.isSuccessful()) { // 如果响应成功 String responseData = response.body().string(); // 获取响应体的字符串形式 responseStr = responseData; } else { responseStr = getResult("-1", response.code() + " " + response.message()); } // 关闭响应体 response.close(); break; } catch (IOException e) { e.printStackTrace(); responseStr = getResult("-2",getExceptionAsString(e)); } retryCount++; } return responseStr; } public static String getExceptionAsString(Throwable throwable) { StringWriter stringWriter = new StringWriter(); PrintWriter printWriter = new PrintWriter(stringWriter); throwable.printStackTrace(printWriter); printWriter.flush(); return stringWriter.toString(); } public static String getResult(String code, String message) { JSONObject jsonObject = new JSONObject(); jsonObject.put("code", code); jsonObject.put("message", message); return jsonObject.toString(); } public static void main(String[] args) throws IOException { System.out.println(new LinkRecognizeUDF().evaluate("https://meida-file.oss-cn-shenzhen.aliyuncs.com/857457769472428.mp4","dataworks")); } }
  1. ContainsAllSubstrUDF

 
import org.apache.hadoop.hive.ql.exec.UDF; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.HashSet; public class ContainsAllSubstrUDF extends UDF { public String evaluate(String str, String substrs) throws IOException { if (str == null) { return null; } if (str.length() == 0) { return null; } return ContainsAllSubstr(str,substrs); } public static String ContainsAllSubstr(String str, String substrs) { HashSet<String> set = new HashSet<String>(); for (String substr : substrs.split(",")) { if (!set.contains(substr)) { if (str.contains(substr)) { set.add(substr); } else { } } } return set.toString() ; } public static void main(String[] args) throws IOException { // System.out.println(ContainsAllSubstr("qwerty",new String[]{"s","w","t"})); System.out.println(new ContainsAllSubstrUDF().ContainsAllSubstr("www.flashcycling.apps.apple.com.amazon.com.com","apps.apple.com,amazon.com")); System.out.println(new ContainsAllSubstrUDF().evaluate("www.flashcycling.apps.apple.com.amazon.com.com","apps.apple.com,amazon.com")); } }
  1. SecretAESUDF

 
import org.apache.hadoop.hive.ql.exec.UDF; import javax.crypto.Cipher; import javax.crypto.spec.SecretKeySpec; import java.io.IOException; import java.util.Base64; public class SecretAESUDF extends UDF { public String evaluate(String str ) throws IOException { if (str == null) { return null; } if (str.length() == 0) { return null; } if (str.length() == 1) { return null; } return Secretaes(str); } public String Secretaes(String str) { String decryptedData = ""; try { String encryptedData = str ; String key = "RQpgBF22P4afl10D"; byte[] encryptedDataBytes = Base64.getDecoder().decode(encryptedData); Cipher cipher = Cipher.getInstance("AES"); SecretKeySpec secretKeySpec = new SecretKeySpec(key.getBytes(),"AES"); cipher.init(Cipher.DECRYPT_MODE,secretKeySpec); byte[] decryptedDataBytes = cipher.doFinal(encryptedDataBytes); String decryptedDatatmp = new String(decryptedDataBytes,"UTF-8"); decryptedData = decryptedDatatmp; } catch (Exception e) { e.printStackTrace(); } return decryptedData; } public static void main(String[] args) throws IOException { System.out.println(new SecretAESUDF().evaluate("jk1piCLmwPvayqW5KTsPwpO5YMp53+ZTwsnqhY8oVZU=")); } }
 
import com.aliyun.oss.ClientConfiguration; import com.aliyun.oss.OSS; import com.aliyun.oss.OSSClient; import com.aliyun.oss.common.comm.Protocol; import org.apache.hadoop.hive.ql.exec.UDF; import com.aliyun.oss.OSSClientBuilder; import java.net.URL; import java.util.Date; public class UrlSignatureUDF extends UDF { public String evaluate(String objectName,Long s) { if (objectName == null) { return null; } if (objectName.length() == 0) { return null; } return ChackStringOp(objectName,s); } //处理字符串 public static String ChackStringOp(String object_,Long ss) { // Endpoint,其它Region请按实际情况填写。 String endpoint = "oss-cn-shenzhen.aliyuncs.com"; // 阿里云账号AccessKey拥有所有API的访问权限,风险很高。建议创建并使用RAM用户进行API访问或日常运维,请登录RAM控制台创建RAM用户。 String accessKeyId = "LTAI5tMhKY7ZKa5aURZJiKs4"; String accessKeySecret = "L0ubasTpzlvEPPZTOzkm7M59JIbznD"; // 填写Bucket名称,例如examplebucket。 String bucketName = "meida-file"; //特殊处理一下 String objectName= object_.replace("%2F", "/"); String result = null; try { URL Url = new URL(objectName); String s = Url.getPath(); result = s; if (objectName.contains(".cn/")){ int index = objectName.indexOf(".cn/"); result=objectName.substring(index+4); } if (objectName.contains(".com/")){ int index = objectName.indexOf(".com/"); result=objectName.substring(index+5); } } catch (Exception e) { e.printStackTrace(); } //设置https请求头 ClientConfiguration clientConfiguration = new ClientConfiguration(); clientConfiguration.setProtocol(Protocol.HTTPS); // 创建OSSClient实例。 OSS ossClient = new OSSClient(endpoint, accessKeyId, accessKeySecret,clientConfiguration); String str = null ; try { // 设置URL过期时间为1小时。 Date expiration = new Date(System.currentTimeMillis() + ss * 1000); // 生成以GET方法访问的签名URL, URL url = ossClient.generatePresignedUrl(bucketName, result, expiration); ossClient.shutdown(); str = url.toString(); }catch (Exception ce) { System.out.println("Error Message:" + ce.getMessage()); } finally { if (ossClient != null) { //关闭连接 ossClient.shutdown(); } } return str; } public static void main(String[] args) { System.out.println(new UrlSignatureUDF().evaluate("http://media.meetsocial.cn/fb_video%2Fa2c0126e488562ef040bcb576290d03a40ed9720393668f5adeb59814f68b468.mp4",Long.valueOf(100))); } }
 
import org.apache.hadoop.hive.ql.exec.UDF; import java.text.SimpleDateFormat; import java.util.Date; public class UDFDatestamp2Date extends UDF { public String evaluate(String s) { if (s == null) { return null ; } if (s.length() == 0 ) { return null ; } return timeStamp2Date(s); } public String timeStamp2Date (String arg) { if (arg.length()==10) { SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); String date = sdf.format(new Date(Long.valueOf(arg + "000"))); return date; } if (arg.length()==13) { SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); String date = sdf.format(new Date(Long.valueOf(arg))); return date; } return null ; } // public static void main(String[] args) { // System.out.println( new UDFDatestamp2Date().evaluate("1656777604")); // System.out.println( new UDFDatestamp2Date().evaluate("1656111111111")); // } }
 
-- py 数据导出 -- mc_export_data """ -*- encoding:utf-8 -*- ========================== author: @time: 2021/12/01 12:12:21 Project:BigData ========================== """ from datetime import datetime, timedelta import pandas as pd from odps import ODPS maxCompute_database = { "access_id": "LTAI5tMhKY7ZKa5aURZJiKs4", "secret_access_key": "L0ubasTpzlvEPPZTOzkm7M59JIbznD", "endpoint": "http://service.cn-shenzhen.maxcompute.aliyun.com/api", "project": "sino_dc_cdm_dev" } def mc_cdm_connect(sql): try: print("MaxCompute_数据库链接成功,稍等片刻!") tmp = [] o = ODPS(**maxCompute_database) with o.execute_sql(sql).open_reader() as reader: for read in reader: for res in read: tmp.append(list(res)) return tmp except ConnectionError as err: print('联系开发') def export_mc_data(mc_sql_script): get_mc_data_list = mc_cdm_connect(mc_sql_script) mc_data_list = [] mc_data_list_name = [] for mc in get_mc_data_list: mc_data_list.append(mc[1]) mc_data_list_name.append(mc[0]) count = mc_data_list_name.count(mc_data_list_name[0]) if count == 1: mc_name_num = len(mc_data_list_name) mc_list = [] for i in range(0, len(mc_data_list), mc_name_num): mc_list.append(mc_data_list[i:i + mc_name_num]) column_name = mc_data_list_name[0:mc_name_num] save_excel = pd.DataFrame(mc_list, columns=column_name) file_name = (datetime.now() + timedelta()).strftime("%Y.%m.%d_%H时%M分%S秒") save_excel.to_excel(f"./{file_name}.xlsx", index=False) print(f'文件:{file_name}.xlsx') elif count > 1: mc_name_num = mc_data_list_name.index(mc_data_list_name[0], 1, len(mc_data_list_name)) mc_list = [] for i in range(0, len(mc_data_list), mc_name_num): mc_list.append(mc_data_list[i:i + mc_name_num]) column_name = mc_data_list_name[0:mc_name_num] save_excel = pd.DataFrame(mc_list, columns=column_name) file_name = (datetime.now() + timedelta()).strftime("%Y.%m.%d_%H时%M分%S秒") save_excel.to_excel(f"./{file_name}.xlsx", index=False) print(f'文件:{file_name}.xlsx') else: print('逻辑有问题,请联系开发') sql_mc_export = """ SELECT * FROM sino_dc_cdm_dev.corp_link_20240925_3 aa """ if __name__ == '__main__': export_mc_data(sql_mc_export)

七、广告知识

归因设置

含义: 用户与广告互动(如点击、观看广告等)后,能够将用户的后继行动(如应用安装、购买或者其他转化行为)归因于该广告互动的时间段。广告组层级选择归因设置。
用途:影响如何评估广告活动的效果,以及如何分配营销预算给表现最佳的广告渠道。
种类
  • 点击归因:用户 点击了广告并采取了操作。
  • 浏览归因:虽然用户 看见广告后未点击广告,但也在统计时间窗内采取了操作。
  • 互动观看归因:用户 观看视频广告达 10 秒或广告播放进度达 97%(若广告总长不足 10 秒)后,在 1 天内执行了操作,就会计入互动观看转化次数.
例子:
假如归因窗口设置为7天点击广告XX,以如下表格为例,只有用户A、用户B的安装被归到10月号那一天的install转化指标中, 用户C的安装被归因到10月2号的广告YY投放中。
10月1号
10月2号
10月3号
10月4号
10月5号
10月6号
10月7号
10月8号
10月9号
用户A
点击广告XX
安装
用户B
点击广告XX
安装
用户C
点击广告XX
点击广告YY
安装
归因窗口种类
解释
1d_view
1 Day View-through, 一天展示归因 , 用户浏览广告但是未点击它,在接下来的24小时内进行了某种转化,那么这个转化就被归因于该广告的展示
展示归因窗口 对认识到 品牌曝光 转化有间接影响 的广告主来说非常重要
7d_view
7 Day View-through, 七天展示归因, 用户浏览广告但是未点击它,在接下来的7天内进行了某种转化,那么这个转化就被归因于该广告的展示
1d_click
点击后 1 天内归因,指的是在点击广告后 24 小时内发生的转化事件。它可以用于评估广告系列的 短期效果, 一般用于
  • 衡量 即时购物 广告系列的效果。
  • 比较不同 广告素材 的效果。
  • 优化 着陆页 的设计。
7d_click
点击后 7 天内归因,是在点击广告后 7 天(168 小时)内发生的转化事件。它可以用于评估广告系列的 短期效果
default
7d_click,1d_view

广告结构

第一层级:广告系列 campaign
广告系列顾名思义是整个广告的基础,在这一层操作中选择并确定你所需要 广告目标,开始并连接下一层,最终创建广告组。
第二层级:广告组
在广告组的操作中,首先需要选择并 创建受众(包括:地区、性别、年龄等其他选择条件);然后,设置本次广告的预算和 排期(如发布以及结束时间);接着就是设置广告竞价;最后连接下一层,创建广告。
第三层级:广告

fb采集加工链路

暂时无法在飞书文档外展示此内容

fb资料

  1. 1.Facebook广告账户申请【P2】
通过访问https://business.facebook.com/,您可以申请自己的Facebook广告账户。这是开始在Facebook平台上进
行广告投放的第一步。
  1. 2.Facebook中国官网【P2】
https://reachtheworldonfacebook.com/contact-us/,这是Facebook在中国的官方网站,无需翻墙即可访问。您可
以在该网站上获取最新的消息、成功案例和政策指南等,并建议订阅相关内容以保持更新。
  1. 3.Facebook官方认证代理商【P2】:
https://reachtheworldonfacebook.com/reseller/,如果您需要专业的广告服务支持,可以通过该网站找到
Facebook官方认证的代理商。
  1. 4.Facebook自学工具:Blueprint【P0】
https://reachtheworldonfacebook.com/facebook-blueprint/,Blueprint是Facebook提供的自学工具,为广告主提
供有关广告策略、目标受众和广告管理的教育和培训资源。//增加
https://www.facebook.com/business/learn/courses https://www.facebookblueprint.com/student/catalog
  1. 5.Facebook工具:跨境商机大数据平台【P2】
https://crossborderinsightsfinder.com/zh-cn/,该平台提供了跨境电商相关的数据洞察和市场分析,帮助广告主了
解和利用跨境商机。//暂无权限
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值