风控服务的模块和对象
分类 | 模块 | 交付形式 | 业务方/使用方 | 核心场景 | |
风控: 我司对客在 授信、调额、开户、充值 等方面的管控 | 风控database/query数据集 | mc | 数据分析:赵鹏、卢少烽 | 客户线下标签 | dwd_admkt_risk_corporation_offline_tag_df |
企查查数据 | ods_qcc_shell_scan_list_df... | ||||
客户认证 | bi | 风控策略:申秋语 | 客户认证过程明细 | dwd_admkt_cust_corporation_auth_df dwd_admkt_cust_guarantee_auth_df dwd_admkt_cust_corporation_final_auth_df ads_admkt_cust_corporation_final_auth_df | |
oneid | bi、接口 | 风控信审:赵鹏、xray、风控系统 | 客户唯一聚合ID | ads_admkt_cust_pub_oneid_corporation,dim_admkt_cust_oneid_corporation_pair_snap,ads_admkt_oneid_corporation_spend_df | |
风控大盘 | bi | 风控信审:赵鹏、王琳 | 客户综合评分 | dwd_admkt_risk_corporation_offline_tag_df ,ads_admkt_risk_corporation_base_df,ads_admkt_risk_corporation_performance_df,ads_admkt_risk_corporation_tag_df ads_admkt_risk_corporation_synthesize_score_df,ads_admkt_risk_corporation_relation_score_df | |
风控系统 | 接口 | 风控系统:张洁、张帅 | 客户风险提示标签 | ads_admkt_risk_corporation_risk_tag_df,ads_admkt_risk_spend_account_df | |
xray客户评级 | 接口 | 媒资:程晓君、高亚召 | fb客户评级 | ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df | |
gg客户评级 | ads_admkt_adver_violation_act_disab_gg_account_corporation_df | ||||
tt客户评级 | ads_admkt_risk_tt_adver_rpt ads_admkt_risk_xray_corporation_level_tt_account_df ads_admkt_risk_xray_corporation_level_corporation_df ads_admkt_risk_xray_corporation_level_adver_df | ||||
真金客户报表 | bi | 风控策略:何雨瑶、 | 真金广告、客户报表 | ads_admkt_risk_channel_standard_corporation_violation_type_df | |
合规: 媒体对账户(媒体没有签约主体的概念)、媒体对我司(BM,MCC,BC等) 在媒体政策和整体违规率水平 的要求 | xray广告素材审核 | 接口 | 媒体合规:周玉珊 数字化工程:王佳佳、陈思捷 | fb广告素材审核 | ads_pub_fb_adinsights_larr_drr_spend |
gg广告素材审核 | ads_admkt_risk_xray_review_ad_material_di,ads_admkt_risk_xray_review_ad_account_corporation_df | ||||
tt广告素材审核 | ads_admkt_adver_pub_multi_ad_status,ads_admkt_risk_ad_cluster_violation_type_score | ||||
fb/gg/tt3大渠道合规报表 | bi | 风控策略:何雨瑶、申秋语 | fb合规报表 | ads_admkt_risk_violation_channel_fb_account_df,ads_admkt_risk_violation_channel_corporation_df, | |
gg合规报表 | dwd_admkt_adver_act_disab_gg_account_df,ads_admkt_adver_violation_act_disab_gg_account_corporation_df | ||||
tt合规报表 | ads_admkt_risk_tt_adver_rpt | ||||
开A跑B | bi | 媒体合规:周玉珊 | fb | dwd_admkt_adver_account_link_df 全渠道全量账户开户链接 dwd_admkt_adver_ad_link_df 3大渠道有消耗的广告投放链接 dwd_admkt_risk_act_ad_link_compare_df fb/gg开A跑B标签 ads_admkt_risk_gg_act_ad_link_compare_df GG开A跑B应用表 ads_admkt_risk_fb_act_ad_link_compare_df FB开A跑B应用表 | |
gg | |||||
其他 | 蚂蚁金融合作 | 接口 | 风控金融:梅君城 | ods_risk_external_policy_info ods_risk_external_corporation_audit_result_info ads_admkt_risk_external_corporation_order_info_df ads_admkt_risk_external_bill_invoice_info_df ads_admkt_risk_external_corporation_abolish_invoice_info_df ads_admkt_risk_external_invoice_payment_df | |
风控sinoclick | 朱佳慧 | ads_admkt_risk_violation_fb_ad_df | |||
预付资损 | 风控资管:徐静、吴极 | 之前的文档: 案件汇报链路: 案件汇报链路 预付问题说明: Mediabuy预付问题说明 技术配合梳理的文档: 1,第一批(深圳翊家,蓝色跳动)to 检察院 第一批预付咨损证据(to 检察院) 2,第三批第一波 第三批第一波预付咨损证据 3,第三批第二波 第三批第二波预付咨损证据 |
合规管理的3个阶段
1、投前审核。(数据配合媒资系统开户审批)
2、投中审查。(算法和数据一起检测违规)
3、投后劝退。(风控部劝退屡教不改客户)
暂时无法在飞书文档外展示此内容
指标维度总线矩阵
维度 | 业务用数 | |||||||||||||||||||||||||||||||||||||||||
媒体 | 业务过程 | 指标 | 指标释义 | 数据时效 | 广告 | 账户 | 签约主体 | 广告主 | BM | 风控报表 | 客户评级 | 素材审核 | 开A跑B | 风控系统 | 真金 | 开户审核 | ||||||||||||||||||||||||||
1d | 7d | 28d | 30d | 1d | 7d | 28d | 30d | 1d | 7d | 28d | 30d | 12m | his | 1d | 7d | 28d | 30d | his | 1d | 7d | 28d | 风控日常媒体违规率运营 | 合规客户违规率运营、管控 | fb客户评级 | gg客户评级 | tt客户评级 | fb素材审核 | gg素材审核 | tt素材审核 | fb-AB名单管控 | gg-AB名单管控 | 授信申请 | 授信调额 | 风控额度管控 | 风控日常消耗管控 | 资管逾期管控 | 开户申请审核 | |||||
Meta | 广告违规 | 广告违规消耗 | 媒体判定的违规消耗 | T-4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||
总消耗 | 广告总消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||
LARR | 违规消耗/总消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||
违规广告数 | 媒体判定的违规广告数 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
有消耗广告数 | 有消耗广告数 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
账户被封 | 账户被封消耗 | 账户被封消耗 | T-1 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||
总消耗 | 账户总消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||
DRR | 账户被封消耗/账户总消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||
被封账户数 | 被封账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
有消耗账户数 | 有消耗账户数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
GG | 账户被封 | 被封账户数(lag) | 含30天内转出的被封账户数 | T-1 | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
账户总数(lag) | 含30天内转出的总账户数 | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
封户率(lag) | 含30天内转出的被封账户数/含30天内转出的总账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
被封账户数(无lag) | 不含30天内转出的被封账户数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
账户总数(无lag) | 不含30天内转出的总账户数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
封户率(无lag) | 不含30天内转出的被封账户数/不含30天内转出的总账户数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
新开被封账户数 | 新开被封账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
新开账户数 | 新开账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
新户封户率 | 新开被封账户数/新开账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
被封消耗(美金) | 被封消耗(美金) | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
消耗(美金) | 总消耗(美金) | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
被封消耗占比 | 被封消耗(美金)/总消耗(美金) | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
TT | 账户被封 | 账户总数 | 账户总数 | T-2 | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
被封账户数 | 被封账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
消耗 | 总消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
被封消耗 | 被封消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
新开账户数 | 新开账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
新开被封账户数 | 新开被封账户数 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
新开账户消耗 | 新开账户消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
新开被封账户消耗 | 新开被封账户消耗 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
Snap | 账号违规 | 高危账户 | 当日广告拒登率≥80%且昨日广告拒登率<80% | T-1 | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
广告违规 | 违规广告数 | 违规广告数 | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
广告数 | 广告数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
广告拒登率 | 违规广告数/广告数 | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
全渠道 | 消耗 | 总消耗 | 账单+api消耗 | T-1 | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
消耗月数 | 消耗月数 | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
月均消耗 | 月均消耗 | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
余额 | 余额 | 余额(剔除无上限账户) | ✓ | ` | ✓ | |||||||||||||||||||||||||||||||||||||
活跃账户余额 | 活跃账户余额 | ✓ | ✓ |
一、风控
一、风控database/query数据集
风控database的技术方案:
风控database建设
二、风控大盘(指标+综合评分)
1,文档
1.产品需求文档:
第1批基础指标和报表
2.数仓技术文档:
风控大盘基础指标_一期
风控大盘基础指标_二期+客户综合评分
2,流程图
暂时无法在飞书文档外展示此内容
暂时无法在飞书文档外展示此内容
先把字段按基本信息、指标、标签来分,是基于应用层公共能力建设的考虑,对于这个需求 如果关联提供不满足时效要求,就在数仓先组装好 做个性化的ads表提供。
反思总结:1,需求频次太杂,比如第4第5次迭代中的 有关字段值格式的调整和加字段这类需求可以调研清楚之后合并起来做。可以减少开发、测试、上线、刷数等一系列繁复的流程。调格式 加字段这类的需求 业务价值并不很大,最好是业务产品核对清楚后 一次性解决上线。开发可以参与一些业务调研性的事情,给出合理建议。
三、风控系统
1,文档
2,流程图
暂时无法在飞书文档外展示此内容
四、xray-客户评级流程
1,文档
2,流程图
暂时无法在飞书文档外展示此内容
二,合规
一、FB违规指标加工流程
1,文档
1.产品需求文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=478478400
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=465895586
2,流程图
暂时无法在飞书文档外展示此内容
3,问题解释
-- fb官方名词解释:larr drr rad
分子1 违规广告消耗(rad)
分子2 被封账户消耗
分母 总消耗
larr = 分子1 / 分母(计算时分子分母的时间窗口必须保持一致)
drr = 分子2 / 分母
-------------------------------------------------------------------
LARR 是 update_date-27~update_date是因为媒体给的违规数据无法做到T-1的频率去更新(API抓不到很多拒登广告),所以根据以往我们对更新频率的观察给的这个时间区间;
DRR 是 t-28~t-1是因为分子的被封消耗我们API大部分能抓取到且能达到T-1的更新频率;
账户的drr in ('0','1',null)
larr>1?
select * -- 124个账户
from sino_dc_cdm.dws_admkt_adver_violation_fb_account_nd
where dt = '20230330'
and fb_larr_28d_stu > 1
为什么是近28天:媒体就用近28天的数据来考核
fb-BM介绍:https://zhuanlan.zhihu.com/p/575805295
二、GG违规指标加工流程
1,文档
1.产品需求文档:
Google 客户违规数据报表-按商务开放
2.数仓技术文档:
Google 客户违规数据(木瓜赛文斯)
2,流程图
暂时无法在飞书文档外展示此内容
通知(被封):ods_admkt_adver_gg_mcc_illegal_act_disab_df -> dwd_admkt_adver_act_disab_gg_account_df
子账号设置:ods_admkt_gg_mcc_subaccount_setup_report_df -> dwd_admkt_adver_act_gg_mcc_trans_scd
预算: ods_admkt_gg_mcc_budget_report_df
MCC每日全量账户信息表 ads_admkt_adver_violation_act_disab_gg_mcc_account_df(scd)
google被封账户指标明细表 ads_admkt_adver_violation_act_disab_gg_account(act_disab_gg)
google被封账户汇总表 ads_admkt_adver_violation_act_disab_gg_account_banned
google签约主体合规明细 ads_admkt_adver_violation_act_disab_gg_account_corporation_df
3,名词解释
我司2个大MCC:'Sinointeractive HK','深诺MCC'
木瓜:'GCC-Papaya' -- 20230817上线新增新增木瓜、赛文斯的逻辑,20231013木瓜账户全部转入我司
赛文斯:'GCC_s'
select account as mcc ,count(distinct customer_id) as act_cnt from sino_dc_ods.ods_admkt_gg_mcc_subaccount_setup_report_df where dt = '20240306' group by account;
mcc
|
act_cnt
|
GCC_s
|
430
|
Sinointeractive HK
|
60095
|
深诺MCC
|
38531
|
三、TT违规指标加工流程
1,文档
1.产品需求文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=431063130
http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=494371267
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=448692378
2,流程图
暂时无法在飞书文档外展示此内容
3,名词解释
四、xray-广告素材审核
Xray pre:https://pre-xray.meetsocial.cn/facebook/advertising_blacklist/advertising_sub#/ tt/gg 分组坑位:title,素材id,落地页
1,文档
1.产品需求文档:
素材违规数据需求
2.数仓技术文档:http://conf.meetsocial.cn:6688/pages/viewpage.action?pageId=445808911
3.接口文档:
文字素材违规查询
图片视频素材违规查询
2,流程图
v1、现状(fb/tt):
暂时无法在飞书文档外展示此内容
fb现状流程:(
0617-FB素材审核数据缺失)
思考/未来工作方向
1,tt广告素材审核,走算法,数仓,xray这个流程,ai先审一边再给人工审,应该是可以很大程度上提升审核效率。 2,业务系统不直接调用算法,数仓提供实时能力。
3,json结构不灵活,如果新增审核类型,算法 数仓 xray 都要调整处理json的代码。json格式如何调整。 xray可以直接用数仓的解析json的结果,不用重复解析。
v2、规划和方向():
暂时无法在飞书文档外展示此内容
gg:
暂时无法在飞书文档外展示此内容
3,xray相关表
任务名(xray-fb客户评级相关) ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df ads_admkt_risk_xray_corporation_level_fb_ad_df 任务名(xray-tt客户评级相关) ai_oneid_client_group ads_admkt_risk_xray_corporation_level_tt_account_df ads_admkt_risk_xray_corporation_level_corporation_df ads_admkt_risk_xray_corporation_level_adver_df http_message_ads_admkt_risk_xray ads_admkt_risk_tt_adver_rpt XRAY-FB广告素材审核 ads_pub_fb_adinsights_larr_drr_spend XRAY-TT广告素材审核 ads_admkt_risk_ad_cluster_violation_type_score ads_admkt_risk_ad_violation_type_score ads_admkt_cust_pub_corporation_nd ads_admkt_adver_pub_multi_ad_status /api/dispatch/getAdMaterialList 获取广告 限制scheduleStartTime (限制20000) /api/dispatch/textViolationList + /api/dispatch/activisionViolationList 1970<ad_create_time<2029 /api/dispatch/getCorporationViolationsCount 累计违规次数做排序 channel_warn_cnt_std,channel_disabled_cnt_std,exposure 排序 xray-分组坑位规则:md5(title,素材id,落地页) XRAY-GG广告素材审核 ads_admkt_risk_xray_review_ad_material_di exp_ads_admkt_risk_xray_review_ad_material_di_olap ads_admkt_risk_xray_review_ad_account_corporation_df exp_ads_admkt_risk_xray_review_ad_account_corporation_df_olap http_message_ads_admkt_risk_xray_ad_material_review
4, 违规类型分析统计
三、风控相关接口梳理
业务流
暂时无法在飞书文档外展示此内容
接口详情
暂时无法在飞书文档外展示此内容
名称 | 表 | API Path | appid | SLA | 接口文档 | |
媒资 | ||||||
签约主体分渠道未使用账户数 | olap_holo | ads_admkt_cust_unused_account_detail | /api/dispatch/custom/category/corporation/unusedActCnt/list | 10:00 | 签约主体分渠道未使用账户数_接口 | |
风控系统 | ||||||
风险提示标签 | one_service->olap_holo | ads_admkt_risk_corporation_risk_tag_df | /api/dispatch/risk/corporation/corporationRiskTag/list | 签约主体风险提示标签_接口 | ||
账户日消耗 | one_service->olap_holo | ads_admkt_risk_spend_account_df | /api/dispatch/risk/account/accountSpend30d/list | 10:00 | 近30天账户每日api消耗 | |
账户属性列表 | one_service | dim_admkt_adver_account_corporation_opt | /api/dispatch/industry/category/account/list | 广告账户属性列表 | ||
签约主体预警标签_接口 | digit_fast | sino_dc_risk.risk_warning_label_application | /api/dispatch/risk/corporation/corporationWarnTag/list | 签约主体预警标签_接口 | ||
蚂蚁 | 02:00 | https://q6y68vu0j8.feishu.cn/drive/folder/fldcnIRbzA8zns1XkihIiG5GRef | ||||
蚂蚁-保单数据列表 | external->olap_holo | ods_risk_external_policy_info | /api/dispatch/ant/guaranteeSlip/list | ant-group | 蚂蚁风险-保单数据列表 | |
蚂蚁-客户申报列表 | external->olap_holo | ods_risk_external_corporation_audit_result_info | /api/dispatch/ant/accountDeclaration/list | 蚂蚁风险-客户申报列表 | ||
蚂蚁-客户补充信息列表 | external->olap_holo | ads_admkt_risk_external_corporation_order_info_df | /api/dispatch/ant/customerSupplInfo/list | 蚂蚁风险-客户补充信息列表 | ||
蚂蚁-账单回款列表 | external->olap_holo | ads_admkt_risk_external_bill_invoice_info_df | /api/dispatch/ant/accountRefund/list | 蚂蚁风险-账单回款列表 | ||
蚂蚁-签约主体已作废发票列表 | external->olap_holo | ads_admkt_risk_external_corporation_abolish_invoice_info_df | /api/dispatch/ant/corpAbolishInvoice/list | 蚂蚁风险-签约主体已作废发票列表 | ||
蚂蚁-发票还款明细列表 | olap_holo | ads_admkt_risk_external_invoice_payment_df | /api/dispatch/ant/invoicePaymentItem/list | 蚂蚁风险-发票还款明细列表 | ||
xray/fb评级 | ||||||
FB渠道-每日违规消耗数据列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df | /api/dispatch/facebook/illegal/spend/Daily | sino-dam | 10:00 | FB渠道-每日违规消耗数据列表 |
FB渠道-签约主体违规消耗topN广告账户列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df | /api/dispatch/facebook/top/illegal/spend/actList | sino-dam | 10:00 | FB渠道-签约主体违规消耗topN广告账户列表 |
FB渠道-签约主体违规原因次数topN列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df | /api/dispatch/facebook/top/review | sino-dam | 10:00 | FB渠道-签约主体违规原因次数topN列表 |
FB渠道-签约主体下违规消耗topN的广告列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_fb_ad_df | /api/dispatch/facebook/top/illegal/spend/adList | sino-dam | 10:00 | FB渠道-签约主体下违规消耗topN的广告列表 |
oneid_签约主体间关系信息列表 | olap_holo | ads_admkt_oneid_corporation_spend_df | /api/dispatch/corporation/cluster/source/target/relation/List | oneid_签约主体间关系信息列表 | ||
签约主体oneid关系查询 | one_service->olap_holo | ai_oneid_client_group | /api/dispatch/corporation/cluster/relation/List | 签约主体oneid关系列表 | ||
xray/tt评级 | 10:00 | https://q6y68vu0j8.feishu.cn/drive/folder/BMBafXr3HlusBXdCF53c4cm3nKh?from=space_personal_filelist | ||||
TT渠道-被封账户列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_tt_account_df | /api/dispatch/risk/account/ttActDisab/list | TT渠道-被封账户列表 | ||
FB渠道-广告违规明细数据列表 | one_service->olap_holo | ads_admkt_adver_violation_ad_disap_fb_ad_df | /api/dispatch/risk/ad/fbAdDisap/list | FB渠道-广告违规明细数据列表 | ||
FB渠道-账户被封明细数据列表 | one_service->olap_holo | ads_admkt_adver_violation_act_disab_fb_account_df | /api/dispatch/risk/account/fbActDisab/list | FB渠道-账户被封明细数据列表 | ||
签约主体客户评级指标数据列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_corporation_df | /api/dispatch/risk/corporation/corporationLevelIndex/list | 签约主体客户评级指标数据列表 | ||
广告主客户评级指标数据列表 | one_service->olap_holo | ads_admkt_risk_xray_corporation_level_adver_df | /api/dispatch/risk/adver/corporationLevelIndex/list | 广告主客户评级指标数据列表 | ||
签约主体广告主客户评级指标数据列表 | one_service->olap_holo | ads_admkt_risk_tt_adver_rpt | /api/dispatch/risk/corporationAdver/corporationLevelIndex/list | 签约主体广告主客户评级指标数据列表 | ||
xray/gg评级 | ||||||
GG渠道-签约主体违规指标列表 | digit_fast | ads_admkt_adver_violation_act_disab_gg_account_corporation_df | /api/dispatch/risk/gg/corporation/corporationLevelIndex/list | GG渠道-签约主体违规指标列表 | ||
GG渠道-账户被封明细数据 | digit_fast | ads_admkt_adver_violation_act_disab_gg_account | /api/dispatch/risk/account/ggActDisab/list | GG渠道-账户被封明细数据 | ||
签约主体历史指标接口 | one_service->olap_holo | ads_admkt_risk_corporation_month_cm | /api/dispatch/historyIndicators | |||
签约主体成效指标接口 | one_service->olap_holo | ads_admkt_risk_corporation_digital_rpt | /api/dispatch/getCorporationDetail | |||
签约主体基础信息接口 | one_service->digit_fast | dim_admkt_cust_corporation | /api/dispatch/getCorporationList | sino-rm-report | ||
素材审核 | ||||||
文字素材违规查询 | olap_holo | ads_admkt_risk_ad_violation_type_score | /api/dispatch/textViolationList | sinoclick,sino-xray | 文字素材违规查询 | |
图片视频素材违规查询 | olap_holo | ads_admkt_risk_ad_cluster_violation_type_score | /api/dispatch/activisionViolationList | sinoclick,sino-xray | 图片视频素材违规查询 | |
素材违规最新日期 | olap_holo | ads_admkt_risk_ad_violation_type_score | /api/dispatch/violationMaxTimeData | sinoclick | FB渠道素材违规最新日期 | |
获取签约主体违规次数 | one_service | ads_admkt_cust_pub_corporation_nd | /api/dispatch/getCorporationViolationsCount | sino-xray,sino-ka | 获取签约主体违规信息聚合 | |
获取tiktok广告素材数据 | one_service->olap_holo | ads_admkt_adver_pub_multi_ad_status | /api/dispatch/getAdMaterialList | sino-xray | 获取广告素材审查列表 | |
素材审核-广告素材(无算法)列表 | digit_fast | ads_admkt_risk_xray_review_ad_material_di | /api/dispatch/review/adMaterial/List | sino-xray | 素材审核-广告素材(无算法)列表 | |
素材审核-广告账户签约主体(无算法)列表 | digit_fast | ads_admkt_risk_xray_review_ad_account_corporation_df | /api/dispatch/review/adActCorp/List | sino-xray | 素材审核-广告账户签约主体(无算法)列表 | |
阿里APIFB违规数据接口 | material_holo->olap_holo | ads_pub_fb_adinsights_larr_drr_spend | /api/dispatch/alapi/facebook/adinsights/larr/arr/spend/list | sino-xray | 素材审核-fb广告素材(老链路)列表 | |
媒资开户 | https://q6y68vu0j8.feishu.cn/drive/folder/TmyXfA950ls3Q3d6dU5cApXEnpe | |||||
媒资开户-签约主体/广告主分渠道指标-列表 | digit_fast | ads_admkt_risk_dam_open_act_corporation_channel_df | /api/dispatch/xray/dam/openAct/corpAdver/channel/List | sino-dam | 媒资开户签约主体/广告主分渠道指标-列表 | |
媒资开户-开户链接跨账号/签约主体使用指标-列表 | digit_fast | ads_admkt_risk_dam_open_act_account_ad_url_df | /api/dispatch/xray/dam/openAct/ActAdUrl/actCorpCnt/List | sino-dam | 媒资开户-开户链接跨账号/签约主体使用指标-列表 | |
国家地区公共明细-列表 | digit_fast | dim_admkt_pub_country | /api/dispatch/pub/country/continents/List | sino-dam | 国家地区公共明细-列表 | |
账户原始开户链接一级域名匹配列表 | digit_fast | dwd_admkt_adver_account_link_df | /api/dispatch/pub/account/actOriginParseUrl/List | sino-xray | 账户原始开户链接一级域名匹配列表 |
四、其他
一、余额预警
二、URL公共能力建设
暂时无法在飞书文档外展示此内容
四、应急处理
1.场景1:爬虫爬取违规广告明细异常
-
20230720之前
爬虫爬取违规广告明细,如果当天拉取不到
1)通知xray先暂停拉数,等数据好了(爬虫-mc-holo常规流程跑完 验数)以后再开启。
2)通知xray先暂停拉数,如果当天爬虫没有具体交付时间,或者时间太晚。
和业务方沟通是否可以(昨天拉到的违规广告明细数据,作为今天的违规广告明细,继续往下跑。) 2),1 dwd_admkt_adver_ad_disap_fb_pcpi_ad_df设置强规则,执行下面语句后,在运维中心将任务置成功即可。 INSERT OVERWRITE TABLE sino_dc_cdm.dwd_admkt_adver_ad_disap_fb_pcpi_ad_df PARTITION (dt = '20230508') SELECT account_id ,ad_id ,disapproved_date ,disap_spend_usd ,review_feedback_type ,landing_url ,pcbm ,update_date ,dw_create_time FROM sino_dc_cdm.dwd_admkt_adver_ad_disap_fb_pcpi_ad_df WHERE dt = '20230507' ;
2),1 dwd_admkt_adver_ad_disap_fb_pcpi_ad_df设置弱规则,执行下面语句后,需手动按照上下游依赖手动调起任务。
-
20230720之后,增加兜底逻辑:当天数据异常用前一天爬取的数据。
正常情况下pcpi的违规广告明细2点开始获取,4点30获取完成。中间爬虫会有异常告警机制,出现问题会介入处理。
如果当天pcpi的违规广告明细,
如果当天8点钟还没拉下来,就用前一天爬到的数据 当做今天爬到的数据。(这种异常处理,代码层面做控制,无需人工手动处理)
如果连续2天都爬不到数据,需要人工介入处理。
2.场景2:违规广告明细正常获取,广告域fb数据延迟。
1,冻结推送3个推送holo的任务+1个发mq的任务。 exp_ads_admkt_risk_xray_corporation_level_fb_account_and_corporation_df_holo exp_ads_admkt_risk_xray_corporation_level_fb_account_disap_type_df_holo
exp_ads_admkt_risk_xray_corporation_level_fb_ad_df_holo
send_rabbitmq_xray_corporation_level 2,广告域fb分时区后,计算评级指标的任务都是8点起,会依赖多个fb任务实例。 在运维中心几个关键指标表,手动操作-去除依赖, dws_admkt_adver_violation_fb_account_nd
dws_admkt_adver_violation_corporation_nd
dws_admkt_adver_violation_advertiser_nd 3,3个 ads_admkt_risk_xray_corporation_level_fb_* 表跑完后,验证下数据。 4,启动 推送3个推送holo的任务+1个发mq的任务
五、风控数据质量保障方案
六、flink
-
流程图
暂时无法在飞书文档外展示此内容
-
代码
-
kafka_source
CREATE TEMPORARY TABLE `ods_amazon_listing_df_pre_source` ( `review_type` VARCHAR, `task_status` VARCHAR, `task_version` VARCHAR, `task_no` VARCHAR, `marketplace_id` VARCHAR, `seller_id` VARCHAR, `code` VARCHAR, `send_kafka_time` VARCHAR, `insert_time` VARCHAR, `task_type` VARCHAR, `api_only`VARCHAR, `asin` VARCHAR, `data` VARCHAR ) with ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '${secret_values.kafka_servers}', 'properties.group.id' = 'datahub-amazon-listing-info-prod', 'topic' = 'datahub-amazon-listing-info', 'scan.topic-partition-discovery.interval' = '100000', 'properties.auto.offset.reset' = 'earliest', 'format' = 'json' ); --listing_sink_mc CREATE TEMPORARY TABLE `ods_amazon_listing_df_pre_sink_mc` ( task_version STRING, task_type STRING, category_browse_path_by_id STRING, category_browse_path_by_name STRING, main_image STRING, attached_images STRING, videos STRING, brand STRING, brand_store_link STRING, brand_store_text STRING, rating_star STRING, rating_star_rating STRING, price_discount STRING, price_list_price STRING , price STRING, amazon_choice STRING, coupon STRING, key_attributes STRING, desc_five STRING, bundle STRING, aplus STRING, brand_story STRING, product_info STRING, buyer_videos STRING, rating_5star STRING, rating_4star STRING, rating_3star STRING, rating_2star STRING, rating_1star STRING, deal STRING, best_seller STRING, new_release STRING, product_desc STRING, search_term STRING, `asin` STRING, pasin STRING, marketplace_id STRING, title STRING, api_only STRING, seller_id STRING, by_feature STRING, variants STRING, send_kafka_time STRING, insert_time STRING, dh_create_time STRING, attribution string, pt STRING ) WITH ( 'connector' = 'odps', 'endpoint' = 'http://service.cn-shenzhen.maxcompute.aliyun-inc.com/api', 'tunnelEndpoint' = 'http://dt.cn-shenzhen.maxcompute.aliyun-inc.com', 'project' = '${secret_values.mc_project_name}', 'tablename' = 'odsopr_amazon_listing_df', 'accessid' = '${secret_values.accessid}', 'accesskey' = '${secret_values.accesskey}', 'partition' = 'pt' ) ; --listing_sink_holo create TEMPORARY table `ods_amazon_listing_df_pre_sink_holo`( task_version VARCHAR, task_type int, category_browse_path_by_id VARCHAR, category_browse_path_by_name VARCHAR, main_image VARCHAR, attached_images VARCHAR, videos VARCHAR, brand VARCHAR, brand_store_link VARCHAR, brand_store_text VARCHAR, rating_star numeric(38,4), rating_star_rating bigint, price_discount VARCHAR, price_list_price VARCHAR, price VARCHAR, amazon_choice VARCHAR, coupon VARCHAR, key_attributes VARCHAR, desc_five VARCHAR, bundle VARCHAR, aplus VARCHAR, brand_story VARCHAR, product_info VARCHAR, buyer_videos VARCHAR, rating_5star VARCHAR, rating_4star VARCHAR, rating_3star VARCHAR, rating_2star VARCHAR, rating_1star VARCHAR, deal VARCHAR, best_seller VARCHAR, new_release VARCHAR, product_desc VARCHAR, search_term VARCHAR, `asin` VARCHAR, pasin VARCHAR, marketplace_id VARCHAR, title VARCHAR, api_only VARCHAR, seller_id VARCHAR, by_feature STRING, variants STRING, send_kafka_time VARCHAR, insert_time bigint, dh_create_time VARCHAR, attribution varchar ) with ( 'connector' = 'hologres', 'dbname' = '${secret_values.holo_dbname_datahub}', 'tablename' = 'ods.ods_amazon_listing_df', 'username' = '${secret_values.holo_username_datahub}', 'password' = '${secret_values.holo_password_datahub}', 'endpoint' = '${secret_values.holo_endpoint_datahub}', 'mutatetype' = 'insertorupdate', 'ignoredelete' = 'true' ); BEGIN STATEMENT SET; --listing insert into ods_amazon_listing_df_pre_sink_mc select task_version, task_type, JSON_VALUE(data, '$.category.browsePathById') as category_browse_path_by_id, JSON_VALUE(data, '$.category.browsePathByName') as category_browse_path_by_name, JSON_VALUE(data, '$.main_image') as main_image, JSON_VALUE(data, '$.attached_images') as attached_images, JSON_VALUE(data, '$.videos') as videos, JSON_VALUE(data, '$.brand') as brand, JSON_VALUE(data, '$.brand_store.link') as brand_store_link, JSON_VALUE(data, '$.brand_store.text') as brand_store_text, JSON_VALUE(data, '$.rating_star.star') as rating_star, JSON_VALUE(data, '$.rating_star.rating') as rating_star_rating, JSON_VALUE(data, '$.price.discount') as price_discount, JSON_VALUE(data, '$.price.list_price') as price_list_price, JSON_VALUE(data, '$.price.price') as price, JSON_VALUE(data, '$.amazon_choice') as amazon_choice, JSON_VALUE(data, '$.coupon') as coupon, JSON_VALUE(data, '$.key_attributes') as key_attributes, JSON_VALUE(data, '$.desc_five') as desc_five, JSON_VALUE(data, '$.bundle') as bundle, JSON_VALUE(data, '$.aplus') as aplus, JSON_VALUE(data, '$.brand_story') as brand_story, JSON_VALUE(data, '$.product_info') as product_info, JSON_VALUE(data, '$.buyer_videos') as buyer_videos, JSON_VALUE(data, '$.rating.5') as rating_5star, JSON_VALUE(data, '$.rating.4') as rating_4star, JSON_VALUE(data, '$.rating.3') as rating_3star, JSON_VALUE(data, '$.rating.2') as rating_2star, JSON_VALUE(data, '$.rating.1') as rating_1star, JSON_VALUE(data, '$.deal') as deal, JSON_VALUE(data, '$.best_seller') as best_seller, JSON_VALUE(data, '$.new_release') as new_release, JSON_VALUE(data, '$.product_desc') as product_desc, JSON_VALUE(data, '$.search_term') as search_term, `asin`, JSON_VALUE(data, '$.pasin ') as pasin, marketplace_id, JSON_VALUE(data, '$.title') as title, api_only, seller_id, JSON_VALUE(data,'$.by_feature') as by_feature, JSON_VALUE(data,'$.variants') as variants, send_kafka_time, insert_time, cast (LOCALTIMESTAMP as String) as dh_create_time, JSON_VALUE(data, '$.attribution') as attribution, DATE_FORMAT(LOCALTIMESTAMP, 'yyyyMMdd') as pt from ods_amazon_listing_df_pre_source where task_status = '1' ; insert into ods_amazon_listing_df_pre_sink_holo select task_version, cast(task_type as int) as task_type, JSON_VALUE(data, '$.category.browsePathById') as category_browse_path_by_id, JSON_VALUE(data, '$.category.browsePathByName') as category_browse_path_by_name, JSON_VALUE(data, '$.main_image') as main_image, JSON_VALUE(data, '$.attached_images') as attached_images, JSON_VALUE(data, '$.videos') as videos, JSON_VALUE(data, '$.brand') as brand, JSON_VALUE(data, '$.brand_store.link') as brand_store_link, JSON_VALUE(data, '$.brand_store.text') as brand_store_text, cast(JSON_VALUE(data, '$.rating_star.star') as numeric(38,4)) as rating_star, cast(JSON_VALUE(data, '$.rating_star.rating') as bigint) as rating_star_rating, JSON_VALUE(data, '$.price.discount') as price_discount, JSON_VALUE(data, '$.price.list_price') as price_list_price, JSON_VALUE(data, '$.price.price') as price, JSON_VALUE(data, '$.amazon_choice') as amazon_choice, JSON_VALUE(data, '$.coupon') as coupon, JSON_VALUE(data, '$.key_attributes') as key_attributes, JSON_VALUE(data, '$.desc_five') as desc_five, JSON_VALUE(data, '$.bundle') as bundle, JSON_VALUE(data, '$.aplus') as aplus, JSON_VALUE(data, '$.brand_story') as brand_story, JSON_VALUE(data, '$.product_info') as product_info, JSON_VALUE(data, '$.buyer_videos') as buyer_videos, JSON_VALUE(data, '$.rating.5') as rating_5star, JSON_VALUE(data, '$.rating.4') as rating_4star, JSON_VALUE(data, '$.rating.3') as rating_3star, JSON_VALUE(data, '$.rating.2') as rating_2star, JSON_VALUE(data, '$.rating.1') as rating_1star, JSON_VALUE(data, '$.deal') as deal, JSON_VALUE(data, '$.best_seller') as best_seller, JSON_VALUE(data, '$.new_release') as new_release, JSON_VALUE(data, '$.product_desc') as product_desc, JSON_VALUE(data, '$.search_term') as search_term, `asin`, JSON_VALUE(data, '$.pasin ') as pasin, marketplace_id, JSON_VALUE(data, '$.title') as title, api_only, COALESCE(seller_id,'') as seller_id, JSON_VALUE(data,'$.by_feature') as by_feature, JSON_VALUE(data,'$.variants') as variants, send_kafka_time, cast(insert_time as bigint) as insert_time, cast (LOCALTIMESTAMP as String) as dh_create_time, JSON_VALUE(data, '$.attribution') as attribution from ods_amazon_listing_df_pre_source where task_status = '1' ; end;
-
mysql_source
CREATE temporary TABLE mysqlcdc_ods_amazon_sales_metrics_df_source ( `id` bigint not null, `create_time` TIMESTAMP(6), `update_time` TIMESTAMP(6), `seller_id` char(14), `marketplace_id` char(14), `identity` string, `identity_type` varchar(10), `granularity` varchar(10), `account_currency` varchar(32), `sale_value` decimal(20,4), `sale_cnt` int, `order_item_cnt` int, `order_cnt` int, `calendar_date` date, `interval_start_time_utc` timestamp, `interval_end_time_utc` timestamp, `interval_start_time` timestamp, `interval_end_time` timestamp, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = '' ,'hostname' = '' ,'password' = '' ,'port' = '' ,'username' = '' ,'database-name' = '' ,'table-name' = '' ) ; --主订单 CREATE TEMPORARY TABLE `mysqlcdc_ods_amazon_sales_metrics_df_mc_sink` ( id bigint, create_time string, update_time string, seller_id string, marketplace_id string, `identity` string, identity_type string, granularity string, account_currency string, sale_value decimal(38,4), sale_cnt bigint, order_item_cnt bigint, order_cnt bigint, calendar_date string, interval_start_time_utc string, interval_end_time_utc string, interval_start_time string, interval_end_time string, pt String ) WITH ( 'connector' = 'odps', 'endpoint' = 'http://service.cn-shenzhen.maxcompute.aliyun-inc.com/api', 'tunnelEndpoint' = 'http://dt.cn-shenzhen.maxcompute.aliyun-inc.com', 'project' = '${secret_values.mc_project_name}', 'tablename' = 'odsopr_amazon_sales_metrics_df', 'accessid' = '${secret_values.accessid}', 'accesskey' = '${secret_values.accesskey}', 'partition' = 'pt' ) ; CREATE TEMPORARY TABLE `mysqlcdc_ods_amazon_sales_metrics_df_holo_sink` ( id bigint, create_time timestamp, update_time timestamp, seller_id string, marketplace_id string, `identity` string, identity_type string, granularity string, account_currency string, sale_value decimal(38,4), sale_cnt bigint, order_item_cnt bigint, order_cnt bigint, calendar_date date, interval_start_time_utc timestamp, interval_end_time_utc timestamp, interval_start_time timestamp, interval_end_time timestamp, dw_create_time string ) WITH ( 'connector' = 'hologres', 'dbname' = '${secret_values.holo_dbname_datahub}', 'tablename' = 'ods.ods_amazon_sales_metrics_df', 'username' = '${secret_values.holo_username_datahub}', 'password' = '${secret_values.holo_password_datahub}', 'endpoint' = '${secret_values.holo_endpoint_datahub}', 'mutatetype' = 'insertorupdate', 'ignoredelete' = 'true' ) ; --子订单 BEGIN STATEMENT SET; --mc表 insert into mysqlcdc_ods_amazon_sales_metrics_df_mc_sink select id, cast(create_time as string) as create_time, cast(update_time as string) as update_time, cast(seller_id as string) as seller_id, cast(marketplace_id as string) as marketplace_id, cast(`identity` as string) as `identity`, cast(identity_type as string) as identity_type, cast(granularity as string) as granularity, cast(account_currency as string) as account_currency, cast(sale_value as decimal(38,4)) as sale_value, cast(sale_cnt as bigint) as sale_cnt, cast(order_item_cnt as bigint) as order_item_cnt, cast(order_cnt as bigint) as order_cnt, cast(calendar_date as string) as calendar_date, cast(interval_start_time_utc as string) as interval_start_time_utc, cast(interval_end_time_utc as string) as interval_end_time_utc, cast(interval_start_time as string) as interval_start_time, cast(interval_end_time as string) as interval_end_time, DATE_FORMAT(LOCALTIMESTAMP, 'yyyyMMdd') as pt from mysqlcdc_ods_amazon_sales_metrics_df_source ; --holo表 insert into mysqlcdc_ods_amazon_sales_metrics_df_holo_sink select id, cast(create_time as TIMESTAMP) as create_time, cast(update_time as TIMESTAMP) as update_time, cast(seller_id as string) as seller_id, cast(marketplace_id as string) as marketplace_id, cast(`identity` as string) as `identity`, cast(identity_type as string) as identity_type, cast(granularity as string) as granularity, cast(account_currency as string) as account_currency, cast(sale_value as numeric(38,4)) as sale_value, cast(sale_cnt as bigint) as sale_cnt, cast(order_item_cnt as bigint) as order_item_cnt, cast(order_cnt as bigint) as order_cnt, cast(calendar_date as date) as calendar_date, cast(interval_start_time_utc as TIMESTAMP) as interval_start_time_utc, cast(interval_end_time_utc as TIMESTAMP) as interval_end_time_utc, cast(interval_start_time as TIMESTAMP) as interval_start_time, cast(interval_end_time as TIMESTAMP) as interval_end_time, cast (substr(LOCALTIMESTAMP,1,19) as String) as dw_create_time from mysqlcdc_ods_amazon_sales_metrics_df_source ; end;
六、UDF
-
使用
-- 注意:需要整体执行。不运行set(内网+专线)直接运行函数,会报错 set odps.session.networklink=sz_data; -- 内网+专线 set odps.stage.reducer.num = 40; select sino_dc_cdm.LinkRecognizeUDF('https://media.meetsocial.com/gg_media/2156178354047647962.jpeg','dataworks');
select sino_dc_cdm.ContainsAllSubstrUDF('','play.google,itunes.apple') ; --<>[]表示子字符串中至少有一个在链接中可以找到
select sino_dc_cdm.SecretAESUDF('Gla6hPHj8MeK1gMTDk8ZVfcsZ4xaf1XRf3cpxSnsAOc=') ; SELECT corporation_id ,name ,mobile ,id_no ,bank_card_no ,sino_dc_cdm.SecretAESUDF(name ) as name_decryption ,sino_dc_cdm.SecretAESUDF(mobile ) as mobile_decryption ,sino_dc_cdm.SecretAESUDF(id_no ) as id_no_decryption ,sino_dc_cdm.SecretAESUDF(bank_card_no) as bank_card_no_decryption FROM sino_dc_ods.ods_crm_corporation_certify_data_df a WHERE a.dt = '${dt}' ;
-
代码
-
LinkRecognizeUDF
import darabonba.core.RequestModel; import okhttp3.*; import org.apache.hadoop.hive.ql.exec.UDF; import org.eclipse.jetty.http.MetaData; import org.json.JSONObject; import java.io.IOException; import java.io.PrintWriter; import java.io.StringWriter; public class LinkRecognizeUDF extends UDF { public String evaluate(String strurl , String appid) throws IOException { if (strurl == null) { return null; } if (strurl.length() == 0) { return null; } return linkRecognize(strurl,appid); } public String linkRecognize(String strurl,String appid) throws IOException { OkHttpClient client = new OkHttpClient(); // 使用 HttpUrl.Builder 来构建 URL,添加 GET 参数 // https://o-test-sino-dc-stb.meetsocial.cn/ // https://n-pre-sino-dc-stb.meetsocial.cn/ // https://n-sino-dc-stb.meetsocial.cn/ HttpUrl.Builder urlBuilder = HttpUrl.parse("https://n-sino-dc-stb.meetsocial.cn/url/usabilityTest").newBuilder(); urlBuilder.addQueryParameter("url", strurl); String url = urlBuilder.build().toString(); // 构建 Request 对象 Request request = new Request.Builder() .url(url) .addHeader("X-SINO-APP-ID", appid) .build(); String responseStr = ""; Integer maxRetry = 2; Integer retryCount = 0; while (retryCount < maxRetry) { try { // 执行 GET 请求 Response response = client.newCall(request).execute(); if (response.isSuccessful()) { // 如果响应成功 String responseData = response.body().string(); // 获取响应体的字符串形式 responseStr = responseData; } else { responseStr = getResult("-1", response.code() + " " + response.message()); } // 关闭响应体 response.close(); break; } catch (IOException e) { e.printStackTrace(); responseStr = getResult("-2",getExceptionAsString(e)); } retryCount++; } return responseStr; } public static String getExceptionAsString(Throwable throwable) { StringWriter stringWriter = new StringWriter(); PrintWriter printWriter = new PrintWriter(stringWriter); throwable.printStackTrace(printWriter); printWriter.flush(); return stringWriter.toString(); } public static String getResult(String code, String message) { JSONObject jsonObject = new JSONObject(); jsonObject.put("code", code); jsonObject.put("message", message); return jsonObject.toString(); } public static void main(String[] args) throws IOException { System.out.println(new LinkRecognizeUDF().evaluate("https://meida-file.oss-cn-shenzhen.aliyuncs.com/857457769472428.mp4","dataworks")); } }
-
ContainsAllSubstrUDF
import org.apache.hadoop.hive.ql.exec.UDF; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.HashSet; public class ContainsAllSubstrUDF extends UDF { public String evaluate(String str, String substrs) throws IOException { if (str == null) { return null; } if (str.length() == 0) { return null; } return ContainsAllSubstr(str,substrs); } public static String ContainsAllSubstr(String str, String substrs) { HashSet<String> set = new HashSet<String>(); for (String substr : substrs.split(",")) { if (!set.contains(substr)) { if (str.contains(substr)) { set.add(substr); } else { } } } return set.toString() ; } public static void main(String[] args) throws IOException { // System.out.println(ContainsAllSubstr("qwerty",new String[]{"s","w","t"})); System.out.println(new ContainsAllSubstrUDF().ContainsAllSubstr("www.flashcycling.apps.apple.com.amazon.com.com","apps.apple.com,amazon.com")); System.out.println(new ContainsAllSubstrUDF().evaluate("www.flashcycling.apps.apple.com.amazon.com.com","apps.apple.com,amazon.com")); } }
-
SecretAESUDF
import org.apache.hadoop.hive.ql.exec.UDF; import javax.crypto.Cipher; import javax.crypto.spec.SecretKeySpec; import java.io.IOException; import java.util.Base64; public class SecretAESUDF extends UDF { public String evaluate(String str ) throws IOException { if (str == null) { return null; } if (str.length() == 0) { return null; } if (str.length() == 1) { return null; } return Secretaes(str); } public String Secretaes(String str) { String decryptedData = ""; try { String encryptedData = str ; String key = "RQpgBF22P4afl10D"; byte[] encryptedDataBytes = Base64.getDecoder().decode(encryptedData); Cipher cipher = Cipher.getInstance("AES"); SecretKeySpec secretKeySpec = new SecretKeySpec(key.getBytes(),"AES"); cipher.init(Cipher.DECRYPT_MODE,secretKeySpec); byte[] decryptedDataBytes = cipher.doFinal(encryptedDataBytes); String decryptedDatatmp = new String(decryptedDataBytes,"UTF-8"); decryptedData = decryptedDatatmp; } catch (Exception e) { e.printStackTrace(); } return decryptedData; } public static void main(String[] args) throws IOException { System.out.println(new SecretAESUDF().evaluate("jk1piCLmwPvayqW5KTsPwpO5YMp53+ZTwsnqhY8oVZU=")); } }
import com.aliyun.oss.ClientConfiguration; import com.aliyun.oss.OSS; import com.aliyun.oss.OSSClient; import com.aliyun.oss.common.comm.Protocol; import org.apache.hadoop.hive.ql.exec.UDF; import com.aliyun.oss.OSSClientBuilder; import java.net.URL; import java.util.Date; public class UrlSignatureUDF extends UDF { public String evaluate(String objectName,Long s) { if (objectName == null) { return null; } if (objectName.length() == 0) { return null; } return ChackStringOp(objectName,s); } //处理字符串 public static String ChackStringOp(String object_,Long ss) { // Endpoint,其它Region请按实际情况填写。 String endpoint = "oss-cn-shenzhen.aliyuncs.com"; // 阿里云账号AccessKey拥有所有API的访问权限,风险很高。建议创建并使用RAM用户进行API访问或日常运维,请登录RAM控制台创建RAM用户。 String accessKeyId = "LTAI5tMhKY7ZKa5aURZJiKs4"; String accessKeySecret = "L0ubasTpzlvEPPZTOzkm7M59JIbznD"; // 填写Bucket名称,例如examplebucket。 String bucketName = "meida-file"; //特殊处理一下 String objectName= object_.replace("%2F", "/"); String result = null; try { URL Url = new URL(objectName); String s = Url.getPath(); result = s; if (objectName.contains(".cn/")){ int index = objectName.indexOf(".cn/"); result=objectName.substring(index+4); } if (objectName.contains(".com/")){ int index = objectName.indexOf(".com/"); result=objectName.substring(index+5); } } catch (Exception e) { e.printStackTrace(); } //设置https请求头 ClientConfiguration clientConfiguration = new ClientConfiguration(); clientConfiguration.setProtocol(Protocol.HTTPS); // 创建OSSClient实例。 OSS ossClient = new OSSClient(endpoint, accessKeyId, accessKeySecret,clientConfiguration); String str = null ; try { // 设置URL过期时间为1小时。 Date expiration = new Date(System.currentTimeMillis() + ss * 1000); // 生成以GET方法访问的签名URL, URL url = ossClient.generatePresignedUrl(bucketName, result, expiration); ossClient.shutdown(); str = url.toString(); }catch (Exception ce) { System.out.println("Error Message:" + ce.getMessage()); } finally { if (ossClient != null) { //关闭连接 ossClient.shutdown(); } } return str; } public static void main(String[] args) { System.out.println(new UrlSignatureUDF().evaluate("http://media.meetsocial.cn/fb_video%2Fa2c0126e488562ef040bcb576290d03a40ed9720393668f5adeb59814f68b468.mp4",Long.valueOf(100))); } }
import org.apache.hadoop.hive.ql.exec.UDF; import java.text.SimpleDateFormat; import java.util.Date; public class UDFDatestamp2Date extends UDF { public String evaluate(String s) { if (s == null) { return null ; } if (s.length() == 0 ) { return null ; } return timeStamp2Date(s); } public String timeStamp2Date (String arg) { if (arg.length()==10) { SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); String date = sdf.format(new Date(Long.valueOf(arg + "000"))); return date; } if (arg.length()==13) { SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); String date = sdf.format(new Date(Long.valueOf(arg))); return date; } return null ; } // public static void main(String[] args) { // System.out.println( new UDFDatestamp2Date().evaluate("1656777604")); // System.out.println( new UDFDatestamp2Date().evaluate("1656111111111")); // } }
-- py 数据导出 -- mc_export_data """ -*- encoding:utf-8 -*- ========================== author: @time: 2021/12/01 12:12:21 Project:BigData ========================== """ from datetime import datetime, timedelta import pandas as pd from odps import ODPS maxCompute_database = { "access_id": "LTAI5tMhKY7ZKa5aURZJiKs4", "secret_access_key": "L0ubasTpzlvEPPZTOzkm7M59JIbznD", "endpoint": "http://service.cn-shenzhen.maxcompute.aliyun.com/api", "project": "sino_dc_cdm_dev" } def mc_cdm_connect(sql): try: print("MaxCompute_数据库链接成功,稍等片刻!") tmp = [] o = ODPS(**maxCompute_database) with o.execute_sql(sql).open_reader() as reader: for read in reader: for res in read: tmp.append(list(res)) return tmp except ConnectionError as err: print('联系开发') def export_mc_data(mc_sql_script): get_mc_data_list = mc_cdm_connect(mc_sql_script) mc_data_list = [] mc_data_list_name = [] for mc in get_mc_data_list: mc_data_list.append(mc[1]) mc_data_list_name.append(mc[0]) count = mc_data_list_name.count(mc_data_list_name[0]) if count == 1: mc_name_num = len(mc_data_list_name) mc_list = [] for i in range(0, len(mc_data_list), mc_name_num): mc_list.append(mc_data_list[i:i + mc_name_num]) column_name = mc_data_list_name[0:mc_name_num] save_excel = pd.DataFrame(mc_list, columns=column_name) file_name = (datetime.now() + timedelta()).strftime("%Y.%m.%d_%H时%M分%S秒") save_excel.to_excel(f"./{file_name}.xlsx", index=False) print(f'文件:{file_name}.xlsx') elif count > 1: mc_name_num = mc_data_list_name.index(mc_data_list_name[0], 1, len(mc_data_list_name)) mc_list = [] for i in range(0, len(mc_data_list), mc_name_num): mc_list.append(mc_data_list[i:i + mc_name_num]) column_name = mc_data_list_name[0:mc_name_num] save_excel = pd.DataFrame(mc_list, columns=column_name) file_name = (datetime.now() + timedelta()).strftime("%Y.%m.%d_%H时%M分%S秒") save_excel.to_excel(f"./{file_name}.xlsx", index=False) print(f'文件:{file_name}.xlsx') else: print('逻辑有问题,请联系开发') sql_mc_export = """ SELECT * FROM sino_dc_cdm_dev.corp_link_20240925_3 aa """ if __name__ == '__main__': export_mc_data(sql_mc_export)
七、广告知识
归因设置
含义: 用户与广告互动(如点击、观看广告等)后,能够将用户的后继行动(如应用安装、购买或者其他转化行为)归因于该广告互动的时间段。广告组层级选择归因设置。
用途:影响如何评估广告活动的效果,以及如何分配营销预算给表现最佳的广告渠道。
种类:
-
点击归因:用户 点击了广告并采取了操作。
-
浏览归因:虽然用户 看见广告后未点击广告,但也在统计时间窗内采取了操作。
-
互动观看归因:用户 观看视频广告达 10 秒或广告播放进度达 97%(若广告总长不足 10 秒)后,在 1 天内执行了操作,就会计入互动观看转化次数.
例子:
假如归因窗口设置为7天点击广告XX,以如下表格为例,只有用户A、用户B的安装被归到10月号那一天的install转化指标中, 用户C的安装被归因到10月2号的广告YY投放中。
10月1号
|
10月2号
|
10月3号
|
10月4号
|
10月5号
|
10月6号
|
10月7号
|
10月8号
|
10月9号
| |
用户A
|
点击广告XX
|
安装
| |||||||
用户B
|
点击广告XX
|
安装
| |||||||
用户C
|
点击广告XX
|
点击广告YY
|
安装
|
归因窗口种类
|
解释
|
1d_view
|
1 Day View-through,
一天展示归因
, 用户浏览广告但是未点击它,在接下来的24小时内进行了某种转化,那么这个转化就被归因于该广告的展示
展示归因窗口
对认识到
品牌曝光
对
转化有间接影响
的广告主来说非常重要
|
7d_view
|
7 Day View-through,
七天展示归因, 用户浏览广告但是未点击它,在接下来的7天内进行了某种转化,那么这个转化就被归因于该广告的展示
|
1d_click
|
点击后 1 天内归因,指的是在点击广告后 24 小时内发生的转化事件。它可以用于评估广告系列的
短期效果,
一般用于
|
7d_click
|
点击后 7 天内归因,是在点击广告后 7 天(168 小时)内发生的转化事件。它可以用于评估广告系列的
短期效果
|
default
|
7d_click,1d_view
|
广告结构
第一层级:广告系列 campaign
广告系列顾名思义是整个广告的基础,在这一层操作中选择并确定你所需要
广告目标,开始并连接下一层,最终创建广告组。
第二层级:广告组
在广告组的操作中,首先需要选择并
创建受众(包括:地区、性别、年龄等其他选择条件);然后,设置本次广告的预算和
排期(如发布以及结束时间);接着就是设置广告竞价;最后连接下一层,创建广告。
第三层级:广告
fb采集加工链路
暂时无法在飞书文档外展示此内容
fb资料
-
1.Facebook广告账户申请【P2】
通过访问https://business.facebook.com/,您可以申请自己的Facebook广告账户。这是开始在Facebook平台上进
行广告投放的第一步。
-
2.Facebook中国官网【P2】
https://reachtheworldonfacebook.com/contact-us/,这是Facebook在中国的官方网站,无需翻墙即可访问。您可
以在该网站上获取最新的消息、成功案例和政策指南等,并建议订阅相关内容以保持更新。
-
3.Facebook官方认证代理商【P2】:
https://reachtheworldonfacebook.com/reseller/,如果您需要专业的广告服务支持,可以通过该网站找到
Facebook官方认证的代理商。
-
4.Facebook自学工具:Blueprint【P0】
https://reachtheworldonfacebook.com/facebook-blueprint/,Blueprint是Facebook提供的自学工具,为广告主提
供有关广告策略、目标受众和广告管理的教育和培训资源。//增加
https://www.facebook.com/business/learn/courses https://www.facebookblueprint.com/student/catalog
-
5.Facebook工具:跨境商机大数据平台【P2】
https://crossborderinsightsfinder.com/zh-cn/,该平台提供了跨境电商相关的数据洞察和市场分析,帮助广告主了
解和利用跨境商机。//暂无权限