Data + AI Summit 欧洲2020全部超清 PPT 下载

Data + AI Summit Europe 2020 原 Spark + AI Summit Europe 于2020年11月17日至19日举行。由于新冠疫情影响,本次会议和六月份举办的会议一样在线举办,一共为期三天,第一天是培训,第二天和第三天是正式会议。会议涵盖来自从业者的技术内容,他们将使用 Apache Spark™、Delta Lake、MLflow、Structured Streaming、BI和SQL分析、深度学习和机器学习框架来解决棘手的数据问题。会议的全部日程请参见:https://databricks.com/dataaisummit/europe-2020/agenda。

和今年六月份会议不一样,这次会议的 KeyNote 没什么劲爆的消息,不过会议的第二天和第三天还是有些干货大家可以看下的。在接下来的几天,本公众号也会对一些比较有意思的议题进行介绍,敬请关注本公众号。

本次会议的议题范围具体如下:

•人工智能用户案例以及新的机会;•Apache Spark™, Delta Lake, MLflow 等最佳实践和用户案例;•数据工程,包括流架构•使用数据仓库(data warehouse)和数据湖(data lakes)进行 SQL 分析和 BI;•数据科学,包括 Python 生态系统;•机器学习和深度学习应用•生产机器学习(MLOps)•大规模数据分析和ML研究•工业界的用户案例 

下载途径

关注微信公众号 过往记忆大数据 或者 Java与大数据架构 并回复 spark-9902 获取。

可下载的PPT

下面议题提供 PPT 下载,共129个。注意,访问 https://www.iteblog.com/archives/9902.html 页面可以在线观看全部 PPT。

•3D: DBT using Databricks and Delta•Accelerated Training of Transformer Models•Achieving Lakehouse Models with Spark 3.0•Acoustics & AI for Conservation•Active Governance Across the Delta Lake with Alation•Add Historical Analysis of Operational Data with Easy Configurations in Fivetran Automated Data Integration•Advanced Natural Language Processing with Apache Spark NLP•Apache Liminal (Incubating)—Orchestrate the Machine Learning Pipeline•Apache Spark Streaming in K8s with ArgoCD & Spark Operator•Apply MLOps at Scale•Arbitrary Stateful Aggregation and MERGE INTO•Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360•Building a Cross Cloud Data Protection Engine•Building a Distributed Collaborative Data Pipeline with Apache Spark•Building a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes•Building a Real-Time Supply Chain View: How Gousto Merges Incoming Streams of Inventory - - Data at Scale to Track Ingredients Throughout its Supply Chain•Building a SIMD Supported Vectorized Native Engine for Spark SQL•Building a Streaming Data Pipeline for Trains Delays Processing•Building a Streaming Microservices Architecture•Building an ML Tool to predict Article Quality Scores using Delta & MLFlow•Building Identity Graph at Scale for Programmatic Media Buying Using Apache Spark and Delta Lake•Building Notebook-based AI Pipelines with Elyra and Kubeflow•Building the Next-gen Digital Meter Platform for Fluvius•CI/CD Templates: Continuous Delivery of ML-Enabled Data Pipelines on Databricks•Cloud-native Semantic Layer on Data Lake•Common Strategies for Improving Performance on Your Delta Lakehouse•Comprehensive View on Date-time APIs of Apache Spark 3.0•Containerized Stream Engine to Build Modern Delta Lake•Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King•Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS SageMaker for Enterprise AI Scenarios•Cost Efficiency Strategies for Managed Apache Spark Service•Data Engineers in Uncertain Times: A COVID-19 Case Study•Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake•Data Privacy with Apache Spark: Defensive and Offensive Approaches•Data Time Travel by Delta Time Machine•Data Time Travel by Delta Time Machine•Data Versioning and Reproducible ML with DVC and MLflow•Databricks University Alliance Meetup - Data + AI Summit EU 2020•Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Customers•Delta: Building Merge on Read•Delta Lake: Optimizing Merge•Designing and Implementing a Real-time Data Lake with Dynamically Changing Schema•Detecting and Recognising Highly Arbitrary Shaped Texts from Product Images•Deterministic Machine Learning with MLflow and mlf-core•Developing ML-enabled Data Pipelines on Databricks using IDE & CI/CD at Runtastic•Digital Turbine Adopts A Lakehouse to Scale to Their Analytics Needs•Distributed and Scalable Model Lifecycle Capabilities•Diving into Delta Lake: Unpacking the Transaction Log•eBay’s Work on Dynamic Partition Pruning & Runtime Filter•Efficient Query Processing Using Machine Learning•Embedding Insight through Prediction Driven Logistics•End to End Supply Chain Control Tower•Extending Apache Spark – Beyond Spark Session Extensions•Foundations of Data Teams•Frequently Bought Together Recommendations Based on Embeddings•From Query Plan to Query Performance: Supercharging your Apache Spark Queries using the Spark UI SQL Tab•From Zero to Hero with Kafka Connect•Generalized Pipeline Parallelism for DNN Training•Getting Started with Apache Spark on Kubernetes•Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads•How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark•How The Weather Company Uses Apache Spark to Serve Weather Data Fast at Low Cost•Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and Parquet Reader•Introducing MLflow for End-to-End Machine Learning on Databricks•Koalas: Interoperability Between Koalas and Apache Spark•Leveraging Apache Spark and Delta Lake for Efficient Data Encryption at Scale•Livestream Economy: The Application of Real-time Media and Algorithmic Personalisation in Urbanism•Materialized Column: An Efficient Way to Optimize Queries on Nested Columns•MATS stack (MLFlow, Airflow, Tensorflow, Spark) for Cross-system Orchestration of Machine Learning Pipelines•MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams•Migrate and Modernize Hadoop-Based Security Policies for Databricks•Migrating Airflow-based Apache Spark Jobs to Kubernetes – the Native Way•ML Production Pipelines: A Classification Model•ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed Feedback Environment•MLflow at Company Scale•MLOps Using MLflow•Model Experiments Tracking and Registration using MLflow on Databricks•Monitoring Half a Million ML Models, IoT Streaming Data, and Automated Quality Check on Delta Lake•Moving to Databricks & Delta•NLP Text Recommendation System Journey to Automated Training•Operating and Supporting Delta Lake in Production•Optimising Geospatial Queries with Dynamic File Pruning•Optimizing Apache Spark UDFs•Our Journey to Release a Patient-Centric AI App to Reduce Public Health Costs•Parallel Ablation Studies for Machine Learning with Maggy on Apache Spark•Personalization Journey: From Single Node to Cloud Streaming•Photon Technical Deep Dive: How to Think Vectorized•Polymorphic Table Functions: The Best Way to Integrate SQL and Apache Spark•Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch and More!)•Productionizing Real-time Serving With MLflow•Project Zen: Improving Apache Spark for Python Users•Query or Not to Query? Using Apache Spark Metrics to Highlight Potentially Problematic Queries•Ray and Its Growing Ecosystem•Real-time Feature Engineering with Apache Spark Streaming and Hof•Real-Time Health Score Application using Apache Spark on Kubernetes•Reproducible AI Using PyTorch and MLflow•Reproducible AI Using PyTorch and MLflow•Revealing the Power of Legacy Machine Data•Scale and Optimize Data Engineering Pipelines with Software Engineering Best Practices: Modularity and Automated Testing•Scale-Out Using Spark in Serverless Herd Mode!•Scaling Machine Learning Feature Engineering in Apache Spark at Facebook•Scaling Machine Learning with Apache Spark•Seamless MLOps with Seldon and MLflow•SHAP & Game Theory For Recommendation Systems•Simplifying AI integration on Apache Spark•Skew Mitigation For Facebook PetabyteScale Joins•Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metadata Platform•Spark NLP: State of the Art Natural Language Processing at Scale•Spark SQL Beyond Official Documentation•Spark SQL Join Improvement at Facebook•Speeding Time to Insight with a Modern ELT Approach•Stateful Streaming with Apache Spark: How to Update Decision Logic at Runtime•Stories from the Financial Service AI Trenches: Lessons Learned from Building AI Models in EY•Streaming Inference with Apache Beam and TFX•TeraCache: Efficient Caching Over Fast Storage Devices•The Beauty of (Big) Data Privacy Engineering•The Hidden Value of Hadoop Migration•The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analytics Engineer•The Pill for Your Migration Hell•Transforming GE Healthcare with Data Platform Strategy•Trust, Context and, Regulation: Achieving More Explainable AI in Financial Services•Unlocking Geospatial Analytics Use Cases with CARTO and Databricks•Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update/Delete SQL Operation•Using Machine Learning at Scale: A Gaming Industry Experience!•Using Machine Learning at Scale: A Gaming Industry Experience!•Using NLP to Explore Entity Relationships in COVID-19 Literature•Using Redash for SQL Analytics on Databricks•What is New with Apache Spark Performance Monitoring in Spark 3.0•X-RAIS: The Third Eye

Java与大数据架构

7年老码农,10W+关注者。【Java与大数据架构】全面分享Java编程、Spark、Flink、Kafka、Elasticsearch、数据湖等干货。欢迎扫码关注!

DataFunSummit 2021 图机器学习峰会PPT汇总,共31份。 2021图机器学习峰会共设置GNN基础模型、复杂图、大规模图平台、推荐与图、NLP与图、风控与图、生物计算与图等7大论坛,将从多个视角彻底了解图机器学习! 1. GNN基础模型 好的图表示到底是什么? 探索图神经网络的表达能力 图注意力多层感知器 Adaptive Universal Generalized PageRank Graph Neural Network 2. 复杂图 面向富文本网络的图神经网络及应用 视频人物社交关系图生成与应用 知识图谱的自监督学习与逻辑推理 Representing and Aligning Networks in Hyperbolic Spaces 3. 大规模图平台 高效、易用、开放的图深度学习平台DGL介绍及展望 面向美团业务场景的图学习平台 图数据库安全控制 Angel Graph大规模图计算平台 PyG 2.0 & GraphGym 图学习平台 4. 推荐与图 基于GNN的社交推荐算法设计和应用 图表征学习在美团推荐中的应用 图神经网络在推荐召回中的应用和挑战 Angel图神经网络在推荐场景下的实践 GNN算法的应用与专用训练框架 5. NLP与图 基于逻辑规则学习的知识图谱推理 基于图深度学习的自然语言处理,方法与应用 基于图神经网络的知识图谱表示 基于图学习的信息流挖掘与兴趣点建模 6. 风控与图 基于图神经网络的欺诈检测—从研究到应用 图机器学习在度小满金融风控中的应用 图神经网络的对抗攻防研究 图神经网络在反欺诈领域的应用 图神经网络在实时风控的应用 7. 生物计算与图 基于梯度向量场的分子三维结构生成 基于最优传输理论的无监督图压缩及其在时间线摘要上的应用 图机器学习在生物图上的应用 图预训练技术在生物计算领域的应用
2021数据治理与安全论坛(DataFunSummit 2021)PPT汇总,共30份。 一、数据治理论坛 业务数据治理在中台侧的实践分享 小米数据管理与应用实践 有赞数据地图实践 二、数据安全论坛 腾讯大数据安全体系介绍 企业数据安全中的数据脱敏 数字水印在数据泄漏溯源中的应用与挑战 三、隐私计算论坛 隐私安全计算平台翼数坊——落地应用实践 保护隐私的安全多方学习 安全多方计算中两方计算的性能分析 异构加速赋能联邦学习 基于百度数据联邦平台的安全数据处理 基于隐私保护计算的医学研究应用 个性化联邦学习助力AI在药物研发中的应用 大数据隐私计算:PowerFL-SQL联合分析技术及应用 四、电商行业论坛 京东数据安全的审计和防护 阿里巴巴数据治理实践 京东实时数仓治理与实战 京东海量数据快速更新实践 京东大数据安全与分布式权限体系的探索与实践 五、内容行业论坛 快看漫画构建数据治理闭环的逻辑与实践 数据安全基础框架与实践 云音乐数据治理探索与实践 百度用户产品大数据治理应用实践 百度广告场景大数据治理应用实践 六、金融行业论坛 360数科在线业务系统安全存储实践 金融科技时代下的数据安全治理 东方证券金融大数据服务转型的探索与实践 360数科大数据治理与应用 数据治理一体化在Mobtech金融风控场景下的实践
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值