增强的PolyBase SQL 2019-安装和基本概述

本文介绍了SQL Server 2019中的增强型PolyBase特性,该特性允许从Oracle、Teradata、MongoDB、PostgreSQL等多样数据源查询数据,无需移动数据。在SQL Server 2016的基础上,SQL Server 2019扩展了PolyBase的功能,支持更多数据源和ODBC驱动。此外,文章还涵盖了如何在Windows环境下安装SQL Server 2019的PolyBase组件,以及安装和配置Azure Data Studio以支持SQL Server 2019的新特性。
摘要由CSDN通过智能技术生成

SQL Server 2019 is recently launched in the ignite 2018 event by Microsoft. We can get an overview of SQL 2019 preview version and learn how to install it on Windows environment by following up the article SQL Server 2019 overview and installation.

Microsoft最近在ignite 2018活动中启动了SQL Server 2019。 通过跟踪文章SQL Server 2019概述和安装,我们可以获得SQL 2019预览版的概述并了解如何在Windows环境中安装它。

We will explore SQL 2019 Enhanced PolyBase feature in a series of article. In this first part of the article, we will explore below topics

我们将在系列文章中探索SQL 2019增强型PolyBase功能。 在本文的第一部分中,我们将探讨以下主题

  • Overview of ETL and PolyBase

    ETL和PolyBase概述
  • Install PolyBase into SQL 2019

    将PolyBase安装到SQL 2019
  • Overview and Installation of Azure Data Studio

    概述和安装Azure Data Studio
  • SQL Server 2019 preview extension in Azure Data Studio

    Azure Data Studio中SQL Server 2019预览扩展

ETL和PolyBase概述 (Overview of ETL and PolyBase)

In today’s industry requirement, we have data in various databases such as Oracle, MongoDB, Teradata, PostgreSQL, etc. The application requires accessing data from these various data sources and combining data into a single source. It is a challenging task for the database developers and data scientists. We normally use ETL (Extract-Transform-Load) to move the data around the different sources.

在当今的行业需求中,我们将数据存储在Oracle,MongoDB,Teradata,PostgreSQL等各种数据库中。该应用程序需要从这些各种数据源访问数据并将数据合并为一个源。 对于数据库开发人员和数据科学家而言,这是一项艰巨的任务。 通常,我们使用ETL(Extract-Transform-Load)在不同源上移动数据。

Below are the steps involved in ETL processes

以下是ETL流程中涉及的步骤

  • Read data from the data source of your choice and extract the specific data

    从您选择的数据源中读取数据并提取特定数据
  • Transform process works on this data based on the logic, rules, and convert data

    转换过程根据逻辑,规则对这些数据进行处理,然后转换数据
  • Load process writes the data to the destination database

    加载过程将数据写入目标数据库

ETL provides great values to apply business logic to the data transform data from various sources and move the data into a single destination or multiple formats. ETL process is having some challenges as below:

ETL提供了巨大的价值,可以将业务逻辑应用于来自各种来源的数据转换数据,并将数据移至单个目标或多种格式。 ETL流程面临以下挑战:

  • We need to move data from the source that will require extra resources in terms of disk space

    我们需要从源移动数据,这将需要额外的磁盘空间资源
  • Data security is also another aspect. Copy of the data should be should be secured from unauthorized access

    数据安全性也是另一方面。 应当保护数据副本免遭未经授权的访问
  • An ETL process is slow to process and requires efforts to maintain due to its complex logic

    ETL流程处理缓慢,由于逻辑复杂,需要维护

In SQL Server 2016, we came across new feature ‘PolyBase’ that allows querying relational and non-relational databases. This data virtualization allows integrating data from the multiple sources without moving the data. This actually creates a virtual data layer called as data lake or data hub. We can access all data from the single sources that allows controlling security as well from a single point. We can query Hadoop and Azure Blob Storage using PolyBase in SQL Server 2016.

在SQL Server 2016中,我们遇到了新功能“ PolyBase”,该功能允许查询关系数据库和非关系数据库。 这种数据虚拟化允许集成多个来源的数据而无需移动数据。 实际上,这将创建一个称为数据湖或数据中心的虚拟数据层。 我们可以从单一来源访问所有数据,从而可以从一个角度控制安全性。 我们可以在SQL Server 2016中使用PolyBase查询Hadoop和Azure Blob存储。

In the article, SQL Server 2016 – PolyBase tutorial, we explored query a CSV file stored in Azure Blob storage from SQL Server 2016 using PolyBase.

SQL Server 2016 – PolyBase教程文章中 ,我们探讨了使用PolyBase从SQL Server 2016查询存储在Azure Blob存储中的CSV文件。

SQL 2019 provides enhancement to PolyBase to access data from various data sources such as Oracle, Teradata, MongoDB, and PostgreSQL. We can also access data from any data sources with an ODBC driver. We can create external tables that link to these data sources (SQL Server, Oracle, Teradata, MongoDB, or any data source with an ODBC). Users can access these data from external tables similar to a relational database table. These external tables are linked to the data sources and when we execute any query, data from an external table is retrieved and shown to the user.

SQL 2019对PolyBase进行了增强,以访问来自各种数据源(例如Oracle,Teradata,MongoDB和PostgreSQL)的数据。 我们还可以使用ODBC驱动程序访问任何数据源中的数据。 我们可以创建链接到这些数据源(SQL Server,Oracle,Teradata,MongoDB或任何具有ODBC的数据源)的外部表。 用户可以从类似于关系数据库表的外部表访问这些数据。 这些外部表链接到数据源,并且当我们执行任何查询时,都会检索来自外部表的数据并将其显示给用户。

On the image below, we can see PolyBase in SQL Server 2019:

在下图上,我们可以看到SQL Server 2019中的PolyBase:

PolyBase in SQL Server 2019

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值