azure 读取文件的目录怎么_Azure 数据目录常见问题

Azure 数据目录常见问题Azure Data Catalog frequently asked questions



本文将解答有关 Azure 数据目录服务的常见问题。This article provides answers to frequently asked questions related to the Azure Data Catalog service.

什么是 Azure 数据目录?What is Azure Data Catalog?

托管在 Microsoft Azure 中的数据目录是一个完全托管的服务,充当企业数据源的注册和发现系统。Data Catalog is a fully managed service, hosted in Microsoft Azure, that serves as a system of registration and discovery for enterprise data sources. 任何用户(从分析师到数据专家,再到开发人员)都可以使用数据目录来注册、发现、理解和使用数据源。With Data Catalog, any user, from analysts to data scientists and developers, can register, discover, understand, and consume data sources.

它解决了客户面临的哪些难题?What customer challenges does it solve?

数据目录解决数据源发现和“暗数据”难题,使用户能够发现和理解企业数据源。Data Catalog addresses the challenges of data-source discovery and “dark data” so that users can discover and understand enterprise data sources.

其目标受众是什么?What are its target audiences?

数据目录专为技术用户和非技术用户而设计,包括:Data Catalog is designed for technical and non-technical users, including:

数据开发人员、BI 和专业分析师(负责生成数据和分析内容,供他人使用)。Data developers and BI and analytics professionals: People who are responsible for producing data and analytics content for others to consume.

数据专员(了解数据的含义、用法和用途)。Data stewards: People who have the knowledge about the data, what it means, and how it is intended to be used.

数据使用者(需要能够使用所需工具轻松发现、理解和连接数据来完成相应的作业)。Data consumers: People who need to be able to easily discover, understand, and connect to the data they need to do their job, by using the tool of their choice.

中央 IT(需要让数以百计的数据源可被业务用户发现,需监督数据的使用方式和使用人员)。Central IT: People who need to make hundreds of data sources discoverable by business users, and who need to maintain oversight over how data is being used and by whom.

各个区域的可用性如何?What is its availability by region?

数据目录服务当前在以下数据中心可用:Data Catalog services are currently available in the following data centers:

美国西部West US

美国东部East US

西欧West Europe

北欧North Europe

澳大利亚东部Australia East

Southeast AsiaSoutheast Asia

数据资产的数量限制是多少?What are its limits on the number of data assets?

数据目录的免费版仅限 5,000 个已注册数据资产。The Free Edition of Data Catalog is limited to 5,000 registered data assets.

数据目录的标准版最多支持 100,000 个已注册数据资产。The Standard Edition of Data Catalog supports up to 100,000 registered data assets.

在数据目录中注册的任何对象(例如表视图、文件和报告)统计为数据资产。Any object registered in Data Catalog, such as tables, views, files, and reports, counts as a data asset.

哪些是受其支持的数据源和资产类型?What are its supported data source and asset types?

有关当前支持的数据源列表,请参阅数据目录 DSR。For a list of currently supported data sources, see Data Catalog DSR.

如何对另一数据源请求支持?How do I request support for another data source?

若要提交功能请求和其他反馈,请转到 Azure 反馈论坛上的数据目录。To submit feature requests and other feedback, go to the Data Catalog on the Azure Feedback Forums.

尝试创建新目录时,为什么会收到错误 目录 ?Why do I get an error Catalog already exists when I try to create a new catalog?

使用 Power BI Pro 许可证购买 Office 365 E5 时,Microsoft 会自动在订阅区域中创建默认目录。When you purchase Office 365 E5 with Power BI Pro License, Microsoft creates a default catalog in the subscription's region automatically. 此目录使用免费 SKU。This catalog uses the free SKU. Office 365/Power BI 用户许可证在管理页中进行管理。The Office 365 / Power BI user license is managed in the administration page.

但是,这种类型的数据目录没有 管理员选项 ,在 Azure 门户中不可见。However, this type of data catalog does not have an Administrator Option and is not visible in the Azure portal. 您无法删除此类型的数据目录。You cannot delete this type of data catalog. 同样,您不能重命名数据目录,也不能将其移动到另一个区域。Similarly, you are not allowed to rename the data catalog, and you cannot move it to another region.

如果用户使用 Power BI Pro 许可证注册 Office 365 E5,则分配有 Power BI Pro 许可证的用户帐户会自动访问数据目录。Users accounts that are assigned a Power BI Pro license automatic have access to the data catalog due to License Agreement when they signed up for Office 365 E5 with the Power BI Pro License. 此类用户对数据目录资产具有完全访问权限,无需管理权限。This type of user has full access to data catalog assets without administrative privileges. 这种类型的用户 不 是 Azure 数据目录中 目录用户 角色的一部分。That kind of user is not part of Catalog User role in Azure Data Catalog.

如何开始使用数据目录?How do I get started with Data Catalog?

访问数据目录入门是入门的最佳方式。The best way to get started is by going to Getting Started with Data Catalog. 本文针对服务中的功能进行端到端概述。This article is an end-to-end overview of the capabilities in the service.

如何注册我的数据?How do I register my data?

在数据目录中注册数据:To register your data in Data Catalog:

在 Azure 数据目录门户的“发布”**** 区域中,启动 Azure 数据目录注册工具。In the Azure Data Catalog portal, in the Publish area, start the Azure Data Catalog registration tool.

在数据目录数据源注册工具中,使用用于访问数据目录门户的同一凭据登录。In the Data Catalog data source registration tool, sign in with the same credentials that you use to access the Data Catalog portal.

选择要注册的数据源和特定资产。Select the data source and the specific assets that you want to register.

为已注册数据资产提取哪些属性?What properties does it extract for data assets that are registered?

具体属性因数据源而异,但一般而言,数据目录发布服务会提取以下信息:The specific properties differ from data source to data source but, in general, the Data Catalog publishing service extracts the following information:

资产名称Asset Name

资产类型Asset Type

资产说明Asset Description

属性/列名称Attribute/Column Names

属性/列数据类型Attribute/Column Data Types

属性/列说明Attribute/Column Description


使用数据目录注册数据资产不会将数据移动或复制到云。Registering data assets with Data Catalog does not move or copy your data to the cloud. 从数据源注册资产会将资产的元数据复制到 Azure,但数据仍保留在现有的数据源位置。Registering assets from a data source copies the assets’ metadata to Azure, but the data remains in the existing data-source location. 如果选择在注册资产时上传预览记录或数据配置文件,则此规则不适用,这是此规则的例外情况。The exception to this rule is if you choose to upload preview records or a data profile when you register the assets. 上传预览时,将从每个资产复制多达 20 个记录并将其作为快照存储在数据目录中。When you include a preview, up to 20 records are copied from each asset and stored as a snapshot in Data Catalog. 上传数据配置文件时,会计算聚合信息并将其包含到存储在目录中的元数据里。When you include a data profile, aggregate information is calculated and included in the metadata that's stored in the catalog. 聚合信息可能包括表的大小、每列空值的百分比或者列的最小值、最大列和平均值。Aggregate information can include the size of tables, the percentage of null values per column, or the minimum, maximum, and average values for columns.


对于具有一流“说明”**** 属性的数据源(例如 SQL Server Analysis Services),数据目录数据源注册工具提取该属性值。For data sources such as SQL Server Analysis Services that have a first-class Description property, the Data Catalog data source registration tool extracts that property value. 对于 本地 SQL Server 缺少第一类 Description 属性的关系数据库,数据目录数据源注册工具将从对象和列的 ms_description 扩展属性中提取值。For on-premises SQL Server relational databases that lack a first-class Description property, the Data Catalog data source registration tool extracts the value from the ms_description extended property for objects and columns. SQL Azure 不支持此属性。This property is not supported for SQL Azure.

新注册的资产需要多久才能在目录中显示?How long should it take for newly registered assets to appear in the catalog?

在数据目录中注册资产后,可能需要 5至10 秒钟,才会在数据目录门户中显示。After you register assets with Data Catalog, there may be a period of 5 to 10 seconds before they appear in the Data Catalog portal.

如何批注和丰富已注册数据资产的元数据?How do I annotate and enrich the metadata for my registered data assets?

为已注册资产提供元数据最简单的方法是在数据目录门户中选择资产,然后在属性窗格或架构窗格中为选定对象输入值。The simplest way to provide metadata for registered assets is to select the asset in the Data Catalog portal and then enter the values in the properties pane or schema pane for the selected object.

此外,还可在注册过程中提供某些元数据(例如专家和标记)。You can also provide some metadata, such as experts and tags, during the registration process. 在数据目录发布服务中提供的值适用于此时注册的所有资产。The values you provide in the Data Catalog publishing service apply to all assets being registered at that time. 若要查看门户中最近注册对象的更多注释,请选择数据目录数据源注册工具的最终屏幕上的“查看门户”**** 按钮。To view the recently registered objects in the portal for additional annotation, select the View Portal button on the final screen of the Data Catalog data source registration tool.

如何删除已注册的数据对象?How do I delete my registered data objects?

可通过在门户中选择对象并单击“删除”**** 按钮,删除数据目录中的对象。You can delete an object from Data Catalog by selecting the object in the portal and then clicking the Delete button. 删除对象会从数据目录中删除其元数据,但不会影响基础数据源。Removing the object removes its metadata from Data Catalog but does not affect the underlying data source.

什么是专家?What is an expert?

专家是对数据对象具有明智观点的人员。An expert is a person who has an informed perspective about a data object. 一个对象可以具有多个专家。An object can have multiple experts. 专家不需要是对象的“所有者”,只需知道可以怎样和应该怎样使用数据。An expert does not need to be the “owner” for an object, but is simply someone who knows how the data can and should be used.

遇到问题时如何与数据目录团队共享信息?How do I share information with the Data Catalog team if I encounter problems?

若要报告问题、共享信息和提出问题,请转到 Azure 数据目录论坛。To report problems, share information, and ask questions, go to the Azure Data Catalog forum.

该目录是否与我感兴趣的另一个数据源配合使用?Does the catalog work with another data source that I’m interested in?

我们当前正努力将更多数据源添加到数据目录中。We’re actively working on adding more data sources to Data Catalog. 如果想查看支持的具体数据源,请通过转到 Azure 反馈论坛的数据目录提出建议(如果已有人建议则可表达你的支持)。If you want to see a specific data source supported, suggest it (or voice your support if it has already been suggested) by going to the Data Catalog on the Azure Feedback Forums.

在数据目录中注册资产时需要什么权限?What permissions do I need to register assets with Data Catalog?

若要运行数据目录注册工具,需要具有对数据源的权限,允许从数据源读取元数据。To run the Data Catalog registration tool, you need permissions on the data source that allows you to read the metadata from the source. 若要上传预览,必须具有允许从正在注册的对象读取数据的权限。To also include a preview, you must have permissions that allow you to read in the data from the objects being registered.

数据目录还允许目录管理员限制哪些用户和组可向目录添加元数据。Data Catalog also allows catalog administrators to restrict which users and groups can add metadata to the catalog.

数据目录将可用于本地部署吗?Will Data Catalog be made available for on-premises deployment as well?

数据目录是可用于云和本地数据源的云服务,提供混合数据源发现解决方案。Data Catalog is a cloud service that can work with both cloud and on-premises data sources to deliver a hybrid data-source discovery solution. 目前没有计划提供在本地运行的数据目录服务版本。There are currently no plans for a version of the Data Catalog service that runs on-premises.

我能从注册的数据源中提取更多或更丰富的元数据吗?Can I extract more or richer metadata from the data sources I register?

我们正在努力扩展数据目录的功能。We’re actively working to expand the capabilities of Data Catalog. 如果希望在注册期间从数据源中提取其他元数据,请在 Azure 反馈论坛上的数据目录中提出建议(若已有建议则可为其投票)。If you want to have additional metadata extracted from the data source during registration, suggest it (or vote for it, if it has already been suggested) in the Data Catalog on the Azure Feedback Forums.

如果想要包含列/架构元数据、预览或其中的数据源注册工具,不提取此元数据的数据源的数据配置文件,可以使用数据目录 API 添加此元数据。If you would like to include column/schema metadata, previews, or data profiles, for data sources where this metadata is not extracted by the data source registration tool, you can use the Data Catalog API to add this metadata. For additional information, see Azure Data Catalog REST API.

如何限制已注册数据资产的可见性,仅允许特定人员发现它们?How do I restrict the visibility of registered data assets, so that only certain people can discover them?

在数据目录中选择数据资产,然后单击“取得所有权”**** 按钮。Select the data assets in the Data Catalog, and then click the Take Ownership button. 数据目录中数据资产的所有者可更改可见性设置,允许所有用户发现拥有的资产或对特定用户限制可见性。Owners of data assets in Data Catalog can change the visibility settings to either allow all users to discover the owned assets or restrict visibility to specific users.

如何更新数据资产的注册,将数据源中的更改都反映到目录中?How do I update the registration for a data asset so that changes in the data source are reflected in the catalog?

若要更新已在目录中注册的数据资产的元数据,仅需重新注册包含该资产的数据源即可。To update the metadata for data assets that are already registered in the catalog, simply re-register the data source that contains the assets. 数据源中的任何更改(例如添加列、从表或视图中删除列)都会在目录中更新,但会保留用户提供的所有批注。Any changes in the data source, such as columns being added or removed from tables or views, are updated in the catalog, but any annotations provided by users are retained.

未在此处找到相关问题。My question isn’t answered here. 在哪里可以找到答案?Where can I go for answers?

将在此处找到提出的问题。Questions asked there will find their way here.

