Documente Academic
Documente Profesional
Documente Cultură
Reference
REST
REST API
Search syntax reference
What is Azure Data Catalog?
8/27/2018 • 4 minutes to read • Edit Online
Azure Data Catalog is a fully managed cloud service whose users can discover the data sources they need and
understand the data sources they find. At the same time, Data Catalog helps organizations get more value from
their existing investments.
With Data Catalog, any user (analyst, data scientist, or developer) can discover, understand, and consume data
sources. Data Catalog includes a crowdsourcing model of metadata and annotations. It is a single, central place for
all of an organization's users to contribute their knowledge and build a community and culture of data.
Next steps
To get started with Data Catalog, go to:
Microsoft Azure Data Catalog
Get started with Azure Data Catalog
Azure Data Catalog common scenarios
8/27/2018 • 5 minutes to read • Edit Online
This article presents common scenarios where Azure Data Catalog can help your organization get more value
from its existing data sources.
You can publish metadata by using a public API or a click-once registration tool, or by manually entering
information directly to the Azure Data Catalog web portal. The following table summarizes all data sources that
are supported by the catalog today, and the publishing capabilities for each. Also listed are the external data tools
that each data source can launch from our portal "open-in" experience. The second table contains a more technical
specification of each data-source connection property.
Azure Storage ✓ ✓ ✓
table
HDFS directory ✓ ✓ ✓
HDFS file ✓ ✓ ✓
DB2 table ✓ ✓ ✓
DB2 view ✓ ✓ ✓
FTP directory ✓ ✓ ✓
FTP file ✓ ✓ ✓
HTTP report ✓
HTTP endpoint ✓
HTTP file ✓
OData function ✓
PostgreSQL table ✓ ✓ ✓
PostgreSQL view ✓ ✓ ✓
Salesforce object ✓ ✓ ✓
SharePoint list ✓
Azure Cosmos ✓ ✓ ✓
DB collection
Generic ODBC ✓ ✓ ✓
table
Generic ODBC ✓ ✓ ✓
view
Sybase table ✓ ✓ ✓
Sybase view ✓ ✓ ✓
If you want to see a specific data source supported, suggest it (or voice your support if it has already been
suggested) by going to the Data Catalog on the Azure Feedback Forums.
Azure Data Catalog is a fully managed cloud service that serves as a system of registration and system of
discovery for enterprise data assets. For a detailed overview, see What is Azure Data Catalog.
This tutorial helps you get started with Azure Data Catalog. You perform the following procedures in this
tutorial:
PROCEDURE DESCRIPTION
Provision data catalog In this procedure, you provision or set up Azure Data
Catalog. You do this step only if the catalog has not been set
up before. You can have only one data catalog per
organization (Microsoft Azure Active Directory domain) even
though there are multiple subscriptions associated with your
Azure account.
Register data assets In this procedure, you register data assets from the
AdventureWorks2014 sample database with the data
catalog. Registration is the process of extracting key
structural metadata such as names, types, and locations
from the data source and copying that metadata to the
catalog. The data source and data assets remain where they
are, but the metadata is used by the catalog to make them
more easily discoverable and understandable.
Discover data assets In this procedure, you use the Azure Data Catalog portal to
discover data assets that were registered in the previous
step. After a data source has been registered with Azure
Data Catalog, its metadata is indexed by the service so that
users can easily search for the data they need.
Annotate data assets In this procedure, you provide annotations (information such
as descriptions, tags, documentation, or experts) for the data
assets. This information supplements the metadata extracted
from the data source, and to make the data source more
understandable to more people.
Connect to data assets In this procedure, you open data assets in integrated client
tools (such as Excel and SQL Server Data Tools) and a non-
integrated tool (SQL Server Management Studio).
Manage data assets In this procedure, you set up security for your data assets.
Data Catalog does not give users access to the data itself.
The owner of the data source controls data access.
With Data Catalog, you can discover data sources and view
the metadata related to the sources registered in the
catalog. There may be situations, however, where data
sources should be visible only to specific users or to
members of specific groups. For these scenarios, you can use
Data Catalog to take ownership of registered data assets
within the catalog and control the visibility of the assets you
own.
PROCEDURE DESCRIPTION
Remove data assets In this procedure, you learn how to remove data assets from
the data catalog.
Tutorial prerequisites
Azure subscription
To set up Azure Data Catalog, you must be the owner or co-owner of an Azure subscription.
Azure subscriptions help you organize access to cloud service resources like Azure Data Catalog. They also help
you control how resource usage is reported, billed, and paid for. Each subscription can have a different billing
and payment setup, so you can have different subscriptions and different plans by department, project, regional
office, and so on. Every cloud service belongs to a subscription, and you need to have a subscription before
setting up Azure Data Catalog. To learn more, see Manage accounts, subscriptions, and administrative roles.
If you don't have a subscription, you can create a free trial account in just a couple of minutes. See Free Trial for
details.
Azure Active Directory
To set up Azure Data Catalog, you must be signed in with an Azure Active Directory (Azure AD ) user account.
You must be the owner or co-owner of an Azure subscription.
Azure AD provides an easy way for your business to manage identity and access, both in the cloud and on-
premises. You can use a single work or school account to sign in to any cloud or on-premises web application.
Azure Data Catalog uses Azure AD to authenticate sign-in. To learn more, see What is Azure Active Directory.
Azure Active Directory policy configuration
You may encounter a situation where you can sign in to the Azure Data Catalog portal, but when you attempt to
sign in to the data source registration tool, you encounter an error message that prevents you from signing in.
This error may occur when you are on the company network or when you are connecting from outside the
company network.
The registration tool uses forms authentication to validate user sign-ins against Azure Active Directory. For
successful sign-in, an Azure Active Directory administrator must enable forms authentication in the global
authentication policy.
With the global authentication policy, you can enable authentication separately for intranet and extranet
connections, as shown in the following image. Sign-in errors may occur if forms authentication is not enabled
for the network from which you're connecting.
For more information, see Configuring authentication policies.
2. Sign in with a user account that is the owner or co-owner of an Azure subscription. You see the following
page after signing in.
3. Specify a name for the data catalog, the subscription you want to use, and the location for the catalog.
4. Expand Pricing and select an Azure Data Catalog edition (Free or Standard).
5. Expand Catalog Users and click Add to add users for the data catalog. You are automatically added to this
group.
6. Expand Catalog Administrators and click Add to add additional administrators for the data catalog. You
are automatically added to this group.
7. Click Create Catalog to create the data catalog for your organization. You see the home page for the data
catalog after it is created.
4. You can view properties of the data catalog and update them. For example, click Pricing tier and change
the edition.
Adventure Works sample database
In this tutorial, you register data assets (tables) from the AdventureWorks2014 sample database for the SQL
Server Database Engine, but you can use any supported data source if you would prefer to work with data that
is familiar and relevant to your role. For a list of supported data sources, see Supported data sources.
Install the Adventure Works 2014 OLTP database
The Adventure Works database supports standard online transaction-processing scenarios for a fictitious
bicycle manufacturer (Adventure Works Cycles), which includes products, sales, and purchasing. In this tutorial,
you register information about products into Azure Data Catalog.
To install the Adventure Works sample database:
1. Download Adventure Works 2014 Full Database Backup.zip on CodePlex.
2. To restore the database on your machine, follow the instructions in Restore a Database Backup by using SQL
Server Management Studio, or by following these steps:
a. Open SQL Server Management Studio and connect to the SQL Server Database Engine.
b. Right-click Databases and click Restore Database.
c. Under Restore Database, click the Device option for Source and click Browse.
d. Under Select backup devices, click Add.
e. Go to the folder where you have the AdventureWorks2014.bak file, select the file, and click OK to
close the Locate Backup File dialog box.
f. Click OK to close the Select backup devices dialog box.
g. Click OK to close the Restore Database dialog box.
You can now register data assets from the Adventure Works sample database by using Azure Data Catalog.
4. On the Microsoft Azure Data Catalog page, click SQL Server and Next.
5. Enter the SQL Server connection properties for AdventureWorks2014 (see the following example) and
click CONNECT.
6. Register the metadata of your data asset. In this example, you register Production/Product objects
from the AdventureWorks Production namespace:
a. In the Server Hierarchy tree, expand AdventureWorks2014 and click Production.
b. Select Product, ProductCategory, ProductDescription, and ProductPhoto by using Ctrl+click.
c. Click the move selected arrow (>). This action moves all selected objects into the Objects to be
registered list.
d. Select Include a Preview to include a snapshot preview of the data. The snapshot includes up to 20
records from each table, and it is copied into the catalog.
e. Select Include Data Profile to include a snapshot of the object statistics for the data profile (for
example: minimum, maximum, and average values for a column, number of rows).
f. In the Add tags field, enter adventure works, cycles. This action adds search tags for these data
assets. Tags are a great way to help users find a registered data source.
g. Specify the name of an expert on this data (optional).
h. Click REGISTER. Azure Data Catalog registers your selected objects. In this exercise, the selected
objects from Adventure Works are registered. The registration tool extracts metadata from the
data asset and copies that data into the Azure Data Catalog service. The data remains where it
currently resides, and it remains under the control of the administrators and policies of the current
system.
i. To see your registered data source objects, click View Portal. In the Azure Data Catalog portal,
confirm that you see all four tables and the database in the grid view.
In this exercise, you registered objects from the Adventure Works sample database so that they can be easily
discovered by users across your organization. In the next exercise, you learn how to discover registered data
assets.
3. Confirm that you see all four tables and the database (AdventureWorks2014) in the results. You can
switch between grid view and list view by clicking buttons on the toolbar as shown in the following
image. Notice that the search keyword is highlighted in the search results because the Highlight option
is ON. You can also specify the number of results per page in search results.
The Searches panel is on the left and the Properties panel is on the right. On the Searches panel, you
can change search criteria and filter results. The Properties panel displays properties of a selected object
in the grid or list.
4. Click Product in the search results. Click the Preview, Columns, Data Profile, and Documentation
tabs, or click the arrow to expand the bottom pane.
On the Preview tab, you see a preview of the data in the Product table.
5. Click the Columns tab to find details about columns (such as name and data type) in the data asset.
6. Click the Data Profile tab to see the profiling of data (for example: number of rows, size of data, or
minimum value in a column) in the data asset.
7. Filter the results by using Filters on the left. For example, click Table for Object Type, and you see only
the four tables, not the database.
3. Select one of the actions you can take on the saved search (Rename, Delete, Save As Default search).
Boolean operators
You can broaden or narrow your search with Boolean operators.
1. In the search box, enter tags:cycles AND objectType:table , and press ENTER.
2. Confirm that you see only tables (not the database) in the results.
Comparison operators
With comparison operators, you can use comparisons other than equality for properties that have numeric and
date data types.
1. In the search box, enter lastRegisteredTime:>"06/09/2016" .
2. Clear the Table filter under Object Type.
3. Press ENTER.
4. Confirm that you see the Product, ProductCategory, ProductDescription, and ProductPhoto tables
and the AdventureWorks2014 database you registered in search results.
See How to discover data assets for detailed information about discovering data assets and Data Catalog
Search syntax reference for search syntax.
The Description helps others discover and understand why and how to use the selected data asset. You
can also add more tags and view columns. Now you can try searching and filtering to discover data
assets by using the descriptive metadata you’ve added to the catalog.
You can also do the following on this page:
Add experts for the data asset. Click Add in the Experts area.
Add tags at the dataset level. Click Add in the Tags area. A tag can be a user tag or a glossary tag. The
Standard Edition of Data Catalog includes a business glossary that helps catalog administrators define a
central business taxonomy. Catalog users can then annotate data assets with glossary terms. For more
information, see How to set up the Business Glossary for Governed Tagging
Add tags at the column level. Click Add under Tags for the column you want to annotate.
Add description at the column level. Enter Description for a column. You can also view the description
metadata extracted from the data source.
Add Request access information that shows users how to request access to the data asset.
Choose the Documentation tab and provide documentation for the data asset. With Azure Data
Catalog documentation, you can use your data catalog as a content repository to create a complete
narrative of your data assets.
You can also add an annotation to multiple data assets. For example, you can select all the data assets you
registered and specify an expert for them.
Azure Data Catalog supports a crowd-sourcing approach to annotations. Any Data Catalog user can add tags
(user or glossary), descriptions, and other metadata, so that any user with a perspective on a data asset and its
use can have that perspective captured and available to other users.
See How to annotate data assets for detailed information about annotating data assets.
2. Click Open in the download pop-up window. This experience may vary depending on the browser.
4. Keep the defaults in the Import Data dialog box and click OK.
5. View the data source in Excel.
In this exercise, you connected to data assets discovered by using Azure Data Catalog. With the Azure Data
Catalog portal, you can connect directly by using the client applications integrated into the Open in menu. You
can also connect with any application you choose by using the connection location information included in the
asset metadata. For example, you can use SQL Server Management Studio to connect to the
AdventureWorks2014 database to access the data in the data assets registered in this tutorial.
1. Open SQL Server Management Studio.
2. In the Connect to Server dialog box, enter the server name from the Properties pane in the Azure Data
Catalog portal.
3. Use appropriate authentication and credentials to access the data asset. If you don't have access, use
information in the Request Access field to get it.
Click View Connection Strings to view and copy ADF.NET, ODBC, and OLEDB connection strings to the
clipboard for use in your application.
NOTE
The management capabilities described in this exercise are available only in the Standard Edition of Azure Data Catalog,
not in the Free Edition. In Azure Data Catalog, you can take ownership of data assets, add co-owners to data assets, and
set the visibility of data assets.
4. To restrict visibility, choose Owners & These Users in the Visibility section and click Add. Enter user
email addresses in the text box and press ENTER.
Remove data assets
In this exercise, you use the Azure Data Catalog portal to remove preview data from registered data assets and
delete data assets from the catalog.
In Azure Data Catalog, you can delete an individual asset or delete multiple assets.
1. Go to the Azure Data Catalog home page.
2. In the Search text box, enter tags:cycles and click ENTER.
3. Select an item in the result list and click Delete on the toolbar as shown in the following image:
If you are using the list view, the check box is to the left of the item as shown in the following image:
You can also select multiple data assets and delete them as shown in the following image:
NOTE
The default behavior of the catalog is to allow any user to register any data source, and to allow any user to delete any
data asset that has been registered. The management capabilities included in the Standard Edition of Azure Data Catalog
provide additional options for taking ownership of assets, restricting who can discover assets, and restricting who can
delete assets.
Summary
In this tutorial, you explored essential capabilities of Azure Data Catalog, including registering, annotating,
discovering, and managing enterprise data assets. Now that you’ve completed the tutorial, it’s time to get
started. You can begin today by registering the data sources you and your team rely on, and by inviting
colleagues to use the catalog.
References
How to register data assets
How to discover data assets
How to annotate data assets
How to document data assets
How to connect to data assets
How to manage data assets
Approach and process for adopting Azure Data
Catalog
8/27/2018 • 16 minutes to read • Edit Online
This article helps you get started adopting Azure Data Catalog in your organization. To successfully adopt Azure
Data Catalog, you focus on three key items: define your vision, identify key business use cases within your
organization, and choose a pilot project.
NOTE
We wrote a sample tool that uses the Azure Data Catalog API to migrate an Excel workbook to Data Catalog. To learn
about the Data Catalog API and the sample tool, download the Ad Hoc workbook code sample, and check out the Azure
Data Catalog REST API documentation.
After the pilot project is in place, it's time to execute your Data Catalog adoption plan.
Execute
At this point you have identified use cases for Data Catalog, and you have identified your first project. In addition,
you have registered the key Adventure Works data sources and have added information from the existing Excel
workbook using the tool that IT built. Now it's time to work with the pilot team to start the Data Catalog adoption
process.
Here are some tips to get you started:
Create excitement - Business users get excited if they believe that Azure Data Catalog makes their lives
easier. Try to make the conversation around the solution and the benefits it provides, not the technology.
Facilitate change - Start small and communicate the plan to business users. To be successful, it's crucial to
involve users from the beginning so that they influence the outcome and develop a sense of ownership about
the solution.
Groom early adopters - Early adopters are business users that are passionate about what they do, and excited
to evangelize the benefits of Azure Data Catalog to their peers.
Target training - Business users do not need to know everything about Data Catalog, so target training to
address specific team goals. Focus on what users do, and how some of their tasks might change, to incorporate
Azure Data Catalog into their daily routine.
Be willing to fail - If the pilot isn't achieving the desired results, reevaluate, and identify areas to change - fix
problems in the pilot before moving on to a larger scope.
Before your pilot team jumps into using Data Catalog, schedule a kick-off meeting to discuss expectations for the
pilot project, and provide initial training.
Set expectations
Setting expectations and goals helps business users focus on specific deliverables. To keep the project on track,
assign regular (for example: daily or weekly based on the scope and duration of the pilot) homework assignments.
One of the most valuable capabilities of Data Catalog is crowdsourcing data assets so that business users can
benefit from knowledge of enterprise data. A great homework assignment is for each pilot team member to
register or annotate at least one data source they have used. See Register a data source and How to annotate data
sources.
Meet with the team on a regular schedule to review some of the annotations. Good annotations about data sources
are at the heart of a successful Data Catalog adoption because they provide meaningful data source insights in a
central location. Without good annotations, knowledge about data sources remains scattered throughout the
enterprise. See How to annotate data sources.
And, the ultimate test of the project is whether users can discover and understand the data sources they need to
use. Pilot users should regularly test the catalog to ensure that the data sources they use for their day to day work
are relevant. When a required data source is missing or not properly annotated, this should serve as a reminder to
register additional data sources or to provide additional annotations. This practice does not only add value to the
pilot effort but also builds effective habits that carry over to other teams after the pilot is complete.
Provide training
Training should be enough to get the users started, and tailored to the specific goals and experience level of the
pilot team members. To get started with training, you can follow the steps in the Get started with Azure Data
Catalog article. In addition, you can download the Azure Data Catalog Pilot Project Training presentation. This
PowerPoint presentation should help you get started introducing Data Catalog to your pilot team members.
Conclusion
Once your pilot team is running fairly smoothly and you have achieved your initial goals, you should expand Data
Catalog adoption to more teams. Apply and refine what you learned from your pilot project to expand Data
Catalog throughout your organization.
The early adopters who participated in the pilot can be helpful to get the word out about the benefits of adopting
Data Catalog. They can share with other teams how Data Catalog helped their team solve business problems,
discover data sources more easily, and share insights about the data sources they use. For example, early adopters
on the Adventure Works pilot team could show others how easy it is to find information about Adventure Works
data assets that were once hard to find and understand.
This article was about getting started with Azure Data Catalog in your organization. We hope you were able to
start a Data Catalog pilot project, and expand Data Catalog throughout your organization.
You need to take care of a few things before you can set up Azure Data Catalog. Don’t worry, this process does not
take long.
Azure subscription
To set up Data Catalog, you must be the owner or co-owner of an Azure subscription.
Azure subscriptions help you organize access to cloud-service resources such as Data Catalog. Subscriptions also
help you control how resource usage is reported, billed, and paid for. Each subscription can have a separate billing
and payment setup, so you can have subscriptions and plans that vary by department, project, regional office, and
so on. Every cloud service belongs to a subscription, and you need to have a subscription before you set up Data
Catalog. To learn more, see Manage accounts, subscriptions, and administrative roles.
NOTE
By using the Azure portal, you can sign in with either a personal Microsoft account or an Azure Active Directory work or
school account. To set up Data Catalog by using either the Azure portal or the Data Catalog portal, you must sign in with an
Azure Active Directory account, not a personal account.
This article provides answers to frequently asked questions related to the Azure Data Catalog service.
What properties does it extract for data assets that are registered?
The specific properties differ from data source to data source but, in general, the Data Catalog publishing service
extracts the following information:
Asset Name
Asset Type
Asset Description
Attribute/Column Names
Attribute/Column Data Types
Attribute/Column Description
IMPORTANT
Registering data assets with Data Catalog does not move or copy your data to the cloud. Registering assets from a data
source copies the assets’ metadata to Azure, but the data remains in the existing data-source location. The exception to this
rule is if you choose to upload preview records or a data profile when you register the assets. When you include a preview, up
to 20 records are copied from each asset and stored as a snapshot in Data Catalog. When you include a data profile,
aggregate information is calculated and included in the metadata that's stored in the catalog. Aggregate information can
include the size of tables, the percentage of null values per column, or the minimum, maximum, and average values for
columns.
NOTE
For data sources such as SQL Server Analysis Services that have a first-class Description property, the Data Catalog data
source registration tool extracts that property value. For SQL Server relational databases, which lack a first-class Description
property, the Data Catalog data source registration tool extracts the value from the ms_description extended property for
objects and columns. For more information, see Using Extended Properties on Database Objects.
How long should it take for newly registered assets to appear in the
catalog?
After you register assets with Data Catalog, there may be a period of 5 to 10 seconds before they appear in the
Data Catalog portal.
What is an expert?
An expert is a person who has an informed perspective about a data object. An object can have multiple experts.
An expert does not need to be the “owner” for an object, but is simply someone who knows how the data can and
should be used.
Does the catalog work with another data source that I’m interested in?
We’re actively working on adding more data sources to Data Catalog. If you want to see a specific data source
supported, suggest it (or voice your support if it has already been suggested) by going to the Data Catalog on the
Azure Feedback Forums.
How is Azure Data Catalog related to the Data Catalog in Power BI for
Office 365?
You can think of Azure Data Catalog as an evolution of the Data Catalog in Power BI. As of spring 2017, Azure
Data Catalog is used to enable the sharing and discovery of queries in Excel 2016 and Power Query for Excel. Data
Catalog capabilities in Excel are available to users with Power BI Pro licenses.
Can I extract more or richer metadata from the data sources I register?
We’re actively working to expand the capabilities of Data Catalog. If you want to have additional metadata
extracted from the data source during registration, suggest it (or vote for it, if it has already been suggested) in the
Data Catalog on the Azure Feedback Forums.
If you would like to include column/schema metadata, previews, or data profiles, for data sources where this
metadata is not extracted by the data source registration tool, you can use the Data Catalog API to add this
metadata. For additional information, see Azure Data Catalog REST API.
How do I update the registration for a data asset so that changes in the
data source are reflected in the catalog?
To update the metadata for data assets that are already registered in the catalog, simply re-register the data source
that contains the assets. Any changes in the data source, such as columns being added or removed from tables or
views, are updated in the catalog, but any annotations provided by users are retained.
Introduction
Azure Data Catalog is a fully managed cloud service that serves as a system of registration and discovery for
enterprise data sources. In other words, Data Catalog helps people discover, understand, and use data sources, and
it helps organizations get more value from their existing data. The first step to making a data source discoverable
via Data Catalog is to register that data source.
Structural metadata
When you register a data source, the registration tool extracts information about the structure of the objects you
select. This information is referred to as structural metadata.
For all objects, this structural metadata includes the object’s location, so that users who discover the data can use
that information to connect to the object in the client tools of their choice. Other structural metadata includes
object name and type, and attribute/column name and data type.
Descriptive metadata
In addition to the core structural metadata that's extracted from the data source, the data source registration tool
extracts descriptive metadata. For SQL Server Analysis Services and SQL Server Reporting Services, this
metadata is taken from the Description properties exposed by these services. For SQL Server, values provided
using the ms_description extended property is extracted. For Oracle Database, the data-source registration tool
extracts the COMMENTS column from the ALL_TAB_COMMENTS view.
In addition to the descriptive metadata that's extracted from the data source, users can enter descriptive metadata
by using the data source registration tool. Users can add tags, and they can identify experts for the objects being
registered. All this descriptive metadata is copied to the Data Catalog service along with the structural metadata.
Include previews
By default, only metadata is extracted from data sources and copied to the Data Catalog service, but
understanding a data source is often made easier when you can view a sample of the data it contains.
By using the Data Catalog data-source registration tool, you can include a snapshot preview of the data in each
table and view that is registered. If you choose to include previews during registration, the registration tool
includes up to 20 records from each table and view. This snapshot is then copied to the catalog along with the
structural and descriptive metadata.
NOTE
Wide tables with a large number of columns might have fewer than 20 records included in their preview.
NOTE
Text and date columns do not include average or standard deviation statistics in their data profile.
Update registrations
Registering a data source makes it discoverable in Data Catalog when you use the metadata and optional preview
extracted during registration. If the data source needs to be updated in the catalog (for example, if the schema of
an object has changed, tables originally excluded should be included, or you want to update the data that's
included in the previews), the data source registration tool can be re-run.
Re-registering an already-registered data source performs a merge “upsert” operation: existing objects are
updated, and new objects are created. Any metadata provided by users through the Data Catalog portal are
retained.
Summary
Because it copies structural and descriptive metadata from a data source to the catalog service, registering the
data source in Data Catalog makes the data easier to discover and understand. After you have registered the data
source, you can annotate, manage, and discover it by using the Data Catalog portal.
Next steps
For more information about registering data sources, see the Get Started with Azure Data Catalog tutorial.
How to discover data sources in Azure Data Catalog
8/27/2018 • 2 minutes to read • Edit Online
Introduction
Azure Data Catalog is a fully managed cloud service that serves as a system of registration and discovery for
enterprise data sources. In other words, Data Catalog helps people discover, understand, and use data sources, and
it helps organizations get more value from their existing data. After a data source is registered with Data Catalog,
its metadata is indexed by the service, so that you can easily search to discover the data you need.
Search syntax
Although the default free text search is simple and intuitive, you can also use Data Catalog search syntax for
greater control over the search results. Data Catalog search supports the following techniques:
Basic search Basic search that uses one or more sales data
search terms. Results are any assets
that match any property with one or
more of the terms specified.
Grouping with parenthesis Use parentheses to group parts of the name:finance AND (tags:Q1 OR
query to achieve logical isolation, tags:Q2)
especially in conjunction with Boolean
operators.
Comparison operators Use comparisons other than equality modifiedTime > "11/05/2014"
for properties that have numeric and
date data types.
For more information about Data Catalog search, see the Azure Data Catalog article.
Hit highlighting
When you view search results, any displayed properties that match the specified search terms (such as the data
asset name, description, and tags) are highlighted to make it easier to identify why a given data asset was returned
by a given search.
NOTE
To turn off hit highlighting, use the Highlight switch in the Data Catalog portal.
When you view search results, it might not always be obvious why a data asset is included, even with hit
highlighting enabled. Because all properties are searched by default, a data asset might be returned because of a
match on a column-level property. And because multiple users can annotate registered data assets with their own
tags and descriptions, not all metadata might be displayed in the list of search results.
In the default tile view, each tile displayed in the search results includes a View search term matches icon, so
that you can quickly view the number of matches and their location, and to jump to them if you want.
Summary
Because registering a data source with Data Catalog copies structural and descriptive metadata from the data
source to the catalog service, the data source becomes easier to discover and understand. After you've registered a
data source, you can discover it by using filtering and search from within the Data Catalog portal.
Next steps
For step-by-step details about how to discover data sources, see Get Started with Azure Data Catalog.
How to annotate data sources
8/27/2018 • 4 minutes to read • Edit Online
Introduction
Microsoft Azure Data Catalog is a fully managed cloud service that serves as a system of registration and
system of discovery for enterprise data sources. In other words, Data Catalog is all about helping people discover,
understand, and use data sources, and helping organizations to get more value from their existing data. When a
data source is registered with Data Catalog, its metadata is copied and indexed by the service, but the story
doesn’t end there. Data Catalog allows users to provide their own descriptive metadata – such as descriptions and
tags – to supplement the metadata extracted from the data source, and to make the data source more
understandable to more people.
ANNOTATION NOTES
Friendly name Friendly names can be supplied at the data asset level, to
make the data assets more easily understood. Friendly names
are most useful when the underlying object name is cryptic,
abbreviated or otherwise not meaningful to users.
Tags (user tags) Tags can be supplied at the data asset and attribute / column
levels. User tags are user-defined labels that can be used to
categorize data assets or attributes.
ANNOTATION NOTES
Tags (glossary tags) Tags can be supplied at the data asset and attribute / column
levels. Glossary tags are centrally-defined glossary terms that
can be used to categorize data assets or attributes using a
common business taxonomy. For more information see How
to set up the Business Glossary for Governed Tagging
Experts Experts can be supplied at the data asset level. Experts identify
users or groups with expert perspectives on the data and can
serve as points of contact for users who discover the
registered data sources and have questions that are not
answered by the existing annotations.
Request access Request access information can be supplied at the data asset
level. This information is for users who discover a data source
that they do not yet have permissions to access. Users can
enter the email address of the user or group who grants
access, the URL of the process or tool that users need to gain
access, or can enter the process itself as text.
NOTE
Tags and experts can also be provided when registering data assets using the Data Catalog data source registration tool.
When selecting multiple tables and views, only columns that all selected data assets have in common will be
displayed in the Data Catalog portal. This allows users to provide tags and descriptions for all columns with the
same name for all selected assets.
Summary
Registering a data source with Data Catalog makes that data discoverable by copying structural and descriptive
metadata from the data source into the Catalog service. Once a data source has been registered, users can provide
annotations to make easier to discover and understand from within the Data Catalog portal.
See also
Get Started with Azure Data Catalog tutorial for step-by-step details about how to annotate data sources.
Document data sources
8/27/2018 • 2 minutes to read • Edit Online
Introduction
Microsoft Azure Data Catalog is a fully managed cloud service that serves as a system of registration and
system of discovery for enterprise data sources. In other words, Azure Data Catalog is all about helping people
discover, understand, and use data sources, and helping organizations to get more value from their existing data.
When a data source is registered with Azure Data Catalog, its metadata is copied and indexed by the service, but
the story doesn’t end there. Azure Data Catalog also allows users to provide their own complete documentation
that can describe the usage and common scenarios for the data source.
In How to annotate data sources, you learn that experts who know the data source can annotate it with tags and a
description. The Azure Data Catalog portal includes a rich text editor so that users can fully document data
assets and containers. The editor includes paragraph formatting, such as headings, text formatting, bulleted lists,
numbered lists, and tables.
Tags and descriptions are great for simple annotations. However, to help data consumers better understand the
use of a data source, and business scenarios for a data source, an expert can provide complete, detailed
documentation. It's easy to document a data source. Select a data asset or container, and choose Documentation.
Documenting data assets
The benefit of Azure Data Catalog documentation allows you to use your Data Catalog as a content repository
to create a complete narrative of your data assets. You can explore detailed content that describes containers and
tables. If you already have content in another content repository, such as SharePoint or a file share, you can add to
the asset documentation links to reference this existing content. This feature makes your existing documents more
discoverable.
NOTE
Documentation is not included in search index.
The level of documentation can range from describing the characteristics and value of a data asset container to a
detailed description of table schema within a container. The level of documentation provided should be driven by
your business needs. But in general, here are a few pros and cons of documenting data assets:
Document just a container: All the content is in one place, but might lack necessary details for users to make an
informed decision.
Document just the tables: Content is specific to that object, but your users have multiple places for documents.
Document containers and tables: Most comprehensive approach, but might introduce more maintenance of the
documents.
Summary
Documenting data sources with Azure Data Catalog can create a narrative about your data assets in as much
detail as you need. By using links, you can link to content stored in an existing content repository, which brings
your existing docs and data assets together. Once your users discover appropriate data assets, they can have a
complete set of documentation.
How to connect to data sources
8/27/2018 • 3 minutes to read • Edit Online
Introduction
Microsoft Azure Data Catalog is a fully managed cloud service that serves as a system of registration and
system of discovery for enterprise data sources. In other words, Azure Data Catalog is all about helping people
discover, understand, and use data sources, and helping organizations to get more value from their existing data. A
key aspect of this scenario is using the data – once a user discovers a data source and understands its purpose, the
next step is to connect to the data source to put its data to use.
When using the list view, the menu is available in the search bar at the top of the portal window.
Supported Client Applications
When using the “Open in…” menu for data sources in the Azure Data Catalog portal, the correct client application
must be installed on the client computer.
SQL Server Data Tools vsweb:// Visual Studio 2013 Update 4 or later
with SQL Server tooling installed
Summary
Registering a data source with Azure Data Catalog makes that data discoverable by copying structural and
descriptive metadata from the data source into the Catalog service. Once a data source has been registered, and
discovered, users can connect to the data source from the Azure Data Catalog portal “Open in…”” menu or using
their data tools of choice.
See also
Get Started with Azure Data Catalog tutorial for step-by-step details about how to connect to data sources.
How to work with big data sources in Azure Data
Catalog
8/27/2018 • 2 minutes to read • Edit Online
Introduction
Microsoft Azure Data Catalog is a fully managed cloud service that serves as a system of registration and
system of discovery for enterprise data sources. It is all about helping people discover, understand, and use data
sources, and helping organizations to get more value from their existing data sources, including big data.
Azure Data Catalog supports the registration of Azure Blog Storage blobs and directories as well as Hadoop
HDFS files and directories. The semi-structured nature of these data sources provides great flexibility. However, to
get the most value from registering them with Azure Data Catalog, users must consider how the data sources
are organized.
\vehicle_maintenance_events
\2013
\2014
\2015
\01
\2015-01-trailer01.csv
\2015-01-trailer92.csv
\2015-01-canister9635.csv
...
\location_tracking_events
\2013
...
In this example, vehicle_maintenance_events and location_tracking_events represent logical data sets. Each of these
folders contains data files that are organized by year and month into subfolders. Each of these folders could
potentially contain hundreds or thousands of files.
In this pattern, registering individual files with Azure Data Catalog probably does not make sense. Instead,
register the directories that represent the data sets that be meaningful to the users working with the data.
When an analyst or data scientist is working with the data contained in the larger directory structures, the data in
these reference files can be used to provide more detailed information for entities that are referred to only by
name or ID in the larger data set.
In this pattern, it makes sense to register the individual reference data files with Azure Data Catalog. Each file
represents a data set, and each one can be annotated and discovered individually.
Alternate patterns
The patterns described in the preceding section are just two possible ways a big data store may be organized, but
each implementation is different. Regardless of how your data sources are structured, when registering big data
sources with Azure Data Catalog, focus on registering the files and directories that represent the data sets that
are of value to others within your organization. Registering all files and directories can clutter the catalog, making it
harder for users to find what they need.
Summary
Registering data sources with Azure Data Catalog makes them easier to discover and understand. By registering
and annotating the big data files and directories that represent logical data sets, you can help users find and use
the big data sources they need.
Data profile data sources
8/27/2018 • 3 minutes to read • Edit Online
Introduction
Microsoft Azure Data Catalog is a fully managed cloud service that serves as a system of registration and
system of discovery for enterprise data sources. In other words, Azure Data Catalog is all about helping people
discover, understand, and use data sources, and helping organizations to get more value from their existing data.
When a data source is registered with Azure Data Catalog, its metadata is copied and indexed by the service, but
the story doesn’t end there.
The Data Profiling feature of Azure Data Catalog examines the data from supported data sources in your
catalog and collects statistics and information about that data. It's easy to include a profile of your data assets.
When you register a data asset, choose Include Data Profile in the data source registration tool.
NOTE
You can also add documentation to an asset to describe how data could be integrated into an application. See How to
document data sources.
NOTE
Selecting Include Data Profile in the data source registration tool includes both table and column-level profile information.
However, the Data Catalog API allows data assets to be registered with only one set of profile information included.
Summary
Data profiling provides statistics and information about registered data assets to help you determine the suitability
of the data to solve business problems. Along with annotating, and documenting data sources, data profiles can
give users a deeper understanding of your data.
See Also
How to register data sources
Get started with Azure Data Catalog
Manage data assets in Azure Data Catalog
8/27/2018 • 3 minutes to read • Edit Online
Introduction
Azure Data Catalog is designed for data-source discovery, so that you can easily discover and understand the data
sources you need to perform analysis and make decisions. These discovery capabilities make the biggest impact
when you and other users can find and understand the broadest range of available data sources. With these
elements in mind, the default behavior of Data Catalog is for all registered data sources to be visible to and
discoverable by all catalog users.
Data Catalog does not give you access to the data itself. Data access is controlled by the owner of the data source.
With Data Catalog, you can discover data sources and view the metadata that's related to the sources that are
registered in the catalog.
There might be situations, however, where data sources should only be visible to specific users, or to members of
specific groups. In such scenarios, users can take ownership of registered data assets within the catalog and then
control the visibility of the assets they own.
NOTE
The functionality described in this article is available only in the Standard Edition of Azure Data Catalog. The Free Edition
does not provide capabilities for ownership and restricting data-asset visibility.
NOTE
Ownership in Data Catalog affects only the metadata that's stored in the catalog. Ownership does not confer any
permissions on the underlying data source.
Take ownership
Users can take ownership of data assets by selecting the Take Ownership option in the Data Catalog portal. No
special permissions are required to take ownership of an unowned data asset. Any user can take ownership of an
unowned data asset.
Add owners and co -owners
If a data asset is already owned, other users cannot simply take ownership. They must be added as co-owners by
an existing owner. Any owner can add additional users or security groups as co-owners.
NOTE
It is a best practice to have at least two individuals as owners for any owned data asset.
Remove owners
Just as any asset owner can add co-owners, any asset owner can remove any co-owner.
An asset owner who removes him or herself as an owner can no longer manage the asset. If the asset owner
removes him or herself as an owner and there are no other co-owners, the asset reverts to an unowned state.
Control visibility
Data-asset owners can control the visibility of the data assets they own. To restrict visibility as the default, where
all Data Catalog users can discover and view the data asset, the asset owner can toggle the visibility setting from
Everyone to Owners & These Users in the properties for the asset. Owners can then add specific users and
security groups.
NOTE
Whenever possible, asset ownership and visibility permissions should be assigned to security groups and not to individual
users.
Catalog administrators
Data Catalog administrators are implicitly co-owners of all assets in the catalog. Asset owners cannot remove
visibility from administrators, and administrators can manage ownership and visibility for all data assets in the
catalog.
Summary
The Data Catalog crowdsourcing model to metadata and data asset discovery allows all catalog users to
contribute and discover. The Standard Edition of Data Catalog is designed for ownership and management to limit
the visibility and use of specific data assets.
Save searches and pin data assets in Azure Data
Catalog
8/27/2018 • 3 minutes to read • Edit Online
Introduction
Azure Data Catalog provides capabilities for data source discovery. You can quickly search and filter the catalog to
locate data sources and understand their intended purpose, making it easier to find the right data for the job at
hand.
But what if you need to regularly work with the same data? And what if you and other users regularly contribute
your knowledge to the same data sources in the catalog? In these situations, having to repeatedly issue the same
searches can be inefficient. This is where saved search and pinned data assets can help.
Saved searches
A saved search in Data Catalog is a reusable, per-user search definition. You can define a search, including search
terms, tags, and other filters, and then save it. You can re-run the saved search definition later to return any data
assets that match its search criteria.
Create a saved search
To create a saved search, do the following:
1. In the Azure Data Catalog portal, in the Current Search window, click Save.
2. Enter the search criteria that you want to reuse, and then click Save.
3. When you are prompted, enter a name for the saved search. Pick a name that is meaningful and that
describes the data assets that will be returned by the search.
Manage saved searches
After you have saved one or more searches, a Saved Searches option is displayed beneath the Current Search
box. When the list is expanded, all saved searches are displayed.
To enter a new name for the saved search, select Rename. The search definition is not changed.
To remove the saved search from your list, select Delete, and then confirm the deletion.
To mark the saved search as your default search, select Save As Default. If you perform an “empty” search
from the Azure Data Catalog home page, your default search is executed. In addition, the search that's
marked as the default search is displayed at the top of the Saved Searches list.
Organizational saved searches
All user in your organization can save searches for their own use. Data Catalog administrators can also save
searches for all users within the organization. When administrators save a search, they're presented with a Share
within the company option. Selecting this option shares the saved search for all users in the organization.
Unpinning a data asset is equally straightforward. Simply click the unpin icon to toggle the setting for the selected
asset.
The My Assets section
The Data Catalog portal home page includes a My Assets section that displays assets of interest to the current
user. This section includes both pinned assets and saved searches.
Summary
Azure Data Catalog provides capabilities that make it easier to discover the data sources you need, so you and
other organization members can spend less time looking for data and more time working with it. Saved searches
and pinned data assets build on these core capabilities so users can easily identify data sources that they work with
repeatedly.
Set up the business glossary for governed tagging
8/27/2018 • 4 minutes to read • Edit Online
Introduction
Azure Data Catalog enables data-source discovery, so you can easily discover and understand the data sources
that you need to perform analysis and make decisions. These capabilities make the biggest impact when you can
find and understand the broadest range of available data sources.
One Data Catalog feature that promotes greater understanding of assets data is tagging. By using tagging, you
can associate keywords with an asset or a column, which in turn makes it easier to discover the asset via searching
or browsing. Tagging also helps you more easily understand the context and intent of the asset.
However, tagging can sometimes cause problems of its own. Some examples of problems that tagging can
introduce are:
The use of abbreviations on some assets and expanded text on others. This inconsistency hinders the discovery
of assets, even though the intent was to tag the assets with the same tag.
Potential variations in meaning, depending on context. For example, a tag called Revenue on a customer data
set might mean revenue by customer, but the same tag on a quarterly sales dataset might mean quarterly
revenue for the company.
To help address these and other similar challenges, Data Catalog includes a business glossary.
By using the Data Catalog business glossary, an organization can document key business terms and their
definitions to create a common business vocabulary. This governance enables consistency in data usage across the
organization. After a term is defined in the business glossary, it can be assigned to a data asset in the catalog. This
approach, governed tagging, is the same approach as tagging.
Data Catalog administrators and members of the glossary administrators role can create, edit, and delete glossary
terms in the business glossary. All Data Catalog users can view the term definitions and tag assets with glossary
terms.
Creating glossary terms
Data Catalog administrators and glossary administrators can create glossary terms by clicking the New Term
button. Each glossary term contains the following fields:
A business definition for the term
A description that captures the intended use or business rules for the asset or column
A list of stakeholders who know the most about the term
The parent term, which defines the hierarchy in which the term is organized
NOTE
User tags are the only type of tag supported in the Free Edition of Data Catalog.
Summary
By using the business glossary in Azure Data Catalog, and the governed tagging it enables, you can identify,
manage, and discover data assets in a consistent manner. The business glossary can promote learning of the
business vocabulary by organization members. The glossary also supports capturing meaningful metadata, which
simplifies asset discovery and understanding.
Next steps
REST API documentation for business glossary operations
How to secure access to data catalog and data assets
8/27/2018 • 2 minutes to read • Edit Online
IMPORTANT
This feature is available only in the standard edition of Azure Data Catalog.
Azure Data Catalog allows you to specify who can access the data catalog and what operations (register, annotate,
take ownership) they can perform on metadata in the catalog.
3. Click Add.
4. Enter the fully qualified user name or name of the security group in the Azure Active Directory (AAD )
associated with the catalog. Use comma (`,’) as a separator if you are adding more than one user or group.
IMPORTANT
We recommend that you add security groups to catalog users rather than adding users directly and assign
permissions. Then, add users to the security groups that match their roles and their required access to the catalog.
Special considerations
The permissions assigned to security groups are additive. Say, a user is in two groups. One group has annotate
permissions and other group does not have annotate permissions. Then, user has annotate permissions.
The permissions assigned explicitly to a user override the permissions assigned to groups to which the user
belongs. In the previous example, say, you explicitly added the user to catalog users and do not assign annotate
permissions. The user cannot annotate data assets even though the user is a member of a group that does have
annotate permissions.
Next steps
Get started with Azure Data Catalog
How to view related data assets in Azure Data
Catalog?
8/27/2018 • 2 minutes to read • Edit Online
Azure Data Catalog allows you to view data assets related to a selected data asset and view relationships between
them.
NOTE
For Data Catalog to import relationship between two data assets, you must register both the assets at the same time. If you
had added one of them separately, add it again and the other data asset to import relationship between them.
In this example, there are two relationships for the selected ProductSubcategory data asset:
ProductSubcategoryID column of the Product table has a foreign key relationship with ProductSubcategoryID
column of the selected ProductSubcategory table.
ProductCategoryID column of the ProductSubCategory table has a foreign key relationship with
ProductCategoryID column of the selected ProductCategory table.
NOTE
Notice the direction of the arrow in the relationships tree view.
To see more details such as the fully qualified name of the column, move the mouse over and you see a popup
similar to the following image:
To include relationships between assets that have already been registered, re-register those assets.
Next steps
How to manage data assets
Azure Data Catalog developer concepts
8/27/2018 • 18 minutes to read • Edit Online
Microsoft Azure Data Catalog is a fully managed cloud service that provides capabilities for data source
discovery and for crowdsourcing data source metadata. Developers can use the service via its REST APIs.
Understanding the concepts implemented in the service is important for developers to successfully integrate with
Azure Data Catalog.
Key concepts
The Azure Data Catalog conceptual model is based on four key concepts: The Catalog, Users, Assets, and
Annotations.
Common properties
These properties apply to all root asset types and all annotation types.
assetCreatedDate String
assetCreatedBy String
assetModifiedDate String
assetModifiedBy String
Annotation types
Annotation types represent types of metadata that can be assigned to other types within the catalog.
expert SecurityPrincipal
TableDataProfile
("tableDataProfiles")
ColumnsDataProfile
("columnsDataProfiles")
ColumnDataClassification
("columnDataClassifications")
Common types
Common types can be used as the types for properties, but are not Items.
Common Type Properties Data Type Comments
DataSourceInfo
DataSourceLocation
Column
Asset identity
Azure Data Catalog uses "protocol" and identity properties from the "address" property bag of the
DataSourceLocation "dsl" property to generate identity of the asset, which is used to address the asset inside the
Catalog. For example, the "tds" protocol has identity properties "server", "database", "schema" and "object". The
combinations of the protocol and the identity properties are used to generate the identity of the SQL Server Table
Asset. Azure Data Catalog provides several built-in data source protocols, which are listed at Data source reference
specification - DSL Structure. The set of supported protocols can be extended programmatically (Refer to Data
Catalog REST API reference). Administrators of the Catalog can register custom data source protocols. The
following table describes the properties needed to register a custom protocol.
Custom data source protocol specification
DataSourceProtocol
DataSourceProtocolIdentityP
roperty
DataSourceProtocolIdentityS
et
name string The name of the identity set.
Key concepts
The Azure Data Catalog uses two authorization mechanisms:
Role-based authorization
Permission-based authorization
Roles
There are three roles: Administrator, Owner, and Contributor. Each role has its scope and rights, which are
summarized in the following table.
Contributor Each individual asset and annotation Read Update Delete ViewRoles Note: all
the rights are revoked if the Read right
on the item is revoked from the
Contributor
NOTE
Read, Update, Delete, ViewRoles rights are applicable to any item (asset or annotation) while TakeOwnership,
ChangeOwnership, ChangeVisibility, ViewPermissions are only applicable to the root asset.
Delete right applies to an item and any subitems or single item underneath it. For example, deleting an asset also deletes any
annotations for that asset.
Permissions
Permission is as list of access control entries. Each access control entry assigns set of rights to a security principal.
Permissions can only be specified on an asset (that is, root item) and apply to the asset and any subitems.
During the Azure Data Catalog preview, only Read right is supported in the permissions list to enable scenario to
restrict visibility of an asset.
By default any authenticated user has Read right for any item in the catalog unless visibility is restricted to the set
of principals in the permissions.
REST API
PUT and POST view item requests can be used to control roles and permissions: in addition to item payload, two
system properties can be specified roles and permissions.
NOTE
permissions only applicable to a root item.
Owner role only applicable to a root item.
By default when an item is created in the catalog its Contributor is set to the currently authenticated user. If item should be
updatable by everyone, Contributor should be set to <Everyone> special security principal in the roles property when item
is first published (refer to the following example). Contributor cannot be changed and stays the same during life-time of an
item (even Administrator or Owner doesn’t have the right to change the Contributor). The only value supported for the
explicit setting of the Contributor is <Everyone>: Contributor can only be a user who created an item or <Everyone>.
Examples
Set Contributor to <Everyone> when publishing an item. Special security principal <Everyone> has objectId
"00000000-0000-0000-0000-000000000201". POST
https://api.azuredatacatalog.com/catalogs/default/views/tables/?api-version=2016-03-30
NOTE
Some HTTP client implementations may automatically reissue requests in response to a 302 from the server, but typically
strip Authorization headers from the request. Since the Authorization header is required to make requests to Azure Data
Catalog, you must ensure the Authorization header is still provided when reissuing a request to a redirect location specified
by Azure Data Catalog. The following sample code demonstrates it using the .NET HttpWebRequest object.
Body
{
"roles": [
{
"role": "Contributor",
"members": [
{
"objectId": "00000000-0000-0000-0000-000000000201"
}
]
}
]
}
Assign owners and restrict visibility for an existing root item: PUT
https://api.azuredatacatalog.com/catalogs/default/views/tables/042297b0...1be45ecd462a?api-version=2016-03-
30
{
"roles": [
{
"role": "Owner",
"members": [
{
"objectId": "c4159539-846a-45af-bdfb-58efd3772b43",
"upn": "user1@contoso.com"
},
{
"objectId": "fdabd95b-7c56-47d6-a6ba-a7c5f264533f",
"upn": "user2@contoso.com"
}
]
}
],
"permissions": [
{
"principal": {
"objectId": "27b9a0eb-bb71-4297-9f1f-c462dab7192a",
"upn": "user3@contoso.com"
},
"rights": [
{
"right": "Read"
}
]
},
{
"principal": {
"objectId": "4c8bc8ce-225c-4fcf-b09a-047030baab31",
"upn": "user4@contoso.com"
},
"rights": [
{
"right": "Read"
}
]
}
]
}
NOTE
In PUT it’s not required to specify an item payload in the body: PUT can be used to update just roles and/or permissions.
Keyboard shortcuts for Azure Data Catalog
8/27/2018 • 2 minutes to read • Edit Online
Keyboard shortcuts for the Data Catalog data source registration tool
General keyboard shortcuts
OPERATION PRESS
Authentication page
OPERATION PRESS
Change selected type when the focus is on a tile LEFT, UP, RIGHT, or DOWN ARROW
On the discover page, when an asset has focus, select asset SPACE or ENTER
Get notified about when to revisit this page for updates by adding this URL to your feed reader.
Azure AD receives improvements on an ongoing basis. To stay up-to-date with the most recent developments, this
article provides you with information about:
The latest releases
Known issues
Bug fixes
Deprecated functionality
Plans for changes
This page is updated monthly, so revisit it regularly.
August 2018
Changes to Azure Active Directory IP address ranges
Type: Plan for change
Service category: Other
Product capability: Platform
We're introducing larger IP ranges to Azure AD, which means if you've configured Azure AD IP address ranges for
your firewalls, routers, or Network Security Groups, you'll need to update them. We're making this update so you
won't have to change your firewall, router, or Network Security Groups IP range configurations again when Azure
AD adds new endpoints.
Network traffic is moving to these new ranges over the next two months. To continue with uninterrupted service,
you must add these updated values to your IP Addresses before September 10, 2018:
20.190.128.0/18
40.126.0.0/18
We strongly recommend not removing the old IP Address ranges until all of your network traffic has moved to the
new ranges. For updates about the move and to learn when you can remove the old ranges, see Office 365 URLs
and IP address ranges.
Converged security info management for self-service password (SSPR ) and Multi-Factor Authentication (MFA )
Type: New feature
Service category: SSPR
Product capability: User Authentication
This new feature helps people manage their security info (such as, phone number, mobile app, and so on) for SSPR
and MFA in a single location and experience; as compared to previously, where it was done in two different
locations.
This converged experience also works for people using either SSPR or MFA. Additionally, if your organization
doesn't enforce MFA or SSPR registration, people can still register any MFA or SSPR security info methods
allowed by your organization from the My Apps portal.
This is an opt-in public preview. Administrators can turn on the new experience (if desired) for a selected group or
for all users in a tenant. For more information about the converged experience, see the Converged experience blog
Privileged Identity Management (PIM ) for Azure resources supports Management Group resource types
Type: New feature
Service category: Privileged Identity Management
Product capability: Privileged Identity Management
Just-In-Time activation and assignment settings can now be applied to Management Group resource types, just
like you already do for Subscriptions, Resource Groups, and Resources (such as VMs, App Services, and more). In
addition, anyone with a role that provides administrator access for a Management Group can discover and manage
that resource in PIM.
For more information about PIM and Azure resources, see Discover and manage Azure resources by using
Privileged Identity Management
New support to add Google as an identity provider for B2B guest users in Azure Active Directory (preview)
Type: New feature
Service category: B2B
Product capability: B2B/B2C
By setting up federation with Google in your organization, you can let invited Gmail users sign-in to your shared
apps and resources using their existing Google account, without having to create a personal Microsoft Account
(MSAs) or an Azure AD account.
This is an opt-in public preview. For more information about Google federation, see Add Google as an identity
provider for B2B guest users.
July 2018
Improvements to Azure Active Directory email notifications
Type: Changed feature
Service category: Other
Product capability: Identity lifecycle management
Azure Active Directory (Azure AD ) emails now feature an updated design, as well as changes to the sender email
address and sender display name, when sent from the following services:
Azure AD Access Reviews
Azure AD Connect Health
Azure AD Identity Protection
Azure AD Privileged Identity Management
Enterprise App Expiring Certificate Notifications
Enterprise App Provisioning Service Notifications
The email notifications will be sent from the following email address and display name:
Email address: azure-noreply@microsoft.com
Display name: Microsoft Azure
For an example of some of the new e-mail designs and more information, see Email notifications in Azure AD PIM.
Connect Health for Sync - An easier way to fix orphaned and duplicate attribute sync errors
Type: New feature
Service category: AD Connect
Product capability: Monitoring & Reporting
Azure AD Connect Health introduces self-service remediation to help you highlight and fix sync errors. This feature
troubleshoots duplicated attribute sync errors and fixes objects that are orphaned from Azure AD. This diagnosis
has the following benefits:
Narrows down duplicated attribute sync errors, providing specific fixes
Applies a fix for dedicated Azure AD scenarios, resolving errors in a single step
No upgrade or configuration is required to turn on and use this feature
For more information, see Diagnose and remediate duplicated attribute sync errors
Converged security info management for self-service password reset and Multi-Factor Authentication
Type: New feature
Service category: SSPR
Product capability: User Authentication
This new feature lets users manage their security info (for example, phone number, email address, mobile app, and
so on) for self-service password reset (SSPR ) and Multi-Factor Authentication (MFA) in a single experience. Users
will no longer have to register the same security info for SSPR and MFA in two different experiences. This new
experience also applies to users who have either SSPR or MFA.
If an organization isn't enforcing MFA or SSPR registration, users can register their security info through the My
Apps portal. From there, users can register any methods enabled for MFA or SSPR.
This is an opt-in public preview. Admins can turn on the new experience (if desired) for a selected group of users or
all users in a tenant.
Use the Microsoft Authenticator app to verify your identity when you reset your password
Type: Changed feature
Service category: SSPR
Product capability: User Authentication
This feature lets non-admins verify their identity while resetting a password using a notification or code from
Microsoft Authenticator (or any other authenticator app). After admins turn this self-service password reset
method on, users who have registered a mobile app through aka.ms/mfasetup or aka.ms/setupsecurityinfo can use
their mobile app as a verification method while resetting their password.
Mobile app notification can only be turned on as part of a policy that requires two methods to reset your password.
June 2018
Change notice: Security fix to the delegated authorization flow for apps using Azure AD Activity Logs API
Type: Plan for change
Service category: Reporting
Product capability: Monitoring & Reporting
Due to our stronger security enforcement, we’ve had to make a change to the permissions for apps that use a
delegated authorization flow to access Azure AD Activity Logs APIs. This change will occur by June 26, 2018.
If any of your apps use Azure AD Activity Log APIs, follow these steps to ensure the app doesn’t break after the
change happens.
To update your app permissions
1. Sign in to the Azure portal, select Azure Active Directory, and then select App Registrations.
2. Select your app that uses the Azure AD Activity Logs API, select Settings, select Required permissions, and
then select the Windows Azure Active Directory API.
3. In the Delegated permissions area of the Enable access blade, select the box next to Read directory data,
and then select Save.
4. Select Grant permissions, and then select Yes.
NOTE
You must be a Global administrator to grant permissions to the app.
For more information, see the Grant permissions area of the Prerequisites to access the Azure AD reporting API
article.
Configure TLS settings to connect to Azure AD services for PCI DSS compliance
Type: New feature
Service category: N/A
Product capability: Platform
Transport Layer Security (TLS ) is a protocol that provides privacy and data integrity between two communicating
applications and is the most widely deployed security protocol used today.
The PCI Security Standards Council has determined that early versions of TLS and Secure Sockets Layer (SSL )
must be disabled in favor of enabling new and more secure app protocols, with compliance starting on June 30,
2018. This change means that if you connect to Azure AD services and require PCI DSS -compliance, you must
disable TLS 1.0. Multiple versions of TLS are available, but TLS 1.2 is the latest version available for Azure Active
Directory Services. We highly recommend moving directly to TLS 1.2 for both client/server and browser/server
combinations.
Out-of-date browsers might not support newer TLS versions, such as TLS 1.2. To see which versions of TLS are
supported by your browser, go to the Qualys SSL Labs site and click Test your browser. We recommend you
upgrade to the latest version of your web browser and preferably enable only TLS 1.2.
To enable TLS 1.2, by browser
Microsoft Edge and Internet Explorer (both are set using Internet Explorer)
1. Open Internet Explorer, select Tools > Internet Options > Advanced.
2. In the Security area, select use TLS 1.2, and then select OK.
3. Close all browser windows and restart Internet Explorer.
Google Chrome
1. Open Google Chrome, type chrome://settings/ into the address bar, and press Enter.
2. Expand the Advanced options, go to the System area, and select Open proxy settings.
3. In the Internet Properties box, select the Advanced tab, go to the Security area, select use TLS 1.2,
and then select OK.
4. Close all browser windows and restart Google Chrome.
Mozilla Firefox
1. Open Firefox, type about:config into the address bar, and then press Enter.
2. Search for the term, TLS, and then select the security.tls.version.max entry.
3. Set the value to 3 to force the browser to use up to version TLS 1.2, and then select OK.
NOTE
Firefox version 60.0 supports TLS 1.3, so you can also set the security.tls.version.max value to 4.
New "all guests" conditional access policy template created during Terms of Use (ToU ) creation
Type: New feature
Service category: Terms of Use
Product capability: Governance
During the creation of your Terms of Use (ToU ), a new conditional access policy template is also created for "all
guests" and "all apps". This new policy template applies the newly created ToU, streamlining the creation and
enforcement process for guests.
For more information, see Azure Active Directory Terms of use feature.
New "custom" conditional access policy template created during Terms of Use (ToU ) creation
Type: New feature
Service category: Terms of Use
Product capability: Governance
During the creation of your Terms of Use (ToU ), a new “custom” conditional access policy template is also created.
This new policy template lets you create the ToU and then immediately go to the conditional access policy creation
blade, without needing to manually navigate through the portal.
For more information, see Azure Active Directory Terms of use feature.
May 2018
ExpressRoute support changes
Type: Plan for change
Service category: Authentications (Logins)
Product capability: Platform
Software as a Service offering, like Azure Active Directory (Azure AD ) are designed to work best by going directly
through the Internet, without requiring ExpressRoute or any other private VPN tunnels. Because of this, on August
1, 2018, we will stop supporting ExpressRoute for Azure AD services using Azure public peering and Azure
communities in Microsoft peering. Any services impacted by this change might notice Azure AD traffic gradually
shifting from ExpressRoute to the Internet.
While we're changing our support, we also know there are still situations where you might need to use a dedicated
set of circuits for your authentication traffic. Because of this, Azure AD will continue to support per-tenant IP range
restrictions using ExpressRoute and services already on Microsoft peering with the "Other Office 365 Online
services" community. If your services are impacted, but you require ExpressRoute, you must do the following:
If you're on Azure public peering. Move to Microsoft peering and sign up for the Other Office 365
Online services (12076:5100) community. For more info about how to move from Azure public peering to
Microsoft peering, see the Move a public peering to Microsoft peering article.
If you're on Microsoft peering. Sign up for the Other Office 365 Online service (12076:5100)
community. For more info about routing requirements, see the Support for BGP communities section of the
ExpressRoute routing requirements article.
If you must continue to use dedicated circuits, you'll need to talk to your Microsoft Account team about how to get
authorization to use the Other Office 365 Online service (12076:5100) community. The MS Office-managed
review board will verify whether you need those circuits and make sure you understand the technical implications
of keeping them. Unauthorized subscriptions trying to create route filters for Office 365 will receive an error
message.
Use Internal URLs to access apps from anywhere with our My Apps Sign-in Extension and the Azure AD
Application Proxy
Type: New feature
Service category: My Apps
Product capability: SSO
Users can now access applications through internal URLs even when outside your corporate network by using the
My Apps Secure Sign-in Extension for Azure AD. This will work with any application that you have published using
Azure AD Application Proxy, on any browser that also has the Access Panel browser extension installed. The URL
redirection functionality is automatically enabled once a user logs into the extension. The extension is available for
download on Edge, Chrome, and Firefox.
Azure AD access reviews of groups and app access now provides recurring reviews
Type: New feature
Service category: Access Reviews
Product capability: Governance
Access review of groups and apps is now generally available as part of Azure AD Premium P2. Administrators will
be able to configure access reviews of group memberships and application assignments to automatically recur at
regular intervals, such as monthly or quarterly.
Azure AD Activity logs (sign-ins and audit) are now available through MS Graph
Type: New feature
Service category: Reporting
Product capability: Monitoring & Reporting
Azure AD Activity logs, which, includes Sign-ins and Audit logs, are now available through MS Graph. We have
exposed two end points through MS Graph to access these logs. Check out our documents for programmatic
access to Azure AD Reporting APIs to get started.
The May release of AADConnect contains a public preview of the integration with PingFederate, important
security updates, many bug fixes, and new great new troubleshooting tools.
Type: Changed feature
Service category: AD Connect
Product capability: Identity Lifecycle Management
The May release of AADConnect contains a public preview of the integration with PingFederate, important security
updates, many bug fixes, and new great new troubleshooting tools. You can find the release notes here.
ID tokens can no longer be returned using the query response_mode for new apps.
Type: Changed feature
Service category: Authentications (Logins)
Product capability: User Authentication
Apps created on or after April 25, 2018 will no longer be able to request an id_token using the query
response_mode. This brings Azure AD inline with the OIDC specifications and helps reduce your apps attack
surface. Apps created before April 25, 2018 are not blocked from using the query response_mode with a
response_type of id_token. The error returned, when requesting an id_token from AAD, is AADSTS70007:
‘query’ is not a supported value of ‘response_mode’ when requesting a token.
The fragment and form_post response_modes continue to work - when creating new application objects (for
example, for App Proxy usage), ensure use of one of these response_modes before they create a new application.
April 2018
Azure AD B2C Access Token are GA
Type: New feature
Service category: B2C - Consumer Identity Management
Product capability: B2B/B2C
You can now access Web APIs secured by Azure AD B2C using access tokens. The feature is moving from public
preview to GA. The UI experience to configure Azure AD B2C applications and web APIs has been improved, and
other minor improvements were made.
For more information, see Azure AD B2C: Requesting access tokens.
Grant B2B users in Azure AD access to your on-premises applications (public preview)
Type: New feature
Service category: B2B
Product capability: B2B/B2C
As an organization that uses Azure Active Directory (Azure AD ) B2B collaboration capabilities to invite guest users
from partner organizations to your Azure AD, you can now provide these B2B users access to on-premises apps.
These on-premises apps can use SAML -based authentication or Integrated Windows Authentication (IWA) with
Kerberos constrained delegation (KCD ).
For more information, see Grant B2B users in Azure AD access to your on-premises applications.
Self-service password reset from Windows 10 lock screen for hybrid Azure AD joined machines
Type: Changed feature
Service category: Self Service Password Reset
Product capability: User Authentication
We have updated the Windows 10 SSPR feature to include support for machines that are hybrid Azure AD joined.
This feature is available in Windows 10 RS4 allows users to reset their password from the lock screen of a
Windows 10 machine. Users who are enabled and registered for self-service password reset can utilize this feature.
For more information, see Azure AD password reset from the login screen.
March 2018
Certificate expire notification
Type: Fixed
Service category: Enterprise Apps
Product capability: SSO
Azure AD sends a notification when a certificate for a gallery or non-gallery application is about to expire.
Some users did not receive notifications for enterprise applications configured for SAML -based single sign-on.
This issue was resolved. Azure AD sends notification for certificates expiring in 7, 30 and 60 days. You are able to
see this event in the audit logs.
For more information, see:
Manage Certificates for federated single sign-on in Azure Active Directory
Audit activity reports in the Azure Active Directory portal
Restrict browser access using Intune Managed Browser with Azure AD application-based conditional access for
iOS and Android
Type: New feature
Service category: Conditional Access
Product capability: Identity Security & Protection
Now in public preview!
Intune Managed Browser SSO: Your employees can use single sign-on across native clients (like Microsoft
Outlook) and the Intune Managed Browser for all Azure AD -connected apps.
Intune Managed Browser Conditional Access Support: You can now require employees to use the Intune
Managed browser using application-based conditional access policies.
Read more about this in our blog post.
For more information, see:
Setup application-based conditional access
Configure managed browser policies
Office 365 native clients are supported by Seamless SSO using a non-interactive protocol
Type: New feature
Service category: Authentications (Logins)
Product capability: User Authentication
User using Office 365 native clients (version 16.0.8730.xxxx and above) get a silent sign-on experience using
Seamless SSO. This support is provided by the addition a non-interactive protocol (WS -Trust) to Azure AD.
For more information, see How does sign-in on a native client with Seamless SSO work?
Users get a silent sign-on experience, with Seamless SSO, if an application sends sign-in requests to Azure AD's
tenant endpoints
Type: New feature
Service category: Authentications (Logins)
Product capability: User Authentication
Users get a silent sign-on experience, with Seamless SSO, if an application (for example,
https://contoso.sharepoint.com ) sends sign-in requests to Azure AD's tenant endpoints - that is,
https://login.microsoftonline.com/contoso.com/<..> or https://login.microsoftonline.com/<tenant_ID>/<..> -
instead of Azure AD's common endpoint ( https://login.microsoftonline.com/common/<...> ).
For more information, see Azure Active Directory Seamless Single Sign-On.
Need to add only one Azure AD URL, instead of two URLs previously, to users' Intranet zone settings to roll out
Seamless SSO
Type: New feature
Service category: Authentications (Logins)
Product capability: User Authentication
To roll out Seamless SSO to your users, you need to add only one Azure AD URL to the users' Intranet zone
settings by using group policy in Active Directory: https://autologon.microsoftazuread-sso.com . Previously,
customers were required to add two URLs.
For more information, see Azure Active Directory Seamless Single Sign-On.
Support for provisioning all user attribute values available in the Workday Get_Workers API
Type: New feature
Service category: App Provisioning
Product capability: 3rd Party Integration
The public preview of inbound provisioning from Workday to Active Directory and Azure AD now supports the
ability to extract and provisioning all attribute values available in the Workday Get_Workers API. This adds
supports for hundreds of additional standard and custom attributes beyond the ones shipped with the initial
version of the Workday inbound provisioning connector.
For more information, see: Customizing the list of Workday user attributes
February 2018
Improved navigation for managing users and groups
Type: Plan for change
Service category: Directory Management
Product capability: Directory
The navigation experience for managing users and groups has been streamlined. You can now navigate from the
directory overview directly to the list of all users, with easier access to the list of deleted users. You can also
navigate from the directory overview directly to the list of all groups, with easier access to group management
settings. And also from the directory overview page, you can search for a user, group, enterprise application, or app
registration.
Availability of sign-ins and audit reports in Microsoft Azure operated by 21Vianet (Azure China 21Vianet)
Type: New feature
Service category: Azure Stack
Product capability: Monitoring & Reporting
Azure AD Activity log reports are now available in Microsoft Azure operated by 21Vianet (Azure China 21Vianet)
instances. The following logs are included:
Sign-ins activity logs - Includes all the sign-ins logs associated with your tenant.
Self service Password Audit Logs - Includes all the SSPR audit logs.
Directory Management Audit logs - Includes all the directory management-related audit logs like User
management, App Management, and others.
With these logs, you can gain insights into how your environment is doing. The provided data enables you to:
Determine how your apps and services are utilized by your users.
Troubleshoot issues preventing your users from getting their work done.
For more information about how to use these reports, see Azure Active Directory reporting.
Use "Report Reader" role (non-admin role ) to view Azure AD Activity Reports
Type: New feature
Service category: Reporting
Product capability: Monitoring & Reporting
As part of customers feedback to enable non-admin roles to have access to Azure AD activity logs, we have
enabled the ability for users who are in the "Report Reader" role to access Sign-ins and Audit activity within the
Azure portal as well as using our Graph APIs.
For more information, how to use these reports, see Azure Active Directory reporting.
IMPORTANT
This build introduces schema and sync rule changes. The Azure AD Connect Synchronization Service triggers a Full Import
and Full Synchronization steps after an upgrade. For information on how to change this behavior, see How to defer full
synchronization after upgrade.
January 2018
New Federated Apps available in Azure AD app gallery
Type: New feature
Service category: Enterprise Apps
Product capability: 3rd Party Integration
In January 2018, the following new apps with federation support were added in the app gallery:
IBM OpenPages, OneTrust Privacy Management Software, Dealpath, [IriusRisk Federated Directory, and Fidelity
NetBenefits.
For more information about the apps, see SaaS application integration with Azure Active Directory.
For more information about listing your application in the Azure AD app gallery, see List your application in the
Azure Active Directory application gallery.
Seamless sign into apps enabled for Password SSO directly from app's URL
Type: New feature
Service category: My Apps
Product capability: SSO
The My Apps browser extension is now available via a convenient tool that gives you the My Apps single-sign on
capability as a shortcut in your browser. After installing, user's will see a waffle icon in their browser that provides
them quick access to apps. Users can now take advantage of:
The ability to directly sign in to password-SSO based apps from the app’s sign-in page
Launch any app using the quick search feature
Shortcuts to recently used apps from the extension
The extension is available for Edge, Chrome, and Firefox.
For more information, see My Apps Secure Sign-in Extension.
December 2017
Terms of use in the Access Panel
Type: New feature
Service category: Terms of use
Product capability: Compliance
You now can go to the Access Panel and view the terms of use that you previously accepted.
Follow these steps:
1. Go to the MyApps portal, and sign in.
2. In the upper-right corner, select your name, and then select Profile from the list.
3. On your Profile, select Review terms of use.
4. Now you can review the terms of use you accepted.
For more information, see the Azure AD terms of use feature (preview ).
Fewer sign-in prompts: A new "keep me signed in" experience for Azure AD sign-in
Type: New feature
Service category: Azure AD
Product capability: User authentication
The Keep me signed in check box on the Azure AD sign-in page was replaced with a new prompt that shows up
after you successfully authenticate.
If you respond Yes to this prompt, the service gives you a persistent refresh token. This behavior is the same as
when you selected the Keep me signed in check box in the old experience. For federated tenants, this prompt
shows after you successfully authenticate with the federated service.
For more information, see Fewer sign-in prompts: The new "keep me signed in" experience for Azure AD is in
preview.
November 2017
Access Control service retirement
Type: Plan for change
Service category: Access Control service
Product capability: Access Control service
Azure Active Directory Access Control (also known as the Access Control service) will be retired in late 2018. More
information that includes a detailed schedule and high-level migration guidance will be provided in the next few
weeks. You can leave comments on this page with any questions about the Access Control service, and a team
member will answer them.
October 2017
Deprecate Azure AD reports
Type: Plan for change
Service category: Reporting
Product capability: Identity Lifecycle Management
The Azure portal provides you with:
A new Azure AD administration console.
New APIs for activity and security reports.
Due to these new capabilities, the report APIs under the /reports endpoint were retired on December 10, 2017.
Terms of use
Type: New feature
Service category: Terms of use
Product capability: Compliance
You can use Azure AD terms of use to present information such as relevant disclaimers for legal or compliance
requirements to users.
You can use Azure AD terms of use in the following scenarios:
General terms of use for all users in your organization
Specific terms of use based on a user's attributes (for example, doctors vs. nurses or domestic vs. international
employees, done by dynamic groups)
Specific terms of use for accessing high-impact business apps, like Salesforce
For more information, see Azure AD terms of use.
Access reviews
Type: New feature
Service category: Access reviews
Product capability: Compliance
Organizations can use access reviews (preview ) to efficiently manage group memberships and access to enterprise
applications:
You can recertify guest user access by using access reviews of their access to applications and memberships of
groups. Reviewers can efficiently decide whether to allow guests continued access based on the insights
provided by the access reviews.
You can recertify employee access to applications and group memberships with access reviews.
You can collect the access review controls into programs relevant for your organization to track reviews for
compliance or risk-sensitive applications.
For more information, see Azure AD access reviews.
Hide third-party applications from My Apps and the Office 365 app launcher
Type: New feature
Service category: My Apps
Product capability: Single sign-on
You now can better manage apps that show up on your users' portals through a new hide app property. You can
hide apps to help in cases where app tiles show up for back-end services or duplicate tiles and clutter users' app
launchers. The toggle is in the Properties section of the third-party app and is labeled Visible to user? You also
can hide an app programmatically through PowerShell.
For more information, see Hide a third-party application from a user's experience in Azure AD.
What's available?
As part of the transition to the new admin console, two new APIs for retrieving Azure AD activity logs are available.
The new set of APIs provides richer filtering and sorting functionality in addition to providing richer audit and sign-
in activities. The data previously available through the security reports now can be accessed through the Identity
Protection Risk Events API in Microsoft Graph.
September 2017
Hotfix for Identity Manager
Type: Changed feature
Service category: Identity Manager
Product capability: Identity lifecycle management
A hotfix roll-up package (build 4.4.1642.0) is available as of September 25, 2017, for Identity Manager 2016
Service Pack 1. This roll-up package:
Resolves issues and adds improvements.
Is a cumulative update that replaces all Identity Manager 2016 Service Pack 1 updates up to build 4.4.1459.0 for
Identity Manager 2016.
Requires you to have Identity Manager 2016 build 4.4.1302.0.
For more information, see Hotfix rollup package (build 4.4.1642.0) is available for Identity Manager 2016 Service
Pack 1.
Azure Data Catalog terminology
8/27/2018 • 4 minutes to read • Edit Online
Catalog
The Azure Data Catalog is a cloud-based metadata repository in which data sources and data assets can be
registered. The catalog serves as a central storage location for structural metadata extracted from data sources and
for descriptive metadata added by users.
Data source
A data source is a system or container that manages data assets. Examples include SQL Server databases, Oracle
databases, SQL Server Analysis Services databases (tabular or multidimensional) and SQL Server Reporting
Services servers.
Data asset
Data assets are objects contained within data sources that can be registered with the catalog. Examples include
SQL Server tables and views, Oracle tables and views, SQL Server Analysis Services measures, dimensions and
KPIs, and SQL Server Reporting Services reports.
Structural metadata
Structural metadata is the metadata extracted from a data source that describes the structure of a data asset. This
includes the assets location, its object name and type, and additional type-specific characteristics. For example, the
structural metadata for tables and views includes the names and data types for the object’s columns.
Descriptive metadata
Descriptive metadata is metadata that describes the purpose or intent of a data asset. Typically descriptive
metadata is added by catalog users using the Azure Data Catalog portal, but it can also be extracted from the data
source during registration. For example, the Azure Data Catalog registration tool will extract descriptions from the
Description property in SQL Server Analysis Services and SQL Server Reporting Services, and from the
ms_description extended property in SQL Server databases, if these properties have been populated with values.
Request access
A data asset's descriptive metadata can include information on how to request access to the data asset or data
source. This information is presented with the data asset location, and can include one or more of the following
options:
The email address of the user or team responsible for granting access to the data source.
The URL of the documented process that users must follow to gain access to the data source.
The URL of an identity and access management tool (such as Microsoft Identity Manager) that can be used to
gain access to the data source.
A free-text entry that describes how users can gain access to the data source.
Preview
A preview in Azure Data Catalog is a snapshot of up to 20 records that can be extracted from the data source
during registration, and stored in the catalog with the data asset metadata. The preview can help users who
discover a data asset better understand its function and purpose. In other words, seeing sample data can be more
valuable than seeing just the column names and data types. Previews are only supported for tables and views, and
must be explicitly selected by the user during registration.
Data Profile
A data profile in Azure Data Catalog is a snapshot of table-level and column-level metadata about a registered data
asset that can be extracted from the data source during registration, and stored in the catalog with the data asset
metadata. The data profile can help users who discover a data asset better understand its function and purpose.
Similar to previews, data profiles must be explicitly selected by the user during registration.
NOTE
Extracting a data profile can be a costly operation for large tables and views, and may significantly increase the time required
to register a data source.
User perspective
In Azure Data Catalog, any user can provide descriptive metadata for a registered data asset. Each user has a
distinct perspective on the data and its use. For example, the administrator responsible for a server may provide
the details of its service level agreement (SLA) or backup windows; a data steward may provide links to
documentation for the business processes the data supports; and an analyst may provide a description in the terms
that are most relevant to other analysts, and which can be most valuable to those users who need to discover and
understand the data.
Each of these perspectives are inherently valuable, and with Azure Data Catalog each user can provide the
information that is meaningful to them, while all users can use that information to understand the data and its
purpose.
Expert
An expert is a user who has been identified as having an informed “expert” perspective for a data asset. Any user
can add themselves or another user as an expert for an asset. Being listed as an expert does not convey any
additional privileges in Azure Data Catalog; it allows users to easily locate those perspectives that are most likely to
be useful when reviewing an asset’s descriptive metadata.
Owner
An owner is a user who has additional privileges for managing a data asset in Azure Data Catalog. Users can take
ownership of registered data assets, and owners can add other users as co-owners. For more information see How
to manage data assets
NOTE
Ownership and management are available only in the Standard Edition of Azure Data Catalog.
Registration
Registration is the act of extracting data asset metadata from a data source and copying it to the Azure Data
Catalog service. Data assets that have been registered can then be annotated and discovered.
See also
What is Azure Data Catalog? - This article provides an overview of the Azure Data Catalog service, the value it
provides, and the scenarios it supports.
Get started with Azure Data Catalog - This article provides an end-to-end tutorial that shows you how to use
Azure Data Catalog for data source discovery.