Sunteți pe pagina 1din 43

Last updated: 2015-10-29

Getting started with Watson Analytics

IBM
Note
Before using this information and the product it supports, read the information in Notices on page 35.

Product Information
This document applies to and may also apply to subsequent releases.
Licensed Materials - Property of IBM
Copyright IBM Corporation 2015.
US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Chapter 1. What is Watson Analytics? . . . . . . . . . . . . . . . . . . . . . .. 1

Chapter 2. Uploading data . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3

Chapter 3. Refining the data . . . . . . . . . . . . . . . . . . . . . . . . . .. 5

Chapter 4. Asking questions and exploring insights . . . . . . . . . . . . . . . .. 9

Chapter 5. Finding out what drives outcomes . . . . . . . . . . . . . . . . . .. 17

Chapter 6. Assembling insights to share with others . . . . . . . . . . . . . . .. 23

Chapter 7. Adding and exploring Tweets . . . . . . . . . . . . . . . . . . . .. 29

Chapter 8. What's next? . . . . . . . . . . . . . . . . . . . . . . . . . . .. 33

Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 35

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 39

Copyright IBM Corp. 2015 iii


iv Last updated: 2015-10-29: Getting started with Watson Analytics
Chapter 1. What is Watson Analytics?
We all ask questions about our data every day.

Some of our questions are about a status or situation. What is the revenue by
country in Europe? What is the trend of my product costs in the United States?
What is the breakdown of revenue by country and product line? Questions that are
about a status or a situation are handled in Explore.

Some of our questions are about why something happened. Why did certain sales
deals close while others did not? Why do some customers leave? What drives the
result that I see in my report? These questions are handled in Predict.

And when you find interesting information, you want to share these insights with
other people by creating beautiful dashboards quickly and easily. This is what
Assemble is all about. You can add insights from Explore and Predict to Assemble
and also create new visualizations on the go.

In short, when it comes to data, we want to know what is happening, why it is


happening, and what insights need to be communicated with others.

IBM Watson Analytics can help you understand your data better and find
insights that are hidden in your data. Watson Analytics offers you the benefits of
advanced analytics without the complexity. A smart data discovery service
available on the cloud, it guides data exploration, automates predictive analytics,
and makes it easy to create dashboards and infographics.

You can get answers and new insights to make confident decisions in minutes all
on your own.

When you start Watson Analytics, you are in the Welcome page where you access
all of the capabilities Explore, Predict, Assemble, and Refine where you can
shape how the data will appear.

Copyright IBM Corp. 2015 1


Scenario for the tutorial

In this tutorial, you're a Human Resources manager who has been given a big
project you'll be leading a new training initiative for your entire global company.
You want to better understand where the training budget is currently invested in
all areas of the company because at this moment, you just know how it's spent in
your area of the company.

We know your time is valuable. We created this tutorial to guide you through
some basic concepts and features using a sample data set so that you can quickly
learn more about Watson Analytics and its innovative way of bringing you closer
to your data. Then you can add your own data and start discovering new insights
in your business. Have fun!

2 Last updated: 2015-10-29: Getting started with Watson Analytics


Chapter 2. Uploading data
Let's start by getting data for the tutorial. There are lots of sample data sets on the
IBM Watson Analytics Community, including the one that's used in this tutorial.
WA_HR_Training is about a fictitious company that sells camping equipment.

Procedure
1. Go to Watson Analytics Resources page (https://
community.watsonanalytics.com/resources/).
2. Scroll down to find and tap Sample Dataset - Human Resources Training.
3. Tap the WA_HR_Training link and tap Save File.
4. In Watson Analytics, tap Add.
5. Drag WA_HR_Training.csv to Drop file or browse.

While uploading the file, Watson Analytics analyzes the data and metadata,
creates hierarchies from the metadata, and identifies concepts to use in
analyses.
The data set appears as a tile in the Welcome page and you're ready to get to
work.

Copyright IBM Corp. 2015 3


4 Last updated: 2015-10-29: Getting started with Watson Analytics
Chapter 3. Refining the data
You can work with your data sets to review your data or tune it to match the way
you want to see or work with it. The changes that you make are saved as a
separate version of the original data set.

About this task

If you refine a data set, the changes that you make to it are available in Predict,
Explore, and Assemble. If you modify the data in an exploration, prediction, or
view, the changed data is available only in that asset.

There are different reasons to refine your data sets. For example, you might want
to enrich the data by adding more value, such as calculations. Or, you might want
to filter the data in a particular area of your business. In addition, you might want
to make data more usable by renaming columns, changing data types, and
modifying the default aggregations. Additionally, you might want to create
hierarchies and groups.

When you refine a data set, a new data set is created that is related to your
original data set.

You can also learn more about your data. When you view the data metrics for
your data set, you see the following information.
v The quality score for each column, which indicates a column's potential
readiness for use in a prediction.
v The percentage of data that is missing from each column.
v Distribution graphs of the data in each numeric column.

You can also sort and scroll through your data to preview it.

Procedure
1. On the Welcome page, tap Refine and then select the WA_HR_Training data
set.

Copyright IBM Corp. 2015 5


2. The organization column is organized by geographic region except for two
items for the GO Accessories area. Lets filter them out of the refined data set.
Tap the Organization column title.

3. Select the items that will be filtered out: GO Accessories corporate and GO
Accessories operations.
The data set is filtered to show only these two.

6 Last updated: 2015-10-29: Getting started with Watson Analytics


4. Now tap Invert to select all the other items.
GO Accessories corporate and GO Accessories operations are now filtered out
of the refined data set. When you use this refined data set to create
explorations, predictions, and views, these two items will not be available for
analysis.

Chapter 3. Refining the data 7


5. Tap to save the refined data set. Keep the default name and tap Save.
6. Lets move back to the Welcome page and ask a question. Tap the Navigate
menu in the app bar and tap Welcome.

8 Last updated: 2015-10-29: Getting started with Watson Analytics


Chapter 4. Asking questions and exploring insights
Now that you've refined the data, let's start discovering some new insights about
the data.

Procedure
1. On the Welcome page, tap the refined data set, WA_HR_Training
Refinement.

IBM Watson Analytics generates some starting points for you. Each starting
point is a way to start diving into the data. You can select a starting point or
ask your own question about the data.

2. If you're wondering how to ask a question, there's a coach to help you. Tap
How to ask a question.
You see categories of questions. Take some time to look through the questions
in each category. If you select one of the questions, you'll see a new set of
starting points. But for this tutorial, let's return to previous dialog box.

Copyright IBM Corp. 2015 9


3. Tap .
4. Enter the question what is the cost of courses by organization and then
press Enter.
The new starting points have a relevancy score because the question provides
a context for the starting points.

5. Select the What is the breakdown of Course cost by Organization starting


point. It opens in the Explore capability.
Take a look at the visualization. The size of each box in the tree map tells you
the amount of training spent by each organization.

10 Last updated: 2015-10-29: Getting started with Watson Analytics


Now look below the visualization. This is the data tray.

In Refining the data, you filtered Organization to exclude GO Accessories


corporate and GO Accessories operations. Tap Organization in the data tray
and you'll see that these items don't appear.

Chapter 4. Asking questions and exploring insights 11


6. You can easily get more details about the data that a part of a visualization
represents. Let's look at the numbers for one of the largest boxes. Touch and
hold, or hover over, GO Central Europe operations.

Before going further, take a look just above the visualization.

This is the insight bar. Without you having to do a thing, Watson Analytics
identifies patterns and associations in your data and automatically creates
other starting points for you to explore. As you change the data in the

12 Last updated: 2015-10-29: Getting started with Watson Analytics


visualization, Watson Analytics creates new starting points to reflect the
changes in the visualization. Don't worry, you'll get to use the insight bar in a
little bit.
7. This visualization shows you data for all years but let's find out how costs
have evolved over time. Tap to see where you can add data and then
complete the following actions:

a. Under Rows, tap Add a column.

b. Scroll down and tap Year.

Chapter 4. Asking questions and exploring insights 13


c. Tap Done, then tap to see the legend again.
You see costs for courses by year.

8. Let's find out which departments spend the most on training. Tap
Organization in the interactive title just above the visualization. Then tap
Department.
The blue text in the title shows you the most important columns in the
visualization. When you tap the blue text, the most relevant columns are
shown first but you can select any column.

You see that the sales departments have spent the most on courses each year.
It's that easy! Look how quickly you are interacting with the data and
discovering insights in Watson Analytics.

14 Last updated: 2015-10-29: Getting started with Watson Analytics


9. Save the exploration.
Name the exploration Tutorial Course Costs.

10. Let's set aside, or collect, this visualization to use later in a dashboard. Tap

, which is located at the bottom right side of the window, below the
visualization. You can collect only the visualization that you are currently
viewing.
11. Let's find out how many new hires are planned for each department. Tap 20
is the lowest Planned position count for Department Operations and tap
New page.

Chapter 4. Asking questions and exploring insights 15


You see that the Sales departments are planning to hire the most people and
the Operations department the least.

12. Let's collect this visualization too.

13. If you want to share this exploration, tap


14. Return to the Welcome page.

16 Last updated: 2015-10-29: Getting started with Watson Analytics


Chapter 5. Finding out what drives outcomes
You've explored the data set and learned that the Sales departments use more of
the training budget than other departments and that Sales is planning to hire the
most people. Now let's see what other factors are at play in this data set.

Procedure
1. On the Welcome page, tap Predict and select WA_HR_Training Refinement.

2. Name the workbook Tutorial - Expenses.


3. Watson Analytics selected Year as the target but let's select a different target.
Tap Select target and tap Expense total. Then remove Year.
A target is a variable from your data set that you want to understand. The
target fields' outcomes are influenced by other fields (input fields) in the data.

Copyright IBM Corp. 2015 17


4. Tap Create.
Watson Analytics automatically analyzes the data when creating your
prediction, applying statistical algorithms to the data to discover insights,
patterns, and correlations in the data set.
Take a look at the spiral visualization. It shows you the top key drivers, or
predictors, in color with other predictors in gray. The closer the predictor is to
the center, the stronger that predictor is.

5. You can view details about each predictor. Hover over the blue circle, Position.
You see that the predictive strength is 74.3%. What does this mean? Predictive
strength measures how well a predictor accurately predicts a target. A
predictor with a predictive strength of 100% perfectly predicts a target.

18 Last updated: 2015-10-29: Getting started with Watson Analytics


In this case, a predictive strength of 74.3% means we got it right 74.3% of the
time based on the data that was provided.

Each predictor has a corresponding detailed visualization that contains


information about the predictor and how it affects the target. The color of the
circle in the spiral visualization is also found in the corresponding detailed
visualization.

6. Tap the detailed visualization called Position drives Expense Total.


You see some details about how position influences expenses.

Chapter 5. Finding out what drives outcomes 19


Close the detailed visualization.
7. Let's find out what other fields influence our target, Expense Total. In the
prediction scenario selector, tap Two Fields.
One Field leads to predictions that are easier to understand but might be less
predictive. Combination might lead to a prediction that is more accurate, but
harder to understand. Two Fields might be somewhere in the middle.

Heres a new set of visualizations about how combining two predictors


influences your target.

8. Open the detailed visualization called Position and Organization drives


Expense total (the blue dot).
You can see that a Level 3 Sales Rep within GO Americas operations has the
highest average expense total overall by the darkest shaded color.

20 Last updated: 2015-10-29: Getting started with Watson Analytics


9. Let's collect this visualization to use later in a dashboard. Tap .
10. Return to the Welcome page.

Chapter 5. Finding out what drives outcomes 21


22 Last updated: 2015-10-29: Getting started with Watson Analytics
Chapter 6. Assembling insights to share with others
You can easily communicate the analysis and insights that you discovered in IBM
Watson Analytics for a report or presentation by combining visualizations with
text, images, and shapes.

Procedure
1. On the Welcome page, tap Assemble and select WA_HR_Training
Refinement.

You aren't required to select a data set in Assemble. You could simply create a
dashboard of the visualizations that you collected earlier in the tutorial. But
let's select a data set now so that you can easily create another visualization.

2. The name of the view defaults to the name of the data set. Change the name
to Tutorial - Dashboard in the Name your view field.

Copyright IBM Corp. 2015 23


3. Let's select a template for our dashboard. Under Dashboard, select this
template that has 4 panes.

Templates contain predefined layouts and grid lines for easy arrangement and
alignment of the visualizations that you want to share.
4. Tap Create.
Let's start by creating a visualization in the dashboard.
5. From the data tray at the bottom of the window, drag Department to the top
left pane. You'll see a square with arrows in it appear in the middle of the
pane. Drop Department on top of the square.

Watson Analytics displays the list of departments.

6. Drag External hires onto the list of departments.


Watson Analytics creates a visualization that shows the number of external
hires by department.

24 Last updated: 2015-10-29: Getting started with Watson Analytics


7. Let's add the visualizations that you set aside earlier. Tap on the app bar.
You see the collection of visualizations that you set aside earlier.

8. Drag Tutorial - Expenses visualization to the top right pane and drop it on
the square that appears in the pane.

Visualizations from Predict display as non-interactive images.

Chapter 6. Assembling insights to share with others 25


9. Add the first Tutorial - Course Costs visualization to the bottom left pane.
Visualizations from Explore remain interactive. While in Assemble, you can
change the data that's displayed in the exploration by using the interactive
title or by tapping .

10. Add the second Tutorial - Course Costs visualization to the bottom right
pane.

26 Last updated: 2015-10-29: Getting started with Watson Analytics


Before sharing this dashboard with others, you can enhance it by adding a
title, images, and shapes. See the Docs in Watson Analytics for the details.

Chapter 6. Assembling insights to share with others 27


28 Last updated: 2015-10-29: Getting started with Watson Analytics
Chapter 7. Adding and exploring Tweets
You want to see what the sentiment is for some of the major ways to deliver
training to see if there's one that could work for your pilot training initiative. Your
organization is located in many different countries so e-learning and mobile
learning are the two that you're interested in.

About this task

You can complete this task only if you are subscribed to the Professional edition or
the Personal edition of Watson Analytics.

Procedure
1. On the Welcome page, tap Add and tap Upload data.

2. Tap Twitter.

Copyright IBM Corp. 2015 29


3. Complete the following actions:
a. Enter these hashtags separated by a space: #elearning #mlearning
b. To get a reasonable and current sample size and quick return on the query,
enter a date range going back 7 days from today. Or use the date ranges
that are shown in the screen capture.
c. In the Data set name box, highlight the default name and replace it with
Tutorial - Twitter.

4. Tap Upload data.


5. Tap the data set that's created.

30 Last updated: 2015-10-29: Getting started with Watson Analytics


IBM Watson Analytics analyzes sentiment and generates some starting points
for you.

6. Enter a question Compare the number of tweets by matching hashtag and


sentiment and press Enter.
7. Tap the starting point called How does the number of Tweet compare by
Matching Hashtags and Sentiment?.
You see that there is a lot more discussion around e-learning.

Chapter 7. Adding and exploring Tweets 31


You can now set aside this exploration and add it to a dashboard in
Assemble.

Results

Because you can quickly tap into Twitter data and understand the sentiment of a
series of disparate or competitive hashtags through Watson Analytics, you can now
begin to focus on key aspects of your business that may have seemed elusive in
the past. This provides a means to better align with departmental goals and even
organizational agendas by using the power of Watson Analytics.

32 Last updated: 2015-10-29: Getting started with Watson Analytics


Chapter 8. What's next?
You've learned how easy it is to get started with IBM Watson Analytics. Now you
can add your own data and start discovering new insights. Here are some tips to
help you get going.

Add your own data

We prepared the sample data for this tutorial. You can use other sample data sets
from the Watson Analytics community. You can also upload your own data sets.
Before uploading other data sets, take a look at the data files to see if there are
things you want to change.

Watson Analytics can provide better predictions and explorations if the quality of
your data is high. If the quality of your data is low, the accuracy of the analyses in
your explorations and predictions is less reliable. You can improve the quality of
your data.

When a data set is loaded, Watson Analytics reads the data and determines a data
quality score that describes the data set's suitability for making predictions. The
higher the score, the better the quality of data. If you provide a high-quality data
set, Watson Analytics provides a high data quality score.

You can see the score that is associated with each data set in the list of assets on
the Welcome page. For example, a score of 68 indicates a data set of medium
quality. The lower the score, the higher the number of outliers or missing values
and other issues.

To obtain a high data quality score, clean your data before you import it into
Watson Analytics.
v Remove blank rows from your data file.
v Remove summary rows and columns from your data file.
v Eliminate nested column headings and nested row headings.

Learn more

There is much more you can do in Watson Analytics. Check out the Help menu.

In this menu, you have access to the following sources of more info:
v Detailed topics in the Docs
v Introductory Tours that open in a separate browser tab
v Little Hints that appear in various places in Watson Analytics
v Expert blogs, discussion forums, and sample data in the Community

Copyright IBM Corp. 2015 33


We also have a Watson Analytics channel on YouTube where you can watch
videos:

https://www.youtube.com/user/watsonanalytics

Wait, there's more!

If you are subscribed to the Professional edition or the Personal edition of Watson
Analytics, you have access to more types of data:
v Twitter metadata
v Cognos BI reports
v Data from SPSS Statistics .sav files
v Databases such as IBM DB2, IBM dashDB, IBMSQL Database for Bluemix,
Microsoft SQL Server, MySQL, Oracle, PostgreSQL

You also have access to other features such as:


v More storage space
v Larger data sets
v Accessing data in the cloud, such as Box, Dropbox, and Microsoft OneDrive
v Sharing data sets, refined data sets, explorations, predictions, and views with
other users in the same subscription

If you want to upgrade to one of these editions, see http://www.ibm.com/


marketplace/cloud/watson-analytics/us/en-us.

34 Last updated: 2015-10-29: Getting started with Watson Analytics


Notices
This information was developed for products and services offered worldwide.

This material may be available from IBM in other languages. However, you may be
required to own a copy of the product or product version in that language in order
to access it.

IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any non-IBM product, program, or service. This document may
describe products, services, or features that are not included in the Program or
license entitlement that you have purchased.

IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:

IBM Director of Licensing


IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.

For license inquiries regarding double-byte (DBCS) information, contact the IBM
Intellectual Property Department in your country or send inquiries, in writing, to:

Intellectual Property Licensing


Legal and Intellectual Property Law
IBM Japan Ltd.
19-21, Nihonbashi-Hakozakicho, Chuo-ku
Tokyo 103-8510, Japan

The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law: INTERNATIONAL
BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS"
WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors.


Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.

Copyright IBM Corp. 2015 35


Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:

IBM Software Group


Attention: Licensing
3755 Riverside Dr.
Ottawa, ON
K1V 1B7
Canada

Such information may be available, subject to appropriate terms and conditions,


including in some cases, payment of a fee.

The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.

Any performance data contained herein was determined in a controlled


environment. Therefore, the results obtained in other operating environments may
vary significantly. Some measurements may have been made on development-level
systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been
estimated through extrapolation. Actual results may vary. Users of this document
should verify the applicable data for their specific environment.

Information concerning non-IBM products was obtained from the suppliers of


those products, their published announcements or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.

All statements regarding IBM's future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.

This information is for planning purposes only. The information herein is subject to
change before the products described become available.

This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.

36 Last updated: 2015-10-29: Getting started with Watson Analytics


If you are viewing this information softcopy, the photographs and color
illustrations may not appear.

Depending upon the configurations deployed, this Software Offering may use
session and persistent cookies that collect each user's
v name
v user name
v password
v personally identifiable information other than name, user name, password,
profile name and position

for purposes of
v session management
v authentication
v enhanced user usability
v single sign-on configuration
v usage tracking or functional purposes other than session management,
authentication, enhanced user usability and single sign-on configuration

These cookies cannot be disabled.

If the configurations deployed for this Software Offering provide you as customer
the ability to collect personally identifiable information from end users via cookies
and other technologies, you should seek your own legal advice about any laws
applicable to such data collection, including any requirements for notice and
consent.

For more information about the use of various technologies, including cookies, for
these purposes, see IBM's Privacy Policy at http://www.ibm.com/privacy and
IBM's Online Privacy Statement at http://www.ibm.com/privacy/details in the
section entitled "Cookies, Web Beacons and Other Technologies" and the "IBM
Software Products and Software-as-a-Service Privacy Statement" at
http://www.ibm.com/software/info/product-privacy.

Trademarks
IBM, the IBM logo and ibm.com are trademarks or registered trademarks of
International Business Machines Corp., registered in many jurisdictions worldwide.
Other product and service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at Copyright and
trademark information at www.ibm.com/legal/copytrade.shtml.

The following terms are trademarks or registered trademarks of other companies:


v Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States, other countries, or both.

Microsoft product screen shot(s) used with permission from Microsoft.

Notices 37
38 Last updated: 2015-10-29: Getting started with Watson Analytics
Index
A predictors 17
preparing data 3, 5
assembling visualizations 23 Professional edition 33

C Q
cleaning data 33 questions 9
collecting 9, 17
collection 23
communicating with others 23
R
refining data 5
D
data
additional types 33 S
cleaning 33 shaping data 5
preparing 5 spiral 17
Twitter 29
uploading 3
T
targets 17
E Tweets 29
exploring data 9 Twitter 29
Twitter 29

U
F upgrading
filtering editions 33
data set 5 uploading data 3
Twitter 29

I
importing 3 V
insights 9 visualizations
adding 23
details 9, 17
P
Personal edition 33
predictive insights 17

Copyright IBM Corp. 2015 39

S-ar putea să vă placă și