Sunteți pe pagina 1din 41

Data

Discovery &
Visualizatio
n
TEAM 2

ADHARSH R - 2019201002

AT H I R A G O P I N A T H A N - 2019201013

G A YA T H R I V - 2019201023

MADESH V V – 2019201034

RA J A B H A RT H I - 2019201045

MAMDOUH ALAJLANI- 2019201065


What is Data Discovery?
 Data discovery is the collection and analysis
of data from various sources to gain insight
from hidden patterns and trends.
 First step in fully harnessing an organization’s
data to inform critical business decisions.
 Data is gathered, combined, and analyzed in
a sequence of steps.
 The goal is to make messy and scattered
data clean, understandable, and user-friendly.
Why is it so popular?
• Data is considered an invaluable commodity and “currency” for businesses.
• It helps companies derive trusted insights that they can apply to their competitive
advantage.
• Data improves the decision-making process, powers growth strategies, significantly
boosts the customer experience, and enables organizations to drive innovation with
their business models.
• Studies suggest that 79% of enterprise executives believe that companies that do not
leverage big data in the right way will lose their competitive position and could
ultimately face extinction.
Steps involved in Data
Discovery
1. Connect and Blend Data - Data, scattered across many sources, must be placed in a single
area where analysis can take place. An operations analyst who wants to consider how weather
trends might influence sales needs to blend weather data with sales data from the organization’s
CRM.

2. Cleanse and prepare data - Data needs to be cleaned and structured in ways that facilitate
reliable and robust analysis. In survey analyses, marketing researchers must break down free-
response answers to catch mistakes and categorize responses.

3. Share data- With data constructed and free from redundant or unneeded information, it must be
shared with others in the organization. Even though this data is the single version of the truth, it
can be leveraged in different ways.
Steps involved in Data
Discovery
4. Analyze and generate insights- Common tools include distributional analysis,
predictive models, and market basket analysis. It is important to understand the type of
insights generated by different analytical tools.

5. Visualize Insights - Insights need to be communicated once they are found, and
visualizations allow users to easily do this.
Data Discovery - trends
BIG DATA DISCOVERY SMART DATA DISCOVERY

Automatically understanding the links


to related data.

For example: if a data visualization


shows a drop in revenue, the system
will explain the reasons why revenue is
dropping, or what events might be
causing this change.

A human asks a question and a


machine answers.
Benefits of Data Discovery
• Gather Actionable Insights - Data discovery takes complex data and structures it in
ways which allow users to visualize and comprehend the information within it.
• Save Time - Data discovery aggregates and formats data from various sources and
different structures to facilitate its analysis. This process provides analysts with the
right data in the right format.
• Scale Data Across Teams - Departments or users can leverage the same data in
different ways to create unique insights. Data discovery facilitates this process and
provides all users with a single version of the truth.
• Clean and Reuse Data- Data analysis is a continuous process. As new data is
collected, current data needs to be cleaned, stored, and made available for future
use.
Challenges of Data
Discovery
• Volume – Data is available in enormous amount and size which can hamper analysis
and introduce bias. Data discovery must overcome this challenge with strong data
governance and capable technology.
• Variety- Various sources of data, with variety of formats pose a challenge to data
consistency. Successful data discovery requires strong technical skills to gather and
clean data so it’s ready to be analyzed and consumed.
• Data Velocity- Velocity is the speed at which data is created. Data discovery
becomes a challenge as the rate of data creation grows by the day. New data must be
continuously and correctly added to the repository to ensure timely insights.
Challenges of Data Discovery
• Consistency- Data must remain consistent across an organization so everyone
within it is on the same page. Inconsistencies can result in poor decisions based on
invalid or out-of-date data.
• Data Management - Mismanaged data introduces several hurtles into the data
discovery process. Data collected and stored inaccurately, illogically, or
inappropriately can introduce errors into an analysis without the user’s knowledge.
Data Discovery Tools
Excel

Excel is the base model for data discovery. Its capabilities allows users to pull, prepare and
analyse data within one document. While it can perform all required functions, many tasks
require the manual manipulation of data. The manual and non-customizable nature of
platform severely limits the depth of analysis which can be conducted.

R.

Mostly used by statisticians, R requires a specialised skill set. While the platform is one of
the least user friendly but it is also extremely useful. It is an open-source platform, and it
can explore, edit, analyse and visualize data in many possible ways.
Data Discovery Tools
Qlik Sense, QlikView and Qlik Nprinting

Qlik Sense is a self-service data visualization and data discovery solution


providing immediate analysis results. It is powered by QIX (Qlik’s
associative engine) and gives flexible access to data sets stored in-
memory. QlikView is a dashboard and analysis product based on the same
in-memory technology. Qlik NPrinting is a report generation, distribution
and scheduling application which can be used to create reports based on
Qlik Sense or QlikView content.
Microstrate
gy
Microstrategy enables users easily
access, blend and analyse data. It
easily links to data and automatically
formats it with built-in data wrangling
and parsing tools. Users can easily
share and distribute data throughout
any organisation. Dossiers can be
uploaded to central library that can
be accessed, viewed and analysed
by others.
Microstrategy
Data Visualization
• The process of breaking complex data collections into information that
users can understand and manage.
Data Visualization
• By using visual elements like charts, graphs, and maps, data
visualization tools provide an accessible way to see and understand
trends, outliers, and patterns in data.
• In the world of Big Data, data visualization tools and technologies are
essential to analyze massive amounts of information and make data-
driven decisions.
Data Visualization
WHAT? WHY?

the process of displaying • To identify trends


data(often in large quantities) in a • To identify patterns
meaningful fashion to provide • To find exceptions
insights that will support better • To compare groups of data
decisions.
Example
Lets say you are a retailer and you want to compare the sales of jackets to the sales of
socks over the course of the previous years. The most common form of representing the
data is a table as shown:
Example
Now here’s the data in a line graph visualization:
Bar Chart
When? To compare multiple
variables in a single
timeframe or a single
variable in a time series.
Excel distinguishes between
vertical and horizontal bar
charts, calling the former
column charts and the
latter bar charts.
Line Chart

When?
• To show how a one or
more variables change
over time. (Progression)
• connects a series of data
points with a
continuous line.
• Easier to see small
changes, but are less
versatile than bar graphs. 
Pie Chart
When?
• To see parts of a whole on a
percentage basis. 
• Data visualization professionals
don’t recommend using pie charts.
• Difficult to compare the relative
sizes of areas.
Area Chart

• When? To show cumulative


changes of multiple
variables over time.
• A variation of line charts.
• Time-Series graph used with
care so as to avoid
Cluttering the observer’s
mind with too many details if
too many data series are
used.
Scatter
Chart
When?
To see the correlation
between two variables.
Bubble
Chart
• When? To
plot/Compare three
variables in two
dimensions.
• Like scatter plots but
add more functionality
because the size and/or
color of each bubble
represents additional
data.
Miscellaneous Charts
• Stock Chart • Choropleth
• Surface Chart • Sankey Diagram
• Doughnut Chart • Network Diagram
• Radar Chart • Heat Maps
• Histograms • Population Pyramids
Excel Data
Visualization
Tools
Data Bars

Display colored bars that


are scaled to the magnitude
of the data values (similar to
a bar chart) but placed
directly within the cells of a
range.
Excel Data
Visualization
Tools
Color Scales

Shade cells based on their


numerical value using a color
palette

Color- coding
of quantitative data is
commonly called a heat map.
Excel Data
Visualization
Tools
Icon Sets

Provide information using


various symbols such as
arrows or stoplight colors

Each icon represents a range
of values.
Line and
Column
Sparklines
Line sparklines are
clearly useful for time-
series data, while
column sparklines are
more appropriate for
categorical data.
Win-loss
Sparklines
Win-loss sparklines
are useful for data
that move up or
down over time.
American Cancer Society: A case
Study
• The American Cancer Society has been working
for over 100 years to find a cure for cancer and to
help patients fight back, get well, and stay well.

• The Society realized it needed help understanding


how people interact with its sites and apps. It knew
its users had different needs and goals, but it was
a challenge for the digital marketing team to
distinguish how to group users in a way that would
be beneficial to all parties.
The Challenge
• In 2012, the members of society decided to seek help to understand how visitors are
interacting with their sites and apps.
• As the website contains information, opportunity offers and donations also ; visitors
visit the site with different needs and goals.
• So the challenge was to isolate these visitors and group them to different customer
segments and help the ACS to achieve their goals.
• So society turned to digital analytics and began working with Search Discovery to
help understand how users are interacting with their sites and applications, and to
address data quality concerns with the Google Analytics implementation on several
sites.
The Goals
• Understand how
different users interact
with the organization’s
sites and apps
• Get a sense of how
users change their
behavior
• Engage with specific
users more effectively
Process
The American Cancer Society and Search Discovery started
working with Google Analytics to capture data that would help them
to identify user segments which were later classified as follows:

Info seekers: People seeking cancer signs and symptoms, some


are needing help with understanding a cancer diagnosis.

Event participants: People who are seeking opportunities to


participate in walks, races, and other  events to fund raise for
cancer research and support services.

Donors: People who simply want to make donations to help the


fight against cancer.
Process
Custom metrics were used to send a score for each event :

Recency score = 1 point awarded if previous session was within the past seven days

Engagement score = 1 point awarded for every three pages viewed

Conversion score = 1 point awarded for each donation, registration for an event or a
view of an entire article on cancer information

Revenue score = 1 point awarded for larger gifts than average gift size of about $70
Insights
Insights
Results
• The new custom metrics allowed the Society’s analysts to identify behavior changes in
each segment that would have otherwise gone unnoticed.
• A separate site called Making Strides Against Breast Cancer is used to raise money
specifically for breast cancer research.
• Analysis showed an Increase in Site performance score in October- Breast Cancer
Awareness Month.
• In response, the Society marketing team created new promotions on Cancer.org to
drive traffic to the Making Strides site. It worked. More than 39,000 people followed
those links throughout the month. The team also created a new donation form on
Cancer.org that sent funds only to breast cancer research. The result was a 5.4%
jump in Cancer.org revenue year over year.
Q&A
Thank You!

S-ar putea să vă placă și