Sunteți pe pagina 1din 16

w h i t e p a p e r

Big Data Analytics:

Future Architectures,
Skills and Roadmaps
for the CIO

September 2011
Sponsored by
By Philip Carter
w h i t e pa p e r

Big Data Analytics:


Future Architectures, Skills
and Roadmaps for the CIO

Brave New World of Big Data

The ‘Big Data Era’ has arrived — multi-petabyte data warehouses, social
media interactions, real-time sensory data feeds, geospatial information
and other new data sources are presenting organisations with a range of
challenges, but also significant opportunities. IDC believes that as CIOs
start to adopt the new class of technologies required to process, discover
and analyse these massive data sets that cannot be dealt with using
traditional databases and architectures, it will become clear that the real
value will be derived from the high-end analytics that can be performed
on the increasing volumes, velocity and variety of data that organisations
are generating – or Big Data analytics.
One of the key differences between analytics in the traditional mode, and what we are dealing with in terms
of the Big Data era is that we are gathering data that we may or may not need – and from the perspective of
analysis, this means ‘we don’t know what we don’t know’ – hence, the variables and models are likely to be
entirely new, requiring a different infrastructure strategy and perhaps most importantly, new skill sets.

The objective of this white paper is to explore the initial impact that Big Data is having on organisations,
particularly the IT departments – which is being forced to re-assess architectures, delivery models and future
roadmaps. It will explore the following areas in more detail:

Defining Big Data. This is not in the context Hadoop, Mapreduce, Key Value
of the quantity or threshold that actually Store? There is a lot of hype around the new
quantifies Big Data (as this is changing all the technologies that are being used by the market
time, and will be applied differently, depending to deal with the Big Data phenomenon. We
on the vertical and market segment), but more will highlight some of these and their relative
in terms of a new generation of technologies importance.
and architectures, designed to economically
extract value from very large volumes of a The Value of Big Data… in Analytics.
wide variety of data, by enabling high-speed The bottom line here is that it is getting more
capture, discovery and/or analysis. complicated to process and analyse these

1
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

large and growing data sets – and it essentially that need to be put in place as the Big Data
requires a re-assessment of the broader phenomenon becomes a reality, there will be
information management strategies for the increasing demand for ‘data scientists’ – the
majority of organisations that have started their next-generation analytical professionals who
business analytics journey. are able to extract information from large data
sets and then present value-added content of
Why Big Data Analytics is Important
business value to non-data experts – who also
(and Different). Many have asked the
have the unique skill of understanding the new
question – what is new with this trend? This
models that need to be put in place.
section will highlight the traditional use of
business analytics in the old ‘pre-Big Data’
Mapping out the Big Data Analytics
world, versus Big Data analytics in the ‘New
Journey. The Big Data analytics journey will
World’. This will also look at the various use
be an iterative one – it is therefore important
cases that IDC expects to see being most
to map this out in the context of a broader
commonly used across a variety of industries.
framework. This section aims to do exactly
The Skill Factor – the Rise of the that, and also provide some recommendations
Data Scientist. With the raft of new to CIOs as they embark on this exciting journey
technologies and organisational structures into the brave new world of Big Data analytics.

Situation Overview
The Rise of Business Analytics

Much has been written on how the amount of data in the world is
exploding in volume. According to the recent IDC Digital Universe
study, the amount of information created and replicated will surpass 1.9
zettabytes (1.8 trillion gigabytes) in 2011 – growing by a factor of 9 in just
five years.

Big data is a dynamic that seemed to appear from technology area is rising on the radars of CIOs
almost nowhere. But in reality, Big Data is not new and line-of-business (LOB) executives. To validate
– and it is moving into mainstream and getting a this, as part of a recent survey of 5,722 end users
lot more attention. The growth of Big Data is being in the US market, business analytics ranked
enabled by inexpensive storage, a proliferation of in the top five IT initiatives of organisations.
sensor and data capture technology, increasing The key drivers for business analytics adoption
connections to information via the cloud and remained conservative or defensive. The
virtualised storage infrastructure, as well as focus on cost control, customer retention and
innovative software and analysis tools. It is optimising operations is likely a reflection of
no surprise then that business analytics as a the continued economic uncertainty. However,

2
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

top drivers vary significantly by organisation According to more than 1000 CIOs and LOB
size and industry. Similarly, IDC surveyed 693 executives that were interviewed as part of the
European organisations in February 2011 where Asia/Pacific C-Suite Barometer in February 2011,
51% of respondents said that BI and analytics business analytics was rated as the number
are high-priority technologies. In emerging one technology area that would enable their
markets such as Asia/Pacific, the focus is very organisations to gain a competitive edge in the
much on capturing the next wave of growth. year ahead.

Figure 1: The Rise of Business Analytics

Q: You (CIO/CTO) mentioned ‘harnessing ICT to gain competitive advantage’…


which of the following technologies or solutions would be your leading choice to
better harness ICT?

TOP 5
Business intelligence/
analytics

Network

Social media/
online channel

Collaboration
(including video, mobility,)

Cloud computing/
services

0 5 10 15 20 25 30 35 %

Source: IDC, 2011

With more businesses in Asia investing in IT to questionable data quality) at the right time (due
ride the hyper growth wave in emerging markets, to performance and scalability issues) to the right
they are harnessing analytics-led solutions to gain stakeholders within their organisations for the
better customer insights, manage risk and financial critical decision-making capabilities needed to
metrics more effectively, and at the same time, drive the necessary business impact. And where
strive for unique market differentiation. Historically, they are unable to do this, the line of business is
organisations have made significant investments procuring and deploying their own solutions in a
in applications with the objective of automating new wave of ‘shadow IT’ investments focusing
business processes and capturing data to improve on business analytics, thereby forcing CIOs to
operational efficiency. Many of these projects are re-examine these issues with a specific focus on
still ongoing, but what is becoming increasingly driving better IT-business alignment. These are
clear to the senior management of these entities taking place even without the ‘Big Data’ dynamic
is that they (and their business managers) have in the picture – which when added, creates the
not been able to get hold of the right information ‘perfect storm’ for Big Data analytics to take
(mainly due to poorly integrated systems and centre stage.

3
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

A Note on Terminology:
BI or Analytics?

We have some challenges when defining and analytics, the term ‘analytics’ simply means a
using terminology for business analytics. Because dashboard on top of some data.
the BI market is mature, many terms have been
around for a long time and have either become For the purpose of this white paper, we
obsolete or have been redefined over the years. interpret ‘BI’ to mean either QRA tools or
For example, the term ‘BI’ itself is sometimes BI across the board (in its narrow definition),
used in a narrow sense (only query, reporting, or ‘business analytics’ (in its broad definition)
and analysis [QRA] technology) and at times, in IDC terminology. We interpret ‘analytics’ to
in a broad sense to refer to the whole of what mean either advanced analytics (data mining,
IDC calls business analytics (including data statistics, optimisation and forecasting) or analytic
warehousing and analytic applications in addition applications (FPSM, CRM and marketing analytics,
to front-end tools). The term ‘analytics’ is relatively supply chain analytics, etc.). Business Analytics is
new and its meaning is often unclear — does it a combination of the above (and also includes data
refer to advanced analytics including predictive warehousing technologies) and this is highlighted
analytics, optimisation and forecasting, or analytic by IDC’s Business Analytics Taxonomy for 2011
applications? In some submarkets, such as Web (see figure 2 below):

Figure 2: IDC Business Analytics Taxonomy

Performance Management & Analytic Applications Business Intelligence Tools

Financial Performance CRM Analytic Applications Query, Reporting,


& Strategy Management Sales, Customer Service, and Analysis Tools
Budgeting, Planning, Consolidation, Contact Centre, Marketing, Web Site Dashboards, production reporting,
Profitability, Strategy Management Analytics, Price Optimisation OLAP, ad-hoc query

Services Operations Advanced Analytics Tools


Supply Chain Analytic Applications
Analytic Applications Data mining and statistics
Financial services, education,
Procurement, logistics,
government, healthcare,
inventory, manufacturing
communications services, etc.
Content Analysis Tools
Production Planning
Analytic Applications Workforce Analytic
Demand, supply, and Applications Spatial Information
production planning Analytics Tools

Data Warehouse Management Platform

Data Warehouse Management

Data Warehouse Generation


Data extraction, transformation, loading; data quality

Source: IDC, 2011

4
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Defining ‘Big Data’


Big Data is not so much about the content that “Big Data technologies describe a new generation
is created, nor is it even about consumption. It of technologies and architectures, designed
is more about the analysis of the data and how to economically extract value from very large
that needs to be done. It is not really a ‘thing’, but volumes of a wide variety of data, by enabling high
instead a dynamic/activity that crosses many IT velocity capture, discovery and/or analysis.”
borders. IDC defines Big Data in this way:

Figure 3: Defining ‘Big Data’

Unstructured
Data Data (Video,
Volumes rich media etc)

Semi-Structured
(e.g. Weblogs,
social media feeds)
Data =
Big, Complex,
High Velocity &
Wide Variety

Time
Source: IDC, 2011

The Volume. One is embodied more in the complex in Asia with local social media sites
structured data realm. Some of this is held in like RenRen in China and Nate in Korea.
transactional data stores and is linked to the
The Velocity. There will also be demand to
ever-present electronic trail that individuals
analyse this data on a more regular basis – for
and businesses create in the wake of rapidly
example, taking into account all transactions
increasing online activity. Sensory data
rather than a sample to obtain a more
(machine-to-machine) contribute to this area
complete view of risk on a trade in real time.
too. The other is in existing data warehouses
or data marts, which have over time grown to
petabyte scale.
In summary, Big Data refers to data sets whose
The Variety. The other aspect of this Big volume, variety, velocity and complexity make it
Data phenomenon is the need to analyse impossible for current databases and architectures
semi-structured and unstructured data. to store and manage. IDC intentionally does not
Text, video and other forms of media will define Big Data as larger than a certain threshold
require a completely different architecture (i.e. terabytes), mainly since this threshold would
and technologies to perform for the required be a moving target depending on the sector, as
analysis. For example, if you look at the well as the fact that it will obviously grow over time.
social media phenomenon, many marketing More important is the value that organisations can
departments are looking at ways to do derive from this phenomenon – and the resulting
sentiment and brand analysis based on need to rethink their information strategies to
what is being posted on Facebook, Twitter extract the value.
and YouTube. This dynamic becomes more

5
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Other Definitions:
Hadoop, Mapreduce, Key Value Store

With the focus on Big Data going mainstream, a range of new technologies have hit the market. The table
below gives an overview of these technologies, with associated context (note that the list is not exhaustive).

Table 1: Big Data Technologies (Terminology)

Technology Context
Proprietary distributed database system built on the Google File System.
Big Table Inspiration for HBase.

An open source (free) database management system designed to handle


huge amounts of data on a distributed system. This system was originally
Cassandra developed at Facebook and is now managed as a project of the Apache
Software foundation.
Consists of an integrated set of servers, storage, operating system(s),
Data Warehouse & database, business intelligence, data mining and other software
Analytical Appliance specifically pre-installed and pre-optimised for data warehousing.
Multiple computers, communicating through a network, used to solve a
common computational problem. The problem is divided into multiple
Distributed System tasks, each of which is solved by one or more computers working in
parallel. Improved price:performance ratio, higher reliability and more
scalability.
Proprietary distributed files system developed by Google: part of the
Google File System inspiration for Hadoop.

An open source (free) software framework for processing huge data sets
on certain kinds of problems on a distributed system. Its development was
Hadoop inspired by Google’s MapReduce and Google File System. It was originally
developed at Yahoo! and now managed as a project of the Apache
Software Foundation.
An open source (free) distributed, non-relational database modeled on
Google’s Big Table. It was originally developed by Powerset and is now
HBase managed as a project by the Apache Software Foundation as part of
Hadoop.
A software framework introduced by Google for processing huge data sets
MapReduce on certain kinds of problems on a distributed system. Also implemented in
Hadoop.
A non-relational database is one that does not store data in tables (rows
Non-relational database/ and columns) – in contrast to a relational database. Key Value Stores allow
Key Value Store for the management of schema-less (noSQL) entities.

Although some of these terms will be used it. Having said that, most IT executives are not
throughout this white paper, the focus is not to aware of the technologies and trends developing
examine them in too much detail – because as in this area – and where they are aware of it,
one IT executive recently mentioned – ‘to know their strategy is to put a couple of people in their
the technology is one thing, but to apply it in the enterprise architecture team to experiment with
right environment is something entirely different’. the new technologies (i.e. in memory, Hadoop,
The new technology needs to be tied back to MapReduce, Key Value Stores etc) that are being
business requirements as much as possible – not used to deal with the ‘Big Data’ phenomenon.
just examining the technology for the sake of

6
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Big Data Analytics:


The Old World vs. The New Era

Many have asked the question – what is new with business users – complete with SLAs that have
this trend? This section highlights the traditional the security, performance, availability and cost
use of business analytics in the old ‘pre-Big Data’ profiles transparent to all in the form of a service
world, versus Big Data analytics in the ‘Brave catalog. Very few organisations, if any, have
New World’. This will also look at the various use achieved this state of infrastructure ‘nirvana’,
cases that IDC expects to see being used most and are still battling with a spaghetti-like tangle
commonly across a variety of industries. The of compute resources in their datacenter. And
majority of IT organisations have progressed in now, we have this external force of Big Data as
terms of their infrastructure architectures over mentioned earlier that is forcing CIOs to re-
time; from predominantly mainframe-based architect their infrastructure – particularly in the
environments in the 1980s to a focus on client- context of how analytics capabilities are deployed
server in the 1990s and the Web at the turn of the in an enterprise-wide fashion.
century, to what is now popularly known as ‘private
cloud’. This supposed state of ‘nirvana’ constitutes Below is an overview of the changes that IDC
a consolidated, virtualised set of infrastructure sees happening in the infrastructure world that
resources (server, storage and network) that can is increasingly impacting the Big Data analytics
be self-provisioned in an automated fashion by world:

Table 2: Old World vs. New Era (Big Data Infrastructure)

Old World New Era

Tenancy Infrastructure Silos Pooled resources

Linear scalability (linked to


Architecture Performance ‘tuned’ distributed parallel processing and
‘in memory’ storage)

Hybrid (with cloud bursting capabilities)


Delivery Model On Premise
and widespread use of the appliance

7
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Based on IDC’s research in this space, here are the high-end analytical skills needed to help
three suggestions for CIOs in dealing with these drive the necessary business impact across
issues: multiple functions.

Cloud Bursting. The private cloud Enterprise Architecture. Enterprise


journey will line up well with the enterprise- analytics needs an enterprise architecture
wide analytical requirements highlighted that scales effectively with growth – and the
earlier, but CIOs need to ensure that workload rise of Big Data analytics means that this
assessments are conducted rigorously and issue needs to be addressed more urgently.
that risk is mitigated where possible. Critical Organisations need to look at creating a
to this approach will be the evaluation of cloud ‘high performance analytical environment’
bursting capabilities from external vendors that leverages in-database analytics, parallel
(i.e. Infrastructure as a service), particularly as processing as well as in-memory storage to
organisations start to leverage more real-time deal with the increased volume, velocity and
analytics environments, to ensure that the use variety of data. Particularly, in terms of dealing
of infrastructure resources maps closely to with unstructured data, more attention needs to
demand – and that there are no issues in terms be paid to Hadoop – an open source software
of performance and availability. framework set up by Apache that allows for the
distributed processing of large data sets across
Analytical Appliance. In terms of delivery
clusters of computers. However, there will be
models, IDC has seen significant performance
an ongoing tension between global standards
benefits from analytical appliances for
and local requirements – and the use of
customers that are dealing with the impact
Hadoop would be a good example of this.
of Big Data. In addition, since the software is
Another would be the ability to process mixed
optimised and pre-integrated with appliances,
workloads (e.g. analytical and operational)
the deployment timeframes are typically
in the same infrastructure environment such
shorter. As part of a recent global survey of
as the appliance that was mentioned earlier.
CIOs, 10% of the respondents indicated that
CIOs need to consider ways in which they
they will be looking at analytical appliances as
can deliver value in terms of solving specific
a delivery model in 2011. IDC also believes
business problems, while at the same
that the demand for reference architectures will
time, being cognizant of global architecture
rise as CIOs look to integrate these appliances
standards and specifications. While certain
within existing data warehousing environments.
global governance models will not allow for
In line with this increased adoption of the
the usage of some of these technologies in a
analytical appliance as a delivery model, IDC
production environment, business expectations
believes that IT departments will allocate less
will force IT departments to re-assess the way
budget towards technical skills (i.e. installation,
the enterprise architecture agenda is utilised at
configuration and management), and more on
a local level.

8
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

The bottom line here is that it is getting more analytics journey. But the impact is potentially
complicated to process and analyse these large, enormous. If you look at optimising the price on
complex and growing data sets – and it essentially every item in a global retail chain or detecting
requires a re-assessment of the broader fraud in real time – you get a sense of the type of
information management strategy for the majority problems that Big Data analytics can be used to
of organisations that have started their business solve.

Table 3: Old World vs. New Era (Big Data Analytics)

Old World New Era

Data Sets Predefined All-encompassing and iterative

Proactive and dynamic (real-time


Data Velocity Batch
where appropriate)

Data Analysis Predominantly Historic Predictive, Forecasting & Optimisation

However, despite the clear potential of such cases can be best mapped out across two of
analytics – it is important to understand that it the Big Data dimensions – namely velocity and
will not necessarily be relevant or applicable variety as outlined below:
to every use case. IDC believes that these use

Figure 4: Potential Use Cases for Big Data Analytics

Real time Credit & Market Risk in Banks


Fraud Detection (Credit Card) & Financial Crimes (AML) in Banks
(including Social Network Analysis)

Event-based Marketing in Financial Services and Telecoms

Markdown Optimization in Retail

Claims and Tax Fraud in Public Sector

Data Predictive
Social Media
Velocity Maintenance in
Sentiment Analysis
Aerospace

Disease Analysis
Demand Forecasting
on Electronic Health
in Manufacturing
Records

Traditional Data Video Surveillance/


Text Mining
Warehousing Analysis
Batch

Structured Semi-structured Unstructured

Data Variety

9
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

A better sense of the potential impact of deploying wide range of products in real time based on
Big Data analytics to drive high value impact can demand forecasting scenarios (that include
be derived by exploring these use cases in more the impact of promotions, seasonality and
detail: important calendar events) has a major impact
on margins. These capabilities can also be
Real-time Fraud Detection in Banks. augmented by social media sentiment analysis
Involves the ability to detect, prevent and to ascertain customer demand for certain
manage fraud across multiple products, lines products on a more real-time basis.
of business and channels for a bank. This
Disease Analysis on Electronic
requires the ability to capture the history
Health Records. As healthcare services
for different types of entities (e.g. card,
evolve, analysts can get hold of a patient’s
account, customer, terminal ID or IP address)
entire medical history in electronic format.
involved in transactions, amplifying accuracy
This will present a major opportunity for Big
in detecting customer behaviours that fall
Data analytics. For example, in the case of
outside the norm during point-of-sale (POS)
a disease such as diabetes, the ability to
transactions. This information can be used by
correlate patient medical history with dietary
multiple predictive models, for fraud detection
data (potentially from market basket analysis
and credit risk assessment.
in retail) and optimised exercise schedules will
Markdown Optimisation in Retail. provide medical practitioners with new insights
The ability for retailers to optimise prices for a that they had only previously dreamt of.

The Skill Factor


As highlighted earlier, IDC believes that the real value from Big Data will be derived from the high-end
analytics that can be performed on the increasing volumes, velocity and variety of data that organisations
are generating. In Asia (outside some of the MNCs because this is mainly being driven out of the US and
Europe), most organisations are not aware of the type and level of skills that are required. IDC also believes
that this is linked to the general lack of awareness and skill available historically in the high-end analytics
arena (regardless of the Big Data phenomenon).

High-end analytics will require new sets of the software interacts with the hardware to
skills in two key categories: leverage the data will be required.

Technical skills. For the new class of


technologies required to process, discover and The new type of business analyst/
analyse these massive data sets that cannot statistician. One of the key differences
be dealt with using traditional databases between analytics in the ‘Old World’ and what
and architectures (i.e. in memory, Hadoop, we are dealing in terms of the Big Data era
MapReduce, Key Value Stores etc). Some is that we are gathering data that we may
of these technologies will be delivered as an or may not need – and from the perspective
appliance – and skills to better understand how of analysis, this means ‘we don’t know

10
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

what we don’t know’ – i.e. there is so much the analytics that needs to be done on these
unstructured data that the variables and new data types and structures.
analytical models are likely to be entirely new.
This means that there is a need to re-think For example, if you look at the social media
the way the analytical power users approach phenomenon (contributing to the semi-structured
their work by creating a ‘Sandbox Mentality’ and unstructured data part of Big Data), many
where discovery is always the starting point. marketing departments are looking at ways to do
Generally, a background in data mining and sentiment and brand analysis based on what is
statistics would be a good starting point for being posted on Facebook, Twitter and YouTube
this type of analysis. Moving forward, there (massive amounts as you can expect). This
will be increasing demand for ‘data scientists’ dynamic becomes more complex in Asia with local
– the next-generation business analyst with social media sites like RenRen in China and Nate
strong statistical skills who are able to extract in Korea. Currently, IT is not the first port of call
information from large data sets and then for the chief marketing officer since it lacks the
present value to non-analytical experts – but skills to understand what needs to be done (and
with the unique skill of understanding the new in many cases, is still trying to work out what role
algorithms and analytical models that will it should play in the policy or governance of the
have the most significant business impact in use of social media). So the make-up of the IT
the short term. Globally, IDC is seeing a lot of department needs to be re-assessed in terms of
interest in this more analytically inclined skill technical, business and relationship skills.
set. Roles and responsibilities have not been
defined – but it basically fits in with the earlier The maturity model below highlights how IDC sees
comments in terms of ‘we don’t know what we these skills (both technical and business) mapping
don’t know’ – i.e. there is so much unstructured out in the context of the organisations that have
data that the variables and analytical models adopted business analytics over time – with a
are likely to be entirely new. It requires a very view to how this could evolve in the era of Big
‘out-of-the-box’ type and creativity in terms of Data analytics:

11
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Figure 5: The Big Data Analytics Maturity Model

Phase Old World New Era


Departmental Enterprise Big Data
Impact Pilot Analytics Analytics Analytics
Little or no expertise in Data warehouse team Advanced data modelers Business Analytics
Staff Skills (IT) analytics – basic knowledge focused on performance, and stewards key part of the Competency Centre (BACC)
of BI tools availability and security IT department that includes ‘data scientists’

Complex problem solving


Few business analysts –
Staff Skills Functional knowledge Savvy analytical modelers and integrated into Business
limited usage of advanced
(Business/IT) of BI tools statisticians utilised Analytics Competency
analytics Centre (BACC)

Widespread adoption
In database mining,
Data warehouse implemented, of appliance for multiple
Technology Simple historical BI and limited usage of parallel
& Tools reporting and dashboards broad usage of BI tools, limited processing and analytical workloads. Architecture and
analytical data marts governance for emerging
appliance technologies

Certain revenue generating Significant revenue impact Business strategy and


Financial No substantial financial impact. KPIs in place with ROI clearly (measured and monitored on a competitive differentiation
Impact No ROI models in place understood regular basis) is based on analytics

Data Initial data warehouse model Data definitions and models Clear master data
Little or none (Skunk works)
Governance and architecture standardised management strategy

Line of Aligned (including Cross-departmental


Frustrated Visible
Business (LOB) LOB executives) (with CEO visibility)

CIO Hidden Limited Involved Transformative


Engagement

% of Customers
(IDC Estimates) 20% 65% 10% 5%

In terms of capturing and developing the right policies and guidelines around master data
skills in the era of Big Data analytics, the creation management, data quality and data models
of a Business Analytics Competency Centre that
Ensure IT/Business alignment by involving the
sits across the business and IT departments will
critical stakeholders at the right time
be critical. IDC believes that this type of structure
not only clarifies the roles and responsibilities of Involve the CIO as the supporter of the
key stakeholders for this transformation, it also necessary transformation from an IT
drives internal visibility, provides a mechanism for perspective that will in turn create the
education as well as bridging the IT/business gap necessary business impact
(and the marketing and sales teams in particular
– as key individuals from these departments will Very few organisations have reached the level of
need to be represented) since improving decision maturity that can truly harness the potential that
making amongst front-office staff will be the Big Data analytics represents – and practically
primary focus of these projects. speaking, it is a major challenge to have ticked off
all the relevant boxes, but this transformation is a
In conjunction with the skills dimension, IDC necessary one in order for organisations to truly
believes that this structure should be involved in differentiate themselves in the current economic
the following areas: environment. The CIO (and the IT department)
Technology identification/deployment needs to play a critical role in this transformation.
The next section highlights some suggestions that
Business case creation and ROI justification
IDC believes should be taken into account in the
Data governance frameworks with clear context of this journey.

12
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

The CIO ‘Big Data Analytics’ Checklist


to ‘tinker’ – which impacts the immediate
business benefits. So while a certain amount
Architect for the Future. Historically, of experimentation is a good thing (as outlined
a lot of work in analytics has been focused on in the context of the ‘Sandbox Mentality’
‘workarounds’ due to the limited scalability of highlighted earlier – Hadoop and Mapreduce
the underlying hardware. As a result, many definitely fit into this category), CIOs need to
IT departments would create materialised be careful that not too much time is wasted on
views or pre-calculated data structures so that experimentation versus delivering business
business users could work off these without value.
impacting the performance of the systems
Get the Team Right. The first step in
that were processing the underlying data.
this process involves the CIO assessing his/
Clustering, parallel processing and in-memory
her own IT department to examine relevant
technologies mean that all that underlying data
skill levels and organisational structures. In
can now be used in the analytical environment.
some cases, it will necessitate an internal
However, it is important not to fall into the
transformation to get the business to take
same trap of blindly adding capacity based on
notice of the change. It then requires that the
availability. There is a need to assess multiple
right people are empowered to execute the IT
delivery models (i.e. cloud – particularly for
analytics strategy with the relevant processes
bursting capabilities, analytical appliances as
and governance structures in place to enable
well as the traditional client/server or 3-tiered
them to effectively deliver the business
Web architecture approach) on a case by case
expectations. Part of this will require a much
basis, as one size will definitely not fit all.
deeper understanding of the capabilities of
Create a ‘Sandbox Mentality’. One of the underlying analytics technology for the
the key differences between analytics in the CIO, but it will also involve working with LOB
traditional old-school batch mode and what we executives to hire the right type of analytically
are dealing with in terms of the Big Data era is minded managers and knowledge workers
that we are gathering data that we may or may who can leverage the underlying technological
not need – and from an analysis perspective, capabilities at the most optimal levels.
this means ‘we don’t know what we don’t know’
Take Analytics to the Enterprise.
– i.e. there is so much unstructured data that
The majority of IT projects in this space have
the variables and analytical models are likely
been focused on building a data warehouse
to be entirely new. This means that there is a
combined with a variety of BI tools to surface
need to re-think the way that analytical power
the underlying information to the end users.
users go about developing their models by
However, in terms of sophisticated analytics
creating more of a ‘Sandbox Mentality’ where
functionality, the lack of IT skills meant that
a discovery process is always the starting
these projects have been largely departmental
point, particularly in terms of drawing linkages
and tactical in nature, leading to a ‘silo-ed’
between unstructured, semi-structured and
mentality. As a result, to assess something
structured data. As part of this, new types
such as risk-adjusted profitability (combining
of skills will need to be brought on board
financial, credit scoring and customer data)
to understand social media nuance (i.e.
would be impossible. This needs to change;
more likely to be from Gen Y, Z or even the
and it requires a different level of IT/business
Millennials).
collaboration to do so, with the CIO personally
Not Too Much ‘Tinkering’. Whenever a focused on an enterprise-wide approach
new set of cool technologies hits the market, in deploying analytics to ensure that these
there is a tendency for IT departments projects are successful.

13
Big Data Analytics:
Future Architectures, Skills
and Roadmaps for the CIO

Governance and Enablement. This business analytics with business process


is where existing investments made in data management capabilities – more specifically,
warehousing technologies, if done correctly, decision management software components
will pay dividends. The data models and that include tools for rule management, data
reference architecture that IT has in place will mining, query and reporting, complex event
ensure that data definitions and standards processing (CEP), collaboration, BPM suites,
are consistent across the various business search, and content analysis. IDC believes
departments. Further work needs to be done that IT departments that can complement
in the master data management (MDM) previous investments in data warehousing
space in terms of bridging the operational and business intelligence technologies with
and analytical gap around data governance a better understanding of the decision-
– but fundamentally, this platform should making process in each of their organisations
provide the necessary management and and the underlying decision management
control that IT requires. When it comes software will be best placed to manage the
to business enablement, IDC sees a new IT governance versus business enablement
class of projects emerging that combines dilemma.

Conclusion
Despite the varying levels of maturity and adoption of business analytics, businesses are definitely gearing
up for the utilisation of more advanced solutions and offerings in this space. In line with this, organisations
need to plan strategically and build a robust roadmap before adopting business analytics. The new
generation of business managers is more aware of the benefits of competing on business analytics and will
be looking to drive adoption of this technology area more aggressively. Moving forward, IDC believes that a
new approach is required to proactively ‘effect’ the necessary change, with a specific focus on the following
areas:

Elevating the status of the CIO to that of one with more transformative impact on the organisation
by playing an integral role in the deployment of the enterprise analytics strategy – and ensuring that
these technologies have the expected business impact

An assessment of alternative delivery models (such as the appliance, in memory and Hadoop for Big
Data)

Capturing higher-level LOB attention and visibility as the next wave of business analytics projects are
integrated with complex event processing (CEP) and business activity monitoring (BAM) technologies
to drive a new class of projects that IDC defines as ‘decision management’

The role of the CIO is gradually becoming much more important in the boardroom and is playing a key role
in the purchase behaviour of advanced applications such as business analytics. Moreover, the CIO and the
IT department need to leverage a broader set of business analytics capabilities to create a new information
management strategy that deals with the emerging Big Data dynamic as well as delivering improved
decision-making capabilities to the business stakeholders across the organisation.

14
#AP14962U

ABOUT THIS PUBLICATION


This publication was produced by IDC Go-to-Market Services. IDC Go-to-Market Services makes IDC content available in a wide range of formats for
distribution by various companies. A license to distribute IDC content does not imply endorsement of or opinion about the licensee.

COPYRIGHT AND RESTRICTIONS


Any IDC information or reference to IDC that is to be used in advertising, press releases, or promotional materials requires prior written approval from
IDC. For permission requests, contact the GMS information line at 65-6829-7757 or gmsap@idc.com. Translation and/or localization of this document
requires an additional license from IDC.
For more information on IDC, visit www.idc.com. For more information on IDC GMS, visit www.idc.com/gms.
IDC Asia/Pacific, 80 Anson Road, #38-00 Fuji Xerox Towers, Singapore 079970. P. 65.6226.0330 F. 65.6220.6116 www.idc.com.
Copyright 2011 IDC. Reproduction is forbidden unless authorized. All rights reserved.

S-ar putea să vă placă și