Sunteți pe pagina 1din 27

Advanced Analytics

Akhlak Hossain
ID:100421977
Computing and Mathematics, UoD
• Big Data and its type.
• Data Analytics and Business Intelligence.
• Key Technologies.
• Technological Aspects of Data Analytics.
• Analytical Applications.
• Benefits.
• Limitations.
• Which one to be used?
• Analytical Algorithm.
• Regression Analysis
• Benefits and Limitations.
• Decision Tree.
• Benefits and Limitations.
• Dataset.
• Conclusion.

Program Outline
• What is Big Data?
• Large volume of data
• Analyzed for better decisions and strategic business moves.

Big Data and its Types


• 13 V’s of Big Data:
• Validity
• Volatility
• Verbosity
• Vulnerability
• Verification
• Volume
• Velocity
• Variety
• Variability
• Value
• Veracity
• Visualization
• Viscosity

Big Data and its Types


Data analytics (DA) is the science of examining raw data with
the purpose of drawing conclusions about that information.
Data analytics is used in many industries to allow companies
and organization to make better business decisions and in the
sciences to verify or disprove existing models or theories.
Business intelligence (BI) is a technology-driven process for
analyzing data and presenting actionable information to help
corporate executives, business managers and other end users
make more informed business decisions.

Data Analytics and BI


Big data Analytics does not envelops any single innovation.
Obviously, big data can be connected to advanced analytics,
however in reality to get the most valuable data there are
several types of technologies that works out for the best
possible outcomes. Here are some of them:
• Data management.
• Data mining
• Hadoop
• In-memory analytics
• Predictive analytics
• Text mining

Key Technologies
Big data affects organizations across practically every industry.
Some of the areas are being discussed below where Data
Analytics is carried out.
• Banking.
• Education.
• Government.
• Healthcare.
• Manufacturing.
• Retail.

Technological Aspects of Data Analytics:


Banking:
• To understand customers and boost their satisfaction.
• To minimize risk by maintaining regulatory compliance.
• Technologies Used:
• Data Management.
• Predictive Analytics.
Education:
• To make sure students are making adequate progress.
• To evaluate the support of Teachers.
• Technologies Used:
• Data Management.
• Predictive Analytics.

Technological Aspects of Data Analytics:


Healthcare:
• Patient records.
• Treatment plans.
• Prescription information.
• Technologies Used:
• Data management.
• In-memory analytics.
• Predictive analytics.
Retail:
• To know what customers want and what they crave for.
• To handle Transactions.
• Building Customer Relationship.
• Technologies Used:
• Hadoop.
• Data management.
• Predictive analytics.

Technological Aspects of Data Analytics:


Government:
• To managing utilities.
• Running agencies.
• Dealing with traffic congestion.
• Preventing crime.
• Technologies Used:
• Data management.
• Data mining.
• In-memory analytics.
• Predictive analytics.
Manufacturing:
• To boost quality and output
• To Minimize the waste
• To solve problems faster
• To make more agile business decisions
• Technologies Used:
• Data mining.
• In-memory analytics.
• Predictive analytics.

Technological Aspects of Data Analytics:


• Analytical software or applications are organization-
based software.
• Used to improve the performance of business operations.
• Precisely, these applications are known as a type of
Business intelligence solution.

Analytical Applications
R SAS
1. Cost effective Alternative. 1. Commercial software and hence not
cheap.
2. Counterpart of SAS. 2. Secured UI and flexible with people
who have ideas in SQL.
3. It’s free and can be downloaded by 3. Well analyzed upgrades. Makes it
anyone. easier to use.
4. Low level programming language 4. Dedicated customer services.
and can take longer codes for Provides their support without ease.
straightforward processes.
5. Largest online support. 5. Around 15000 data can be based in
SAS. The largest of all.

Analytical Applications
R SAS
1. Good integration between the 1. SAS has the ability to work on
programming language and the many platforms
statistical functions.
2. relatively easy to integrate the 2. the software is reasonable to afford
application with other languages
3. Packages include a wide variety of 3. sufficiently effective and flexible to
quantitative applications meet a user’s demand

4. large network help or aid is instantly 4. it is easy and straightforward to


available on almost any topic based on enter data and set up files
the application.

5. always up to date 5. incorporated framework with


identical architecture that is shared by
modules

Benefits of Analytical Applications


R SAS
1. May not have the ability to hold 1. Its primitive and thus can seem to be
large data sets as efficiently as SAS a bit hard to use for first time users.
2. Documentation is imprecise 2. No graphical possibilities.
and resistant to a non-analyst

3. Rapidly consumes most of the 3. Not cheap.


available memory while the
application is running.
4. Poor choice of tool for Data Mining. 4. Not an open source application.

Limitations of Analytical Applications


• As, I will be doing a following research I might go for R
as it is free to use and it’s logic are a bit more easier.
• For example, TCO (Total Cost of Ownership) of using R
might actually go higher than SAS. For example an
Analytics company decides to use R exclusively figuring
since they don’t have to pay for SAS licenses, their cost
of project delivery will go down, better profit margins,
lower billing to client, better competitiveness in the
market.

Which one to be used?


• What is Analytical Algorithm?
• It stands for a set of interrogatives and estimations that
builds a data-mining model from a given set of data.
• It evaluates the provided data observing for distinct types of
patterns.
• It uses the outcomes of the analysis to extract significant
patterns and precise statistics.
• Why we need to analyze?
• To classify problems and algorithms by difficulty, to
anticipate performance.
• For a better understanding and improvements in
implementations of algorithms.

Analytical Algorithm
• Study of relationships in between the variables.
• Easy to use and applies in many situations.
• Most commonly used tools for business analysis.

Regression Analysis
Benefits Limitations
1. Use of analysis and research to 1. Focuses on relationship between
foresee what is liable to happen in the dependent and independent variables.
following quarter or year.
2. Can provide understanding how 2. Not correct in most of the cases
changes in customer spending or local
economy shifts will affect an
organization.
3. To make business decisions 3. Regression assumes that data is
independent.
4. Can diminish a large amount of data 4. This is frequently, but not generally,
to actionable information. sensible
5. Provides new insight for managers 5. It does not completely describe the
by disclosing patterns and relationship between variables.
relationships that has not been noticed
previously.

Benefits and Limitations


• It breaks down a dataset into smaller and smaller subsets
while decision tree is incrementally developed.
• The final result is a tree with decision nodes and leaf
nodes.
• A decision node has two or more branches.
• Leaf node represents a classification or decision.
• The topmost decision node in a tree which corresponds to
the best predictor called root node.
• Decision trees can handle both categorical and numerical
data.

Decision Tree
• The core algorithm for building decision trees called ID3 by J. R.
Quinlan.
• ID3 uses Entropy and Information Gain to construct a decision tree.

Decision Tree (cont.)


Benefits Limitations
1. utilised in parallel to other project 1. The more decisions there are in a
management tools. tree, the less accurate any expected
outcomes are likely to be
2. Working with continuous attributes 2. Large trees that include dozens of
decision nodes can be convoluted and
may have limited value.
3. decision tree can easily be modified 3. unrealistic decision tree that could
according to the information. guide you toward a bad decision.
4. decision tree documentation is easy 4. unexpected events may alter
to maintain. decisions.
5. A decision tree can also represent
decision alternatives, conceivable
results and risk occasions illustratively.

Benefits and Limitations


• Datasets has been downloaded from:
https://www.data.gov/
• This dataset contains Police Traffic Enforcement activity.
It has all types of activities regarding enforcements, such
as, Parking Violation, Traffic violation, Speed limit etc
occurring at each state frequently.
• Reason to choose this dataset is to show how the
violations has been decreased and its impact on the
society.

Dataset
• For my proposed dataset, Regression Analysis have been
used.
• With the help of Regression analysis I will be able to
predict the rate at which it is increasing and how to
decrease it.
• Moreover, it will also help me to identify at which state
the incident is taking place and at what hour it is
happening more frequently.

Output and Recommendation


Variable Name Type of Data Description
Incnum Numeric Incident Number
Inctype String Incident Type
Inctypecode Numeric Incident Type Code
Dtrecieved Numeric Date Received
Stnum String State Number
Stname1 String State Name 1
Stname2 String State Name 2
Rate Numeric How many incident took place
X Numeric Geological Co-ordinate X
Y Numeric Geological Co-ordinate Y

Attributes of Dataset
• “R” does not recognize/take any string values. It only
understand Numeric values.
• In order to progress further we need to nullify the strings.
• Command that to be used to nullify the string under the
column “inctype”:
• rti$inctype <- NULL

Further Breakdown
• In this report I have built up a comprehension of what the
present business intelligence advance is alongside prologue to
expository calculations and what they are.
• Analytical techniques, for example, decision tree and
regression analysis have been examined alongside the
advantages and restrictions of the investigative procedures.
• A further understanding and investigation of Analytical
applications, for example, SAS and R have been quickly de-
scribed alongside their advantages and restrictions.
• At the end, a short detail of the information set for my next
paper has been incorporated into the report.

Conclusion
• EYGM Limited. (2014). Big Data: Changing The Way Businesses
Compete And Operate. London: EY.
• Flajolet, R. S. (2013). An Introduction To The Analysis of Algorithms.
New Jersey: Pearson Education, Inc.
• Gelman, A. (2003). Regression Modeling and Meta-Analysis for
Decision Making. American Statistical Association, 1-2Ranjan, J.
(2009). Business Intelligence: Concepts, Components, Techniques
And Benefits. Journal of Theoretical and Applied Information
Technology, 60-70.
• SAS Institute Inc. (2001). Step-by-Step Programming with Base SAS
Software. Cary: SAS Institute Inc.
• Stanley, J. (2009). Notes on computer programmes for statistical
analysis. 5-6.

References

S-ar putea să vă placă și