Sunteți pe pagina 1din 48

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS.

NEURAL NETWORKS Conclusion References

Neural Network Model For Data Mining


Lubna Shaikh
Guided By- Prof Jitali Patel

January 2013

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

1 2

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

3 4

5 6 7

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

The following are the major steps required to develop a Neural Network Model for Data Mining:

Network Construction and Training Network Pruning Rule Extraction Knowledge Representation

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Steps Involved

Is Neural Network approach appropriate? Select appropriate Paradigm Select input data and facts Prepare data Train and test network Use the Network for Data Mining

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Is Neural Network Approach Appropriate?

Inadequate knowledge base Volatile knowledge bases Data-intensive system Standard technology is inadequate Qualitative or complex quantitative reasoning is required Data is intrinsically noisy and error-prone Project development time is short and training time for the neural network is reasonable.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Select Appropriate Paradigm


Decide Network architecture according to general problem area
Classication Filtering Pattern recognition Optimization Data compression Prediction

Select network size


No. No. No. No. of of of of inputs outputs hidden layers neurons per layer
Seminar

Computer Science Department

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Select Appropriate Paradigm

Decide on learning method Decide on transfer function Decide on nature of input/output Decide on type of training used.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Data Set Considerations

Size Noise Knowledge domain representation Training set and test set Insucient data Coding the input data

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Data Set Size

Optimal size of the training set depends on the type of network used. Size is relatively large Rule of thumb for backpropagation networks:
Training Set Size = Number of hidden layers/Testing Tolerance + Number of input neuron

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Noise and Knowledge Domain Representation

For back propagation networks, the training is more successful when the data contain noise. Training set should contain a good representation of the entire universe of the domain May result in an increase in number of training facts, causing the network size to change.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Selection of Variables

Reduce the size of input data without degrading the performance of the network
Principle Component Analysis Manual Method

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Insucient Data
Scarce data makes the allocation of the data into training and a testing set critical
Rotation Scheme:
Data set has N facts Set aside one of the facts, training the system with N-1 facts. Then set aside another fact and retrain the network with the other N-1 facts. Repeat the process N times.

Made-up Data:
Include made up-data, idea of BOOTSTRAPPING is also used The decision should be made as whether the distribution of data should be maintained.

Expert-made Data
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Coding the Input Data

The training data set should be properly normalized. and match the design of the network. Functions used:
Zero-mean-unit Variant (Zscore) Min-Max Cut o Sigmoidale

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Data Preparation

In Distributed data set, the qualities that dene a unique pattern are spread out over more than one neuron For example, a purple object can be represented described as being half red and half blue The two neurons assigned to red and blue can together dene purple, eliminating the need to assign a third purple neuron

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Training Strategies
The main intention of training is not to memorize the examples of the training set, but to build a general model of the input/output relationships based on the training examples. Generalisation
A general model means that the set of input/output relationships, derived from the training set, apply equally well to new sets of data from the same problem not included in the training set. The main goal of a neural network is thus the generalization to new data of the relationships learned on the training set.

Overtting
A too large amount of training can memorize all the examples of the training set with the associated noise, errors, and inconsisten- cies, and therefore perform a poor generalization on new data.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies

Network Dimension

The overtting problem depends on the model size, the number of free parameters, the number of constraints, the number of independent training examples. A rule of thumb for obtaining good generalization is to use the smallest network that ts the training data. A small network, besides the better expected generalization, is also faster to train.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Network pruning
Pruning Algorithms The general pruning approach consists of training a relatively large network and gradually removing either weights or complete units that seem not to be necessary. The large initial size allows the network to learn quickly and with a lower sensitivity to initial conditions and local minima. The reduced nal size helps to improve generalization. There are basically two ways of reducing the size of the original network.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Network pruning
Sensitivity methods: After learning the sensitivity of the error function to the removal of every element (unit or weight) is esti- mated: the element with the least eect can be removed. Penalty-term methods: Weight decay terms are added to the error function, to reward the network for choosing ecient solutions. That is networks with small weight values are privileged. At the end of the learning process, the weights with smallest values can be removed, but, even in case they are not, a network with several weights close to 0 already acts as a smaller system.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Rule extraction

Extracts classication rules from pruned network Rules generated are in the form of if (a1, v1,) and (x1, v1,) and ... and (xn, vn,) then Cj Where a is are the attributes of an input tuple vis are constants s are relational operators (=,<,>, !=) Cj is one of the class labels

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Diculties in dening relationships:


Links may be still too many to express the relationship between an input tuple and its class label in the form of if ... then rules. If a network has n input links with binary values, there could be as many as 2n , distinct input patterns. The rules could be quite lengthy or complex even for a small n. The activation values of a hidden unit could be anywhere in the range [-1, 1] depending on the input tuple. Dicult to derive an explicit relationship between the continuous activation values of the hidden units and the output values of a unit in the output layer.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Rule Extraction Algorithm, (RX)


1

Apply a clustering algorithm to nd clusters of hidden node activation values. Enumerate the discretized activation values and compute the network outputs. Generate rules that describe the network outputs in terms of the discretized hidden unit activation values. For each hidden unit, enumerate the input values that lead to them and generate a set of rules to describe the hidden units discretized values in terms of the inputs. Merge the two sets of rules obtained in the previous two steps to obtain rules that relate the inputs and outputs.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Clustering
1st step of RX clusters the activation values of hidden units into a manageable number of discrete values Without sacricing the classication accuracy of the network Neural Network based clustering method Represents each cluster as an exemplar which acts as a prototype New objects can be distributed to the cluster whose exemplar is the most similar based on some distance measure. Neural network approach has strong theoretical links with actual brain processing.
Competitive Learning Self-organizing feature maps
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Competitive Learning

Hierarchical architecture of several articial Neurons Winner-takes-all fashion Winning unit within each cluster becomes active(lled circles) while others are inactive. Connections between layers are excitatory- inputs are received from lower levels. The units within a cluster compete to responds to the pattern that is output from the layer below.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Competitive Learning

Connections within layers are inhibitory so that only 1 unit in a given cluster may be active. The winning unit adjusts the weights on its connections between other units in the cluster so that it will respond more strongly in future. The number of clusters and the number of units per cluster are input parameters.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Competitive Learning

Figure: Competitive Learning


Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Self Organising feature maps(SOMS)


The learning algorithm in SOM still follows the competitive model, but the updating rule produces an output layer, where the topology of the patterns in the input space is preserved. That means that if patterns xr and xs are close in the input space - close on the basis of the similarity measure adopted in the winner-take-all rule- the corresponding ring neural units are topologically close in the network layer. A network that performs such a mapping is called a feature map. Feature maps not only group input patterns into clusters, but also visually describe the relationships among these clusters in the input space.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Self Organising feature maps(SOMS)


A Kohonen map consists usually of a two-dimensional array of neurons fully connected with the input vector, without lateral connections, arranged on a squared or hexagonal lattice. The topology preserving property is obtained by a learning rule that involves the winner unit and its neighbors in the weight updating process. As a consequence, close neurons in the output layer learn to re for input vectors with similar characteristics. During training the network assigns to ring neurons a position on the map, based on the dominant feature of the activating input vector. For this reason Kohonen maps are also called Self-Organizing Maps(SOM).
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Competitive Learning

Figure: Self Organizing Maps (SOMS)


Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Rule Extraction Algorithm

2nd step is to relate these discretized activation values with the output layer activation values, i e, the class labels. 3rd step is to relate them with the attribute values at the nodes connected to the hidden node. Input: Set of discrete patterns with the class labels and produces the rules describing the relationship between the patterns and their class labels

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Knowledge Extraction
One of the reasons of Neural Networks success is its ability to develop an internal representation of the knowledge necessary to solve a given problem. However, such internal knowledge representation is very dicult to understand and to translate into symbolic knowledge, due to its distributed nature. At the end of the learning process, the networks knowledge is spread all over its weights and units. In addition, even if a translation into symbolic rules is possible it might not have physical meaning, because the networks com putation does not take into account the physical ranges of the input variables.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Knowledge Extraction
Generally the internal knowledge representation of neural networks presents a very low degree of human comprehensibility and, for this reason, it has often been described as opaque to the outside world. This lack of comprehension of how decisions are made inside a neural network denitely represents a strong limitation for the application of ANNs to intelligent data analysis. Several real world applications need an explanation of how a given decision is reached.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Advantages Of Neural Networks?


High Accuracy: Neural networks are able to approximate complex non-linear mappings. Noise Tolerance: Neural networks are very exible with respect to incomplete, missing and noisy data. Independence from prior assumptions: Neural networks can be updated with fresh data, making them useful for dynamic environments. Hidden nodes, in supervised neural networks can be regarded as latent variables. Neural networks can be implemented in parallel hardware
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Disadvantages Of Neural Networks?


One of the main drawbacks of ANN paradigms consists of the lack of criteria for the a priori denition of the optimal network size for a given task. The space generated by all possible ANN structures with dierent size for a selected ANN paradigm can then become the object of other data analysis techniques. Genetic algorithms, for example, have been recently applied to this problem, to build a population of good ANN architectures with respect to a given task.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Rule Extraction Algorithm

Disadvantages Of Neural Networks?


ANNs decision processes remain still quite opaque and a trans- lation into meaningful symbolic knowledge hard to perform. On the contrary, fuzzy systems are usually appreciated for the transparency of their decisional algorithms. The combination of the ANN approach and of fuzzy logic has produced hybrid architectures, called neuro-fuzzy networks. Learning rules are no more constrained into the traditional crisp logic, but exploit the linguistic power of fuzzy logic.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Traditional approaches of Data Mining VS. Neural Networks


Foundation: Logic vs. Brain
Traditional Approach: Simulate and formalize human reasoning and logic process. TA treats the brain as a black box. TA focuses on how the elements are related to each other and how to give the machine the same capabilities. Neural Networks: Simulate the intelligence functions of the brain. NN focus on modeling the brain structure. NN attempts to create a system that functions like the brain because it has a structure similar to the structure of the brain

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Processing Techniques: Sequential vs. Parallel

Traditional Approach: The processing method of TA is inherently sequential. Neural Networks: The processing method of NN is inherently parallel. Each neuron in a neural network system functions in parallel with others

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Learning: Static and External vs. Dynamic and Internal


Traditional Approach: Learning takes place outside of the system. The knowledge is obtained outside the system and then coded into the system. Neural Networks: Learning is an integral part of the system and its design. Knowledge is stored as the strength of the connections among the neurons and it is the job of NN to learn these weights from a data set presented to it.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Reasoning Method: Deductive vs. Inductive


Traditional Approach: Is deductive in nature. The use of the system involves a deductive reasoning process, applying the generalized knowledge to a given case. Neural Networks: Is inductive in nature. It constructs an internal knowledge base from the data presented to it. It generalizes from the data, such that when it is presented a new set of data, it can make a decision based on the generalized internal knowledge.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Knowledge Representation: Explicit vs. Implicit

Traditional Approach: It represents knowledge in an explicit form. Rules and relationships can be inspected and altered. Neural Networks: The knowledge is stored in the form of interconnections strengths among neurons Nowhere in the system, can one pick up a piece of computer code or a numerical value as a discernible piece of knowledge.

Computer Science Department

Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Conclusion
A study of the neural network based data mining process show that neural network is very suitable for solving the problems of data mining because its characteristics of good robustness, self-organizing adaptive, parallel processing, distributed storage and high degree of fault tolerance. The combination of data mining method and neural network model can greatly improve the eciency of data mining methods, and it has been widely used. One of the issues is to reduce the training time of neural networks. The speed of network training by developing fast algorithms can be improved. Tthe time required to extract rules by neural network approach is still longer than the time needed by the decision tree based approach.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

Outline
1 2

3 4

5 6 7

Major Steps Network Construction and Training Is Neural Network Approach Appropriate? Select Appropriate Paradigm Select Input Data and Facts Data Preparation Training Strategies Network Pruning Rule extraction Rule Extraction Algorithm Traditional approaches VS. NEURAL NETWORKS Conclusion References
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

References
Eective Data Mining Using Neural Networks by Hongjun Lu, Member, IEEE Computer Society, Rudy Setiono, and Huan Liu, Member, IEEE Introduction to Data Mining Using Articial Neural Networks, Dr. Hamid Nemati An Introduction to Data Mining by Kurt Thearling, Ph.D., www.thearling.com Data Mining: Concepts and Techniques Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2001. Anderson, J. A. (2003). An introduction to neural networks, Prentice Hall.
Computer Science Department Seminar

Major Steps Network Construction and Training Network Pruning Rule extraction Traditional approaches VS. NEURAL NETWORKS Conclusion References

References

http://www.mathworks.in/products/neural network/description2.html

Computer Science Department

Seminar

S-ar putea să vă placă și