Sunteți pe pagina 1din 5

25/11/2014

ID3algorithmWikipedia,thefreeencyclopedia

ID3algorithm
FromWikipedia,thefreeencyclopedia

Indecisiontreelearning,ID3(IterativeDichotomiser3)isanalgorithminventedbyRossQuinlan[1]used
togenerateadecisiontreefromadataset.ID3istheprecursortotheC4.5algorithm,andistypicallyused
inthemachinelearningandnaturallanguageprocessingdomains.

Contents
1Algorithm
1.1Summary
1.2Pseudocode
1.3Properties
1.4Usage
2TheID3metrics
2.1Entropy
2.2InformationGain
3Seealso
4References
5Externallinks

Algorithm
TheID3algorithmbeginswiththeoriginalset astherootnode.Oneachiterationofthealgorithm,it
iteratesthrougheveryunusedattributeoftheset andcalculatestheentropy
(orinformationgain
)ofthatattribute.Itthenselectstheattributewhichhasthesmallestentropy(orlargestinformation
gain)value.Theset isthensplitbytheselectedattribute(e.g.age<50,50<=age<100,age>=100)to
producesubsetsofthedata.Thealgorithmcontinuestorecurseoneachsubset,consideringonlyattributes
neverselectedbefore.
Recursiononasubsetmaystopinoneofthesecases:
everyelementinthesubsetbelongstothesameclass(+or),thenthenodeisturnedintoaleafand
labelledwiththeclassoftheexamples
therearenomoreattributestobeselected,buttheexamplesstilldonotbelongtothesameclass
(someare+andsomeare),thenthenodeisturnedintoaleafandlabelledwiththemostcommon
classoftheexamplesinthesubset
therearenoexamplesinthesubset,thishappenswhennoexampleintheparentsetwasfoundtobe
matchingaspecificvalueoftheselectedattribute,forexampleiftherewasnoexamplewithage>=
http://en.wikipedia.org/wiki/ID3_algorithm

1/5

25/11/2014

ID3algorithmWikipedia,thefreeencyclopedia

100.Thenaleafiscreated,andlabelledwiththemostcommonclassoftheexamplesintheparent
set.
Throughoutthealgorithm,thedecisiontreeisconstructedwitheachnonterminalnoderepresentingthe
selectedattributeonwhichthedatawassplit,andterminalnodesrepresentingtheclasslabelofthefinal
subsetofthisbranch.

Summary
1. Calculatetheentropyofeveryattributeusingthedataset
2. Splittheset intosubsetsusingtheattributeforwhichentropyisminimum(or,equivalently,
informationgainismaximum)
3. Makeadecisiontreenodecontainingthatattribute
4. Recurseonsubsetsusingremainingattributes.

Pseudocode
ID3(Examples,Target_Attribute,Attributes)
Createarootnodeforthetree
Ifallexamplesarepositive,ReturnthesinglenodetreeRoot,withlabel=+.
Ifallexamplesarenegative,ReturnthesinglenodetreeRoot,withlabel=.
Ifnumberofpredictingattributesisempty,thenReturnthesinglenodetreeRoot,
withlabel=mostcommonvalueofthetargetattributeintheexamples.
OtherwiseBegin
ATheAttributethatbestclassifiesexamples.
DecisionTreeattributeforRoot=A.
Foreachpossiblevalue, ,ofA,
AddanewtreebranchbelowRoot,correspondingtothetestA= .
LetExamples( )bethesubsetofexamplesthathavethevalue forA
IfExamples( )isempty
Thenbelowthisnewbranchaddaleafnodewithlabel=mostcommontargetvalueintheexamples
ElsebelowthisnewbranchaddthesubtreeID3(Examples( ),Target_Attribute,Attributes{A})
End
ReturnRoot

Properties
ID3doesnotguaranteeanoptimalsolutionitcangetstuckinlocaloptimums.Itusesagreedyapproachby
selectingthebestattributetosplitthedatasetoneachiteration.Oneimprovementthatcanbemadeonthe
algorithmcanbetousebacktrackingduringthesearchfortheoptimaldecisiontree.
ID3canoverfittothetrainingdata,toavoidoverfitting,smallerdecisiontreesshouldbepreferredover
largerones.Thisalgorithmusuallyproducessmalltrees,butitdoesnotalwaysproducethesmallest
possibletree.
ID3ishardertouseoncontinuousdata.Ifthevaluesofanygivenattributeiscontinuous,thenthereare
manymoreplacestosplitthedataonthisattribute,andsearchingforthebestvaluetosplitbycanbetime
consuming.

Usage
http://en.wikipedia.org/wiki/ID3_algorithm

2/5

25/11/2014

ID3algorithmWikipedia,thefreeencyclopedia

TheID3algorithmisusedbytrainingonadataset toproduceadecisiontreewhichisstoredinmemory.
Atruntime,thisdecisiontreeisusedtoclassifynewunseentestcasesbyworkingdownthedecisiontree
usingthevaluesofthistestcasetoarriveataterminalnodethattellsyouwhatclassthistestcasebelongs
to.

TheID3metrics
Entropy
Entropy
isameasureoftheamountofuncertaintyinthe(data)set (i.e.entropycharacterizesthe
(data)set ).

Where,
Thecurrent(data)setforwhichentropyisbeingcalculated(changeseveryiterationoftheID3
algorithm)
Setofclassesin
Theproportionofthenumberofelementsinclass tothenumberofelementsinset
When

,theset isperfectlyclassified(i.e.allelementsin areofthesameclass).

InID3,entropyiscalculatedforeachremainingattribute.Theattributewiththesmallestentropyisusedto
splittheset onthisiteration.Thehighertheentropy,thehigherthepotentialtoimprovetheclassification
here.

InformationGain
Informationgain
isthemeasureofthedifferenceinentropyfrombeforetoaftertheset issplit
onanattribute .Inotherwords,howmuchuncertaintyin wasreducedaftersplittingset onattribute
.

Where,
Entropyofset
Thesubsetscreatedfromsplittingset byattribute suchthat
Theproportionofthenumberofelementsin tothenumberofelementsinset
Entropyofsubset

http://en.wikipedia.org/wiki/ID3_algorithm

3/5

25/11/2014

ID3algorithmWikipedia,thefreeencyclopedia

InID3,informationgaincanbecalculated(insteadofentropy)foreachremainingattribute.Theattribute
withthelargestinformationgainisusedtosplittheset onthisiteration.

Seealso
CART
C4.5algorithm

References
1. ^Quinlan,J.R.1986.InductionofDecisionTrees.Mach.Learn.1,1(Mar.1986),81106

Mitchell,TomM.MachineLearning.McGrawHill,1997.pp.5558.
GrzymalaBusse,JerzyW."SelectedAlgorithmsofMachineLearningfromExamples."Fundamenta
Informaticae18,(1993):193207.

Externallinks
Seminarshttp://www2.cs.uregina.ca/
(http://www2.cs.uregina.ca/~hamilton/courses/831/notes/ml/dtrees/4_dtrees1.html)
Descriptionandexampleshttp://www.cise.ufl.edu/(http://www.cise.ufl.edu/~ddd/cap6635/Fall
97/Shortpapers/2.htm)
Descriptionandexampleshttp://www.cis.temple.edu/
(http://www.cis.temple.edu/~ingargio/cis587/readings/id3c45.html)
AnimplementationofID3inPython
(http://www.onlamp.com/pub/a/python/2006/02/09/ai_decision_trees.html)
AnimplementationofID3inRuby(http://ai4r.org/machineLearning.html)
AnimplementationofID3inCommonLisp(http://www.pvv.ntnu.no/~oyvinht/static/OSS/clid3/)
AnimplementationofID3algorithminC#(http://www.codeproject.com/cs/algorithms/id3.asp)
AnimplementationofID3inPerl(https://metacpan.org/module/AI::DecisionTree)
AnimplementationofID3inProlog(http://ftp.cs.stanford.edu/cs/robotics/shoham/prolog.tar.Z)
AnimplementationofID3inC(ThiscodeiscommentedbynonEnglishlanguage)
(http://id3alg.altervista.org)
Retrievedfrom"http://en.wikipedia.org/w/index.php?title=ID3_algorithm&oldid=633226059"
Categories: Decisiontrees Classificationalgorithms

http://en.wikipedia.org/wiki/ID3_algorithm

4/5

25/11/2014

ID3algorithmWikipedia,thefreeencyclopedia

Thispagewaslastmodifiedon10November2014at13:30.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmay
apply.Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisa
registeredtrademarkoftheWikimediaFoundation,Inc.,anonprofitorganization.

http://en.wikipedia.org/wiki/ID3_algorithm

5/5

S-ar putea să vă placă și