Home
NumericalExcelTutorial
MicroscopicPedestrianSimulation
KardiTeknomo'sTutorial
MicroPedSimFreeDownload
PersonalDevelopmentHandbook
Research
P ub li c a tions
Tut or ia ls
R e s um e
S e r v ic e
R e s our c e s
C ont a c t
LinearDiscriminantAnalysis(LDA)
ByKardiTeknomo,PhD.
<Previous|Next|Index>
Purpose
ThepurposeofDiscriminantAnalysisistoclassifyobjects(people,customers,things,etc.)
intooneoftwoormoregroupsbasedonasetoffeaturesthatdescribetheobjects(e.g.
gender,age,income,weight,preferencescore,etc.).Ingeneral,weassignanobjecttoone
ofanumberofpredeterminedgroupsbasedonobservationsmadeontheobject.
Notethatthegroupsareknownorpredeterminedanddonothaveorder(i.e.nominalscale).
Theclassificationproblemgivesseveralobjectswithasetfeaturesmeasuredfromthose
objects.Whatwearelookingforistwothings:
1.Whichsetoffeaturescanbestdeterminegroupmembershipoftheobject?
2.Whatistheclassificationruleormodeltobestseparatethosegroups?
(Checkthedifferenceofdiscriminantanalysisandclusteranalysis)
Thefirstpurposeisfeatureselectionandthesecondpurposeisclassification.Inthistutorial
wewillnotcoverthefirstpurpose(readerinterestedinthisstepwiseapproachcanuse
statisticalsoftwaresuchasSPSS,SASorstatisticalpackageofMatlab.However,wedocover
thesecondpurposetogettheruleofclassificationandpredictnewobjectbasedontherule.
LinearDiscriminantAnalysis
Forexample,wewanttoknowwhetherasoapproductisgoodorbadbasedonseveral
measurementsontheproductsuchasweight,volume,people'spreferentialscore,smell,
colorcontrastetc.Theobjecthereissoap.Theclasscategoryorthegroup("good"and"bad")
iswhatwearelookingfor(itisalsocalleddependentvariable).Eachmeasurementonthe
productiscalledfeaturesthatdescribetheobject(itisalsocalledindependentvariable).
Thus,indiscriminantanalysis,thedependentvariable(Y)isthegroupandtheindependent
variables(X)aretheobjectfeaturesthatmightdescribethegroup.Thedependentvariableis
alwayscategory(nominalscale)variablewhiletheindependentvariablescanbeany
measurementscale(i.e.nominal,ordinal,intervalorratio).
Ifwecanassumethatthegroupsarelinearlyseparable,wecanuselineardiscriminantmodel
(LDA).Linearlyseparablesuggeststhatthegroupscanbeseparatedbyalinearcombination
offeaturesthatdescribetheobjects.Ifonlytwofeatures,theseparatorsbetweenobjects
groupwillbecomelines.Ifthefeaturesarethree,theseparatorisaplaneandthenumberof
features(i.e.independentvariables)ismorethan3,theseparatorsbecomeahyperplane.
LDAFormula
Usingclassificationcriteriontominimizetotalerrorofclassification(TEC),wetendtomake
theproportionofobjectthatitmisclassifiesassmallaspossible.TECistheperformancerule
inthe'longrun'onarandomsampleofobjects.Thus,TECshouldbethoughtasthe
probabilitythattheruleunderconsiderationwillmisclassifyanobject.Theclassificationruleis
toassignanobjecttothegroupwithhighestconditionalprobability.Thisiscalled
BayesRule.ThisrulealsominimizestheTEC.Ifthereare groups,theBayes'ruleisto
assigntheobjecttogroup where .