Non-Euclidean or non-metric measures can be informative
More Info
expand_more
Abstract
StatisticallearningalgorithmsoftenrelyontheEuclideandistance.Inpractice,non-Euclideanornon-metricdissimilaritymeasuresmayarisewhencontours,spectraorshapesarecomparedbyeditdistancesorasaconsequenceofrobustobjectmatching[1,2].Itisanopenissuewhethersuchmeasuresareadvantageousforstatisticallearningorwhethertheyshouldbeconstrainedtoobeythemetricaxioms.
Thek-nearestneighbor(NN)ruleiswidelyappliedtogeneraldissimilaritydataasthemostnaturalapproach.Alternativemethodsexistthatembedsuchdataintosuitablerepresentationspacesinwhichstatisticalclassi¿ersareconstructed[3].Inthispaper,weinvestigatetherelationbetweennon-Euclideanaspectsofdissimilaritydataandtheclassi¿cationperformanceofthedirectNNruleandsomeclassi¿erstrainedinrepresentationspaces.Thisisevaluatedonaparameterizedfamilyofeditdistances,inwhichparametervaluescontrolthestrengthofnon-Euclideanbehavior.Our¿ndingisthatthediscriminativepowerofthismeasureincreaseswithincreasingnon-Euclideanandnon-metricaspectsuntilacertainoptimumisreached.Theconclusionisthatstatisticalclassi¿ersperformwellandtheoptimalvaluesoftheparameterscharacterizeanon-Euclideanandsomewhatnon-metricmeasure