SimilaritySearch and ManifoldLearning
SimSearchManifoldLearning.ManifoldKnnIndex
— TypeManifoldKnnIndex{DistType,MinRecall}
Implements the ManifoldLearning.AbstractNearestNeighbors
interface to interoperate with the non-linear dimensionality reduction methods of the ManifoldLearning
package.
It should be passed to the fit
method as a type, e.g.,
fit(ManifoldKnnIndex{L2Distance,0.9}) # will use an approximate index with an expected recall of 0.9
DistType
should be any in SimilaritySearch
package, or Distances.jl
or any other following the SemiMetric
.
The second argument of the composite type indicates the quality and therefore the type of index to use:
- It takes values between
0
and1
. 0
means for aSearchGraph
index usingParetoRecall
optimization for the construction and searching, this will try to achieve a competitive structure in both quality and search speed1
means for aExhaustiveSearch
index, this will compute the exact solution (exact knns) but at cost speed. Can work pretty well on small datasets and very high dimensionality. Really high dimensions suffer from the curse of dimensionality such that an index likeSearchGraph
degrades to ExhaustiveSearch.0 < value < 1
: Uses aSearchGraph
and is the minimum recall-score quality that the index should perform. In particular, it constructs the index usingParetoRecall
and the use a final optimization withMinRecall
. It takes values from 0 to 1, small values produce faster searches with lower qualities and high values slower searches with higher quality. Values 0.8 or 0.9 should work pretty well.
Note: The minimum performance is evaluated in a small training set took from the database, this could yield to some kind of overfitting in the parameters, and therefore, perform not so good in an unseen query set. If you note this effect, please see SimilaritySearch
documentation function optimize!
.
The distance functions are defined to work under the evaluate(::SemiMetric, u, v)
function (borrowed from Distances.jl package).
KNN predefined types
SimSearchManifoldLearning.ExactEuclidean
— TypeExactEuclidean
ManifoldKnnIndex
's type specialization for exact search with the Euclidean distance.
SimSearchManifoldLearning.ExactManhattan
— TypeExactManhattan
ManifoldKnnIndex
's type specialization for exact search with the Manhattan distance.
SimSearchManifoldLearning.ExactChebyshev
— TypeExactChebyshev
ManifoldKnnIndex
's type specialization for exact search with the Chebyshev distance.
SimSearchManifoldLearning.ExactCosine
— TypeExactCosine
ManifoldKnnIndex
's type specialization for exact search with the cosine distance.
SimSearchManifoldLearning.ExactAngle
— TypeExactAngle
ManifoldKnnIndex
's type specialization for exact search with the angle distance.
SimSearchManifoldLearning.ApproxEuclidean
— TypeApproxEuclidean
ManifoldKnnIndex
's type specialization for approximate search with the Euclidean distance (expected recall of 0.9)
SimSearchManifoldLearning.ApproxManhattan
— TypeApproxManhattan
ManifoldKnnIndex
's type specialization for approximate search with the Manhattan distance (expected recall of 0.9)
SimSearchManifoldLearning.ApproxChebyshev
— TypeApproxChebyshev
ManifoldKnnIndex
's type specialization for approximate search with the Chebyshev distance (expected recall of 0.9)
SimSearchManifoldLearning.ApproxCosine
— TypeApproxCosine
ManifoldKnnIndex
's type specialization for approximate search with the Cosine distance (expected recall of 0.9)
SimSearchManifoldLearning.ApproxAngle
— TypeApproxAngle
ManifoldKnnIndex
's type specialization for approximate search with the angle distance (expected recall of 0.9)