Solving single queries

by: Eric S. Téllez

using SimilaritySearch

This example shows how to perform single queries instead of solving a batch of them. This is particularly useful for some applications, and we also show how they are solved, which could be used to avoid some memory allocations.

dim = 8
db = MatrixDatabase(randn(Float32, dim, 10^4))
queries = db = MatrixDatabase(randn(Float32, dim, 100))
dist = SqL2Distance()
G = SearchGraph(; dist, db)
ctx = SearchGraphContext()
index!(G, ctx)

Suppose you want to compute some \(k\) nearest neighbors, for this we use the structure KnnResult which is a priority queue of maximum size \(k\).


for _ in 1:10
    res = KnnResult(3)

    @time search(G, ctx, randn(Float32, dim), res)
    @show minimum(res), maximum(res), argmin(res), argmax(res)
    @show collect(IdView(res))
    @show collect(DistView(res))
end
  0.207479 seconds (118.20 k allocations: 7.439 MiB, 99.96% compilation time)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.17419f0, 3.781705f0, 0x00000042, 0x00000023)
collect(IdView(res)) = UInt32[0x00000042, 0x0000001c, 0x00000023]
collect(DistView(res)) = Float32[2.17419, 2.7570584, 3.781705]
  0.000015 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.9567637f0, 8.3428545f0, 0x00000009, 0x00000059)
collect(IdView(res)) = UInt32[0x00000009, 0x00000047, 0x00000059]
collect(DistView(res)) = Float32[2.9567637, 6.4577327, 8.3428545]
  0.000006 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (1.8161318f0, 2.363171f0, 0x00000051, 0x00000007)
collect(IdView(res)) = UInt32[0x00000051, 0x00000023, 0x00000007]
collect(DistView(res)) = Float32[1.8161318, 2.3379018, 2.363171]
  0.000008 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (1.8514413f0, 3.9120069f0, 0x00000003, 0x0000001f)
collect(IdView(res)) = UInt32[0x00000003, 0x0000003c, 0x0000001f]
collect(DistView(res)) = Float32[1.8514413, 3.377074, 3.9120069]
  0.000004 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (3.570466f0, 3.9146912f0, 0x0000002c, 0x00000064)
collect(IdView(res)) = UInt32[0x0000002c, 0x00000058, 0x00000064]
collect(DistView(res)) = Float32[3.570466, 3.8789546, 3.9146912]
  0.000004 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.2877853f0, 5.390996f0, 0x0000004f, 0x00000013)
collect(IdView(res)) = UInt32[0x0000004f, 0x0000003f, 0x00000013]
collect(DistView(res)) = Float32[2.2877853, 4.867247, 5.390996]
  0.000004 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (4.23781f0, 5.603115f0, 0x00000023, 0x00000032)
collect(IdView(res)) = UInt32[0x00000023, 0x0000004f, 0x00000032]
collect(DistView(res)) = Float32[4.23781, 5.4622717, 5.603115]
  0.000004 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (4.6007833f0, 5.5480776f0, 0x0000002a, 0x0000001b)
collect(IdView(res)) = UInt32[0x0000002a, 0x00000023, 0x0000001b]
collect(DistView(res)) = Float32[4.6007833, 5.0899153, 5.5480776]
  0.000004 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (4.8126698f0, 4.8592696f0, 0x00000024, 0x00000001)
collect(IdView(res)) = UInt32[0x00000024, 0x00000004, 0x00000001]
collect(DistView(res)) = Float32[4.8126698, 4.813215, 4.8592696]
  0.000003 seconds (3 allocations: 160 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.365991f0, 4.04924f0, 0x0000002c, 0x00000020)
collect(IdView(res)) = UInt32[0x0000002c, 0x00000003, 0x00000020]
collect(DistView(res)) = Float32[2.365991, 3.51064, 4.04924]

KnnResult

This structure is the container for the result and it is also used to specify the number of elements to retrieve. As mentioned before, it is a priority queue


res = KnnResult(4)
push_item!(res, 1, 10)
push_item!(res, 2, 9)
push_item!(res, 3, 8)
push_item!(res, 4, 7)
push_item!(res, 6, 5)
@show res

# it also supports removals
@show :popfirst! => popfirst!(res)
push_item!(res, 7, 0.1)
@show :push_item! => res
@show :pop! => pop!(res)
res
# It can be iterated

@show collect(res)
res = SimilaritySearch.KnnResult(SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000006, 5.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)], 4)
:popfirst! => popfirst!(res) = :popfirst! => SimilaritySearch.AdjacencyLists.IdWeight(0x00000006, 5.0f0)
:push_item! => res = :push_item! => SimilaritySearch.KnnResult(SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000007, 0.1f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)], 4)
:pop! => pop!(res) = :pop! => SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)
collect(res) = SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000007, 0.1f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0)]
3-element Vector{IdWeight}:
 IdWeight(0x00000007, 0.1f0)
 IdWeight(0x00000004, 7.0f0)
 IdWeight(0x00000003, 8.0f0)

Environment and dependencies

Julia Version 1.10.9
Commit 5595d20a287 (2025-03-10 12:51 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 64 × Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
Threads: 64 default, 0 interactive, 32 GC (on 64 virtual cores)
Environment:
  JULIA_PROJECT = .
  JULIA_NUM_THREADS = auto
  JULIA_LOAD_PATH = @:@stdlib
Status `~/sites/SimilaritySearchDemos/Project.toml`
  [aaaa29a8] Clustering v0.15.8
  [944b1d66] CodecZlib v0.7.8
  [a93c6f00] DataFrames v1.7.0
  [c5bfea45] Embeddings v0.4.6
  [f67ccb44] HDF5 v0.17.2
  [b20bd276] InvertedFiles v0.8.0 `~/.julia/dev/InvertedFiles`
  [682c06a0] JSON v0.21.4
  [23fbe1c1] Latexify v0.16.6
  [eb30cadb] MLDatasets v0.7.18
  [06eb3307] ManifoldLearning v0.9.0
⌃ [ca7969ec] PlotlyLight v0.11.0
  [91a5bcdd] Plots v1.40.11
  [27ebfcd6] Primes v0.5.7
  [ca7ab67e] SimSearchManifoldLearning v0.3.0 `~/.julia/dev/SimSearchManifoldLearning`
  [053f045d] SimilaritySearch v0.12.0 `~/.julia/dev/SimilaritySearch`
⌅ [2913bbd2] StatsBase v0.33.21
  [f3b207a7] StatsPlots v0.15.7
  [7f6f6c8a] TextSearch v0.19.0 `~/.julia/dev/TextSearch`
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`