Solving single queries

by: Eric S. Téllez

using SimilaritySearch

This example shows how to perform single queries instead of solving a batch of them. This is particularly useful for some applications, and we also show how they are solved, which could be used to avoid some memory allocations.

dim = 8
db = MatrixDatabase(randn(Float32, dim, 10^4))
queries = db = MatrixDatabase(randn(Float32, dim, 100))
dist = SqL2Distance()
G = SearchGraph(; dist, db)
ctx = SearchGraphContext()
index!(G, ctx)

Suppose you want to compute some \(k\) nearest neighbors, for this we use the structure KnnResult which is a priority queue of maximum size \(k\).


for _ in 1:10
    res = knnqueue(ctx, 3)

    @time search(G, ctx, randn(Float32, dim), res)
    @show minimum(res), maximum(res), argmin(res), argmax(res)
    @show collect(IdView(res))
    @show collect(DistView(res))
end
  0.302833 seconds (154.11 k allocations: 10.185 MiB, 99.98% compilation time)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (5.73508f0, 8.136073f0, 0x00000047, 0x00000049)
collect(IdView(res)) = UInt32[0x00000047, 0x0000000c, 0x00000049]
collect(DistView(res)) = Float32[5.73508, 8.045934, 8.136073]
  0.000016 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.4517207f0, 4.1300344f0, 0x00000054, 0x0000001a)
collect(IdView(res)) = UInt32[0x00000054, 0x0000002d, 0x0000001a]
collect(DistView(res)) = Float32[2.4517207, 3.6453025, 4.1300344]
  0.000007 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (5.665649f0, 6.3988633f0, 0x00000019, 0x0000003d)
collect(IdView(res)) = UInt32[0x00000019, 0x00000022, 0x0000003d]
collect(DistView(res)) = Float32[5.665649, 6.12617, 6.3988633]
  0.000006 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (4.363997f0, 6.268838f0, 0x0000000c, 0x00000029)
collect(IdView(res)) = UInt32[0x0000000c, 0x00000025, 0x00000029]
collect(DistView(res)) = Float32[4.363997, 4.8510885, 6.268838]
  0.000005 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (1.9081182f0, 3.2303193f0, 0x00000061, 0x00000054)
collect(IdView(res)) = UInt32[0x00000061, 0x0000000d, 0x00000054]
collect(DistView(res)) = Float32[1.9081182, 2.8516128, 3.2303193]
  0.000004 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (3.5657682f0, 7.0970955f0, 0x0000002a, 0x0000003e)
collect(IdView(res)) = UInt32[0x0000002a, 0x00000056, 0x0000003e]
collect(DistView(res)) = Float32[3.5657682, 6.842711, 7.0970955]
  0.000004 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (1.0025893f0, 2.340893f0, 0x00000017, 0x0000000d)
collect(IdView(res)) = UInt32[0x00000017, 0x0000000c, 0x0000000d]
collect(DistView(res)) = Float32[1.0025893, 1.2908223, 2.340893]
  0.000005 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.1722112f0, 5.5402813f0, 0x00000027, 0x00000002)
collect(IdView(res)) = UInt32[0x00000027, 0x00000021, 0x00000002]
collect(DistView(res)) = Float32[2.1722112, 4.0453386, 5.5402813]
  0.000004 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (1.3056961f0, 3.5194216f0, 0x00000060, 0x00000062)
collect(IdView(res)) = UInt32[0x00000060, 0x00000023, 0x00000062]
collect(DistView(res)) = Float32[1.3056961, 3.0779376, 3.5194216]
  0.000003 seconds (2 allocations: 128 bytes)
(minimum(res), maximum(res), argmin(res), argmax(res)) = (2.4991374f0, 4.457182f0, 0x0000000d, 0x00000034)
collect(IdView(res)) = UInt32[0x0000000d, 0x0000001c, 0x00000034]
collect(DistView(res)) = Float32[2.4991374, 3.6591063, 4.457182]

Knn queue

This structure is the container for the result and it is also used to specify the number of elements to retrieve. As mentioned before, it is a priority queue


res = knnqueue(ctx, 4)
push_item!(res, 1, 10)
push_item!(res, 2, 9)
push_item!(res, 3, 8)
push_item!(res, 4, 7)
push_item!(res, 6, 5)
@show collect(viewitems(res))

# it also supports removals; yet `pop_min!` is not exported since currently is available only for `KnnSorted` queue backend.
@show :pop_min! => SimilaritySearch.pop_min!(res) 
push_item!(res, 7, 0.1)
@show :push_item! => res
@show :pop_max! => pop_max!(res)
res
# It can be iterated

@show collect(viewitems(res))
collect(viewitems(res)) = SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000006, 5.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)]
:pop_min! => SimilaritySearch.pop_min!(res) = :pop_min! => SimilaritySearch.AdjacencyLists.IdWeight(0x00000006, 5.0f0)
:push_item! => res = :push_item! => SimilaritySearch.KnnSorted{Vector{SimilaritySearch.AdjacencyLists.IdWeight}}(SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000007, 0.1f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)], 1, 4, 4, 0, 0)
:pop_max! => pop_max!(res) = :pop_max! => SimilaritySearch.AdjacencyLists.IdWeight(0x00000002, 9.0f0)
collect(viewitems(res)) = SimilaritySearch.AdjacencyLists.IdWeight[SimilaritySearch.AdjacencyLists.IdWeight(0x00000007, 0.1f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000004, 7.0f0), SimilaritySearch.AdjacencyLists.IdWeight(0x00000003, 8.0f0)]
3-element Vector{IdWeight}:
 IdWeight(0x00000007, 0.1f0)
 IdWeight(0x00000004, 7.0f0)
 IdWeight(0x00000003, 8.0f0)

Environment and dependencies

Julia Version 1.10.10
Commit 95f30e51f41 (2025-06-27 09:51 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 64 × Intel(R) Xeon(R) Silver 4216 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
Threads: 64 default, 0 interactive, 32 GC (on 64 virtual cores)
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_PROJECT = .
  JULIA_LOAD_PATH = @:@stdlib
Status `~/Research/SimilaritySearchDemos/Project.toml`
  [aaaa29a8] Clustering v0.15.8
  [944b1d66] CodecZlib v0.7.8
  [a93c6f00] DataFrames v1.8.0
  [c5bfea45] Embeddings v0.4.6
  [f67ccb44] HDF5 v0.17.2
  [b20bd276] InvertedFiles v0.8.1
  [682c06a0] JSON v0.21.4
  [23fbe1c1] Latexify v0.16.10
  [eb30cadb] MLDatasets v0.7.18
  [06eb3307] ManifoldLearning v0.9.0
⌃ [ca7969ec] PlotlyLight v0.11.0
  [91a5bcdd] Plots v1.40.20
  [27ebfcd6] Primes v0.5.7
  [ca7ab67e] SimSearchManifoldLearning v0.3.1
  [053f045d] SimilaritySearch v0.13.0
⌅ [2913bbd2] StatsBase v0.33.21
  [f3b207a7] StatsPlots v0.15.7
  [7f6f6c8a] TextSearch v0.19.6
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`