Benchmark Vector Search Databases with One Million Data

In DocArray, Documents inside a DocumentArray can live in a document store instead of in memory. External store provides longer persistence and faster retrieval. Ever wonder about which one is best for your use case? Here's a comprehensive benchmark to help guide you.

Wooden geometric shapes with tech logos and drafting tools on a warm brown surface, suggesting an educational or creative set
Today, DocArray supports six external stores including SQLite, Weaviate, Qdrant, Elasticsearch, Redis and AnnLite. Apart from SQLite, all other backends support approximate nearest neighbour search.

In DocArray, Documents inside a DocumentArray can live in a document store instead of in memory. The benefit of using an external store over an in-memory store is often about longer persistence and faster retrieval.

Document Store
Documents inside a DocumentArray can live in a document store instead of in memory, e.g. in SQLite, Redis. The benefit of using an external store over an in-memory store is often about longer persistence and faster retrieval. The look-and-feel of a DocumentArray with external store is almost the ...

Today, DocArray supports six external stores including SQLite, Weaviate, Qdrant, Elasticsearch, Redis and AnnLite. Apart from SQLite, all other backends support approximate nearest neighbour search. The look-and-feel of a DocumentArray with an external store is almost the same as a regular in-memory DocumentArray. This lets you easily switch between backends under the same DocArray idiom. For example, to use Elasticsearch as the backend, simply:

from docarray import DocumentArray

da = DocumentArray(storage='elasticsearch', config={'n_dim': 128})

Or if you want to use Qdrant, you can just change one parameter:

from docarray import DocumentArray

da = DocumentArray(storage='qdrant', config={'n_dim': 128})

You get the idea!

Ever wonder about which document store is best for your use case? Here's a comprehensive benchmark to help guide you.

Methodology

We created a DocumentArray with one million Documents based on sift1m, a dataset containing 1 million objects (each of 128 dimensions) and using L2 distance metrics.

sift1m | TensorFlow Datasets
Pre-trained embeddings for approximate nearest neighbor search using theEuclidean distance. This dataset consists of two splits: 1. ‘database’: consists of 1,000,000 data points, each has features: ‘embedding’ (128 floats), ‘index’ (int64), ‘neighbors’ (empty list). 2. ‘test’: consists of…

We benchmarked the document stores as summarized below:

Name Usage Client version Database version
In-memory DocumentArray DocumentArray() DocArray 0.18.2 N/A
SQLite DocumentArray(storage='sqlite') 2.6.0 N/A
Weaviate DocumentArray(storage='weaviate') 3.9.0 1.16.1
Qdrant DocumentArray(storage='qdrant') 0.10.3 0.10.1
AnnLite DocumentArray(storage='anlite') 0.3.13 N/A
Elasticsearch DocumentArray(storage='elasticsearch') 8.4.3 8.2.0
Redis DocumentArray(storage='redis') 4.3.4 2.6.0

Core tasks

We focused on the following tasks on every document store:

  1. Create: Add one million Documents to the document store via extend() and backend capabilities when applicable.
  2. Read: Retrieve existing Documents from the document store by .id, i.e. da['some_id'].
  3. Update: Update existing Documents in the document store by .id, i.e. da['some_id'] = Document(...).
  4. Delete: Delete Documents from the document store by .id, i.e. del da['some_id']
  5. Find by condition: Search existing Documents by .tags via find() in the document store by boolean filters and use backend side filtering when possible, as described in Query by Conditions.
  6. Find by vector: Retrieve existing Documents by .embedding via find() using nearest neighbor search or approximate nearest neighbor search, as described in Find Nearest Neighbours.

The above tasks are often atomic operations in the high-level DocArray API. Hence, understanding their performance gives users a good estimation of the experience when using DocumentArray with different backends.

Parametric combinations

Most of these document stores use their own implementation of HNSW (an approximate nearest neighbor search algorithm) but with different parameters:

  • ef_construct - the HNSW build parameter that controls the index time/index accuracy. Higher ef_construct leads to longer construction, but better index quality.
  • m - maximum connections, the number of bi-directional links created for every new element during construction. Higher m works better on datasets with high intrinsic dimensionality and/or high recall, while lower m works better for datasets with low intrinsic dimensionality and/or low recall.
  • ef - the size of the dynamic list for the nearest neighbors. Higher ef at search leads to more accuracy but slower performance.

Experiment setup

We are interested in the single query performance on the above six tasks with different combinations of the above three parameters. Single query performance is measured by evaluating one Document at a time, repeatedly for tasks 2, 3, 4, 5, and 6. Finally the average number is reported.

We now elaborate the setup of our experiments. First some high-level statistics of the experiment:

Name Value
Number of created Documents 1,000,000
Number of Documents on tasks 2, 3, 4, 5, 6 1
Dimension of .embedding 128
Number of results for the task "Find by vector" 10,000

Each Document follows the structure:

{
  "id": "94ee6627ee7f582e5e28124e78c3d2f9",
  "tags": {"i": 10},
  "embedding": [0.49841760378680844, 0.703959752118305, 0.6920759535687985, 0.10248648858410625, ...]
}

We use Recall@K value as an indicator of search quality. The in-memory and SQLite store do not implement approximate nearest neighbor search but use exhaustive search instead. Hence, they give the maximum Recall@K but are the slowest.

The experiments were conducted on an AWS EC2 t2.2xlarge instance (Intel Xeon CPU E5-2676 v3 @ 2.40GHz) with Python 3.10.6 and DocArray 0.18.2.

As Weaviate, Qdrant, Elasticsearch, and Redis follow a client/server pattern, we set them up with their official Docker images in a single node configuration, with 32 GB of RAM allocated. That is, only 1 replica and shard are operated during the benchmarking. We did not opt for a cluster setup because our benchmarks mainly aim to assess the capabilities of a single instance of the server.

docarray/benchmarking_sift1m.py at main · docarray/docarray
🧬 The data structure for unstructured multimodal data · Neural Search · Vector Search · Document Store - docarray/benchmarking_sift1m.py at main · docarray/docarray
The full benchmark script is available for reproducing the results.

Latency result

In-Memory

m ef_construct ef Recall@10 Find by vector (s) Find by condition (s) Create 1M (s) Read (ms) Update (ms) Delete (ms)
N/A N/A N/A 1.000 2.37 11.17 1.06 0.17 0.05 0.14

SQLite

m ef_construct ef Recall@10 Find by vector (s) Find by condition (s) Create 1M (s) Read (ms) Update (ms) Delete (s)
N/A N/A N/A 1.000 54.32 78.63 16,421.51 1.09 0.40 28.87

AnnLite

m ef_construct ef Recall@10 Find by vector (ms) Find by condition (ms) Create 1M (s) Read (ms) Update (ms) Delete (ms)
16 64 32 0.873 1.42 0.40 114.30 0.36 12.93 18.01
16 64 64 0.942 1.51 0.37 114.18 0.38 14.43 15.38
16 64 128 0.977 1.76 0.39 135.75 0.35 12.30 13.66
16 64 256 0.986 1.98 0.36 111.66 0.32 12.39 14.51
16 128 32 0.897 1.43 0.37 134.94 0.34 17.82 18.08
16 128 64 0.960 1.53 0.38 148.67 0.36 24.42 46.17
16 128 128 0.988 1.67 0.37 136.90 0.37 13.76 31.10
16 128 256 0.996 1.99 0.37 134.40 0.36 13.95 30.39
16 256 32 0.905 1.51 0.37 200.29 0.37 16.94 18.10
16 256 64 0.965 1.54 0.37 186.36 0.36 32.40 45.42
16 256 128 0.990 1.68 0.39 173.68 0.37 12.42 14.60
16 256 256 0.997 2.07 0.36 183.66 0.36 18.86 35.82
32 64 32 0.895 1.49 0.37 116.49 0.33 12.63 17.55
32 64 64 0.954 1.59 0.37 112.83 0.34 11.74 12.26
32 64 128 0.983 1.75 0.36 114.32 0.37 17.26 16.86
32 64 256 0.993 2.06 0.37 114.64 0.34 14.64 15.88
32 128 32 0.930 1.52 0.38 142.51 0.35 14.17 15.93
32 128 64 0.975 1.58 0.40 156.41 0.34 16.17 31.42
32 128 128 0.993 1.81 0.37 147.05 0.35 19.81 39.87
32 128 256 0.998 2.15 0.38 144.64 0.34 29.62 40.21
32 256 32 0.946 1.49 0.38 196.37 0.36 20.55 15.37
32 256 64 0.984 1.62 0.37 211.81 0.35 32.65 35.15
32 256 128 0.996 1.88 0.37 194.97 0.33 12.72 13.93
32 256 256 0.999 2.25 0.37 204.65 0.35 22.13 31.54

Qdrant

m ef_construct ef Recall@10 Find by vector (ms) Find by condition (ms) Create 1M (s) Read (ms) Update (ms) Delete (ms)
16 64 32 0.965 3.50 403.70 448.99 3.74 1.88 3.74
16 64 64 0.986 4.11 396.10 453.71 3.25 1.80 3.95
16 64 128 0.995 5.09 418.13 456.74 1.59 2.00 4.03
16 64 256 0.998 5.24 410.67 459.59 1.57 1.98 4.00
16 128 32 0.974 5.03 412.48 462.62 1.45 1.90 4.08
16 128 64 0.993 5.13 392.27 460.42 1.56 1.79 3.81
16 128 128 0.998 4.32 379.69 461.63 1.48 1.86 3.96
16 128 256 0.999 5.67 381.22 459.57 1.53 1.79 3.85
16 256 32 0.982 5.26 387.67 462.77 1.58 1.80 4.07
16 256 64 0.995 5.94 386.60 463.52 1.47 1.92 3.96
16 256 128 0.998 5.76 385.84 463.80 1.58 1.78 4.11
16 256 256 0.999 6.29 393.34 464.37 1.62 1.84 4.03
32 64 32 0.969 4.53 390.39 459.58 1.54 1.83 3.80
32 64 64 0.992 3.94 399.62 459.31 1.56 1.85 4.10
32 64 128 0.997 5.34 390.16 458.17 1.62 1.87 3.85
32 64 256 0.999 5.51 426.42 459.17 1.50 1.97 4.00
32 128 32 0.983 5.42 385.82 460.31 1.57 1.88 3.99
32 128 64 0.995 4.26 381.91 462.88 1.51 1.81 3.89
32 128 128 0.998 5.69 389.73 462.65 1.47 1.85 3.82
32 128 256 0.999 6.05 399.89 464.31 1.55 1.85 3.76
32 256 32 0.990 6.13 385.43 463.11 1.48 1.91 3.82
32 256 64 0.997 3.96 397.49 462.75 1.46 1.87 3.92
32 256 128 0.999 3.82 375.39 464.36 1.45 1.79 3.95
32 256 256 0.999 4.42 381.84 462.36 1.48 1.67 3.83

Weaviate

m ef_construct ef Recall@10 Find by vector (ms) Find by condition (ms) Create 1M (s) Read (ms) Update (ms) Delete (ms)
16 64 32 0.871 4.84 2.30 574.92 7.80 4.88 19.43
16 64 64 0.939 4.99 2.38 580.66 4.52 4.15 9.42
16 64 128 0.977 5.44 2.40 577.36 2.62 3.81 9.66
16 64 256 0.984 6.43 2.37 639.69 2.55 3.62 9.63
16 128 32 0.897 4.82 2.47 655.69 4.51 4.61 26.50
16 128 64 0.960 5.11 2.34 659.43 2.64 4.23 26.89
16 128 128 0.988 7.20 2.36 818.47 4.55 4.59 34.54
16 128 256 0.996 6.38 2.38 659.77 2.62 4.81 26.89
16 256 32 0.906 4.86 2.32 787.04 2.58 5.29 26.00
16 256 64 0.965 5.09 2.49 782.85 2.62 5.36 26.50
16 256 128 0.990 5.60 2.35 784.30 2.65 5.16 25.51
16 256 256 0.997 6.49 2.30 780.20 2.60 5.05 26.55
32 64 32 0.895 4.62 2.34 1022.22 4.30 3.87 22.42
32 64 64 0.955 7.08 2.38 939.90 15.23 3.58 25.71
32 64 128 0.983 5.39 2.42 1001.35 2.75 3.57 21.51
32 64 256 0.995 7.92 2.37 981.43 2.71 3.64 29.34
32 128 32 0.929 4.92 2.49 675.61 4.50 5.06 8.61
32 128 64 0.975 5.15 2.39 673.31 4.49 5.16 27.09
32 128 128 0.993 7.26 2.45 1297.00 4.47 5.34 6.04
32 128 256 0.998 6.38 2.40 1383.55 2.62 4.69 8.49
32 256 32 0.946 6.46 2.40 1846.17 4.63 4.94 6.51
32 256 64 0.984 6.29 3.14 1926.27 2.56 6.99 21.92
32 256 128 0.996 6.36 2.41 1364.68 4.42 5.66 25.66
32 256 256 0.999 6.64 2.58 1966.97 2.64 5.91 22.22

Elasticsearch

m ef_construct ef Recall@10 Find by vector (ms) Find by condition (ms) Create 1M (s) Read (ms) Update (ms) Delete (ms)
16 64 32 0.889 4.39 7.50 508.94 14.04 70.96 69.41
16 64 64 0.947 5.51 6.31 449.22 11.92 41.26 34.40
16 64 128 0.980 7.20 6.36 434.35 12.91 71.19 57.10
16 64 256 0.990 8.81 5.93 504.32 11.71 67.09 57.06
16 128 32 0.897 4.24 6.45 688.20 14.00 72.93 61.18
16 128 64 0.953 5.10 6.64 678.95 15.25 47.03 43.08
16 128 128 0.981 6.43 7.11 719.78 12.25 55.61 46.85
16 128 256 0.993 8.59 7.01 720.77 16.59 64.65 58.07
16 256 32 0.902 4.37 6.46 1,048.13 13.13 68.83 71.74
16 256 64 0.958 5.43 7.19 1,138.32 18.90 73.47 62.13
16 256 128 0.983 6.60 6.54 1,077.97 11.58 73.65 56.86
16 256 256 0.993 8.80 6.80 1,108.34 12.93 60.73 47.59
32 64 32 0.945 5.02 7.32 471.34 11.26 69.82 55.91
32 64 64 0.976 6.18 6.48 480.60 11.58 51.82 43.04
32 64 128 0.992 7.29 7.32 527.22 11.92 72.21 57.79
32 64 256 0.997 11.42 6.77 487.11 11.72 52.50 46.61
32 128 32 0.954 4.90 6.73 790.79 13.68 69.82 66.17
32 128 64 0.984 5.72 7.00 812.03 12.65 48.82 42.13
32 128 128 0.996 7.65 7.46 861.62 12.32 61.79 57.73
32 128 256 0.999 10.44 6.61 840.29 14.27 67.59 58.75
32 256 32 0.959 4.80 6.69 1,424.29 11.77 68.75 73.07
32 256 64 0.987 6.08 7.51 1,506.04 15.66 66.59 55.46
32 256 128 0.997 8.02 6.63 1,408.87 11.89 72.99 65.46
32 256 256 0.999 11.55 7.69 1,487.95 13.37 50.19 58.59

Redis

m ef_construct ef Recall@10 Find by vector (ms) Find by condition (ms) Create 1M (s) Read (ms) Update (ms) Delete (ms)
16 64 32 0.872 1.67 0.63 563.15 1.00 1.88 25.58
16 64 64 0.941 1.78 0.63 563.27 0.98 1.85 25.17
16 64 128 0.976 1.98 0.70 563.09 0.95 1.99 24.89
16 64 256 0.991 2.32 0.56 562.37 0.91 2.01 25.39
16 128 32 0.897 1.73 0.62 754.93 0.91 2.96 25.36
16 128 64 0.959 1.78 0.51 721.33 0.89 2.31 23.23
16 128 128 0.988 2.37 0.70 775.26 1.24 4.25 28.60
16 128 256 0.997 2.63 0.64 799.26 1.06 2.72 27.36
16 256 32 0.905 1.70 0.58 1091.72 0.93 3.20 10.65
16 256 64 0.965 2.06 0.66 1196.05 1.03 5.24 28.84
16 256 128 0.990 2.33 0.62 1232.47 1.02 3.67 27.35
16 256 256 0.998 2.80 0.67 1203.37 1.05 4.44 27.85
32 64 32 0.896 1.74 0.56 625.56 0.85 2.05 6.05
32 64 64 0.954 1.86 0.65 626.49 0.92 1.74 25.02
32 64 128 0.982 2.05 0.56 626.09 0.94 1.79 25.99
32 64 256 0.994 2.47 0.59 625.44 0.99 1.64 25.05
32 128 32 0.930 1.79 0.73 871.87 0.94 8.99 4.52
32 128 64 0.975 2.10 0.67 953.10 1.06 2.35 27.11
32 128 128 0.993 2.49 0.69 921.87 1.03 3.06 27.58
32 128 256 0.998 3.06 0.64 926.96 1.06 2.45 27.27
32 256 32 0.947 1.82 0.59 1315.16 0.92 4.21 9.03
32 256 64 0.984 2.28 0.79 1489.83 1.05 4.92 29.27
32 256 128 0.996 2.75 0.79 1511.17 1.05 4.03 28.48
32 256 256 0.999 3.15 0.63 1534.68 1.03 3.26 28.19

QPS result

When we consider each Document as a query, we can convert the above metrics into queries per second (QPS). Higher values are better.

In-memory

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
N/A N/A N/A 1.000 0.42 0.09 947,284 6,061 21,505 7,246

SQLite

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
N/A N/A N/A 1.000 0.02 0.01 61 915 2,476 0.03

AnnLite

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
16 64 32 0.873 706 2,519 8,749 2,762 77 56
16 64 64 0.942 662 2,674 8,758 2,625 69 65
16 64 128 0.977 570 2,597 7,366 2,825 81 73
16 64 256 0.986 504 2,762 8,956 3,155 81 69
16 128 32 0.897 698 2,710 7,411 2,976 56 55
16 128 64 0.960 652 2,611 6,726 2,786 41 22
16 128 128 0.988 599 2,670 7,304 2,721 73 32
16 128 256 0.996 504 2,729 7,440 2,751 72 33
16 256 32 0.905 663 2,695 4,993 2,681 59 55
16 256 64 0.965 648 2,695 5,366 2,762 31 22
16 256 128 0.990 594 2,540 5,758 2,730 81 68
16 256 256 0.997 483 2,786 5,445 2,786 53 28
32 64 32 0.895 671 2,674 8,585 3,003 79 57
32 64 64 0.954 629 2,725 8,863 2,915 85 82
32 64 128 0.983 572 2,762 8,747 2,710 58 59
32 64 256 0.993 487 2,681 8,723 2,976 68 63
32 128 32 0.930 657 2,625 7,017 2,882 71 63
32 128 64 0.975 632 2,500 6,394 2,959 62 32
32 128 128 0.993 553 2,703 6,800 2,825 50 25
32 128 256 0.998 465 2,660 6,914 2,985 34 25
32 256 32 0.946 672 2,646 5,092 2,809 49 65
32 256 64 0.984 618 2,703 4,721 2,874 31 28
32 256 128 0.996 531 2,734 5,129 3,040 79 72
32 256 256 0.999 445 2,740 4,886 2,874 45 32

Qdrant

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
16 64 32 0.965 286 2.48 2,227 267 532 267
16 64 64 0.986 244 2.52 2,204 308 557 253
16 64 128 0.995 197 2.39 2,189 629 501 248
16 64 256 0.998 191 2.44 2,176 636 505 250
16 128 32 0.974 199 2.42 2,162 691 527 245
16 128 64 0.993 195 2.55 2,172 641 559 263
16 128 128 0.998 231 2.63 2,166 675 537 253
16 128 256 0.999 176 2.62 2,176 653 559 260
16 256 32 0.982 190 2.58 2,161 633 554 246
16 256 64 0.995 168 2.59 2,157 680 521 253
16 256 128 0.998 174 2.59 2,156 632 561 244
16 256 256 0.999 159 2.54 2,153 616 543 248
32 64 32 0.969 221 2.56 2,176 648 547 263
32 64 64 0.992 254 2.50 2,177 643 541 244
32 64 128 0.997 187 2.56 2,183 616 535 260
32 64 256 0.999 182 2.35 2,178 666 508 250
32 128 32 0.983 184 2.59 2,172 635 533 251
32 128 64 0.995 235 2.62 2,160 663 554 257
32 128 128 0.998 176 2.57 2,161 682 540 262
32 128 256 0.999 165 2.50 2,154 647 541 266
32 256 32 0.990 163 2.59 2,159 676 524 262
32 256 64 0.997 252 2.52 2,161 685 535 255
32 256 128 0.999 262 2.66 2,154 687 560 253
32 256 256 0.999 226 2.62 2,163 676 598 261

Weaviate

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
16 64 32 0.871 207 436 1,739 128 205 51
16 64 64 0.939 200 421 1,722 221 241 106
16 64 128 0.977 184 416 1,732 381 262 103
16 64 256 0.984 156 422 1,563 392 277 104
16 128 32 0.897 207 405 1,525 222 217 38
16 128 64 0.960 196 427 1,516 380 236 37
16 128 128 0.988 139 424 1,222 220 218 29
16 128 256 0.996 157 421 1,516 381 208 37
16 256 32 0.906 206 430 1,271 388 189 38
16 256 64 0.965 197 402 1,277 382 187 38
16 256 128 0.990 179 425 1,275 378 194 39
16 256 256 0.997 154 435 1,282 384 198 38
32 64 32 0.895 217 427 978 233 258 45
32 64 64 0.955 141 421 1,064 66 279 39
32 64 128 0.983 185 414 999 364 280 46
32 64 256 0.995 126 422 1,019 370 275 34
32 128 32 0.929 203 402 1,480 222 198 116
32 128 64 0.975 194 418 1,485 223 194 37
32 128 128 0.993 138 409 771 224 187 166
32 128 256 0.998 157 417 723 382 213 118
32 256 32 0.946 155 418 542 216 202 154
32 256 64 0.984 159 318 519 391 143 46
32 256 128 0.996 157 415 733 226 177 39
32 256 256 0.999 151 387 508 378 169 45

Elasticsearch

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
16 64 32 0.889 228 133 1,965 71 14 14
16 64 64 0.947 182 159 2,226 84 24 29
16 64 128 0.980 139 157 2,302 77 14 18
16 64 256 0.990 113 169 1,983 85 15 18
16 128 32 0.897 236 155 1,453 71 14 16
16 128 64 0.953 196 151 1,473 66 21 23
16 128 128 0.981 155 141 1,389 82 18 21
16 128 256 0.993 116 143 1,387 60 15 17
16 256 32 0.902 229 155 954 76 15 14
16 256 64 0.958 184 139 878 53 14 16
16 256 128 0.983 151 153 928 86 14 18
16 256 256 0.993 114 147 902 77 16 21
32 64 32 0.945 199 137 2,122 89 14 18
32 64 64 0.976 162 154 2,081 86 19 23
32 64 128 0.992 137 137 1,897 84 14 17
32 64 256 0.997 88 148 2,053 85 19 21
32 128 32 0.954 204 149 1,265 73 14 15
32 128 64 0.984 175 143 1,231 79 20 24
32 128 128 0.996 131 134 1,161 81 16 17
32 128 256 0.999 96 151 1,190 70 15 17
32 256 32 0.959 208 149 702 85 15 14
32 256 64 0.987 165 133 664 64 15 18
32 256 128 0.997 125 151 710 84 14 15
32 256 256 0.999 87 130 672 75 20 17

Redis

m ef_construct ef Recall@10 Find by vector Find by condition Create 1M Read Update Delete
16 64 32 0.872 600 1,585 1,776 1,001 533 39
16 64 64 0.941 563 1,595 1,775 1,018 541 40
16 64 128 0.976 504 1,425 1,776 1,058 504 40
16 64 256 0.991 431 1,795 1,778 1,094 499 39
16 128 32 0.897 579 1,621 1,325 1,099 338 39
16 128 64 0.959 562 1,961 1,386 1,125 432 43
16 128 128 0.988 422 1,420 1,290 804 235 35
16 128 256 0.997 380 1,567 1,251 943 368 37
16 256 32 0.905 587 1,726 916 1,080 313 94
16 256 64 0.965 485 1,527 836 971 191 35
16 256 128 0.990 429 1,616 811 978 273 37
16 256 256 0.998 357 1,493 831 951 225 36
32 64 32 0.896 574 1,799 1,599 1,173 488 165
32 64 64 0.954 537 1,541 1,596 1,083 575 40
32 64 128 0.982 488 1,789 1,597 1,065 560 38
32 64 256 0.994 405 1,695 1,599 1,014 612 40
32 128 32 0.930 558 1,378 1,147 1,060 111 221
32 128 64 0.975 476 1,490 1,049 948 425 37
32 128 128 0.993 402 1,456 1,085 970 327 36
32 128 256 0.998 326 1,570 1,079 943 408 37
32 256 32 0.947 548 1,682 760 1,083 238 111
32 256 64 0.984 438 1,266 671 951 203 34
32 256 128 0.996 364 1,263 662 952 248 35
32 256 256 0.999 318 1,600 652 971 306 35

Vector search recall & QPS

In particular to the find by vector queries task, the chart below depicts Recall@10 (the fraction of true nearest neighbors found, on average over all queries) against the QPS. The smaller the time values and the more upper-right in the chart, the better.

Rationale on experiment design

Our experiments are designed to be fair and the same across all backends while favouring document stores that benefit DocArray users the most. Note that such a benchmark was impossible to set up before DocArray, as each store has its own API and the definition of a task varies.

Our benchmark is based on the following principles:

  • Cover the most important operations: We understand that some backends are better at some operations than others, and some offer better quality. Therefore, we try to benchmark on six operations (CRUD + Find by vector + Find by condition) and report quality measurement (Recall@K).
  • Not just speed, but also quality: We show the trade-off between quality and speed as you tune your parameters in each document store.
  • Same experiment, same API: DocArray offers the same API across all backends and therefore we built on top of it the same benchmarking experiment. Furthermore, we made sure to run the experiment with a series of HNSW parameters for backends that support approximate nearest neighbor search. All backends are run on official Docker containers, local to the DocArray client which allows having similar network overhead. We also allocate the same resources for those Docker containers and all servers are run in a single node setup.
  • Benefit users as much as possible: We offer the same conditions and resources to all backends, but our experiment favors backends that use resources efficiently. Therefore, some backends might not use the network, or use gRPC instead of HTTP, or use batch operations. We’re okay with that, as long as it benefits the DocArray and Jina user.
  • Open to improvements: We are constantly improving the performance of storage backends from the DocArray side and updating benchmarks accordingly. If you believe we missed an optimization (e.g. perform an operation in batches, benefit from a recent feature in upstream, avoid unnecessary steps), feel free to raise a PR or issue. We’re open to your contributions!

Known limitations

  • Incompleteness on the document stores: We do not benchmark algorithms or ANN libraries like Faiss, Annoy, ScaNN. We only benchmark backends that can be used as document stores. In fact, we do not benchmark HNSW itself, but it is used by some backends internally. Other storage backends that support vector search are not yet integrated with DocArray. We’re open to contributions to DocArray’s repository to support them.
  • Client/server setup introduces random network latency: Although a real-life scenario would be the clients and server living on different machines with potentially multiple clients in parallel, we chose to keep both on the same machine and have only one client process to minimize network overhead.
  • Benchmarks are conducted end-to-end: We benchmark function calls from DocArray, not just the underlying backend vector database. Therefore, results for a particular backend can be influenced (positively or negatively) by our interface. If you spot bottlenecks we would be thrilled to know about them and improve our code accordingly.
  • We use similar underlying search algorithms but different implementations: In this benchmark we try a set of parameters ef, ef_construct and max_connections from HNSW applied equally on all backends. Note that there might be other parameters that storage backends can fix than might or might not be accessible and can have a big impact on performance. This means that even similar configurations cannot be easily compared.
  • Benchmark is for DocArray users, not for research: This benchmark showcases what a user can expect to get from DocArray without tuning hyper-parameters of a vector database. In practice, we strongly recommend tuning them to achieve high quality results.

Conclusions

We hope our benchmark result can help users select the store that suits their use cases. Depending on the dataset size and the desired quality, some stores may be preferable than others. Here are some of our conclusions:

  • If your dataset has fewer than 10,000 Documents, you should just use the in-memory DocumentArray as-is to enjoy the best quality for nearest neighbor search with reasonable latency (e.g. ~20 ms/query).
  • If your dataset is large but still fits into memory (say 1 million Documents), Redis and AnnLite offer great speed in CRUD and vector search operations. In particular, AnnLite is designed as a monolithic package, which saves a lot network overhead unlike other backends. Unfortunately, this also means one can not scale-out AnnLite natively. Nonetheless, if you are using DocArray with Jina together, you can always leverage Jina’s sharding features, an agnostic solution to scale out any document store. This of course also includes AnnLite.
Scale out with replicas and shards
A Jina Flow orchestrates multiple Executor s. By default, an Executor runs with a single replica and shard. Some Executors in the Flow may be less performant than others, which could cause performance bottlenecks in an application. To solve this, you can configure the number of replicas and shard...
  • Finally, if your dataset does not fit in memory, and you do not care much about the speed of nearest neighbor search, you can use SQLite as storage. Otherwise, Weaviate, Qdrant and Elasticsearch are good options.