Documentation

From a pile of vectors to a served query API in 3 minutes

3-Minute Quickstart

Vector Panda takes your embedding vectors and serves them as a high-performance similarity search API. Upload Parquet, CSV, or binary vector files and go from files to live queries in three steps.

1

Install the SDK

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ veep
2

Upload your vectors

from veep import Client

client = Client(api_key="sk_live_your_key_here")

# Upload a Parquet file containing your embeddings
client.upload("my-collection", "embeddings.parquet")

Vector Panda ingests the file, distributes vectors across workers, and begins serving them within seconds.

3

Query for similar vectors

results = client.query(
    "my-collection",
    vector=[0.1, 0.2, 0.3, ...],
    top_k=10,
    include_metadata=True
)

for r in results:
    print(f"{r.key}: {r.score:.4f}")
That's it

Your vectors are now searchable via the API. No index configuration, no cluster setup, no YAML files.

Installation

Install the veep Python SDK. Requires Python 3.9 or later.

Early access

veep is currently published to Test PyPI while in alpha. Install with the command below. It will move to the main PyPI index before GA.

# Install from Test PyPI (alpha)
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ veep

# With Parquet upload support
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ "veep[parquet]"

The core package depends only on requests. The [parquet] extra adds pyarrow for file validation during uploads.

Authentication

All API access requires an API key. You can find your key in the dashboard after signing in.

from veep import Client

# Pass the key directly
client = Client(api_key="sk_live_your_key_here")

# Or set the VEEP_API_KEY environment variable
# export VEEP_API_KEY=sk_live_your_key_here
client = Client()  # reads from VEEP_API_KEY

You can also set VEEP_HOST to point at a custom endpoint. By default the SDK connects to https://api.vectorpanda.com.

Keep your key secret

Never commit API keys to version control. Use environment variables or a secrets manager.

Upload Vectors

Vector Panda ingests vectors from multiple file formats. Upload your data and the ingestion pipeline handles the rest.

Supported formats

Format Extensions Description
Apache Parquet .parquet Columnar format with vectors, keys, and metadata columns. Recommended for most use cases.
CSV .csv Comma-separated values with vector columns. Good for small datasets or exports from other tools.
Binary vectors .fvecs, .bvecs, .ivecs Standard binary vector formats (float32, uint8, int32). Common in ANN benchmark datasets.

Basic upload

client.upload("products", "product_embeddings.parquet")

With options (Parquet)

client.upload(
    "products",
    "product_embeddings.parquet",
    vector_column="emb",       # column containing vectors (default: "emb")
    key_column="product_id",    # column for vector keys (auto-generated if omitted)
)

Upload flow

When you call upload():

  1. The file is uploaded to Vector Panda over HTTPS
  2. The ingestion pipeline detects the format from the file extension
  3. Vectors are extracted, distributed across workers, and indexed
  4. Vectors become queryable within seconds of ingestion completing

You can upload multiple files to the same collection. Each upload adds vectors; it does not replace existing ones.

Deleting uploaded files

client.delete("products", "product_embeddings.parquet")

Query Vectors

Search a collection for the most similar vectors to a query vector.

results = client.query(
    "products",             # collection name
    vector=[0.1, 0.2, ...],    # query vector (must match collection dimensions)
    top_k=10,                  # max results (default: 10)
    similarity_threshold=0.7,  # minimum score (default: 0.7)
    distance_metric="cosine",  # cosine | euclidean | dot_product
    include_metadata=True,    # include metadata fields (default: False)
)

Working with results

query() returns a QueryResults object. It is iterable and indexable.

# Iterate
for result in results:
    print(result.key, result.score)

# Index
best = results[0]
print(best.key, best.score, best.metadata)

# Length
print(f"Found {len(results)} matches")

Each Result object has three fields:

Field Type Description
key str Vector identifier
score float Similarity score (higher = more similar)
metadata dict Metadata fields (empty dict if not requested)

Index acceleration

For large collections, you can request index-accelerated search:

results = client.query(
    "products",
    vector=query_vec,
    use_index="pca",
    index_params={"reduced_dimensions": 64},
)

Manage Collections

A collection is a namespace for your vectors. Collections are created automatically when you upload your first file. You can list all collections accessible to your API key.

collections = client.collections()

for col in collections:
    print(f"{col.name}: {col.vector_count} vectors, {col.storage_gb:.2f} GB, tier={col.tier}")

Each Collection object has these fields:

Field Type Description
name str Collection name
tier str Storage tier: hot, warm, or paused
is_active bool Whether the collection is currently queryable
vector_count int Total vectors stored
storage_gb float Storage used in gigabytes

Tiers & Pausing

Every collection runs in one of three storage tiers. New collections default to warm. Free-tier accounts (no payment method) are limited to warm; hot requires a payment method on file.

Tier Storage Queries Use case
Hot NVMe SSD Unlimited, lowest latency Production workloads, real-time search
Warm SSD Unlimited Development, moderate-traffic apps (default)
Paused HDD None (must resume first) Archival, cost savings for inactive collections

Pausing a collection

Pausing moves data to HDD and stops serving queries. Your index parameters and vector data are preserved. Paused collections bill at $0.09/M vectors (base 512D).

Resuming

Resuming rebuilds artifacts from HDD source, distributes to workers, and promotes the epoch. Typical resume times: 100K vectors <30s, 1M ~1 min, 10M ~3-4 min.

Auto-pause

Free-tier collections with no queries for 30 days are automatically paused. You'll receive an email notification 7 days before auto-pause.

Metadata

Metadata is stored alongside vectors and can be returned with query results. In Parquet files, any column that is not the vector column or key column is treated as metadata. CSV files also support metadata columns.

Supported metadata types: strings, integers, floats, booleans.

# Query with metadata
results = client.query(
    "products",
    vector=query_vec,
    include_metadata=True
)

for r in results:
    print(r.metadata)
    # {"product_name": "Widget", "price": 29.99, "category": "electronics"}
Metadata is fetched on demand

When include_metadata=False (the default), metadata is not loaded during search, keeping queries fast. Set it to True only when you need the extra fields.

Distance Metrics

Vector Panda supports three distance metrics for similarity search. Pass the distance_metric parameter to query().

Metric Value Best for
Cosine similarity "cosine" Text embeddings, normalized vectors (default)
Euclidean distance "euclidean" Spatial data, image features
Dot product "dot_product" Maximum inner product search, recommendation scores

Pricing & Free Tier

Every account gets 250,000 free vector slots at 512 dimensions — no credit card required. The free allowance scales inversely with dimensions: higher dimensions use more storage per vector, so fewer vectors fit in the same space.

Dimension multiplier

All pricing uses 512D as the baseline. The cost multiplier is dimensions / 512.

Dimensions Multiplier Free vectors Example models
1280.25x1,000,000Custom, lightweight
2560.50x500,000E5-small
3840.75x333,333all-MiniLM-L6-v2
5121.00x250,000CLIP ViT-B/32
7681.50x166,666all-mpnet-base-v2
10242.00x125,000Cohere embed-v3
15363.00x83,333OpenAI text-embedding-3-small
30726.00x41,666OpenAI text-embedding-3-large

Storage rates

Tier Rate (per M vectors/mo, 512D)
Hot$2.99
Warm$0.49
Paused$0.09

Cost examples

Free tier: 250K vectors at 512D (or 83K at 1536D) — $0/mo.

1M vectors, 768D, warm: 1M × 1.5x × $0.49 = $0.74/mo.

5M vectors, 1536D, hot: 5M × 3.0x × $2.99 = $44.85/mo, minus free tier (83K vectors free ≈ $0.75 off).

No base fee

There is no monthly minimum or platform fee. You pay only for storage above the free tier.

Rate Limits

Rate limits are applied per API key. If you exceed the limit, the server returns HTTP 429 Too Many Requests. Retry after the delay indicated in the Retry-After header.

Tier Queries/sec Uploads/min
Free105
Warm5020
Hot20060
# Handle rate limiting
from veep import ServerError

try:
    results = client.query("products", vector=vec)
except ServerError as e:
    if e.status_code == 429:
        print("Rate limited — slow down or upgrade tier")

Error Handling

The SDK raises typed exceptions for different error conditions. All exceptions inherit from VeepError.

from veep import Client, AuthError, NotFoundError, ServerError, UploadError

try:
    results = client.query("my-collection", vector=vec)
except AuthError:
    print("Invalid API key")
except ServerError as e:
    print(f"Server error (HTTP {e.status_code}): {e}")
Exception When
AuthError API key missing, invalid, or rejected (HTTP 401)
NotFoundError Collection or file not found (HTTP 404)
ServerError Unexpected server error (HTTP 5xx). Has .status_code
UploadError File not found, wrong format, or missing vector column

File Formats

Parquet (recommended)

Apache Parquet is the recommended format. It supports vector columns, key columns, and arbitrary metadata columns in a single file.

# Minimum: a vector column named "emb"
emb: list<float32>[768]    # fixed-size list of your embedding dimensions

# Recommended: vector column + key column + metadata
emb: list<float32>[768]    # embedding vectors
id: string                  # unique key per vector
title: string               # metadata
category: string            # metadata
price: float64              # metadata

Creating a Parquet file

import pyarrow as pa
import pyarrow.parquet as pq

# Your data
vectors = [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]
keys = ["doc_1", "doc_2"]
titles = ["First document", "Second document"]

table = pa.table({
    "emb": vectors,
    "id": keys,
    "title": titles,
})

pq.write_table(table, "my_embeddings.parquet")

CSV

Plain CSV files with vector values. Useful for small datasets or quick experiments.

Binary vector formats (fvecs / bvecs / ivecs)

Standard binary formats used by ANN benchmark datasets (SIFT, GloVe, etc). Each file contains a sequence of vectors prefixed with their dimension count.

  • .fvecs — float32 vectors
  • .bvecs — uint8 vectors
  • .ivecs — int32 vectors

Generating Embeddings

Vector Panda stores and searches vectors. You generate them using any embedding model. Here are examples with popular providers.

OpenAI

from openai import OpenAI

openai = OpenAI()
response = openai.embeddings.create(
    input="Vector databases are fast",
    model="text-embedding-3-small"  # 1536 dimensions
)
vector = response.data[0].embedding

Sentence Transformers

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")  # 384 dimensions
vector = model.encode("Vector databases are fast").tolist()

Common dimension sizes

Model Dimensions
OpenAI text-embedding-3-small 1,536
OpenAI text-embedding-3-large 3,072
all-MiniLM-L6-v2 384
CLIP ViT-B/32 512
Cohere embed-english-v3 1,024

Best Practices

Keep dimensions consistent

All vectors in a collection must have the same number of dimensions. Use the same embedding model for all data in a collection and for query vectors.

Use cosine similarity for text

Most text embedding models produce normalized vectors. Cosine similarity (the default) is the correct metric for these. Use dot product only if your model documentation specifically recommends it.

Request metadata only when needed

Leave include_metadata=False (the default) for search-only workloads. Metadata lookup adds a round-trip to the storage layer. Request it when you need to display or filter on metadata fields.

Batch your uploads

Combine vectors into fewer, larger Parquet files rather than uploading many small files. Each file triggers an ingestion cycle, so larger files are more efficient.

Health check

Use client.health() to verify connectivity before running queries in scripts or CI pipelines.