AI & Vectors

Metadata

Metadata

vecs allows you to associate key-value pairs of metadata with indexes and ids in your collections. You can then add filters to queries that reference the metadata metadata.

Types

Metadata is stored as binary JSON. As a result, allowed metadata types are drawn from JSON primitive types.

  • Boolean
  • String
  • Number

The technical limit of a metadata field associated with a vector is 1GB. In practice you should keep metadata fields as small as possible to maximize performance.

Metadata Query Language

The metadata query language is based loosely on mongodb's selectors.

vecs currently supports a subset of those operators.

Comparison Operators

Comparison operators compare a provided value with a value stored in metadata field of the vector store.

OperatorDescription
$eqMatches values that are equal to a specified value
$neMatches values that are not equal to a specified value
$gtMatches values that are greater than a specified value
$gteMatches values that are greater than or equal to a specified value
$ltMatches values that are less than a specified value
$lteMatches values that are less than or equal to a specified value
$inMatches values that are contained by scalar list of specified values
$containsMatches values where a scalar is contained within an array metadata field

Logical Operators

Logical operators compose other operators, and can be nested.

OperatorDescription
$andJoins query clauses with a logical AND returns all documents that match the conditions of both clauses.
$orJoins query clauses with a logical OR returns all documents that match the conditions of either clause.

Performance

For best performance, use scalar key-value pairs for metadata and prefer $eq, $and and $or filters where possible. Those variants are most consistently able to make use of indexes.

Examples


year equals 2020


_10
{"year": {"$eq": 2020}}


year equals 2020 or gross greater than or equal to 5000.0


_10
{
_10
"$or": [
_10
{"year": {"$eq": 2020}},
_10
{"gross": {"$gte": 5000.0}}
_10
]
_10
}


last_name is less than "Brown" and is_priority_customer is true


_10
{
_10
"$and": [
_10
{"last_name": {"$lt": "Brown"}},
_10
{"is_priority_customer": {"$gte": 5000.00}}
_10
]
_10
}


priority contained by ["enterprise", "pro"]


_10
{
_10
"priority": {"$in": ["enterprise", "pro"]}
_10
}

tags, an array, contains the string "important"


_10
{
_10
"tags": {"$contains": "important"}
_10
}