Optimizing Search Speed for Vector Similarity in a Filtered Collection Schema #34503

JaeHyeonSoon · 2024-07-09T01:59:55Z

JaeHyeonSoon
Jul 9, 2024

Hello,
My collection schema is as follows:

fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=512),
    FieldSchema(name="word", dtype=DataType.VARCHAR, max_length=512),
    FieldSchema(name="tag", dtype=DataType.VARCHAR, max_length=16),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=768)
]

schema = CollectionSchema(fields)

I want to filter by word == "apple" and calculate the vector similarity among the filtered data to retrieve the tag values of the top 10 most similar data points. How can I optimize the search speed for this task?

Thanks.