-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Scalar indexes cannot search out data #34548
Comments
Just now I backed up this vector to another cluster, and the exception can still be reproduced. After deleting the collection from the backup cluster, I backed up and restored it again, and it cannot be reproduced. |
/assign @zhagnlu |
This is not because scalar filtering causes hnsw to be unable to perform layer traversal, because there is another set of data that can also have data with scalars after multiple queries. And no data operation is performed during this period |
This collection configures replicas. Is it caused by index differences between replicas? |
@zhagnlu There is no difference, multiple requests return completely different results |
if not hybrid search, just using query, will multiple requests return completely different results ? |
Vector search and query returns normally |
@zhagnlu Another phenomenon is that some search conditions cannot be returned at all if they have scalar filtering, but vector searches have returns. But this scalar filtering has data. hybrid search returns blank |
@yanliang567 Yes, sometimes the returned results are inconsistent, and sometimes the returned results are incorrect.Appears only on the hnsw index plus scalar filtering |
Regenerated debug log |
@yanliang567 @cydrain @liliu-z Can you help us check together? |
@syang Could you please tell us the filter_rate and index building parameters? |
![]() |
@alwayslove2013 This collection has less than 20,000 data, but the M value and efConstruction are large enough (I think).I know about the data island problem that scalar filtering and hnsw work together, and I have previously investigated and adjusted the index construction parameters |
Hi @syang1997 , Can you share your script to reproduce this issue ? |
I'm coding a demo to replicate this issue |
Hi @syang1997 , One more question, I see you're using Milvus v2.3.15, have you tried Milvus v2.4.x ? |
@syang1997 |
We have already communicated with the community once, and the preliminary reason is still the previous data island problem |
@xiaofan-luan The phenomenon is that there is no return instead of returning insufficient topK, so it is suspected that the first layer node of HNSW is filtered by all |
after discussion, it seems the reason might be hnsw filtered 70-80% data, cause graph connectivity brokes |
This fix will be released with 2.4.7 |
/assign @yanliang567 |
Is there an existing issue for this?
Environment
Current Behavior
Hybrid search cannot find out data, but a separate query can find out data
![img_3](https://cdn.statically.io/img/private-user-images.githubusercontent.com/48277927/346984958-f7bc02b7-61de-4564-b176-c9b753f766c0.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE5NjI0ODksIm5iZiI6MTcyMTk2MjE4OSwicGF0aCI6Ii80ODI3NzkyNy8zNDY5ODQ5NTgtZjdiYzAyYjctNjFkZS00NTY0LWIxNzYtYzliNzUzZjc2NmMwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI2VDAyNDk0OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTBlZGZkZGM4OWJlMDBiMzAzMTBjMWE3MmVkYWE1MzQ1MTgzZmM1YzA4NTAwMjNiOWJhZDIzN2M3MmY1NWUzZTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.CgNYDMXmcNGTL5tzlOf7ny4CRQnyWOBqFbOdk4aiehc)
![img_4](https://cdn.statically.io/img/private-user-images.githubusercontent.com/48277927/346984977-95e5651e-1560-45bf-a264-f3ea0764b97e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE5NjI0ODksIm5iZiI6MTcyMTk2MjE4OSwicGF0aCI6Ii80ODI3NzkyNy8zNDY5ODQ5NzctOTVlNTY1MWUtMTU2MC00NWJmLWEyNjQtZjNlYTA3NjRiOTdlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI2VDAyNDk0OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTkxMThlMWEwYjZmYTY3NDIwODM3YzE2OGM4NWY1NDgzYTM0MjQwOWVmOWM1ZmNiYzllOGQxNzhjMzMzMDY5YWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.5BsDp8PLIjvPAY_kzmfeBY4xTVmPwQIoKLw5SWRAiio)
![img_5](https://cdn.statically.io/img/private-user-images.githubusercontent.com/48277927/346985167-bc4cfbc5-d69e-43ed-b972-63c379cfc1a2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE5NjI0ODksIm5iZiI6MTcyMTk2MjE4OSwicGF0aCI6Ii80ODI3NzkyNy8zNDY5ODUxNjctYmM0Y2ZiYzUtZDY5ZS00M2VkLWI5NzItNjNjMzc5Y2ZjMWEyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzI2VDAyNDk0OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQ3OWQyODFkMGIyZmE3ZTc4Mzc4YzUxZTY4OTYzMjM0MjU3MDc1NGU1N2YzOWVlNmI2NTBlOGNkZTNiNDUwNWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.00rQ4t4Gdr7WODdJ1t9N_YLl7ckCzBw13_5S5sMo7Ag)
This scalar query has data
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
milvus-log (1).tar.gz
Anything else?
No response
The text was updated successfully, but these errors were encountered: