[PR #48] [MERGED] feat: add search by metadata function and test #66

Closed
opened 2026-02-16 06:15:59 -05:00 by yindo · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/langchain-ai/langchain-milvus/pull/48
Author: @sunwupark
Created: 3/13/2025
Status: Merged
Merged: 3/31/2025
Merged by: @zc277584121

Base: mainHead: main


📝 Commits (2)

  • aed2ea2 feat: add search by metadata function and test
  • 32ae92d feat: applied linting

📊 Changes

2 files changed (+83 additions, -0 deletions)

View changed files

📝 libs/milvus/langchain_milvus/vectorstores/milvus.py (+38 -0)
📝 libs/milvus/tests/integration_tests/vectorstores/test_milvus.py (+45 -0)

📄 Description

I have implemented a new method, search_by_metadata, for the Milvus vector store. This method allows users to query documents based on metadata conditions without performing a vector similarity search.

Issue

#47

Key Features:

  • Enables metadata filtering using an expression (e.g., "city == 'Seoul'").
  • Supports specifying output fields to retrieve specific metadata.
  • Returns results as a list of Document objects.

Code Implementation:

def search_by_metadata(
        self, 
        expr: str, 
        fields: Optional[List[str]] = None, 
        limit: int = 10
    ) -> List[Document]:
    """
    Searches the Milvus vector store based on metadata conditions.

    Args:
        expr (str): A filtering expression (e.g., `"city == 'Seoul'"`).
        fields (Optional[List[str]]): List of fields to retrieve.
                                      If None, retrieves all available fields.
        limit (int): Maximum number of results to return.

    Returns:
        List[Document]: List of documents matching the metadata filter.
    """
    from pymilvus import MilvusException
    if self.col is None:
        logger.debug("No existing collection to search.")
        return []

    if fields is None:
        fields = self.fields

    try:
        results = self.col.query(expr=expr, output_fields=fields, limit=limit)
        return [
            Document(page_content=result[self._text_field], metadata=result)
            for result in results
        ]
    except MilvusException as e:
        logger.error(f"Metadata search failed: {e}")
        return []

Testing:

I have also written a test function to verify the functionality of search_by_metadata. Let me know if any modifications or additional tests are needed.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/langchain-ai/langchain-milvus/pull/48 **Author:** [@sunwupark](https://github.com/sunwupark) **Created:** 3/13/2025 **Status:** ✅ Merged **Merged:** 3/31/2025 **Merged by:** [@zc277584121](https://github.com/zc277584121) **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (2) - [`aed2ea2`](https://github.com/langchain-ai/langchain-milvus/commit/aed2ea222300607b52632a83f4dd9dbdf7cc209e) feat: add search by metadata function and test - [`32ae92d`](https://github.com/langchain-ai/langchain-milvus/commit/32ae92dfc64f89b2e0b59c82259e5a2d1411c89f) feat: applied linting ### 📊 Changes **2 files changed** (+83 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `libs/milvus/langchain_milvus/vectorstores/milvus.py` (+38 -0) 📝 `libs/milvus/tests/integration_tests/vectorstores/test_milvus.py` (+45 -0) </details> ### 📄 Description I have implemented a new method, search_by_metadata, for the Milvus vector store. This method allows users to query documents based on metadata conditions without performing a vector similarity search. ### Issue #47 ### Key Features: - Enables metadata filtering using an expression (e.g., "city == 'Seoul'"). - Supports specifying output fields to retrieve specific metadata. - Returns results as a list of Document objects. #### Code Implementation: ``` def search_by_metadata( self, expr: str, fields: Optional[List[str]] = None, limit: int = 10 ) -> List[Document]: """ Searches the Milvus vector store based on metadata conditions. Args: expr (str): A filtering expression (e.g., `"city == 'Seoul'"`). fields (Optional[List[str]]): List of fields to retrieve. If None, retrieves all available fields. limit (int): Maximum number of results to return. Returns: List[Document]: List of documents matching the metadata filter. """ from pymilvus import MilvusException if self.col is None: logger.debug("No existing collection to search.") return [] if fields is None: fields = self.fields try: results = self.col.query(expr=expr, output_fields=fields, limit=limit) return [ Document(page_content=result[self._text_field], metadata=result) for result in results ] except MilvusException as e: logger.error(f"Metadata search failed: {e}") return [] ``` #### Testing: I have also written a test function to verify the functionality of search_by_metadata. Let me know if any modifications or additional tests are needed. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
yindo added the pull-request label 2026-02-16 06:15:59 -05:00
yindo closed this issue 2026-02-16 06:16:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: langchain-ai/langchain-milvus#66