Stop response generation in langchain framework #36

New Issue

2026-02-16T00:15:17-05:00

yindo commented

2026-02-16 00:15:17 -05:00

Originally created by @Aniketparab1999 on GitHub (Mar 18, 2024).

Python code:
qa_chain = RetrievalQA.from_chain_type(llm=turbo_llm,
chain_type="stuff",
retriever=compression_retriever,
return_source_documents=True
)
response = qa_chain("What is Langchain?")

This is the python code I am using to query over a PDF by following RAG approach.
My requirement is, if it takes more than 1 minute to generate the response then it should stop response generation from the backend.
How I can do that? Is there any python code architecture available for this?

Originally created by @Aniketparab1999 on GitHub (Mar 18, 2024). Python code: qa_chain = RetrievalQA.from_chain_type(llm=turbo_llm, chain_type="stuff", retriever=compression_retriever, return_source_documents=True ) response = qa_chain("What is Langchain?") This is the python code I am using to query over a PDF by following RAG approach. My requirement is, if it takes more than 1 minute to generate the response then it should stop response generation from the backend. How I can do that? Is there any python code architecture available for this?

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: run-llama/rags#36