mirror of
https://github.com/open-webui/cookbook.git
synced 2026-07-01 20:24:03 -04:00
Merge pull request #5 from westbrook-ai/pharma-blog
feat: Add Pharmaceutical Industry Blog
This commit is contained in:
+113
-65
@@ -1,91 +1,143 @@
|
||||
# What Would It Take for a Pharma Company to Run AI On Its Own Infrastructure?
|
||||
# What Would It Take for a Pharma Company to Run AI In-House?
|
||||
|
||||
*For R&D leaders, CIOs, and digital transformation executives evaluating AI solutions for their organization.*
|
||||
|
||||
*This article is for informational purposes only and does not constitute regulatory, legal, or compliance advice. Organizations should evaluate AI deployments with qualified professionals based on their specific regulatory obligations and risk profile.*
|
||||
*This article is for informational purposes only and does not constitute scientific, medical, regulatory, legal, or compliance advice. Organizations should evaluate AI deployments with qualified counsel based on their regulatory environment, therapeutic areas, and data governance obligations.*
|
||||
|
||||
<!-- TODO: Replace with hero image for social sharing previews -->
|
||||

|
||||
|
||||
---
|
||||
|
||||
## The Problem With Third-Party AI in Pharma
|
||||
## Why This Question Matters Now
|
||||
|
||||
In 2023, Samsung [banned employees from using ChatGPT](https://mashable.com/article/samsung-chatgpt-leak-leads-to-employee-ban) after engineers inadvertently uploaded proprietary source code and internal meeting notes to the service - data that could be stored on external servers and potentially used for model training. Samsung wasn't a pharma company, but the lesson landed hard across the industry: if it can happen with source code, it can happen with compound structures, clinical trial data, and manufacturing processes.
|
||||
According to a Mordor Intelligence estimate, the pharmaceutical industry's AI market is [projected to reach $25.7 billion by 2030](https://www.mordorintelligence.com/industry-reports/artificial-intelligence-in-pharmaceutical-market), up from roughly $4 billion at the time of that analysis. Stanford's 2025 AI Index [estimated more than $250 billion](https://hai.stanford.edu/ai-index/2025-ai-index-report/economy) in AI investment across all sectors. Yet amid this surge, medicine makers have broadly yet to see substantially shorter development timelines or meaningful improvements in clinical success rates. The gap isn't whether to adopt AI, but how it can be adopted effectively.
|
||||
|
||||
The pharma industry's exposure is particularly acute. A 2024 Kiteworks study found that [**83% of pharmaceutical companies operate without automated safeguards**](https://www.contractpharma.com/exclusives/ai-data-security-the-83-compliance-gap-facing-pharmaceutical-companies/) to prevent sensitive data from leaking through AI tools. Only 17% have implemented technical controls like DLP scanning. The rest rely on training emails (40%), warnings without follow-up (20%), or have no AI usage policy at all (13%). Meanwhile, 27% of life sciences organizations report that over 30% of their AI-handled data contains sensitive or proprietary content.
|
||||
The numbers reveal the disconnect. A 2024 Kiteworks study reported that [**83% of pharmaceutical companies operate without automated safeguards**](https://www.contractpharma.com/exclusives/ai-data-security-the-83-compliance-gap-facing-pharmaceutical-companies/) to prevent sensitive data from leaking through AI tools. Only 17% have implemented technical controls like DLP scanning. The rest rely on training emails (40%), warnings without follow-up (20%), or have no AI usage policy at all (13%). McKinsey's [State of AI report](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-how-organizations-are-rewiring-to-capture-value) reported that 47% of organizations using generative AI have already experienced at least one negative consequence, with cybersecurity leading the list. Meanwhile, companies that simply [bolt AI onto existing workflows](https://www.mckinsey.com/industries/life-sciences/our-insights/the-synthesis/how-pharma-is-rewriting-the-ai-playbook-perspectives-from-industry-leaders) are having trouble capturing meaningful value.
|
||||
|
||||
This creates a paradox: the data that would benefit most from AI analysis - pre-IND compound structures, clinical endpoint designs, manufacturing process parameters - is the data organizations can least afford to send to a third-party API. And the regulatory environment makes the problem harder, not easier, to solve.
|
||||
That backdrop is pushing some pharma companies to ask a more fundamental question: rather than relying solely on third-party AI platforms, **what would it take to run AI internally**? Four challenges are driving the conversation:
|
||||
|
||||
**The most valuable data for AI analysis is often the most sensitive to expose.** Pre-IND compound structures, unpublished mechanism-of-action data, clinical endpoint designs, and manufacturing process parameters, for example, make up the intellectual property that underpins a company's pipeline. Sending it to a cloud LLM can raise questions about control. Even with contractual protections, once data hits a third-party API, it can be difficult to fully guarantee how it's stored, cached, or used for model improvement. For organizations where a single patent filing depends on maintaining trade secret protection, that trade-off can be difficult to accept.
|
||||
|
||||
**Regulated workflows demand validated, auditable systems.** If a scientist uses an LLM to draft a clinical study report section, summarize adverse events, or review CMC documentation, the tool that produced that output may come under scrutiny similar to any computerized system, depending on the intended use and the organization's regulatory framework. Regulators expect electronic records with audit trails, access controls, and attributable authorship. Any AI tool that can't tell you who asked what, when, or what sources informed the answer raises questions under those expectations.
|
||||
|
||||
**Scientific hallucinations compound through the pipeline.** When an AI fabricates a drug-drug interaction, misattributes a clinical outcome to the wrong study arm, or cites a retracted paper, the consequences aren't just embarrassing. They can contaminate safety assessments, mislead regulatory reviewers, and delay or derail programs. Scientists need every AI-generated claim traceable to a source document they can verify themselves.
|
||||
|
||||
**Scientists who need computational AI the most can't always access it.** Drug discovery, clinical biostatistics, and pharmacometrics increasingly depend on computational workflows in order to run analyses, generate visualizations, and fit models. But according to a Tracekey report, [30% of IT-related positions in pharmaceutical companies were unfilled at the time of the report](https://www.tracekey.com/en/skills-shortage-in-pharma/) in key markets, and the scientists with deep domain expertise are rarely the ones writing Python scripts. Industry commentators have described the need for ["trilingualism"](https://www.mckinsey.com/industries/life-sciences/our-insights/the-synthesis/how-pharma-is-rewriting-the-ai-playbook-perspectives-from-industry-leaders) — proficiency in data science, domain science, and business strategy — but acknowledge that finding people with all three is rare. The result is computational capability bottlenecks at the IT team, and scientists waiting in queue instead of iterating on analyses.
|
||||
|
||||
These challenges are leading some pharma companies to ask whether AI they can *control*, *validate*, *audit*, and *put directly in scientists' hands* might be worth exploring, instead of relying solely on third-party platforms.
|
||||
|
||||
---
|
||||
|
||||
## Three Risks That Keep R&D Leaders Up at Night
|
||||
## What Self-Hosted AI Would Actually Require
|
||||
|
||||
### 1. Intellectual Property Leaves the Building
|
||||
Every major cloud provider and a growing roster of startups now offer some flavor of "AI for life sciences." Most follow the same playbook: upload your data, call their API, get results. When the stakes are low (for tasks like summarizing published literature, drafting internal comms, or triaging help-desk tickets), that might be fine. But the moment the work touches a compound patent strategy, an IND-enabling dataset, or a safety signal trending toward a regulatory action, the risk of relying on a third party increases. The question stops being *"can we use AI?"* and becomes *"can we prove — to an inspector, an auditor, a partner in a licensing negotiation — exactly where this output came from, who saw the underlying data, and that it remained on infrastructure we control?"*
|
||||
|
||||
Your pipeline is your competitive advantage. Pre-IND compound structures, unpublished mechanism-of-action data, manufacturing process parameters - this is the intellectual property that underpins billions in R&D investment. Sending it to a cloud LLM means relinquishing physical control. Even with contractual protections, once data hits a third-party API, you cannot verify how it's stored, cached, or used downstream. For organizations where a single patent filing depends on maintaining trade secret protection, that risk may be difficult to accept.
|
||||
That's why the conversation in pharma has moved past feature checklists. When organizations start evaluating self-hosted AI seriously, the requirements tend to cluster around five areas:
|
||||
|
||||
### 2. The Regulatory Traceability Gap
|
||||
- **Data locality.** The ability to run AI entirely on company-controlled infrastructure, whether it's an on-premise data center, private cloud, or air-gapped environment. With the right configuration, this can reduce third-party data exposure, limit model training risk, and reduce external API calls for inference.
|
||||
|
||||
AI isn't exempt from GxP. If a scientist uses an LLM to draft a clinical study report section, summarize adverse events, or review CMC documentation, the tool that produced that output falls under the same scrutiny as any computerized system in a regulated environment. [FDA 21 CFR Part 11](https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11) requires electronic records with audit trails, access controls, and attributable authorship. [EMA Annex 11](https://health.ec.europa.eu/system/files/2016-11/annex11_01-2011_en_0.pdf) imposes equivalent requirements. A SaaS tool that cannot provide attributable, auditable records of who asked what, when, and what sources informed the answer may not satisfy these requirements.
|
||||
- **Source-grounded responses with citations.** The ability for scientists to query their internal company documents — SOPs, study protocols, regulatory guidance, literature databases, pharmacopeia references — and receive responses with inline citations and relevance scores. This does not eliminate hallucination, but it can improve traceability for verification workflows. **All AI-generated content must be reviewed and verified by qualified scientists before reliance or use in any regulatory submission.**
|
||||
|
||||
### 3. Hallucinations Compound Through the Pipeline
|
||||
- **Group-based access control.** The ability to map role-based permissions to functional groups (R&D, Clinical, Regulatory, Pharmacovigilance, Manufacturing, Medical Affairs), restrict administrators from viewing certain conversations in the chat interface, and control model access, document access, and feature access per group.
|
||||
|
||||
When an AI fabricates a drug-drug interaction, misattributes a clinical outcome to the wrong study arm, or cites a retracted paper, the consequences aren't just embarrassing - they can contaminate safety assessments, mislead regulatory reviewers, and delay or derail programs worth hundreds of millions. Scientists need AI-generated claims traceable to a source document they can verify themselves. **All AI-generated content must be reviewed by qualified personnel before use in any decision-making, clinical, or regulatory context.**
|
||||
- **Configurable audit and retention controls.** Conversation logging, configurable retention, SSO integration, and restrictions on chat deletion that can support an organization's compliance and audit requirements, including the kind of electronic records regulators expect in validated environments.
|
||||
|
||||
The common thread: some organizations are asking whether AI infrastructure they can *deploy inside their own walls*, *validate against their own standards*, and *audit with their own tools* might be worth exploring.
|
||||
- **Computational capabilities accessible through natural language.** The ability for scientists to run real code — Python, R, Julia — through conversational prompts, in sandboxed environments on internal infrastructure, without requiring programming expertise or IT tickets. This can help bridge the gap between domain scientists and the computational workflows they need.
|
||||
|
||||
These aren't unique to any one product, but they are the criteria that organizations exploring self-hosted AI tend to evaluate against.
|
||||
|
||||
---
|
||||
|
||||
## What "AI You Control" Looks Like in Practice
|
||||
## One Approach: Self-Hosting
|
||||
|
||||
Self-hosting changes the equation. Instead of trusting vendor claims about data handling, you inspect the infrastructure yourself. But what does that actually look like inside a pharma organization?
|
||||
[Open WebUI](https://docs.openwebui.com/) is a general-purpose, self-hostable AI platform with a publicly available codebase. It's one example of a platform that can be configured to address the requirements above — organizations should evaluate whether and how its capabilities fit their own compliance and governance requirements.
|
||||
|
||||
[Open WebUI](https://docs.openwebui.com/) is a general-purpose, open-source AI platform that can be self-hosted. It's one example of a platform that can be configured to address these concerns - organizations should evaluate whether and how its capabilities fit their own regulatory, quality, and governance requirements.
|
||||
### Illustrative Examples
|
||||
|
||||
Here's how different teams across a pharma organization might interact with a self-hosted AI platform, depending on how it's configured:
|
||||
> **Note:** The following scenarios are illustrative and do not represent validated or endorsed workflows. All drug names, compound data, study identifiers, and clinical results in these scenarios are entirely fictional and created solely for illustration. Organizations must design, test, and validate their own AI workflows according to their regulatory and governance requirements.
|
||||
|
||||
> **Note:** The following scenarios are illustrative and do not represent validated or endorsed workflows. Any use of AI-generated content in regulatory submissions requires your organization's own validation, human expert review, and quality sign-off processes. Open WebUI is a general-purpose tool - it does not replace these processes.
|
||||
#### Querying Internal Documents with Source Citations
|
||||
|
||||
**A scientist in R&D** queries the company's internal compound library: *"Summarize the primary efficacy endpoints from our internal studies for compound X, including the statistical methods used."* The response pulls from internal study reports and cites each by document name with relevance scores. She clicks each citation to verify it against the source PDF. The scientist has access to compound libraries and assay protocols; she cannot see manufacturing SOPs or pharmacovigilance data. The entire exchange is logged under her SSO identity.
|
||||
A regulatory affairs scientist is preparing a Module 2.7 clinical summary for an eCTD submission. She opens Open WebUI, configured with the company's internal document library, and queries: *"Summarize the primary efficacy endpoints from our Phase III trials for compound X, including the statistical methods used."* The response can draw from internal clinical study reports, can cite each by document name with relevance scores, and can structure the summary in a format the scientist can adapt to their reporting requirements. She clicks each citation to verify it against the source PDF. The conversation can be logged under her SSO identity for search and audit workflows.
|
||||
|
||||
**A regulatory affairs specialist** is drafting a response to an FDA deficiency letter. She queries the precedent correspondence knowledge base: *"Find examples of how we've previously addressed CMC deficiency observations related to dissolution testing."* The platform retrieves relevant passages from prior FDA interactions and cites the specific documents. She reviews, edits, and routes the content through her organization's quality procedures before it goes anywhere near a submission.
|
||||
Two weeks later, during an internal pre-submission readiness review, a colleague asks how a specific claim in the summary was generated. The QA team can pull up the audit trail: the exact query, the AI response, the source documents cited, and the timestamp — all attributable to a named user, all retained on company-controlled infrastructure. That level of traceability is increasingly emphasized in regulatory guidance.
|
||||
|
||||
**A pharmacovigilance officer** reviews a safety signal. His group has been configured with RAG-only access to curated safety databases - no web search, no file upload, no general-purpose chat. The platform prioritizes retrieval from MedDRA dictionaries and signal detection SOPs. He knows the underlying model may still draw on its training data, so he verifies every claim against the source documents before including anything in a safety assessment.
|
||||
|
||||
Later, when a question arises about how a specific claim was drafted, the team can review the conversation log: the query, the AI response, the source documents cited, and the timestamp - attributable to a named user and retained on company-controlled infrastructure when configured for local-only operation.
|
||||
|
||||
<!-- TODO: Replace with real screenshot of chat UI showing inline citations and source panel -->
|
||||

|
||||
|
||||
#### Visualizing Clinical Trial Data with Open Terminal
|
||||
|
||||
The example above shows how Open WebUI can ground responses in source documents — the scientist asks a question, and the AI pulls from the company's CSRs to produce a cited summary. But text summaries may not be sufficient for all use cases. When that same scientist needs to present the Verixanib efficacy data to a cross-functional team ahead of a data monitoring committee meeting, she needs visuals — and she needs them built from the same underlying trial data, not recreated manually.
|
||||
|
||||
[Open Terminal](https://docs.openwebui.com/features/extensibility/open-terminal) is a sandboxed computing environment built into Open WebUI. Instead of writing code, a scientist provides the relevant data, and describes an analysis in plain language. The AI writes and executes code inside an isolated Docker container on the organization's infrastructure — which can include scientific libraries — and returns the results directly in the chat. The file browser lets scientists upload datasets, preview outputs, and download finished artifacts without ever leaving the interface.
|
||||
|
||||
She switches to an Open Terminal session, attaches the consolidated Verixanib efficacy dataset from the company's Knowledge collection — data corresponding to the CSRs she just queried — and types:
|
||||
|
||||
*"From this Verixanib Phase III trial data, create a four-panel efficacy overview: (1) a line graph showing ACR20 response rate over time by treatment arm across all three studies, (2) a grouped bar chart comparing ACR20, ACR50, and ACR70 response rates at the primary endpoint for each treatment arm in VRX-301, (3) a line graph of DAS28-CRP change from baseline over time for VRX-302 treatment arms including an active comparator, and (4) a grouped bar chart of Week 24 HAQ-DI improvement across all monotherapy arms from VRX-303."*
|
||||
|
||||
The AI can read the dataset, generate the four-panel figure, and return it in the chat. She reviews the output, asks for a color adjustment to match the company's slide template, and can download the final figure — all within the same conversation, all on the same infrastructure where the source CSRs live.
|
||||
|
||||

|
||||
|
||||
#### SAR Visualization in Drug Discovery
|
||||
|
||||
A medicinal chemist is deep in lead optimization. His team has synthesized 200+ analogs of a kinase inhibitor scaffold, and the SAR is getting complex — potency cliffs appear when certain R-groups are swapped, but the team hasn't yet mapped the full landscape systematically. He references a compound library CSV (SMILES strings, IC50 values, selectivity ratios, and physicochemical descriptors) from Knowledge, enables Open Terminal, and types:
|
||||
|
||||
*"Using RDKit: compute Morgan fingerprints for these compounds, cluster them by Tanimoto similarity, and generate a heatmap showing the relationship between structural clusters and pIC50. Annotate the cluster with the best selectivity ratio. Add a second panel showing a matched molecular pair analysis for substitutions at the R1 position — plot the ΔpIC50 for each transformation as a horizontal bar chart, sorted by magnitude."*
|
||||
|
||||
The AI can install RDKit and scikit-learn, calculate fingerprints, run the clustering, identify the matched pairs, and produce a two-panel figure: the SAR heatmap on top, the matched molecular pair bar chart below. The chemist spots a potential pattern — a fluorine-to-chlorine swap at R1 appears to boost potency by ~0.5 log units without killing selectivity — and sends the figure to the project team with a follow-up synthesis proposal for experimental validation.
|
||||
|
||||

|
||||
|
||||
Industry leaders have described this iterative approach — the model predicts, the scientist validates, and the scientist builds on those results — as a [virtuous cycle](https://www.mckinsey.com/industries/life-sciences/our-insights/the-synthesis/how-pharma-is-rewriting-the-ai-playbook-perspectives-from-industry-leaders). Open Terminal can make that cycle more accessible to scientists who can describe what they need in natural language.
|
||||
|
||||
#### Detecting Safety Signals in Pharmacovigilance Data
|
||||
|
||||
A pharmacovigilance scientist is running a routine disproportionality screen on the company's adverse event database. A new signal has appeared for a marketed product — a cluster of hepatic events that wasn't flagged in the clinical program. She needs to characterize it before the next PSMF update. She exports the case-level data (MedDRA preferred terms, time-to-onset, seriousness criteria, reporter type, and co-suspect medications) from the safety database, uploads the CSV to Open Terminal, and types:
|
||||
|
||||
*"Calculate the proportional reporting ratio (PRR) and reporting odds ratio (ROR) with 95% confidence intervals for all hepatic preferred terms in this dataset. Generate three visualizations: a forest plot of PRR values for all hepatic PTs that cross the signaling threshold (PRR ≥ 2, chi-squared ≥ 4, N ≥ 3), a time-to-onset histogram for the flagged events grouped by seriousness, and a bubble chart showing case volume by preferred term on the x-axis, PRR on the y-axis, and bubble size scaled to the proportion of serious cases."*
|
||||
|
||||
The AI can process the dataset, run the statistical calculations, and return three figures. The forest plot shows that three hepatic PTs appear to exceed the signaling threshold. The time-to-onset histogram shows a concentration in the first 90 days — a pattern the scientist can evaluate further as part of the signal assessment. The bubble chart gives the scientist an at-a-glance view of which terms carry the most signal weight. She downloads the figures, attaches them to a signal evaluation report, and has a starting point for the preliminary assessment that the safety committee will review — reducing the iteration time between the pharmacovigilance scientist and the computational analysis.
|
||||
|
||||

|
||||
|
||||
#### Building Regulatory Submission Visuals
|
||||
|
||||
A regulatory operations lead is coordinating a rolling NDA submission. The publishing team needs a clear visual for internal stakeholders showing the submission architecture — which eCTD modules are complete, which are in review, which are blocked on pending data, and how the cross-references between sections are structured. He opens Open Terminal and types:
|
||||
|
||||
*"From this submission tracker spreadsheet, create two visuals: First, a Gantt-style timeline chart showing each eCTD module (1 through 5) with sub-sections color-coded by status — green for finalized, yellow for in-review, red for blocked. Include a vertical 'today' line and the target submission date. Second, a dependency diagram showing the cross-reference relationships between Module 2.7 (Clinical Summary), Module 5.3 (Clinical Study Reports), and Module 2.5 (Clinical Overview) — draw directional arrows showing which sections reference which, and flag any referenced section that isn't yet finalized."*
|
||||
|
||||
The AI can parse the spreadsheet, generate a color-coded Gantt chart overlaid with milestone markers and a dependency graph with red-flagged nodes for incomplete cross-references. The regulatory lead can see that Module 2.7 references three Module 5.3 study reports that are still in QC — a dependency that may not have been as visible in the flat tracker. He shares the visuals in the publishing team's next standup, and they reprioritize QC accordingly.
|
||||
|
||||

|
||||
|
||||
---
|
||||
|
||||
## Mapping AI to Your Organizational Structure
|
||||
## What Access Control Could Look Like
|
||||
|
||||
The scenarios above depend on one thing: each functional group seeing only the models, documents, and capabilities assigned to it. Open WebUI includes a group-based access control system that can map to your organizational structure. **The table below is an illustrative configuration - organizations should design their own group structure based on their specific functional areas, risk profile, and governance requirements.**
|
||||
Open WebUI provides a group-based access control system. The table below shows one example of how a pharma company might map functional groups to AI capabilities. **This is an illustrative configuration — organizations should design their own group structure based on their specific needs, risk tolerance, and governance requirements.**
|
||||
|
||||
<!-- TODO: Replace with screenshot of Admin Panel → Groups showing functional groups -->
|
||||

|
||||
|
||||
| Functional Group | AI Capabilities | Knowledge Bases | Special Permissions |
|
||||
|---|---|---|---|
|
||||
| **R&D / Discovery** | Full | Compound libraries, assay protocols, literature databases | Code interpreter *(run analysis scripts on screening data)* |
|
||||
| **Clinical Operations** | Full | Study protocols, CRF templates, monitoring plan libraries | Web search enabled |
|
||||
| **Biostatistics** | Full | Analysis datasets, statistical analysis plans, CDISC standards libraries | Open Terminal *(survival analysis, enrollment dashboards, forest plots)*, code interpreter |
|
||||
| **Clinical Development** | Full | Study protocols, investigator brochures, CRF templates, monitoring plan libraries | Document extraction *(structured data from regulatory letters)*, web search enabled |
|
||||
| **Manufacturing / CMC** | Full | Batch records, process validation reports, equipment SOPs | Open Terminal *(batch trend analysis, process parameter visualization)*, file upload |
|
||||
| **Medical Affairs** | Full | Product monographs, congress abstracts, medical education slide decks | Web search enabled |
|
||||
| **Pharmacovigilance** | Advanced analysis only | MedDRA dictionaries, CIOMS forms, signal detection SOPs | RAG-only mode *(responses grounded in curated internal source documents)* |
|
||||
| **R&D / Discovery** | Full | Compound libraries, assay protocols, literature databases | Open Terminal *(SAR analysis, molecular modeling, visualization)*, code interpreter |
|
||||
| **Regulatory Affairs** | Full | eCTD templates, FDA/EMA guidance, precedent correspondence | Document extraction *(structured data from regulatory letters)* |
|
||||
| **Pharmacovigilance** | Advanced analysis only | MedDRA dictionaries, CIOMS forms, signal detection SOPs | RAG-only mode *(responses grounded in internal source documents)* |
|
||||
| **Manufacturing / CMC** | Full | Batch records, process validation reports, equipment SOPs | File upload enabled |
|
||||
| **Medical Affairs** | Full | Product monographs, congress abstracts, KOL slide decks | Web search enabled |
|
||||
| **Support Staff** | Basic tasks only | Company policies, HR procedures, training materials | No file upload, no web search |
|
||||
| **Support Staff** | Basic tasks only | Company policies, HR procedures, training materials | No file upload, no web search, no terminal access |
|
||||
|
||||
Groups synchronize with your identity provider (Okta, Azure AD, Ping Identity) via OAuth, so when someone transfers between departments, their AI permissions update on next login.
|
||||
Groups can synchronize with your identity provider (such as Okta, Azure AD, or Ping Identity) via OAuth, so functional group membership can stay aligned with your organization's directory.
|
||||
|
||||
---
|
||||
|
||||
## The Infrastructure Behind It
|
||||
## What Infrastructure Is Involved
|
||||
|
||||
*This section is a reference for your IT or infrastructure team. If you're evaluating at a strategic level, the key takeaway is: a self-hosted AI platform can deploy on existing infrastructure (VMware, Azure, AWS, or bare metal), scale horizontally, and can operate without external dependencies once models are loaded.*
|
||||
*This section is a reference for your IT or infrastructure team. If you're evaluating at a strategic level, the key takeaway is simple: a self-hosted AI platform can run on existing infrastructure (VMware, Azure, AWS, or bare metal), scale with the organization, and be deployed with dependencies that can run on internal infrastructure.*
|
||||
|
||||
For large pharma organizations (500-10,000+ employees), a production deployment typically requires high availability and data isolation. Here's a reference architecture using Open WebUI - for full deployment instructions, see the **[Technical Setup Guide](setup.md)**.
|
||||
For large pharma organizations (500–10,000+ employees), a production deployment typically requires high availability, data isolation, and infrastructure that can support regulatory requirements. Here's a reference architecture using Open WebUI — for full deployment instructions, see the **[Technical Setup Guide](setup.md)**.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
@@ -112,6 +164,10 @@ flowchart TB
|
||||
vllm["vLLM<br/>(Large Local Models)"]
|
||||
end
|
||||
|
||||
subgraph terminal["Open Terminal"]
|
||||
ot["Open Terminal<br/>Docker Container(s)<br/>Sandboxed Execution"]
|
||||
end
|
||||
|
||||
subgraph optional["Optional Services"]
|
||||
functions["Functions<br/>(Built-in Plugins)"]
|
||||
otel["OpenTelemetry<br/>Collector"]
|
||||
@@ -121,32 +177,35 @@ flowchart TB
|
||||
proxy --> owui
|
||||
owui --> data
|
||||
owui --> inference
|
||||
owui --> terminal
|
||||
owui -.-> optional
|
||||
```
|
||||
|
||||
What makes this architecture relevant for pharma specifically:
|
||||
|
||||
- **Air-gappable.** When models are pre-loaded, the entire stack can run without outbound internet access. No prompts, completions, or embeddings leave your network when configured for local-only inference.
|
||||
- **Separable by function.** RBAC and group-based document access mean R&D doesn't see PV data, PV doesn't see CMC data, and IT can manage the platform without viewing conversation content - all configurable at the application level.
|
||||
- **Auditable.** Conversations are timestamped and attributed to authenticated users via SSO. Chat deletion and temporary chats can be disabled at the application level, creating records that your quality team can evaluate as part of their own governance framework.
|
||||
- **Scalable without re-architecture.** Stateless application nodes scale horizontally. Add capacity during high-demand periods, scale back during quieter ones. Lose any single node without service interruption.
|
||||
**Key design decisions:**
|
||||
- **Stateless application nodes** — horizontal scaling allows capacity to flex with demand across the organization
|
||||
- **Inference can run locally** — via Ollama (lightweight models) and vLLM (large models with GPU optimization), so prompts can stay on-network when configured accordingly
|
||||
- **Unified data layer** — PostgreSQL handles both application data and vector search, reducing operational complexity
|
||||
- **Redis session coordination** — enables multi-node deployments where any instance can serve any request without requiring session affinity
|
||||
- **Sandboxed Open Terminal containers** — each terminal runs in an isolated Docker container with resource limits enforced; scientists get a sandboxed computing environment while IT maintains control
|
||||
|
||||
---
|
||||
|
||||
## What It Takes to Get There
|
||||
## Considerations Before Getting Started
|
||||
|
||||
Self-hosting AI is not trivial. Before committing, organizations should consider:
|
||||
|
||||
- **Validation effort.** If AI is used in GxP-adjacent workflows, the platform will likely need to go through your Computer System Validation process. This is an organizational responsibility, not something the software provides out of the box. Plan for IQ/OQ/PQ scoping, test case development, and documentation.
|
||||
- **Infrastructure costs.** Open WebUI itself is free, but GPU servers, storage, and networking are not. A single-department pilot can run on one GPU server in your existing environment; a full production deployment involves dedicated compute and storage.
|
||||
- **Governance design.** Who approves AI use cases? How are outputs reviewed? What's the policy for AI-assisted content in submissions? These questions matter more than the technology.
|
||||
- **Ongoing maintenance.** Model updates, security patches, user support, and knowledge base curation are ongoing responsibilities. A pilot can typically be running within hours; a full production rollout typically takes a few weeks.
|
||||
- **Infrastructure costs.** Open WebUI itself is free to use (see [license](https://github.com/open-webui/open-webui/blob/main/LICENSE) for terms), but GPU servers, storage, and networking are not. A single-department pilot may run on one GPU server; an organization-wide deployment involves dedicated compute and storage.
|
||||
- **Governance design.** Who approves AI use cases? How are outputs reviewed? What's the policy for AI-assisted work product in regulatory submissions? These questions matter more than the technology.
|
||||
- **Validation and testing.** Any AI deployment in a regulated environment should go through appropriate validation, security review, and integration testing before production use. This is typically a multi-week program.
|
||||
- **Ongoing maintenance.** Model updates, security patches, user support, and knowledge base curation are ongoing responsibilities.
|
||||
|
||||
For organizations that want to explore the technical details, the complete Docker Compose stack, security hardening checklist, RBAC configuration guide, and backup strategy are in our companion guide:
|
||||
For organizations that want to explore the technical details, the complete Docker Compose stack (including Open Terminal configuration), security hardening checklist, RBAC configuration guide, and backup strategy are in our companion guide:
|
||||
|
||||
**[Technical Setup Guide →](setup.md)**
|
||||
|
||||
For organizations that want deployment guidance, [Open WebUI Enterprise](https://docs.openwebui.com/enterprise/) offers hands-on support including regulatory alignment guidance *(compliance determination remains your organization's responsibility)*, white-label branding, and dedicated SLAs.
|
||||
For organizations that want deployment guidance, [Open WebUI Enterprise](https://docs.openwebui.com/enterprise/) offers hands-on support including regulatory alignment guidance *(compliance determination remains the organization's responsibility)*, white-label branding, and dedicated SLAs.
|
||||
|
||||
*Note: No software alone establishes regulatory compliance. Organizations should validate controls, policies, and use cases with qualified regulatory, legal, and security teams.*
|
||||
|
||||
**[Learn more about Enterprise → sales@openwebui.com](mailto:sales@openwebui.com)**
|
||||
|
||||
@@ -154,17 +213,6 @@ For organizations that want deployment guidance, [Open WebUI Enterprise](https:/
|
||||
|
||||
### Disclaimer
|
||||
|
||||
*Open WebUI is a general-purpose AI platform. It is not a validated GxP system. All regulatory compliance determinations - including 21 CFR Part 11, EMA Annex 11, HIPAA, and any other applicable framework - are the sole responsibility of the deploying organization. AI-generated content is not a substitute for professional scientific, clinical, or regulatory judgment and must always be reviewed by qualified personnel before use.*
|
||||
*Open WebUI is a general-purpose AI platform, not a pharmaceutical or life sciences product validated for any specific regulatory standard (including GxP, 21 CFR Part 11, EU Annex 11, or ICH guidelines). All compliance determinations — including computerized system validation, data integrity, electronic records requirements, and any other applicable regulatory framework — are the sole responsibility of the deploying organization. AI-generated content is not scientific, medical, or regulatory advice and is not a substitute for qualified professional judgment. All AI outputs must be reviewed and verified by qualified scientists before use in any regulated context. Open WebUI is not a medical device and has not been cleared, approved, or registered with the FDA or any other regulatory authority as medical device software.*
|
||||
|
||||
---
|
||||
|
||||
*Open WebUI is free to use and self-hostable. It powers AI deployments at organizations ranging from small research teams to Fortune 500 companies. [See who's using Open WebUI →](https://docs.openwebui.com/enterprise/customers/)*
|
||||
|
||||
---
|
||||
|
||||
### References
|
||||
|
||||
1. *"Samsung Bans ChatGPT Among Employees After Sensitive Code Leak."* Mashable, 2023. [mashable.com](https://mashable.com/article/samsung-chatgpt-leak-leads-to-employee-ban)
|
||||
2. *"AI Data Security: The 83% Compliance Gap Facing Pharmaceutical Companies."* Contract Pharma / Kiteworks, 2024. [contractpharma.com](https://www.contractpharma.com/exclusives/ai-data-security-the-83-compliance-gap-facing-pharmaceutical-companies/)
|
||||
3. *21 CFR Part 11 - Electronic Records; Electronic Signatures.* U.S. Food & Drug Administration. [ecfr.gov](https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11)
|
||||
4. *Annex 11: Computerised Systems.* European Commission, EudraLex Volume 4. [health.ec.europa.eu](https://health.ec.europa.eu/system/files/2016-11/annex11_01-2011_en_0.pdf)
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 108 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 255 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 2.3 MiB |
Binary file not shown.
|
After Width: | Height: | Size: 126 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 162 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 1.8 MiB |
+160
-56
@@ -21,9 +21,11 @@ This guide is a technical reference companion to [What Would It Take for a Pharm
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The production stack is identical to any Open WebUI enterprise deployment: reverse proxy with TLS, stateless application nodes, PostgreSQL + PGVector for data and vector search, Redis for session coordination, and local inference via Ollama and vLLM. See the [blog post](article.md) for the architecture diagram and rationale.
|
||||
The production stack is based on the same components as any Open WebUI enterprise deployment: reverse proxy with TLS, stateless application nodes, PostgreSQL + PGVector for data and vector search, Redis for session coordination, local inference via Ollama and vLLM, and [Open Terminal](https://docs.openwebui.com/features/extensibility/open-terminal) for sandboxed code execution. See the [blog post](article.md) for the architecture diagram and rationale.
|
||||
|
||||
What makes this configuration different from a generic deployment isn't the infrastructure - it's the **configuration layer on top**: how access is structured, how knowledge bases map to functional groups, and how audit records are retained. The settings below are examples of how organizations in regulated industries have approached these decisions.
|
||||
This deployment also includes the [Inline Visualizer](https://github.com/Classic298/open-webui-plugins/tree/main/inline-visualizer) tool and skill, which renders interactive HTML/SVG visualizations directly in the chat. When combined with Open Terminal for computational work, this gives scientists two complementary paths to visual output: Open Terminal for figures generated by matplotlib, RDKit, and other scientific Python libraries, and Inline Visualizer for interactive diagrams, flowcharts, and explorable visuals rendered natively in the conversation.
|
||||
|
||||
What makes this configuration different from a generic deployment isn't the infrastructure - it's the **configuration layer on top**: how access is structured, how knowledge bases map to functional groups, and how audit records are retained. The settings below are examples of how organizations in regulated industries might approach these decisions.
|
||||
|
||||
**Example configuration decisions:**
|
||||
|
||||
@@ -60,7 +62,7 @@ For an organization with 500–10,000+ employees and concurrent usage of ~100–
|
||||
### Network Requirements
|
||||
|
||||
- All services communicate on an internal Docker network - no public exposure except the reverse proxy
|
||||
- Outbound internet access is **not required** if models are pre-pulled (fully air-gappable)
|
||||
- Outbound internet access is **not required** if models are pre-pulled (can be air-gapped after initial image and model provisioning)
|
||||
- Ports: only `443` (HTTPS) exposed externally
|
||||
|
||||
---
|
||||
@@ -164,6 +166,9 @@ services:
|
||||
- ENABLE_OTEL=${ENABLE_OTEL:-False}
|
||||
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_ENDPOINT:-}
|
||||
|
||||
# --- Open Terminal integration ---
|
||||
- TERMINAL_SERVER_CONNECTIONS=${TERMINAL_SERVER_CONNECTIONS:-[]}
|
||||
|
||||
# --- Persistent config ---
|
||||
- ENABLE_PERSISTENT_CONFIG=True
|
||||
volumes:
|
||||
@@ -225,6 +230,7 @@ services:
|
||||
- ENABLE_DB_MIGRATIONS=False # Node-1 handles migrations
|
||||
- ENABLE_OTEL=${ENABLE_OTEL:-False}
|
||||
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_ENDPOINT:-}
|
||||
- TERMINAL_SERVER_CONNECTIONS=${TERMINAL_SERVER_CONNECTIONS:-[]}
|
||||
- ENABLE_PERSISTENT_CONFIG=True
|
||||
volumes:
|
||||
- owui-data:/app/backend/data
|
||||
@@ -303,7 +309,7 @@ services:
|
||||
# Ollama - Local model inference (smaller models, ≤13B)
|
||||
# ---------------------------------------------------------------------------
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
image: ollama/ollama:0.6 # Pin to a specific version for production environments
|
||||
container_name: owui-ollama
|
||||
restart: unless-stopped
|
||||
volumes:
|
||||
@@ -322,7 +328,7 @@ services:
|
||||
# vLLM - GPU-optimized inference (large models, 70B+)
|
||||
# ---------------------------------------------------------------------------
|
||||
vllm:
|
||||
image: vllm/vllm-openai:latest
|
||||
image: vllm/vllm-openai:v0.8 # Pin to a specific version for production environments
|
||||
container_name: owui-vllm
|
||||
restart: unless-stopped
|
||||
command: >
|
||||
@@ -344,6 +350,27 @@ services:
|
||||
networks:
|
||||
- owui-net
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Open Terminal - Sandboxed code execution for scientists
|
||||
# ---------------------------------------------------------------------------
|
||||
open-terminal:
|
||||
image: ghcr.io/open-webui/open-terminal:latest # Pin to a specific version for production environments
|
||||
container_name: owui-terminal
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
- OPEN_TERMINAL_API_KEY=${OPEN_TERMINAL_API_KEY}
|
||||
- OPEN_TERMINAL_PIP_PACKAGES=rdkit-pypi scikit-learn lifelines matplotlib seaborn
|
||||
- OPEN_TERMINAL_MAX_SESSIONS=16
|
||||
volumes:
|
||||
- terminal-data:/home/user
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 4G
|
||||
cpus: "4.0"
|
||||
networks:
|
||||
- owui-net
|
||||
|
||||
# =============================================================================
|
||||
# Named Volumes
|
||||
# =============================================================================
|
||||
@@ -356,6 +383,8 @@ volumes:
|
||||
driver: local
|
||||
ollama-data:
|
||||
driver: local
|
||||
terminal-data:
|
||||
driver: local
|
||||
|
||||
# =============================================================================
|
||||
# Network
|
||||
@@ -490,6 +519,7 @@ generate_secret() {
|
||||
}
|
||||
|
||||
if [ ! -f .env ]; then
|
||||
TERMINAL_KEY=$(generate_secret)
|
||||
cat > .env << EOF
|
||||
# =============================================================================
|
||||
# Open WebUI - Environment Configuration
|
||||
@@ -520,6 +550,10 @@ VLLM_MAX_MODEL_LEN=8192
|
||||
VLLM_API_KEY=$(generate_secret)
|
||||
HF_TOKEN=hf_your_token_here
|
||||
|
||||
# --- Open Terminal ---
|
||||
OPEN_TERMINAL_API_KEY=${TERMINAL_KEY}
|
||||
TERMINAL_SERVER_CONNECTIONS='[{\"url\":\"http://open-terminal:8000\",\"key\":\"${TERMINAL_KEY}\"}]'
|
||||
|
||||
# --- Workers ---
|
||||
UVICORN_WORKERS=4
|
||||
|
||||
@@ -615,7 +649,7 @@ These are the same Open WebUI environment variables used in any deployment. This
|
||||
|
||||
| Variable | Value | Rationale |
|
||||
|---|---|---|
|
||||
| `ENABLE_SIGNUP` | `False` | All users provisioned via SSO or admin. No uncontrolled account creation. |
|
||||
| `ENABLE_SIGNUP` | `False` | All users provisioned via SSO or admin. Disables the self-registration form. |
|
||||
| `DEFAULT_USER_ROLE` | `pending` | New SSO users require explicit admin approval before accessing any AI capabilities. |
|
||||
| `BYPASS_MODEL_ACCESS_CONTROL` | `False` | Enforces RBAC model restrictions - users only see models assigned to their functional group. |
|
||||
| `BYPASS_ADMIN_ACCESS_CONTROL` | `False` | Admins are subject to the same workspace access rules as regular users. |
|
||||
@@ -627,7 +661,7 @@ These are the same Open WebUI environment variables used in any deployment. This
|
||||
|---|---|---|
|
||||
| `VECTOR_DB` | `pgvector` | Uses PostgreSQL's PGVector extension - one database for both application data and vector search. |
|
||||
| `RAG_TOP_K` | `5` | Returns the top 5 most relevant document chunks. Tune based on document density. |
|
||||
| `ENABLE_RAG_HYBRID_SEARCH` | `True` | BM25 + vector ensemble search with reranking. Recommended for scientific documents where exact terminology matters alongside semantic similarity. |
|
||||
| `ENABLE_RAG_HYBRID_SEARCH` | `True` | BM25 + vector ensemble search with reranking. May be beneficial for scientific documents where exact terminology matters alongside semantic similarity. |
|
||||
|
||||
### Infrastructure
|
||||
|
||||
@@ -637,9 +671,18 @@ These are the same Open WebUI environment variables used in any deployment. This
|
||||
| `REDIS_URL` | `redis://redis:6379/0` | Session coordination across stateless nodes. |
|
||||
| `WEBSOCKET_MANAGER` | `redis` | Routes streaming responses through Redis for multi-node consistency. |
|
||||
| `ENABLE_DB_MIGRATIONS` | `True` (node-1 only) | Only one node should run migrations on startup to prevent race conditions. |
|
||||
| `TERMINAL_SERVER_CONNECTIONS` | JSON array | Pre-configures the Open Terminal connection so it's available to all users on startup. Format: `[{"url":"http://open-terminal:8000","key":"<API_KEY>"}]`. Can also be configured manually in Admin Settings → Integrations. |
|
||||
|
||||
> **Redis note:** The `timeout 1800` setting in the Docker Compose Redis config is critical. Without it, idle connections accumulate until `maxclients` is exhausted and all logins fail. See the [Open WebUI Redis documentation](https://docs.openwebui.com/reference/env-configuration/#redis_url).
|
||||
|
||||
### Open Terminal
|
||||
|
||||
| Variable | Value | Rationale |
|
||||
|---|---|---|
|
||||
| `OPEN_TERMINAL_API_KEY` | Generated secret | Bearer API key for authenticating requests from Open WebUI to the terminal container. |
|
||||
| `OPEN_TERMINAL_PIP_PACKAGES` | `rdkit-pypi scikit-learn lifelines matplotlib seaborn` | Pre-installs scientific Python libraries at container startup. Scientists can install additional packages at runtime. |
|
||||
| `OPEN_TERMINAL_MAX_SESSIONS` | `16` | Maximum concurrent interactive terminal sessions. Helps limit resource consumption. |
|
||||
|
||||
---
|
||||
|
||||
## Technical Controls Reference
|
||||
@@ -662,12 +705,12 @@ The following table lists technical capabilities that Open WebUI provides when c
|
||||
| **User identity** | SSO provides authenticated identity. The platform authenticates individual user accounts via SSO/OIDC. |
|
||||
| **Deployment model** | Can be deployed on internal infrastructure with no external dependencies when models are pre-loaded. |
|
||||
| **Data integrity** | Chat deletion can be disabled at the application level. PostgreSQL WAL for write-ahead logging. Automated backups. |
|
||||
| **Data migration** | PostgreSQL `pg_dump`/`pg_restore` with integrity verification. Standard, well-documented process. |
|
||||
| **Business continuity** | Stateless nodes with automatic failover, Redis HA, PostgreSQL WAL archiving for point-in-time recovery. |
|
||||
| **Data migration** | PostgreSQL `pg_dump`/`pg_restore` with integrity verification. |
|
||||
| **Business continuity** | Stateless application nodes with health-check-based recovery. PostgreSQL WAL archiving can support point-in-time recovery. Redis Sentinel and PostgreSQL replication can be added for high availability. |
|
||||
|
||||
### What This Might Mean for a Validation Team
|
||||
|
||||
If your organization uses a risk-based approach to CSV (Computer System Validation), the GAMP categorization, validation scope, and testing depth are decisions your validation team must make based on your specific deployment, customizations, and intended use. Open WebUI's open-source codebase and container-based deployment with version-pinned images may facilitate aspects of your validation process, but the validation strategy itself is an organizational responsibility.
|
||||
If your organization uses a risk-based approach to CSV (Computer System Validation), the GAMP categorization, validation scope, and testing depth are decisions your validation team must make based on your specific deployment, customizations, and intended use. Open WebUI's publicly available codebase and container-based deployment with version-pinned images may facilitate aspects of your validation process, but the validation strategy itself is an organizational responsibility.
|
||||
|
||||
---
|
||||
|
||||
@@ -693,7 +736,7 @@ ENABLE_OAUTH_GROUP_CREATION=True
|
||||
ENABLE_OAUTH_ROLE_MANAGEMENT=True
|
||||
```
|
||||
|
||||
> **Tip:** Set `ENABLE_OAUTH_GROUP_MANAGEMENT=True` so that functional group membership syncs automatically from your identity provider. When a scientist transfers from R&D to Medical Affairs in your directory, their Open WebUI permissions update on next login - no manual reprovisioning.
|
||||
> **Tip:** Set `ENABLE_OAUTH_GROUP_MANAGEMENT=True` so that functional group membership syncs automatically from your identity provider. When a scientist transfers from R&D to Medical Affairs in your directory, their Open WebUI permissions update on next login - no manual reprovisioning needed in most cases.
|
||||
|
||||
### Step 2: Create Functional Groups
|
||||
|
||||
@@ -702,46 +745,52 @@ Navigate to **Admin Panel → Groups** and create groups matching your organizat
|
||||
<!-- TODO: Replace with screenshot of Admin Panel → Groups page -->
|
||||

|
||||
|
||||
1. **R&D / Discovery**
|
||||
1. **Biostatistics**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Analysis datasets, statistical analysis plans, CDISC standards libraries
|
||||
- Permissions: Open Terminal enabled, code interpreter enabled, file upload enabled
|
||||
- *Rationale: Biostatisticians run survival analyses, enrollment dashboards, and forest plots. Open Terminal gives them a computational environment without requiring IT tickets.*
|
||||
|
||||
2. **Clinical Development**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Study protocols, investigator brochures, CRF templates, monitoring plan libraries
|
||||
- Permissions: Document extraction enabled, file upload enabled, web search enabled
|
||||
- *Rationale: Clinical teams work with both internal protocols and public clinical trial registries. Web search enables ClinicalTrials.gov lookups. Document extraction helps process regulatory correspondence.*
|
||||
|
||||
3. **Manufacturing / CMC**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Batch records, process validation reports, equipment SOPs
|
||||
- Permissions: Open Terminal enabled, file upload enabled
|
||||
- *Rationale: CMC scientists frequently upload batch records and deviation reports for AI-assisted review of deviation data. Open Terminal enables batch trend analysis and process parameter visualization.*
|
||||
|
||||
4. **Medical Affairs**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Product monographs, congress abstracts, medical education slide decks
|
||||
- Permissions: Web search enabled, file upload enabled
|
||||
- *Rationale: Medical Affairs teams need access to public literature and congress proceedings alongside internal medical information.*
|
||||
|
||||
5. **Pharmacovigilance**
|
||||
- Models: Reasoning models only (e.g., Llama 3.1 70B via vLLM)
|
||||
- Knowledge bases: MedDRA dictionaries, CIOMS forms, signal detection SOPs
|
||||
- Permissions: RAG-only mode (no web search, no file upload)
|
||||
- *Rationale: PV work is safety-critical. Restricting to RAG-only mode prioritizes retrieval from curated internal documents and disables web search, reducing exposure to uncontrolled external content. The underlying model may still draw on its training data.*
|
||||
|
||||
6. **R&D / Discovery**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Compound libraries, assay protocols, literature databases
|
||||
- Permissions: Code interpreter enabled, file upload enabled, web search enabled
|
||||
- *Rationale: Discovery scientists need the broadest toolset - running analysis scripts on screening data, uploading proprietary assay results, and searching public literature.*
|
||||
- Permissions: Open Terminal enabled, code interpreter enabled, file upload enabled, web search enabled
|
||||
- *Rationale: Discovery scientists need the broadest toolset - running SAR analyses and molecular modeling in Open Terminal, uploading proprietary assay results, and searching public literature.*
|
||||
|
||||
2. **Clinical Operations**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Study protocols, CRF templates, monitoring plan libraries
|
||||
- Permissions: File upload enabled, web search enabled
|
||||
- *Rationale: Clinical teams work with both internal protocols and public clinical trial registries. Web search enables ClinicalTrials.gov lookups.*
|
||||
|
||||
3. **Regulatory Affairs**
|
||||
7. **Regulatory Affairs**
|
||||
- Models: All available models
|
||||
- Knowledge bases: eCTD templates, FDA/EMA guidance, precedent correspondence
|
||||
- Permissions: Document extraction enabled, file upload enabled
|
||||
- *Rationale: Regulatory scientists frequently need to extract structured data from FDA letters, EMA assessment reports, and deficiency notices.*
|
||||
|
||||
4. **Pharmacovigilance**
|
||||
- Models: Reasoning models only (e.g., Llama 3.1 70B via vLLM)
|
||||
- Knowledge bases: MedDRA dictionaries, CIOMS forms, signal detection SOPs
|
||||
- Permissions: RAG-only mode (no web search, no file upload)
|
||||
- *Rationale: PV work is safety-critical. Restricting to RAG-only mode prioritizes retrieval from curated internal documents and disables web search, reducing exposure to uncontrolled external content. The underlying model may still draw on its training data.*
|
||||
|
||||
5. **Manufacturing / CMC**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Batch records, process validation reports, equipment SOPs
|
||||
- Permissions: File upload enabled
|
||||
- *Rationale: CMC scientists frequently upload batch records and deviation reports for AI-assisted root cause analysis.*
|
||||
|
||||
6. **Medical Affairs**
|
||||
- Models: All available models
|
||||
- Knowledge bases: Product monographs, congress abstracts, KOL slide decks
|
||||
- Permissions: Web search enabled, file upload enabled
|
||||
- *Rationale: Medical Affairs teams need access to public literature and congress proceedings alongside internal medical information.*
|
||||
|
||||
7. **Support Staff**
|
||||
8. **Support Staff**
|
||||
- Models: Small models only (e.g., Llama 3.1 8B via Ollama)
|
||||
- Knowledge bases: Company policies, HR procedures, training materials
|
||||
- Permissions: No file upload, no web search
|
||||
- Permissions: No file upload, no web search, no terminal access
|
||||
- *Rationale: Minimal access footprint for non-scientific users.*
|
||||
|
||||
### Step 3: Assign Models to Groups
|
||||
@@ -765,23 +814,65 @@ For each knowledge base in **Admin Panel → Knowledge**:
|
||||
1. Set access control to the relevant functional groups
|
||||
2. Users will only see knowledge bases assigned to their group(s) in the chat interface
|
||||
|
||||
### Step 5: Install the Inline Visualizer Tool & Skill
|
||||
|
||||
The [Inline Visualizer](https://github.com/Classic298/open-webui-plugins/tree/main/inline-visualizer) plugin renders interactive HTML/SVG visualizations directly in the chat. It includes a theme-aware design system with color ramps, SVG utility classes, and a communication bridge that lets visualizations send prompts back to the chat for conversational exploration.
|
||||
|
||||
This plugin has two components:
|
||||
|
||||
| Component | File | Install Location |
|
||||
|---|---|---|
|
||||
| **Tool** | `tool.py` | Workspace → Tools |
|
||||
| **Skill** | `SKILL.md` | Workspace → Knowledge → Create Skill |
|
||||
|
||||
**Install the Tool:**
|
||||
|
||||
1. Copy the contents of [`tool.py`](https://github.com/Classic298/open-webui-plugins/blob/main/inline-visualizer/tool.py)
|
||||
2. In Open WebUI, go to **Workspace → Tools → + Create New**
|
||||
3. Paste the code and click **Save**
|
||||
|
||||
**Install the Skill:**
|
||||
|
||||
1. Copy the contents of [`SKILL.md`](https://github.com/Classic298/open-webui-plugins/blob/main/inline-visualizer/SKILL.md)
|
||||
2. In Open WebUI, go to **Workspace → Knowledge → + Create Skill**
|
||||
3. Name it **`visualize`** (this exact name is required)
|
||||
4. Paste the contents and click **Save**
|
||||
|
||||
**Attach to Models:**
|
||||
|
||||
1. Go to **Admin Panel → Models** and edit each model that should support visualizations
|
||||
2. Under **Tools**, enable the **Inline Visualizer** tool
|
||||
3. Under **Skills**, attach the **visualize** skill
|
||||
4. Ensure native function calling is enabled for the model
|
||||
5. Save
|
||||
|
||||
**Enable Interactive Features (Optional):**
|
||||
|
||||
1. Go to **Settings → Interface**
|
||||
2. Enable **iframe Sandbox Allow Same Origin**
|
||||
|
||||
Without this, visualizations render normally but interactive buttons that send prompts back to the chat (`sendPrompt`) will not work.
|
||||
|
||||
> **Tip:** Performance depends on model capability; stronger models generally produce better results for complex, visually detailed interactive visualizations.
|
||||
|
||||
---
|
||||
|
||||
## Knowledge Base Setup Guide
|
||||
|
||||
Open WebUI's RAG system ingests documents and creates searchable vector embeddings in PGVector. This section provides an example knowledge base design for pharmaceutical contexts.
|
||||
|
||||
### Recommended Knowledge Base Structure
|
||||
### Example Knowledge Base Structure
|
||||
|
||||
| Knowledge Base | Contents | Functional Groups | Notes |
|
||||
|---|---|---|---|
|
||||
| `Compound Library` | Structures, SAR data, screening results, MoA summaries | R&D / Discovery | *High sensitivity - restrict strictly to R&D* |
|
||||
| `Assay Protocols` | Standard assay procedures, validation data, reference standards | R&D / Discovery | |
|
||||
| `Clinical Protocols` | Study protocols, ICH E6/E8/E9 references, SAPs | Clinical Operations | |
|
||||
| `CRF Templates` | Case report forms, data management plans, reconciliation guides | Clinical Operations | |
|
||||
| `Clinical Protocols` | Study protocols, ICH E6/E8/E9 references, SAPs | Clinical Development | |
|
||||
| `CRF Templates` | Case report forms, data management plans, reconciliation guides | Clinical Development | |
|
||||
| `Statistical Methods` | SAPs, CDISC standards, analysis dataset specifications | Biostatistics | |
|
||||
| `Regulatory Guidance` | FDA guidance library, EMA guidelines, ICH harmonized guidelines | Regulatory Affairs | *Consider splitting by region (FDA/EMA/PMDA)* |
|
||||
| `Submission Templates` | eCTD module templates, cover letters, precedent review correspondence | Regulatory Affairs | |
|
||||
| `PV Reference` | MedDRA hierarchy, CIOMS forms, signal detection SOPs, PSUR templates | Pharmacovigilance | *Reviewed documents only - no drafts* |
|
||||
| `PV Reference` | MedDRA hierarchy, CIOMS forms, signal detection SOPs, PSUR templates | Pharmacovigilance | *Reviewed documents only - no drafts (organizational policy — not enforced by the platform)* |
|
||||
| `Manufacturing SOPs` | Batch records, process validation reports, equipment qualification docs | Manufacturing / CMC | |
|
||||
| `Medical Information` | Product monographs, SmPCs, congress posters, medical response letters | Medical Affairs | |
|
||||
| `Company Policies` | HR handbook, compliance policies, IT security procedures, training guides | All groups | |
|
||||
@@ -796,23 +887,23 @@ Open WebUI's RAG system ingests documents and creates searchable vector embeddin
|
||||
3. Upload documents (supported formats: PDF, DOCX, TXT, Markdown, HTML, CSV, XLSX, PPTX)
|
||||
4. Open WebUI automatically:
|
||||
- Extracts text from uploaded documents
|
||||
- Chunks the content for optimal retrieval
|
||||
- Chunks the content for retrieval
|
||||
- Generates vector embeddings and stores them in PGVector
|
||||
5. Users in the assigned groups can now reference this knowledge base in chat by typing `#` followed by the knowledge base name
|
||||
|
||||
### RAG Best Practices
|
||||
|
||||
- **Regulatory submissions**: Large eCTD modules should be split by section (e.g., upload Module 2.5 Quality Overall Summary separately from Module 3.2.P Drug Product). This improves retrieval precision significantly.
|
||||
- **SOPs and batch records**: These are typically well-structured documents that RAG handles effectively. Use descriptive filenames that include the SOP number and revision (e.g., `SOP-MFG-042-Rev3-Tablet-Compression.pdf`).
|
||||
- **Regulatory submissions**: Large eCTD modules should be split by section (e.g., upload Module 2.5 Quality Overall Summary separately from Module 3.2.P Drug Product). This can improve retrieval precision.
|
||||
- **SOPs and batch records**: These are typically well-structured documents that tend to work well with RAG retrieval. Use descriptive filenames that include the SOP number and revision (e.g., `SOP-MFG-042-Rev3-Tablet-Compression.pdf`).
|
||||
- **Literature databases**: For large literature collections (1,000+ papers), consider organizing into topic-specific knowledge bases rather than one monolithic collection. This lets users target their retrieval.
|
||||
- **Citation verification**: RAG provides relevance scores with each retrieved chunk. Scientists must always verify citations against the source - RAG reduces hallucination but does not eliminate it. **All AI-generated content must be reviewed by qualified personnel before use in any clinical, regulatory, or safety-critical context.** This is especially critical for PV and regulatory use cases.
|
||||
- **Citation verification**: RAG provides relevance scores with each retrieved chunk. Scientists must always verify citations against the source - RAG can help reduce certain types of hallucination by grounding responses in retrieved documents, but does not eliminate it. **All AI-generated content must be reviewed by qualified personnel before use in any clinical, regulatory, or safety-critical context.** This is especially critical for PV and regulatory use cases.
|
||||
- **Version control**: When SOPs are revised or guidance documents update, upload the new version and remove the old one. Knowledge bases can be updated without downtime. Maintain a document version log outside Open WebUI for your QMS.
|
||||
|
||||
### Embedding Model Selection
|
||||
|
||||
The Docker Compose stack pulls `nomic-embed-text` via Ollama for generating embeddings locally. Configure this in **Admin Panel → Settings → Documents → Embedding Model**.
|
||||
|
||||
For higher-quality embeddings (recommended for 10,000+ document deployments), consider using a dedicated embedding endpoint. Set `RAG_OPENAI_API_BASE_URL` to point to a self-hosted embedding service or use Ollama's built-in embedding support (both options keep data on your infrastructure when configured accordingly).
|
||||
For alternative embedding configurations (which may be beneficial for 10,000+ document deployments), consider using a dedicated embedding endpoint. Set `RAG_OPENAI_API_BASE_URL` to point to a self-hosted embedding service or use Ollama's built-in embedding support (both options keep data on your infrastructure when configured accordingly).
|
||||
|
||||
---
|
||||
|
||||
@@ -844,15 +935,15 @@ The following checklist describes operational security measures. **This is not a
|
||||
- [ ] `ENABLE_ADMIN_CHAT_ACCESS=False` - restricts IT administrators from viewing user conversation content at the application level
|
||||
- [ ] `ENABLE_ADMIN_EXPORT=False` - disables bulk data extraction at the application level
|
||||
- [ ] `USER_PERMISSIONS_CHAT_DELETE=False` - disables chat deletion at the application level
|
||||
- [ ] `USER_PERMISSIONS_CHAT_TEMPORARY=False` - no unlogged conversations
|
||||
- [ ] `ENABLE_COMMUNITY_SHARING=False` - no external data sharing
|
||||
- [ ] `USER_PERMISSIONS_CHAT_TEMPORARY=False` - disables the temporary chat option in the UI
|
||||
- [ ] `ENABLE_COMMUNITY_SHARING=False` - disables the community sharing feature
|
||||
- [ ] PostgreSQL configured with encryption at rest (transparent data encryption or full-disk encryption on the host)
|
||||
- [ ] Redis `requirepass` set if Redis is network-accessible (not needed when Redis is internal-only via Docker network)
|
||||
- [ ] Backup encryption enabled (see [Backup & Disaster Recovery](#backup--disaster-recovery))
|
||||
|
||||
### Model & Inference Security
|
||||
|
||||
- [ ] When configured for local-only inference, all models run via Ollama or vLLM on your infrastructure
|
||||
- [ ] When configured for local-only inference, all models can run via Ollama or vLLM on your infrastructure
|
||||
- [ ] Hugging Face token is stored only in `.env`, not committed to version control
|
||||
- [ ] `.env` file has restrictive permissions: `chmod 600 .env`
|
||||
- [ ] For production deployments, consider migrating secrets from `.env` to a dedicated secrets manager (e.g., HashiCorp Vault, AWS Secrets Manager)
|
||||
@@ -860,6 +951,16 @@ The following checklist describes operational security measures. **This is not a
|
||||
- [ ] Docker image tags pinned to specific versions (not `:main` or `:latest`) for reproducible, auditable deployments
|
||||
- [ ] If Functions are used: LLM-Guard or equivalent function installed for prompt injection scanning
|
||||
|
||||
### Open Terminal Security
|
||||
|
||||
- [ ] `OPEN_TERMINAL_API_KEY` is set — without it, the terminal endpoint is unauthenticated (the Docker Compose reference above includes this key by default)
|
||||
- [ ] Open Terminal container is on the internal Docker network only — not exposed to external traffic
|
||||
- [ ] Resource limits applied: `memory: 4G` and `cpus: 4.0` (or appropriate for your environment)
|
||||
- [ ] `OPEN_TERMINAL_MAX_SESSIONS=16` to help limit resource consumption from concurrent terminal sessions
|
||||
- [ ] Docker socket is **not** mounted (`/var/run/docker.sock`) — unless explicitly required and the environment is trusted
|
||||
- [ ] Named volume mounted at `/home/user` for file persistence across container restarts
|
||||
- [ ] Open Terminal access restricted to appropriate functional groups via Admin Settings → Integrations
|
||||
|
||||
### Operational Security
|
||||
|
||||
- [ ] `ENABLE_DB_MIGRATIONS=True` on exactly one node; `False` on all others
|
||||
@@ -881,7 +982,8 @@ The following checklist describes operational security measures. **This is not a
|
||||
| **Redis** | `redis-data` volume | AOF + RDB snapshots (handled by Redis config) |
|
||||
| **Ollama models** | `ollama-data` volume | Volume snapshot or re-pull (models are public) |
|
||||
| **Open WebUI data** | `owui-data` volume | Volume snapshot |
|
||||
| **Configuration** | `.env`, `nginx/`, `docker-compose.yml` | Git repository (exclude secrets) |
|
||||
| **Open Terminal data** | `terminal-data` volume | Volume snapshot (or ephemeral — rebuild on demand) |
|
||||
| **Configuration** | `.env`, `nginx/`, `docker-compose.yml` | Git repository (`.env` must be excluded via `.gitignore` — it contains credentials) |
|
||||
| **TLS certificates** | `nginx/certs/` | Certificate management system |
|
||||
|
||||
### Automated PostgreSQL Backup Script
|
||||
@@ -912,7 +1014,7 @@ docker compose exec -T postgres pg_dump \
|
||||
# Verify backup integrity (pg_restore runs on the host against the host-side file)
|
||||
pg_restore --list "${BACKUP_FILE}" > /dev/null 2>&1 \
|
||||
&& echo "[OK] Backup verified: ${BACKUP_FILE}" \
|
||||
|| echo "[ERROR] Backup verification failed: ${BACKUP_FILE}"
|
||||
|| { echo "[ERROR] Backup verification failed: ${BACKUP_FILE}"; exit 1; }
|
||||
|
||||
# Prune old backups
|
||||
find "${BACKUP_DIR}" -name "openwebui_*.sql.gz" -mtime +${RETENTION_DAYS} -delete
|
||||
@@ -929,6 +1031,8 @@ echo "[INFO] Backup complete. Size: $(du -h "${BACKUP_FILE}" | cut -f1)"
|
||||
|
||||
### RPO / RTO Targets
|
||||
|
||||
> **Note:** The following targets are estimates based on the reference architecture above and assume properly configured infrastructure. Actual recovery performance depends on your specific deployment, network conditions, and data volume.
|
||||
|
||||
| Scenario | RPO (Data Loss) | RTO (Downtime) |
|
||||
|---|---|---|
|
||||
| Single node failure | 0 (stateless, auto-recovered) | < 30 seconds (health check interval) |
|
||||
@@ -940,4 +1044,4 @@ For mission-critical deployments, enable PostgreSQL WAL archiving for point-in-t
|
||||
|
||||
---
|
||||
|
||||
*This guide is maintained alongside [What Would It Take for a Pharma Company to Run AI On Its Own Infrastructure?](article.md). For questions about enterprise deployment, contact [sales@openwebui.com](mailto:sales@openwebui.com).*
|
||||
*This guide is maintained alongside [What Would It Take for a Pharma Company to Run AI On Its Own Infrastructure?](article.md). See the [article disclaimer](article.md#disclaimer) for important limitations and responsibilities. For questions about enterprise deployment, contact [sales@openwebui.com](mailto:sales@openwebui.com).*
|
||||
|
||||
@@ -56,6 +56,7 @@ generate_secret() {
|
||||
}
|
||||
|
||||
if [ ! -f .env ]; then
|
||||
TERMINAL_KEY=$(generate_secret)
|
||||
cat > .env << EOF
|
||||
# =============================================================================
|
||||
# Open WebUI — Pharma Industry Environment Configuration
|
||||
@@ -86,6 +87,10 @@ VLLM_MAX_MODEL_LEN=8192
|
||||
VLLM_API_KEY=$(generate_secret)
|
||||
HF_TOKEN=hf_your_token_here
|
||||
|
||||
# --- Open Terminal ---
|
||||
OPEN_TERMINAL_API_KEY=${TERMINAL_KEY}
|
||||
TERMINAL_SERVER_CONNECTIONS='[{\"url\":\"http://open-terminal:8000\",\"key\":\"${TERMINAL_KEY}\"}]'
|
||||
|
||||
# --- Workers ---
|
||||
UVICORN_WORKERS=4
|
||||
|
||||
@@ -152,6 +157,10 @@ echo " - privkey.pem (private key)"
|
||||
echo " 3. Update nginx/nginx.conf server_name to match your domain"
|
||||
echo " 4. Start the stack: docker compose up -d"
|
||||
echo " 5. Access the UI at: https://ai.yourcompany.com"
|
||||
echo " 6. Install the Inline Visualizer tool & skill (see setup.md Step 5)"
|
||||
echo ""
|
||||
echo " Open Terminal is pre-configured via TERMINAL_SERVER_CONNECTIONS."
|
||||
echo " Verify it's connected: Admin Settings → Integrations → Open Terminal"
|
||||
echo ""
|
||||
echo " To check service health: docker compose ps"
|
||||
echo " To view logs: docker compose logs -f open-webui-1"
|
||||
|
||||
Reference in New Issue
Block a user