mirror of
https://github.com/langchain-ai/langsmith-self-hosted-workshops.git
synced 2026-07-01 20:44:14 -04:00
Remove deprecated 01_aws_preflight.ipynb and update references
- Delete AWS-specific preflight notebook (replaced by cloud-agnostic version) - Update README and teardown notebook to reference `01_preflight.ipynb` The cloud-agnostic version supports both AWS and Azure deployments.
This commit is contained in:
@@ -180,7 +180,7 @@ git clone https://github.com/langchain-ai/helm.git <your-helm-path>
|
||||
### 3. Start the Workshop
|
||||
|
||||
1. Read `docs/modules/module-1.md` for module overview and context
|
||||
2. Open `notebooks/module-1/01_aws_preflight.ipynb` in Jupyter
|
||||
2. Open `notebooks/module-1/01_preflight.ipynb` in Jupyter
|
||||
3. Run the bootstrap cell (first cell) to validate your environment
|
||||
4. Follow the notebook cells sequentially
|
||||
|
||||
|
||||
@@ -1,589 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Module 1: Preflight Checks\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"This notebook validates your environment before deploying LangSmith. Most self-hosted failures occur **before** users ever touch the product due to:\n",
|
||||
"\n",
|
||||
"- Mis-sized clusters\n",
|
||||
"- Unsupported ingress setups\n",
|
||||
"- In-cluster databases used past their limits\n",
|
||||
"- Missing storage primitives (blob, PVs)\n",
|
||||
"\n",
|
||||
"This preflight ensures you start from a **supported baseline**.\n",
|
||||
"\n",
|
||||
"## What We'll Check\n",
|
||||
"\n",
|
||||
"1. ✅ Tooling validation (cloud CLI, terraform, kubectl, helm, jq)\n",
|
||||
"2. ✅ Cloud provider credentials & region sanity check\n",
|
||||
"3. ✅ Cluster capacity expectations\n",
|
||||
"4. ✅ Storage prerequisites (CSI drivers, StorageClasses)\n",
|
||||
"5. ✅ Blob storage requirement (cloud object storage)\n",
|
||||
"\n",
|
||||
"**Estimated time:** 20-30 minutes\n",
|
||||
"\n",
|
||||
"**Supported Cloud Providers:** AWS, Azure (GCP coming soon)\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Bootstrap environment\n",
|
||||
"import sys\n",
|
||||
"from pathlib import Path\n",
|
||||
"\n",
|
||||
"# Add notebooks directory to path so we can import shared as a package\n",
|
||||
"# Find the notebooks directory by looking for the shared folder\n",
|
||||
"possible_paths = [\n",
|
||||
" Path.cwd().parent, # If cwd is module-1, go up one level to notebooks\n",
|
||||
" Path.cwd(), # If cwd is already notebooks\n",
|
||||
" Path.cwd() / \"notebooks\", # If cwd is workspace root\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"notebooks_path = None\n",
|
||||
"for path in possible_paths:\n",
|
||||
" if path and (path / \"shared\" / \"_bootstrap.py\").exists():\n",
|
||||
" notebooks_path = path\n",
|
||||
" break\n",
|
||||
"\n",
|
||||
"if not notebooks_path:\n",
|
||||
" # Fallback: try workspace root\n",
|
||||
" notebooks_path = Path.cwd() / \"notebooks\"\n",
|
||||
" if not (notebooks_path / \"shared\" / \"_bootstrap.py\").exists():\n",
|
||||
" raise RuntimeError(f\"Could not find notebooks/shared directory. Current dir: {Path.cwd()}\")\n",
|
||||
"\n",
|
||||
"# Add notebooks directory to path so 'shared' can be imported as a package\n",
|
||||
"if str(notebooks_path) not in sys.path:\n",
|
||||
" sys.path.insert(0, str(notebooks_path))\n",
|
||||
"\n",
|
||||
"from shared._bootstrap import bootstrap\n",
|
||||
"\n",
|
||||
"# Run bootstrap: loads env, checks tools, validates AWS, creates artifacts dir\n",
|
||||
"bootstrap_info = bootstrap()\n",
|
||||
"print(f\"\\nBootstrap complete! Artifacts directory: {bootstrap_info['artifacts_dir']}\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Cloud Provider Account & Region Validation\n",
|
||||
"\n",
|
||||
"Verify you're using the correct cloud provider account/subscription and region. This is critical for avoiding accidental deployments to production or wrong regions.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import json\n",
|
||||
"from shared._cloud_helpers import (\n",
|
||||
" get_cloud_provider,\n",
|
||||
" get_region,\n",
|
||||
" get_identity,\n",
|
||||
" assert_account,\n",
|
||||
")\n",
|
||||
"from shared._validation import require_env, print_config, ok, warn\n",
|
||||
"\n",
|
||||
"# Get cloud configuration\n",
|
||||
"provider = get_cloud_provider()\n",
|
||||
"region = get_region()\n",
|
||||
"identity = get_identity()\n",
|
||||
"\n",
|
||||
"provider_display = provider.upper()\n",
|
||||
"print(f\"### Current {provider_display} Session\")\n",
|
||||
"print(f\"Cloud Provider: {provider_display}\")\n",
|
||||
"print(f\"Region: {region}\")\n",
|
||||
"\n",
|
||||
"if provider == \"aws\":\n",
|
||||
" print(f\"Account ID: {identity['Account']}\")\n",
|
||||
" print(f\"User ARN: {identity['Arn']}\")\n",
|
||||
" account_var = \"AWS_ACCOUNT_ID\"\n",
|
||||
"elif provider == \"azure\":\n",
|
||||
" subscription_id = identity.get(\"SubscriptionId\") or identity.get(\"Account\", \"\")\n",
|
||||
" subscription_name = identity.get(\"SubscriptionName\", \"\")\n",
|
||||
" print(f\"Subscription ID: {subscription_id}\")\n",
|
||||
" print(f\"Subscription Name: {subscription_name}\")\n",
|
||||
" account_var = \"AZURE_SUBSCRIPTION_ID\"\n",
|
||||
"else:\n",
|
||||
" account_var = None\n",
|
||||
"\n",
|
||||
"# Optional: Validate against expected account/subscription\n",
|
||||
"if account_var:\n",
|
||||
" expected_account = os.environ.get(account_var, \"\").strip()\n",
|
||||
" if expected_account:\n",
|
||||
" assert_account(expected_account)\n",
|
||||
" else:\n",
|
||||
" warn(f\"{account_var} not set in environment - skipping account validation\")\n",
|
||||
" print(f\"💡 Tip: Set {account_var} in your .env file to add a guardrail against wrong account deployments\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Required Environment Variables\n",
|
||||
"\n",
|
||||
"Verify that all required configuration is present. These values will be used throughout the deployment.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Check required environment variables\n",
|
||||
"from shared._cloud_helpers import get_cloud_provider\n",
|
||||
"\n",
|
||||
"provider = get_cloud_provider()\n",
|
||||
"\n",
|
||||
"# Base required vars (cloud-agnostic)\n",
|
||||
"required_vars = [\n",
|
||||
" \"WORKSHOP_NAME\",\n",
|
||||
" \"NAMESPACE\",\n",
|
||||
" \"CLUSTER_NAME\",\n",
|
||||
" \"TERRAFORM_DIR\",\n",
|
||||
" \"HELM_RELEASE\",\n",
|
||||
" \"HELM_NAMESPACE\",\n",
|
||||
" \"HELM_CHART_REF\",\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# Add cloud-specific required vars\n",
|
||||
"if provider == \"aws\":\n",
|
||||
" required_vars.append(\"AWS_REGION\")\n",
|
||||
"elif provider == \"azure\":\n",
|
||||
" required_vars.append(\"AZURE_LOCATION\")\n",
|
||||
"\n",
|
||||
"config = require_env(*required_vars)\n",
|
||||
"\n",
|
||||
"# Optional but recommended (cloud-specific)\n",
|
||||
"optional_vars = {}\n",
|
||||
"if provider == \"aws\":\n",
|
||||
" optional_vars = {\n",
|
||||
" \"AWS_PROFILE\": os.environ.get(\"AWS_PROFILE\", \"\"),\n",
|
||||
" \"AWS_ACCOUNT_ID\": os.environ.get(\"AWS_ACCOUNT_ID\", \"\"),\n",
|
||||
" \"VALUES_FILE\": os.environ.get(\"VALUES_FILE\", \"\"),\n",
|
||||
" }\n",
|
||||
"elif provider == \"azure\":\n",
|
||||
" optional_vars = {\n",
|
||||
" \"AZURE_SUBSCRIPTION_ID\": os.environ.get(\"AZURE_SUBSCRIPTION_ID\", \"\"),\n",
|
||||
" \"AZURE_RESOURCE_GROUP\": os.environ.get(\"AZURE_RESOURCE_GROUP\", \"\"),\n",
|
||||
" \"VALUES_FILE\": os.environ.get(\"VALUES_FILE\", \"\"),\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"print(\"\\n### Configuration Summary\")\n",
|
||||
"print(f\"Cloud Provider: {provider.upper()}\")\n",
|
||||
"print_config(config, redact_keys={\"AWS_PROFILE\"})\n",
|
||||
"print(\"\\n### Optional Configuration\")\n",
|
||||
"for k, v in optional_vars.items():\n",
|
||||
" if v:\n",
|
||||
" print(f\"- {k}: {v}\")\n",
|
||||
" else:\n",
|
||||
" print(f\"- {k}: (not set)\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Cluster Capacity Expectations\n",
|
||||
"\n",
|
||||
"LangSmith requires adequate cluster resources. Before deploying, understand what you'll need:\n",
|
||||
"\n",
|
||||
"- **Minimum:** 3 nodes, 4 vCPU, 16GB RAM each (for development/testing)\n",
|
||||
"- **Recommended:** 3 nodes, 8 vCPU, 32GB RAM each (for production workloads)\n",
|
||||
"- **Storage:** EBS CSI driver required for ClickHouse PVCs\n",
|
||||
"\n",
|
||||
"Let's check if a cluster already exists and validate its configuration.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from shared._aws_helpers import eks_cluster_exists\n",
|
||||
"from shared._shell import run\n",
|
||||
"\n",
|
||||
"cluster_name = os.environ[\"CLUSTER_NAME\"]\n",
|
||||
"region = aws_region()\n",
|
||||
"\n",
|
||||
"print(f\"### Checking EKS Cluster: {cluster_name}\")\n",
|
||||
"print(f\"Region: {region}\\n\")\n",
|
||||
"\n",
|
||||
"if eks_cluster_exists(cluster_name):\n",
|
||||
" ok(f\"Cluster '{cluster_name}' exists\")\n",
|
||||
" \n",
|
||||
" # Get cluster details\n",
|
||||
" result = run(\n",
|
||||
" [\"aws\", \"eks\", \"describe-cluster\", \"--name\", cluster_name, \"--region\", region, \"--output\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" cluster_info = json.loads(result.stdout)[\"cluster\"]\n",
|
||||
" \n",
|
||||
" print(f\"\\nCluster Status: {cluster_info['status']}\")\n",
|
||||
" print(f\"Kubernetes Version: {cluster_info['version']}\")\n",
|
||||
" print(f\"Platform Version: {cluster_info.get('platformVersion', 'N/A')}\")\n",
|
||||
" \n",
|
||||
" # Check node groups\n",
|
||||
" print(\"\\n### Node Groups\")\n",
|
||||
" ng_result = run(\n",
|
||||
" [\"aws\", \"eks\", \"list-nodegroups\", \"--cluster-name\", cluster_name, \"--region\", region, \"--output\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" nodegroups = json.loads(ng_result.stdout).get(\"nodegroups\", [])\n",
|
||||
" \n",
|
||||
" if nodegroups:\n",
|
||||
" for ng in nodegroups:\n",
|
||||
" ng_detail = run(\n",
|
||||
" [\"aws\", \"eks\", \"describe-nodegroup\", \"--cluster-name\", cluster_name, \n",
|
||||
" \"--nodegroup-name\", ng, \"--region\", region, \"--output\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" ng_info = json.loads(ng_detail.stdout)[\"nodegroup\"]\n",
|
||||
" scaling = ng_info.get(\"scalingConfig\", {})\n",
|
||||
" print(f\"\\n Node Group: {ng}\")\n",
|
||||
" print(f\" Status: {ng_info['status']}\")\n",
|
||||
" print(f\" Desired: {scaling.get('desiredSize', 'N/A')}\")\n",
|
||||
" print(f\" Min: {scaling.get('minSize', 'N/A')}\")\n",
|
||||
" print(f\" Max: {scaling.get('maxSize', 'N/A')}\")\n",
|
||||
" print(f\" Instance Types: {', '.join(ng_info.get('instanceTypes', []))}\")\n",
|
||||
" else:\n",
|
||||
" warn(\"No node groups found\")\n",
|
||||
" print(\"💡 You'll need to create node groups when deploying with Terraform\")\n",
|
||||
"else:\n",
|
||||
" warn(f\"Cluster '{cluster_name}' does not exist yet\")\n",
|
||||
" print(\"💡 This is expected if you haven't run Terraform yet. Proceed to notebook 02_terraform_apply.ipynb\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Storage Prerequisites\n",
|
||||
"\n",
|
||||
"LangSmith requires persistent storage for ClickHouse. The cloud storage CSI driver must be installed and StorageClasses must be configured.\n",
|
||||
"\n",
|
||||
"**Why this matters:** Without the appropriate CSI driver, ClickHouse PVCs will remain in `Pending` state forever.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Check if kubectl is configured for the cluster\n",
|
||||
"from shared._cloud_helpers import (\n",
|
||||
" get_cloud_provider,\n",
|
||||
" get_region,\n",
|
||||
" configure_kubectl,\n",
|
||||
" get_storage_driver_name,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"provider = get_cloud_provider()\n",
|
||||
"cluster_name = os.environ[\"CLUSTER_NAME\"]\n",
|
||||
"region = get_region()\n",
|
||||
"storage_driver = get_storage_driver_name()\n",
|
||||
"\n",
|
||||
"k8s_service = \"EKS\" if provider == \"aws\" else \"AKS\" if provider == \"azure\" else \"Kubernetes\"\n",
|
||||
"print(f\"### Configuring kubectl for {k8s_service} cluster\")\n",
|
||||
"try:\n",
|
||||
" # Configure kubectl (cloud-agnostic)\n",
|
||||
" configure_kubectl(cluster_name, region)\n",
|
||||
" ok(\"kubectl configured for cluster\")\n",
|
||||
" \n",
|
||||
" # Check CSI driver (cloud-specific labels)\n",
|
||||
" print(f\"\\n### Checking {storage_driver} Driver\")\n",
|
||||
" \n",
|
||||
" if provider == \"aws\":\n",
|
||||
" driver_label = \"app=ebs-csi-controller\"\n",
|
||||
" driver_name = \"EBS CSI\"\n",
|
||||
" elif provider == \"azure\":\n",
|
||||
" driver_label = \"app=csi-azuredisk-controller\"\n",
|
||||
" driver_name = \"Azure Disk CSI\"\n",
|
||||
" else:\n",
|
||||
" driver_label = None\n",
|
||||
" driver_name = \"Storage CSI\"\n",
|
||||
" \n",
|
||||
" if driver_label:\n",
|
||||
" result = run(\n",
|
||||
" [\"kubectl\", \"get\", \"daemonset\", \"-n\", \"kube-system\", \"-l\", driver_label, \"-o\", \"json\"],\n",
|
||||
" check=False,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" if result.returncode == 0 and result.stdout.strip():\n",
|
||||
" ds_info = json.loads(result.stdout)\n",
|
||||
" if ds_info.get(\"items\"):\n",
|
||||
" ok(f\"{driver_name} driver is installed\")\n",
|
||||
" print(f\" DaemonSet: {ds_info['items'][0]['metadata']['name']}\")\n",
|
||||
" else:\n",
|
||||
" warn(f\"{driver_name} driver not found\")\n",
|
||||
" print(f\"💡 {driver_name} driver must be installed before deploying LangSmith\")\n",
|
||||
" print(\" The Terraform module should handle this, but verify after deployment\")\n",
|
||||
" else:\n",
|
||||
" warn(f\"{driver_name} driver not found\")\n",
|
||||
" print(f\"💡 {driver_name} driver must be installed before deploying LangSmith\")\n",
|
||||
" \n",
|
||||
" # Check StorageClasses\n",
|
||||
" print(\"\\n### Checking StorageClasses\")\n",
|
||||
" result = run(\n",
|
||||
" [\"kubectl\", \"get\", \"storageclass\", \"-o\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" sc_list = json.loads(result.stdout)\n",
|
||||
" \n",
|
||||
" # Find cloud-specific storage classes\n",
|
||||
" if provider == \"aws\":\n",
|
||||
" storage_scs = [sc for sc in sc_list.get(\"items\", []) if \"ebs\" in sc[\"metadata\"][\"name\"].lower() or \n",
|
||||
" sc.get(\"provisioner\", \"\").endswith(\"ebs.csi.aws.com\")]\n",
|
||||
" elif provider == \"azure\":\n",
|
||||
" storage_scs = [sc for sc in sc_list.get(\"items\", []) if \"disk\" in sc[\"metadata\"][\"name\"].lower() or \n",
|
||||
" sc.get(\"provisioner\", \"\").endswith(\"disk.csi.azure.com\")]\n",
|
||||
" else:\n",
|
||||
" storage_scs = []\n",
|
||||
" \n",
|
||||
" if storage_scs:\n",
|
||||
" ok(f\"Found {len(storage_scs)} {storage_driver} StorageClass(es):\")\n",
|
||||
" for sc in storage_scs:\n",
|
||||
" name = sc[\"metadata\"][\"name\"]\n",
|
||||
" default = sc.get(\"metadata\", {}).get(\"annotations\", {}).get(\"storageclass.kubernetes.io/is-default-class\", \"false\")\n",
|
||||
" print(f\" - {name} (default: {default})\")\n",
|
||||
" else:\n",
|
||||
" warn(f\"No {storage_driver} StorageClasses found\")\n",
|
||||
" print(f\"💡 At least one {storage_driver} StorageClass is required for ClickHouse PVCs\")\n",
|
||||
" \n",
|
||||
"except Exception as e:\n",
|
||||
" warn(f\"Could not check storage prerequisites: {e}\")\n",
|
||||
" print(\"💡 This is expected if the cluster doesn't exist yet\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Blob Storage Requirement\n",
|
||||
"\n",
|
||||
"**Critical:** LangSmith requires cloud object storage (S3, Blob Storage, etc.) for blob storage in production. Inline trace payloads will explode ClickHouse if blob storage is not configured.\n",
|
||||
"\n",
|
||||
"Let's verify access to your cloud provider's object storage service and check if a storage account/bucket exists or needs to be created.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from shared._cloud_helpers import (\n",
|
||||
" get_cloud_provider,\n",
|
||||
" get_region,\n",
|
||||
" get_blob_storage_service_name,\n",
|
||||
" verify_blob_storage_access,\n",
|
||||
")\n",
|
||||
"from shared._shell import run\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"provider = get_cloud_provider()\n",
|
||||
"region = get_region()\n",
|
||||
"blob_service = get_blob_storage_service_name()\n",
|
||||
"\n",
|
||||
"print(f\"### {blob_service} Access Check\")\n",
|
||||
"print(f\"Cloud Provider: {provider.upper()}\")\n",
|
||||
"print(f\"Region: {region}\\n\")\n",
|
||||
"\n",
|
||||
"# Test blob storage access\n",
|
||||
"try:\n",
|
||||
" if provider == \"aws\":\n",
|
||||
" result = run(\n",
|
||||
" [\"aws\", \"s3\", \"ls\", \"--region\", region],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" ok(f\"{blob_service} access verified\")\n",
|
||||
" \n",
|
||||
" # List buckets\n",
|
||||
" buckets_result = run(\n",
|
||||
" [\"aws\", \"s3api\", \"list-buckets\", \"--output\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" buckets = json.loads(buckets_result.stdout).get(\"Buckets\", [])\n",
|
||||
" \n",
|
||||
" print(f\"\\nFound {len(buckets)} S3 bucket(s):\")\n",
|
||||
" for bucket in buckets[:10]: # Show first 10\n",
|
||||
" print(f\" - {bucket['Name']} (created: {bucket['CreationDate']})\")\n",
|
||||
" \n",
|
||||
" if len(buckets) > 10:\n",
|
||||
" print(f\" ... and {len(buckets) - 10} more\")\n",
|
||||
" \n",
|
||||
" elif provider == \"azure\":\n",
|
||||
" result = run(\n",
|
||||
" [\"az\", \"storage\", \"account\", \"list\", \"--output\", \"json\"],\n",
|
||||
" check=True,\n",
|
||||
" stream=False\n",
|
||||
" )\n",
|
||||
" ok(f\"{blob_service} access verified\")\n",
|
||||
" \n",
|
||||
" # List storage accounts\n",
|
||||
" accounts = json.loads(result.stdout)\n",
|
||||
" \n",
|
||||
" print(f\"\\nFound {len(accounts)} Storage Account(s):\")\n",
|
||||
" for account in accounts[:10]: # Show first 10\n",
|
||||
" name = account.get(\"name\", \"N/A\")\n",
|
||||
" location = account.get(\"location\", \"N/A\")\n",
|
||||
" print(f\" - {name} (location: {location})\")\n",
|
||||
" \n",
|
||||
" if len(accounts) > 10:\n",
|
||||
" print(f\" ... and {len(accounts) - 10} more\")\n",
|
||||
" \n",
|
||||
" print(f\"\\n💡 Note: The Terraform module should create a {blob_service} resource for LangSmith blob storage\")\n",
|
||||
" print(\" Verify the resource exists after Terraform deployment\")\n",
|
||||
" \n",
|
||||
"except Exception as e:\n",
|
||||
" warn(f\"{blob_service} access check failed: {e}\")\n",
|
||||
" if provider == \"aws\":\n",
|
||||
" print(\"💡 Ensure your AWS credentials have S3 permissions\")\n",
|
||||
" elif provider == \"azure\":\n",
|
||||
" print(\"💡 Ensure your Azure credentials have Storage Account permissions\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Terraform & Helm Repository Paths\n",
|
||||
"\n",
|
||||
"Verify that the Terraform and Helm repository paths are correctly configured and accessible.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import re\n",
|
||||
"from pathlib import Path\n",
|
||||
"from shared._validation import ok, warn\n",
|
||||
"\n",
|
||||
"def expand_env_vars(path_str: str) -> str:\n",
|
||||
" \"\"\"Expand environment variable references in a path string.\"\"\"\n",
|
||||
" # Expand $VAR and ${VAR} references\n",
|
||||
" def replace_var(match):\n",
|
||||
" var_name = match.group(1) or match.group(2)\n",
|
||||
" return os.environ.get(var_name, match.group(0))\n",
|
||||
" \n",
|
||||
" # Replace $VAR and ${VAR} patterns\n",
|
||||
" path_str = re.sub(r'\\$\\{([^}]+)\\}|\\$([a-zA-Z_][a-zA-Z0-9_]*)', replace_var, path_str)\n",
|
||||
" return path_str\n",
|
||||
"\n",
|
||||
"# Expand environment variables in paths (e.g., $TERRAFORM_REPO_DIR, $HELM_REPO_DIR, $HOME)\n",
|
||||
"terraform_dir_str = expand_env_vars(os.environ[\"TERRAFORM_DIR\"])\n",
|
||||
"terraform_dir = Path(terraform_dir_str).expanduser().resolve()\n",
|
||||
"\n",
|
||||
"helm_chart_ref_str = expand_env_vars(os.environ[\"HELM_CHART_REF\"])\n",
|
||||
"helm_chart_ref = Path(helm_chart_ref_str).expanduser().resolve()\n",
|
||||
"\n",
|
||||
"print(\"### Repository Paths Check\\n\")\n",
|
||||
"\n",
|
||||
"# Check Terraform directory\n",
|
||||
"print(f\"Terraform Directory: {terraform_dir}\")\n",
|
||||
"if terraform_dir.exists():\n",
|
||||
" ok(f\"Terraform directory exists\")\n",
|
||||
" \n",
|
||||
" # Check for main.tf or similar\n",
|
||||
" tf_files = list(terraform_dir.glob(\"*.tf\"))\n",
|
||||
" if tf_files:\n",
|
||||
" print(f\" Found {len(tf_files)} Terraform file(s)\")\n",
|
||||
" else:\n",
|
||||
" warn(\"No .tf files found in Terraform directory\")\n",
|
||||
" print(\"💡 Ensure you're pointing to the correct Terraform module path\")\n",
|
||||
"else:\n",
|
||||
" warn(f\"Terraform directory does not exist: {terraform_dir}\")\n",
|
||||
" print(\"💡 Update TERRAFORM_DIR in your .env file to point to the langchain-ai/terraform repo\")\n",
|
||||
"\n",
|
||||
"# Check Helm chart\n",
|
||||
"print(f\"\\nHelm Chart Reference: {helm_chart_ref}\")\n",
|
||||
"if helm_chart_ref.exists():\n",
|
||||
" ok(f\"Helm chart path exists\")\n",
|
||||
" \n",
|
||||
" # Check for Chart.yaml\n",
|
||||
" chart_yaml = helm_chart_ref / \"Chart.yaml\"\n",
|
||||
" if chart_yaml.exists():\n",
|
||||
" print(f\" Found Chart.yaml\")\n",
|
||||
" else:\n",
|
||||
" warn(\"Chart.yaml not found\")\n",
|
||||
" print(\"💡 Ensure you're pointing to the correct Helm chart path\")\n",
|
||||
"else:\n",
|
||||
" warn(f\"Helm chart path does not exist: {helm_chart_ref}\")\n",
|
||||
" print(\"💡 Update HELM_CHART_REF in your .env file to point to the langchain-ai/helm chart\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Preflight Summary\n",
|
||||
"\n",
|
||||
"Review the checklist below. All items should be ✅ before proceeding to Terraform deployment.\n",
|
||||
"\n",
|
||||
"### ✅ Checklist\n",
|
||||
"\n",
|
||||
"- [ ] All required tools installed (cloud CLI, terraform, kubectl, helm, jq)\n",
|
||||
"- [ ] Cloud provider credentials valid and correct account/subscription/region\n",
|
||||
"- [ ] Required environment variables set\n",
|
||||
"- [ ] Terraform directory path correct\n",
|
||||
"- [ ] Helm chart path correct\n",
|
||||
"- [ ] Blob storage access verified (S3/Blob Storage)\n",
|
||||
"- [ ] (If cluster exists) Storage CSI driver installed\n",
|
||||
"- [ ] (If cluster exists) StorageClasses configured\n",
|
||||
"\n",
|
||||
"### Next Steps\n",
|
||||
"\n",
|
||||
"If all checks pass, proceed to **02_terraform_apply.ipynb** to deploy the infrastructure.\n",
|
||||
"\n",
|
||||
"If any checks failed, review the warnings above and fix the issues before continuing.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.14.2"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
@@ -425,7 +425,7 @@
|
||||
"\n",
|
||||
"If you want to start over:\n",
|
||||
"1. Review and update your `.env` file\n",
|
||||
"2. Run `01_aws_preflight.ipynb` again\n",
|
||||
"2. Run `01_preflight.ipynb` again\n",
|
||||
"3. Proceed through the module notebooks\n",
|
||||
"\n",
|
||||
"**Thank you for completing Module 1!**\n"
|
||||
|
||||
Reference in New Issue
Block a user