š Web Interface Overview
What This Guide Covers
This comprehensive guide teaches you how to use the DON Research API web interface and its features:
- Homepage Navigation - Understanding the main documentation page
- Swagger UI - Interactive API testing in your browser (no code required!)
- Data Formats - H5AD files, GEO accessions, URLs, gene lists, text queries
- Bio Module - ResoTrace integration for advanced workflows
- Complete Workflows - Step-by-step examples from start to finish
- Troubleshooting - Solutions for common errors
⢠Homepage: https://don-research.onrender.com/ - Quick reference docs
⢠This Guide: https://don-research.onrender.com/guide - Detailed tutorials
⢠Swagger UI: https://don-research.onrender.com/docs - Interactive testing
Who Should Use This Guide?
This guide is designed for Texas A&M researchers who:
- ā Have received their API token (via secure email)
- ā Want to test endpoints in their browser before writing code
- ā Need detailed explanations of Bio module features
- ā Want complete workflow examples for common tasks
- ā Need troubleshooting help for API errors
š Swagger UI Tutorial: Test APIs in Your Browser
What is Swagger UI?
Swagger UI is an interactive tool that lets you test API endpoints directly in your web browser without writing any code. It's perfect for:
- ā Learning how endpoints work before writing scripts
- ā Validating your API token
- ā Testing with small datasets
- ā Debugging API calls
- ā Seeing real-time request/response data
Accessing Swagger UI
- Open your web browser (Chrome, Firefox, Safari, or Edge)
- Navigate to: https://don-research.onrender.com/docs
- You'll see a list of all available API endpoints organized by category
Step-by-Step: Testing Your First Endpoint
Step 1: Authenticate with Your Token
- Look for the green "Authorize" button at the top right of the page
- Click it to open the authentication dialog
- In the "Value" field, enter:
Bearer your-tamu-token-here- ā ļø Important: Include the word "Bearer " (with a space) before your token
- Example:
Bearer tamu_cai_lab_2025_HkRs17sgvbjnQax2KzD1iqYcHWbAs5xvZZ2ApKptWuc
- Click "Authorize" button
- Click "Close" to return to the main page
Step 2: Select an Endpoint to Test
Endpoints are organized into sections. For your first test, try the health check:
- Scroll down to find GET
/health - Click on it to expand the endpoint details
Step 3: Execute the Request
- Click the "Try it out" button (top right of the endpoint section)
- Click the "Execute" button (blue button)
- Wait a moment for the response...
Step 4: View the Results
After execution, you'll see three important sections:
Request URL:
https://don-research.onrender.com/health
Response Body: (should look like this)
{
"status": "ok",
"timestamp": "2025-10-27T15:30:45.123Z",
"database": "connected",
"don_stack": {
"mode": "real",
"don_gpu": true,
"tace": true,
"qac": true
}
}
Response Headers:
X-Trace-ID: tamu_20251027_abc123def
content-type: application/json
Testing File Upload Endpoints
Example: Build Feature Vectors
- Find POST
/api/v1/genomics/vectors/build - Click to expand ā Click "Try it out"
- file parameter: Click "Choose File" ā Select your
.h5adfile - mode parameter: Select
clusterfrom dropdown - Click "Execute"
- View response with vector counts and preview data
⢠Use small datasets (< 5,000 cells) in Swagger UI
⢠For large files, use Python scripts instead
⢠Don't upload sensitive/unpublished data via browser
⢠Copy the curl command shown for use in scripts
Understanding Response Codes
| Code | Meaning | Action |
|---|---|---|
| 200 | Success | Request completed successfully |
| 400 | Bad Request | Check parameter format (e.g., file extension) |
| 401 | Unauthorized | Check token format (must include "Bearer ") |
| 429 | Rate Limit Exceeded | Wait 1 hour or use fewer requests |
| 500 | Server Error | Contact support with trace_id |
š Supported Data Formats
Format Overview
The DON Research API supports multiple input formats, not just H5AD files:
| Format | Example | Use Case | Supported Endpoints |
|---|---|---|---|
| H5AD Files | pbmc3k.h5ad |
Direct upload of single-cell data (AnnData format) | All genomics + Bio module |
| GEO Accessions | GSE12345 |
Auto-download from NCBI GEO database | /load |
| Direct URLs | https://example.com/data.h5ad |
Download from external HTTP/HTTPS sources | /load |
| Gene Lists (JSON) | ["CD3E", "CD8A", "CD4"] |
Encode cell type marker genes as query vectors | /query/encode |
| Text Queries | "T cell markers in PBMC" |
Natural language biological queries | /query/encode |
⢠Bio Module endpoints require H5AD files only (for ResoTrace compatibility)
⢠Genomics endpoints support all formats above
⢠GEO accessions and URLs are automatically downloaded and converted to H5AD
Format-Specific Usage Examples
1. H5AD Files (Most Common)
import requests
# Upload H5AD file directly
with open("pbmc_3k.h5ad", "rb") as f:
files = {"file": ("pbmc_3k.h5ad", f, "application/octet-stream")}
response = requests.post(
"https://don-research.onrender.com/api/v1/genomics/vectors/build",
headers={"Authorization": f"Bearer {TOKEN}"},
files=files,
data={"mode": "cluster"}
)
print(response.json())
2. GEO Accessions
# Automatically downloads from NCBI GEO
data = {"accession_or_path": "GSE12345"}
response = requests.post(
"https://don-research.onrender.com/api/v1/genomics/load",
headers={"Authorization": f"Bearer {TOKEN}"},
data=data
)
h5ad_path = response.json()["h5ad_path"]
print(f"Downloaded to: {h5ad_path}")
3. Direct URLs
# Download from any HTTP/HTTPS source
data = {"accession_or_path": "https://example.com/dataset.h5ad"}
response = requests.post(
"https://don-research.onrender.com/api/v1/genomics/load",
headers={"Authorization": f"Bearer {TOKEN}"},
data=data
)
h5ad_path = response.json()["h5ad_path"]
4. Gene Lists (JSON)
import json
# T cell marker genes
gene_list = ["CD3E", "CD8A", "CD4", "IL7R", "CCR7"]
data = {"gene_list_json": json.dumps(gene_list)}
response = requests.post(
"https://don-research.onrender.com/api/v1/genomics/query/encode",
headers={"Authorization": f"Bearer {TOKEN}"},
data=data
)
query_vector = response.json()["psi"] # 128-dimensional vector
5. Text Queries
# Natural language query
data = {
"text": "T cell markers in PBMC tissue",
"cell_type": "T cell",
"tissue": "PBMC"
}
response = requests.post(
"https://don-research.onrender.com/api/v1/genomics/query/encode",
headers={"Authorization": f"Bearer {TOKEN}"},
data=data
)
query_vector = response.json()["psi"]
When to Use Each Format
- H5AD files: When you have preprocessed single-cell data locally
- GEO accessions: When referencing published datasets (e.g., "GSE12345" from papers)
- URLs: When data is hosted externally (collaborator's server, cloud storage)
- Gene lists: When searching for specific cell types by marker genes
- Text queries: When exploring data without knowing exact gene names
𧬠Bio Module: ResoTrace Integration
Overview
The Bio Module provides advanced single-cell analysis workflows optimized for ResoTrace integration. Key capabilities:
- ā Export Artifacts: Convert H5AD ā ResoTrace collapse maps
- ā Signal Sync: Compare pipeline runs for reproducibility
- ā Parasite Detection: QC for ambient RNA, doublets, batch effects
- ā Evolution Report: Track pipeline stability over parameter changes
Sync vs Async Execution Modes
Every Bio endpoint supports two execution modes:
| Mode | When to Use | Response Time | Best For |
|---|---|---|---|
sync=true |
Small datasets (< 5K cells) | Immediate (< 30s) | Quick validation, exploratory analysis, Swagger UI testing |
sync=false |
Large datasets (> 10K cells) | Background job | Production pipelines, batch processing, automated workflows |
Feature 1: Export Artifacts
POST /api/v1/bio/export-artifacts
What it does:
- Converts
.h5adfiles into ResoTrace-compatible formats - Generates collapse maps (cluster graph structure)
- Exports cell-level vector collections (128D embeddings)
- Includes PAGA connectivity (if available)
Required parameters:
file: H5AD file uploadcluster_key: Column inadata.obswith cluster labels (e.g., "leiden")latent_key: Embedding inadata.obsm(e.g., "X_umap", "X_pca")
Example (Synchronous):
with open("pbmc_3k.h5ad", "rb") as f:
files = {"file": ("pbmc.h5ad", f, "application/octet-stream")}
data = {
"cluster_key": "leiden",
"latent_key": "X_umap",
"sync": "true",
"project_id": "cai_lab_pbmc_study",
"user_id": "researcher_001"
}
response = requests.post(
"https://don-research.onrender.com/api/v1/bio/export-artifacts",
headers={"Authorization": f"Bearer {TOKEN}"},
files=files,
data=data
)
result = response.json()
print(f"ā Exported {result['nodes']} clusters")
print(f"ā {result['vectors']} cell vectors")
print(f"ā Trace ID: {result.get('trace_id')}")
Feature 2: Parasite Detector (QC)
POST /api/v1/bio/qc/parasite-detect
What it does:
- Flags low-quality cells ("parasites")
- Detects: Ambient RNA, doublets, batch effects
- Returns per-cell boolean flags
- Computes overall contamination score
Recommended Actions:
| Parasite Score | Quality | Action |
|---|---|---|
| 0-5% | Excellent | Proceed without filtering |
| 5-15% | Good | Minor filtering recommended |
| 15-30% | Moderate | Filter flagged cells |
| > 30% | Poor | Review QC pipeline |
For complete Bio module documentation, see the homepage or Swagger UI.
š¬ Complete Workflow Examples
Workflow 1: Cell Type Discovery with T Cells
Goal: Identify T cell clusters in PBMC dataset using marker genes
import requests
import json
API_URL = "https://don-research.onrender.com"
TOKEN = "your-tamu-token-here"
headers = {"Authorization": f"Bearer {TOKEN}"}
# Step 1: Build cluster vectors
print("Step 1: Building vectors...")
with open("pbmc_3k.h5ad", "rb") as f:
files = {"file": ("pbmc_3k.h5ad", f, "application/octet-stream")}
response = requests.post(
f"{API_URL}/api/v1/genomics/vectors/build",
headers=headers,
files=files,
data={"mode": "cluster"}
)
vectors_result = response.json()
jsonl_path = vectors_result["jsonl"]
print(f"ā Built {vectors_result['count']} cluster vectors")
# Step 2: Encode T cell query
print("\nStep 2: Encoding T cell markers...")
t_cell_genes = ["CD3E", "CD8A", "CD4", "IL7R"]
query_data = {"gene_list_json": json.dumps(t_cell_genes)}
response = requests.post(
f"{API_URL}/api/v1/genomics/query/encode",
headers=headers,
data=query_data
)
query_vector = response.json()["psi"]
print("ā Encoded query vector (128 dimensions)")
# Step 3: Search for matching clusters
print("\nStep 3: Searching for T cell-like clusters...")
search_data = {
"jsonl_path": jsonl_path,
"psi": json.dumps(query_vector),
"k": 5
}
response = requests.post(
f"{API_URL}/api/v1/genomics/vectors/search",
headers=headers,
data=search_data
)
results = response.json()["hits"]
print(f"\nā Top 5 T cell-like clusters:")
for i, hit in enumerate(results, 1):
cluster_id = hit['meta']['cluster']
distance = hit['distance']
cells = hit['meta']['cells']
print(f"{i}. Cluster {cluster_id}: distance={distance:.4f}, cells={cells}")
Workflow 2: QC Pipeline with Parasite Detection
Goal: Clean dataset by detecting and removing low-quality cells
import requests
import scanpy as sc
import numpy as np
# Step 1: Detect parasites
print("Step 1: Detecting QC parasites...")
with open("pbmc_raw.h5ad", "rb") as f:
files = {"file": ("pbmc_raw.h5ad", f, "application/octet-stream")}
data = {
"cluster_key": "leiden",
"batch_key": "sample",
"sync": "true"
}
response = requests.post(
f"{API_URL}/api/v1/bio/qc/parasite-detect",
headers=headers,
files=files,
data=data
)
result = response.json()
flags = result["flags"]
parasite_score = result["parasite_score"]
print(f"ā Parasite score: {parasite_score:.1f}%")
# Step 2: Filter flagged cells
print("\nStep 2: Filtering flagged cells...")
adata = sc.read_h5ad("pbmc_raw.h5ad")
adata = adata[~np.array(flags), :]
adata.write_h5ad("pbmc_cleaned.h5ad")
print(f"ā Saved {adata.n_obs} clean cells")
š§ Troubleshooting Common Errors
Error 401: Authentication Failed
{"detail": "Invalid or missing token"}Solutions:
- ā Verify token format includes "Bearer " prefix
- ā Check for extra whitespace in token
- ā Confirm token hasn't expired (valid 1 year)
Error 400: File Upload Failed
{"detail": "Expected .h5ad file"}Solutions:
- ā
Verify file has
.h5adextension - ā
Validate AnnData format:
sc.read_h5ad("file.h5ad") - ā Check file size < 500MB
Error 429: Rate Limit Exceeded
{"detail": "Rate limit exceeded"}Solutions:
- ā Wait 1 hour for rate limit reset (1,000 req/hour)
- ā Implement exponential backoff in scripts
- ā Use cluster mode instead of cell mode
- ā Contact support for higher limits if needed
Contact Support
When reporting issues, include:
- Institution: Texas A&M University (Cai Lab)
- API endpoint and method (e.g., POST /vectors/build)
- Full error message (JSON response)
- Trace ID from response header (
X-Trace-ID) - Dataset description (cells, genes, file size)
Email: support@donsystems.com | Response time: < 24 hours