San Francisco 311 City Services Agent
Learn Daemo by exploring a complete, working agent that queries 8+ million rows of San Francisco 311 service request data.
What is SF 311?
311 is a non-emergency municipal services hotline used by cities across the United States. Unlike 911 (which handles emergencies like fires, crimes, and medical situations), 311 is designed for everyday city service requests and information inquiries.
San Francisco's 311 System
San Francisco launched its 311 service in March 2007, making it one of the most comprehensive municipal service request systems in the country. Citizens can contact SF 311 via:
- Phone: Dial 3-1-1 from any phone in SF
- Mobile App: SF311 app for iOS and Android
- Website: sf311.org
- Twitter: @SF311
What Can You Report?
SF 311 handles a wide variety of non-emergency city services:
| Category | Examples |
|---|---|
| Street Issues | Potholes, broken sidewalks, damaged signs, streetlight outages |
| Cleanliness | Graffiti removal, illegal dumping, street cleaning requests |
| Encampments | Homeless encampment reports and cleanup requests |
| Utilities | Water leaks, sewer issues, blocked storm drains |
| Trees & Parks | Fallen trees, park maintenance, landscaping issues |
| Vehicles | Abandoned vehicles, blocked driveways, parking violations |
The Open Dataset
San Francisco publishes all 311 service requests as open data through the city's DataSF portal. This dataset:
- Contains 8+ million records dating back to 2008
- Is updated daily with new requests
- Includes request details, status, location, timestamps, and resolution info
- Is freely accessible via the Socrata Open Data API
Why SF 311 for this example? It's a perfect real-world dataset for demonstrating AI agents:
- Large scale: 8M+ rows tests performance patterns
- Real data: Actual city service requests, not synthetic data
- Public API: No authentication required for basic queries
- Structured: Well-defined schema with dates, categories, locations
- Meaningful: Questions have real-world relevance (city operations, civic tech)
An AI-powered data analyst that answers natural language questions about SF 311 service requests (potholes, graffiti, encampments, street cleaning, etc.)
What This Agent Can Do
Ask questions in natural language and get answers backed by real data:
| Example Query | What It Does |
|---|---|
| "How many 311 incidents are there by type?" | Aggregates counts by service type |
| "What's the average time to close graffiti reports in the Mission?" | Analyzes cycle times by neighborhood |
| "Show me the most common complaints in District 6" | Filters and ranks by district |
| "Find recurring issues at Market and 5th" | Detects "zombie cases" at intersections |
Architecture
┌────────────────────┐ ┌──────────────────────┐ ┌──────────────────┐
│ User Query │────▶│ Daemo Engine │────▶│ SF 311 API │
│ (REST API) │ │ (LLM + Tools) │ │ (Socrata/SoQL) │
└────────────────────┘ └──────────────────────┘ └──────────────────┘
- User sends a query via HTTP POST to
/agent/query - Daemo Engine receives it, sends it to an LLM (Gemini/Anthropic/OpenAI)
- The LLM decides which tools to call based on the query and available functions
- Tools execute SoQL queries against SF's Open Data API (8M+ rows)
- Results are processed and returned as a natural language response
Quick Start
# Clone the repository
git clone https://github.com/Daemo-AI/daemo-sf-311-agent.git
cd daemo-sf-311-agent
# Install dependencies
npm install
# Configure your API keys
cp env.example .env
# Edit .env with your DAEMO_AGENT_API_KEY
# Start the server
npm run dev
Then query your agent:
curl -X POST http://localhost:5000/agent/query \
-H "Content-Type: application/json" \
-d '{"query": "What are the top 10 service types by count?"}'
Project Structure
Understanding the codebase will help you adapt it for your own use case:
src/
├── app.ts # Express server setup
├── services/
│ ├── daemoService.ts # 🔑 Registers tools + system prompt
│ ├── sf311Functions.ts # 🔑 The AI tools (4 functions)
│ ├── sf311.schemas.ts # Zod schemas for inputs/outputs
│ └── socrataClient.ts # Helper for Socrata API
└── controllers/
└── agentController.ts # HTTP request handling
Focus on two files: daemoService.ts (registration + system prompt) and sf311Functions.ts (the actual tools). Everything else is boilerplate.
The Four AI Tools
The agent has 4 registered tools, each decorated with @DaemoFunction and Zod schemas:
1. searchOrAggregate
The "Swiss Army Knife" for general SoQL queries — counts, groupings, filtering:
@DaemoFunction({
description:
"Execute a general search or aggregation using SoQL. Input MUST be a single object. Use this for general counts, grouping, and finding top records.",
tags: ["311", "search", "aggregate"],
category: "SF311",
inputSchema: z.object({
select: z.string().describe("Columns to select (e.g., 'service_name, count(*) as count')"),
where: z.string().optional().describe("Filter conditions"),
group_by: z.string().optional().describe("Group by columns"),
order_by: z.string().optional().describe("Order by clause"),
limit: z.number().default(100),
}),
outputSchema: z.object({
results: z.array(z.any()),
count: z.number(),
}),
})
async searchOrAggregate(input: {
select: string;
where?: string;
group_by?: string;
order_by?: string;
limit: number;
}) {
let query = `SELECT ${input.select}`;
if (input.where) query += ` WHERE ${input.where}`;
if (input.group_by) query += ` GROUP BY ${input.group_by}`;
if (input.order_by) query += ` ORDER BY ${input.order_by}`;
query += ` LIMIT ${input.limit}`;
const results = await this.runSoql(query);
return { results, count: results.length };
}
2. analyzeCycleTimes
How long it takes to close cases — computes statistics in memory:
@DaemoFunction({
description:
"Analyze how long it takes to close cases. Calculates avg, median, min, max duration in days.",
inputSchema: z.object({
service_name_filter: z.string().optional().describe("Exact or prefix match for service name"),
neighborhood: z.string().optional().describe("Exact neighborhood name"),
days_ago: z.number().default(90).describe("Look back window in days"),
}),
outputSchema: z.object({
total_closed_analyzed: z.number(),
avg_days_to_close: z.number(),
median_days_to_close: z.number(),
max_days_to_close: z.number(),
min_days_to_close: z.number(),
}),
})
async analyzeCycleTimes(input: { ... }) { ... }
3. analyzeResubmissions
Detects "zombie cases" — issues closed and reopened within 7 days at the same location:
@DaemoFunction({
description:
"Identify 'Zombie' cases: Requests closed and then immediately resubmitted at the same location within 7 days.",
inputSchema: z.object({
service_name_filter: z.string().optional().describe("Service name prefix"),
district: z.string().optional().describe("Supervisor district number"),
days_to_analyze: z.number().default(30).describe("Time window"),
}),
outputSchema: z.object({
total_cases_scanned: z.number(),
potential_resubmissions: z.number(),
resubmission_rate_percent: z.number(),
examples: z.array(z.any()),
}),
})
async analyzeResubmissions(input: { ... }) { ... }
4. findIntersections
Specialized search for intersection-related queries:
@DaemoFunction({
description: "Specialized search for intersection-related queries. Finds requests where address contains ' / '.",
inputSchema: z.object({
service_query: z.string().describe("Exact or partial service name"),
days_ago: z.number().default(90),
}),
outputSchema: z.object({
count: z.number(),
results: z.array(z.any()),
}),
})
async findIntersections(input: { service_query: string; days_ago: number }) { ... }
Notice the pattern: Each function uses inputSchema and outputSchema with Zod to give the LLM precise type information. This prevents hallucinations and ensures the AI calls functions correctly.
Zod Schemas: The Contract
The sf311.schemas.ts file defines reusable schemas. Here's a taste:
import { z } from "zod";
// Input schema for searching cases
export const SearchCasesInputSchema = z.object({
status: z.string().optional().describe("Case status (e.g., 'Open', 'Closed')"),
neighborhood: z.string().optional().describe("Neighborhood name (e.g., 'Mission')"),
service_name: z.string().optional().describe("Service category (e.g., 'Graffiti')"),
supervisor_district: z.string().optional().describe("Supervisor district number (e.g., '9')"),
days_old_min: z.number().optional().describe("Minimum age in days"),
limit: z.number().optional().default(100).describe("Maximum number of results"),
});
// Output schema for a single case
export const CaseOutputSchema = z.object({
service_request_id: z.string(),
requested_datetime: z.string(),
closed_date: z.string().optional(),
status_description: z.string(),
service_name: z.string(),
address: z.string().optional(),
neighborhoods_sffind_boundaries: z.string().optional(),
days_open: z.number().optional(),
});
Why Zod? The .describe() method adds documentation that gets passed to the LLM. This is how the AI knows that status should be 'Open' or 'Closed', not arbitrary text.
The System Prompt (The Secret Sauce)
In daemoService.ts, there's a detailed system prompt (~90 lines) that instructs the LLM how to behave. This is where the magic happens. Here's the actual prompt:
const systemPrompt = `You are an expert Data Analyst for the San Francisco 311 Dataset (Socrata ID: vw6y-z8j6).
Your goal is to write high-performance SoQL queries to answer user questions about city infrastructure requests.
## ⚠️ CRITICAL: EXECUTION RULES
1. **SINGLE OBJECT ARGUMENTS**: When calling tools, you MUST pass a single JSON object.
- ✅ CORRECT: call searchOrAggregate({ "select": "...", "where": "..." })
- ❌ WRONG: call searchOrAggregate("...", "...")
2. **PERFORMANCE & TIMEOUTS**: This dataset has 8+ MILLION rows.
- **NEVER** use leading wildcards (e.g., \`LIKE '%Trash%'\`). This causes full table scans and WILL TIMEOUT.
- **INSTEAD**, use prefix searches: \`LIKE 'Trash%'\` or \`starts_with(service_name, 'Trash')\`.
- **ALWAYS** include a date filter if possible (e.g., \`requested_datetime > '2024-01-01T00:00:00'\`).
- **ALWAYS** limit your results to under 1000 (e.g., \`LIMIT 1000\`).
## 🧠 STRATEGY: "PROBE THEN ATTACK"
If you don't know the exact \`service_name\` or \`neighborhood\`, do not guess with wildcards.
1. **Probe**: specific groupings to find exact values.
- Query: "Show me top service names" -> \`SELECT service_name, count(*) GROUP BY service_name ORDER BY count(*) DESC LIMIT 1000\`
2. **Attack**: Once you have the exact name (e.g., 'Street and Sidewalk Cleaning'), run your detailed query using exact matches (\`=\`).
## 📚 SOCRATA (SoQL) SYNTAX GUIDE
### 1. Dates (floating_timestamp)
Format: ISO 8601 \`YYYY-MM-DDThh:mm:ss\`
- **Truncation**: \`date_trunc_ym(requested_datetime)\` (Group by month)
- **Extraction**: \`date_extract_hh(requested_datetime)\` (Hour of day)
- **Filter**: \`requested_datetime > '2023-01-01T00:00:00'\`
### 2. Text & Categories
- **Exact Match**: \`service_name = 'Encampments'\`
- **Case Sensitive**: Socrata 2.1+ is case sensitive. 'trash' != 'Trash'.
- **Prefix**: \`starts_with(service_name, 'Graffiti')\`
### 3. Location
- **Intersection**: Addresses often contain ' / ' or ' AND '.
- **Neighborhoods**: Use column \`neighborhoods_sffind_boundaries\`.
- **Districts**: Use column \`supervisor_district\` (1-11).
## 🛠️ TOOLKIT
### 1. searchOrAggregate
The "Swiss Army Knife" for SoQL. Use for almost everything.
- **Input**: \`{ select: string, where?: string, group_by?: string, order_by?: string, limit?: number }\`
- **Example (Trends)**:
- select: \`date_trunc_ym(requested_datetime) as month, count(*) as count\`
- where: \`service_name = 'Graffiti' AND requested_datetime > '2023-01-01T00:00:00'\`
- group_by: \`month\`
- order_by: \`month DESC\`
### 2. analyzeResubmissions
Use for "Zombie Cases" or "Reopened" questions.
- Logic: Finds clusters of cases at same address/type closed then reopened within 7 days.
### 3. analyzeCycleTimes
Use for "How long to close?" or "Duration" questions.
- Logic: Fetches raw start/end dates and calculates stat (Avg, Median) in memory.
### 4. findIntersections
Use ONLY for "requests at intersections".
- Optimized query looking for slash characters in addresses.
## DATA SCHEMA CHEATSHEET
- \`service_request_id\` (Text)
- \`requested_datetime\` (Floating Timestamp)
- \`closed_date\` (Floating Timestamp)
- \`status_description\` (Text: 'Open', 'Closed')
- \`service_name\` (Text: 'Street and Sidewalk Cleaning', 'Graffiti', etc.)
- \`service_subtype\` (Text: Specific type)
- \`supervisor_district\` (Number: 1-11)
- \`neighborhoods_sffind_boundaries\` (Text: 'Mission', 'Tenderloin', etc.)
- \`address\` (Text)
- \`source\` (Text: 'Mobile/Open311', 'Phone', 'Web')
`;
The system prompt is critical. A well-crafted prompt can be the difference between an agent that works and one that hallucinates. Notice how this prompt includes:
- Explicit rules for how to call functions
- Performance guardrails for a massive dataset
- A clear strategy ("Probe then Attack")
- Complete schema documentation
- Per-tool guidance on when to use each
Key Patterns in the System Prompt
| Pattern | Why It Matters |
|---|---|
| "Probe then attack" | Prevents errors from guessing field values |
| Explicit schema reference | LLM knows exact column names and types |
| Performance guardrails | Prevents timeouts on 8M rows (wildcards, limits, date filters) |
| Tool selection guidance | Tells LLM exactly when to use each tool |
| Syntax examples | Shows exact SoQL syntax the LLM should generate |
Registration: Putting It All Together
Here's how the service is registered in daemoService.ts:
import { DaemoBuilder, DaemoHostedConnection, SessionData } from "daemo-engine";
import { SF311Functions } from "./sf311Functions";
let hostedConnection: DaemoHostedConnection | null = null;
let sessionData: SessionData | null = null;
export function initializeDaemoService(): SessionData {
console.log("[Daemo] Initializing Daemo service...");
const builder = new DaemoBuilder()
.withServiceName("sf_311_service")
.withSystemPrompt(systemPrompt); // The 90-line prompt above
// Register the SF 311 service
const sf311Functions = new SF311Functions();
builder.registerService(sf311Functions);
sessionData = builder.build();
console.log(`[Daemo] Registered ${sessionData.Functions.length} functions`);
return sessionData;
}
export async function startHostedConnection(sessionData: SessionData): Promise<void> {
const agentApiKey = process.env.DAEMO_AGENT_API_KEY;
const gatewayUrl = process.env.DAEMO_GATEWAY_URL || "localhost:50052";
if (!agentApiKey) {
console.warn("[Daemo] DAEMO_AGENT_API_KEY not set. Hosted connection will not start.");
return;
}
hostedConnection = new DaemoHostedConnection(
{ daemoGatewayUrl: gatewayUrl, agentApiKey: agentApiKey },
sessionData
);
await hostedConnection.start();
console.log("[Daemo] Hosted connection started successfully");
}
Notice the pattern: Build → Register → Start. The DaemoBuilder collects your service configuration, then DaemoHostedConnection establishes the secure tunnel to the Daemo Engine.
Extending for Your Use Case
This template is designed to be easily customizable:
Replace the Data Source
- Create a new client (like
socrataClient.ts) for your API - Define new schemas for your data types
- Update the tool functions to query your source
Add New Tools
// In your functions file
@DaemoFunction({
description: "Your new tool description"
})
async myNewTool(args: { param1: string; param2: number }) {
// Your logic here
return { result: "..." };
}
Then register in daemoService.ts:
builder.registerService(new MyNewService());
Adapt the System Prompt
Customize the system prompt for your domain:
- What data does your agent access?
- What query patterns should it follow?
- What are the performance considerations?
Environment Variables
| Variable | Required | Description |
|---|---|---|
DAEMO_AGENT_API_KEY | ✅ | From app.daemo.ai |
GEMINI_API_KEY | ✅* | For Gemini (default LLM) |
ANTHROPIC_API_KEY | ✅* | For Claude |
OPENAI_API_KEY | ✅* | For GPT-4 |
SF_311_APP_TOKEN | Recommended | Prevents Socrata rate limits |
*One LLM provider key required
What You'll Learn
By exploring this template, you'll understand:
- ✅ How to structure a Daemo agent project
- ✅ How to write effective
@DaemoFunctiondecorators - ✅ How to craft system prompts that guide LLM behavior
- ✅ How multiple tools can work together
- ✅ How to handle real-world data with proper schemas
- ✅ The "probe then attack" pattern for querying large datasets