· System Design

One phone call, four agents, two streams.

The whole stack was built in 24 hours. It connects a regular phone line to a Retell-powered conversational layer, hands transcript chunks to a small OpenAI Swarm of cooperating agents, persists banking actions against an in-memory ledger, and pins generated application PDFs to IPFS through Pinata. A second WebSocket pipes everything to a Next.js operator console for live observability.

01 · Topology

What hits what, in what order

A single inbound call fans out to two streams: the LLM bidirectional WebSocket back into Retell, and a separate observability WebSocket into the operator console.

02 · Telephony bridge

The Retell ↔ FastAPI bidirectional socket

Retell forwards real-time transcripts and waits for streamed LLM responses on a per-call WebSocket. The bridge starts a fresh LlmClient and immediately greets the caller by name.

server/main.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
if request_json["interaction_type"] == "call_details":
number = "+1-" + ... # normalize from caller ID
llm_client = LlmClient(db["users"][number]["name"])
first_event = llm_client.draft_begin_message()
await websocket.send_json(first_event.__dict__)Look up caller by phone number
# stream the LLM response back to Retell
async for event in llm_client.draft_response(request):Greet by name on connection
await websocket.send_json(event.__dict__)
if request.response_id < response_id:
break # new response needed, abandon this one
async for data in websocket.iter_json():Spawn one task per inbound chunk
asyncio.create_task(handle_message(data))
Each WebSocket message is handled in its own task so a long LLM response never blocks the next user utterance.
03 · Multi-agent routing

OpenAI Swarm with handoff functions

Triage owns intent classification and delegates by calling a transfer_to_X function. Each specialist has its own tool surface and an explicit way home.

server/agent_swarm.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
self.triage_agent = TriageAgent([
self.transfer_to_accounts,
self.transfer_to_payments,Triage agent gets all transfer fns
self.transfer_to_applications,
])
self.accounts_agent = AccountsAgent(
transfer_to_payments=self.transfer_to_payments,
handle_account_balance=handle_account_balance,Specialists know how to return
retrieve_bank_statement=retrieve_bank_statement,
)
def transfer_to_accounts(self, ctx, user_message):
self.current_agent = self.accounts_agent
return self.accounts_agent
def transfer_back_to_triage(self, ctx, response):
self.current_agent = self.triage_agent
return self.triage_agent
Handoffs are first-class tools: they are functions the LLM literally calls. There is no hidden router.
server/agents/triage_agent.py
python
1
2
3
4
5
6
7
8
9
triage_instructions = """
You are the Triage Agent ...
1. Account balances or statements → Accounts Agent
2. Transfers, scheduled payments, cancellations → Payments Agent
3. Loan or credit card applications → Applications AgentLoan = Applications, not Payments
4. Need more info? Ask one direct question.
5. Don't expose your internal decisions.
"""
Verbatim from server/agents/triage_agent.py and reused as the system prompt for Live AI mode.
04 · Action layer

Real bank operations against an in-memory ledger

transfer_funds is the canonical example: validate, check funds, mutate balances, record a payment, return a Result that the LLM can read back to the caller.

server/agents/payments_agent.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def transfer_funds(ctx, from_account, to_account, amount):
if not validate_account_id(from_account):
return Result(value="Source account does not exist.", agent=None)
if not validate_account_id(to_account):
return Result(value="Destination account does not exist.", agent=None)
if not validate_amount(amount):
return Result(value="Amount must be a positive number.", agent=None)Defensive validation
db = get_db()
if db["accounts"][from_account]["balance"] < amount:
return Result(value="Insufficient funds.", agent=None)
db["accounts"][from_account]["balance"] -= amount
db["accounts"][to_account]["balance"] += amountReal balance mutation
new_payment_id = generate_payment_id()
db["payments"][new_payment_id] = {
"from_account": from_account,
"to_account": to_account,
"amount": amount,
"date": datetime.now().strftime("%Y-%m-%d"),Append to ledger
"status": "Completed",
}
set_db(db)
return Result(
value=f"Transferred {amount:.2f{'}'} from {'{'}from_account{'}'} to {'{'}to_account{'}'}. Payment ID: {'{'}new_payment_id{'}'}.",
agent=None,agent=None ⇒ control returns to triage
)
05 · Applications & decentralized storage

LaTeX → PDF → Pinata IPFS

When the caller applies for a loan or credit card, the Applications Agent renders a LaTeX template, compiles to PDF with pdflatex, and pins the file to IPFS through Pinata. The CID surfaces in the operator console.

server/agents/applications_agent.py
python
1
2
3
4
5
6
7
8
9
10
tex_filepath = os.path.join(APPLICATIONS_DIR, tex_filename)
pdf_filepath = os.path.join(APPLICATIONS_DIR, pdf_filename)
with open(tex_filepath, "w") as f:
f.write(latex_content)Run pdflatex in the apps dir
subprocess.run(["pdflatex", "-interaction=nonstopmode", tex_filename],Pin the resulting PDF to IPFS
cwd=APPLICATIONS_DIR, check=True)
upload_pdf_to_pinata(pdf_filepath, "LOAN")
server/pinata.py
python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
PINATA_URL = "https://api.pinata.cloud/pinning/pinFileToIPFS"
headers = {
"pinata_api_key": PINATA_API_KEY,
"pinata_secret_api_key": PINATA_SECRET_API_KEY,
}
with open(pdf_path, "rb") as f:
files = {"file": (os.path.basename(pdf_path), f, "application/pdf")}
metadata = {"name": f"{document_type}_{stem}", "keyvalues": {...}}
data = {"pinataMetadata": json.dumps(metadata)}
r = requests.post(PINATA_URL, files=files, data=data, headers=headers)
Returns IpfsHash + URL
if r.status_code == 200:
return r.json() # {"IpfsHash": "...", ...}
06 · Why we picked these

Tradeoffs we made under a 24-hour clock

Telephony
Retell AIinstead of Twilio Voice + custom STT/TTS

Retell collapses STT, TTS, barge-in handling, and the LLM bidirectional WebSocket into one provider. We had hours, not days.

Multi-agent
OpenAI Swarminstead of LangGraph or hand-rolled router

Swarm's handoff-as-a-tool model maps 1:1 to a phone-call transfer. Simpler mental model, smaller diff against vanilla OpenAI tool calling.

Document storage
Pinata IPFSinstead of S3 / Supabase Storage

The challenge sponsor was Pinata. Bonus: a content-addressed CID is naturally tamper-evident, which is a nice property for a regulated artifact.

Bank data
In-memory dictinstead of Postgres / SQLite

Hackathon scope. The dict's shape ports cleanly to a real DB later — no ORM lock-in, no migrations to maintain over the weekend.

Console transport
Plain WebSocketinstead of SSE / polling

One bidirectional socket per operator, identical contract to the LLM channel. Easier to extend with operator-initiated actions later.

Frontend
Next.js 15 + RSCinstead of Vite SPA

Matches Vercel deployment defaults and lets the rebuilt project page ship serverless route handlers (Live AI) without a second backend.

Want the real telephony loop?

The full Python backend is in server/. Set the env vars below and run uvicorn main:app — the operator console here will pick the WebSocket back up with a one-line change in src/lib/store.ts.

RETELL_API_KEY
OPENAI_API_KEY
PINATA_API_KEY