· System Design

One phone call, four agents, two streams.

The whole stack was built in 24 hours. It connects a regular phone line to a Retell-powered conversational layer, hands transcript chunks to a small OpenAI Swarm of cooperating agents, persists banking actions against an in-memory ledger, and pins generated application PDFs to IPFS through Pinata. A second WebSocket pipes everything to a Next.js operator console for live observability.

01 · Topology

What hits what, in what order

A single inbound call fans out to two streams: the LLM bidirectional WebSocket back into Retell, and a separate observability WebSocket into the operator console.

02 · Telephony bridge

The Retell ↔ FastAPI bidirectional socket

Retell forwards real-time transcripts and waits for streamed LLM responses on a per-call WebSocket. The bridge starts a fresh LlmClient and immediately greets the caller by name.

server/main.py

python

if request_json["interaction_type"] == "call_details":

number = "+1-" + ... # normalize from caller ID

llm_client = LlmClient(db["users"][number]["name"])

first_event = llm_client.draft_begin_message()

await websocket.send_json(first_event.__dict__)◀ Look up caller by phone number

# stream the LLM response back to Retell

async for event in llm_client.draft_response(request):◀ Greet by name on connection

await websocket.send_json(event.__dict__)

if request.response_id < response_id:

break # new response needed, abandon this one

async for data in websocket.iter_json():◀ Spawn one task per inbound chunk

asyncio.create_task(handle_message(data))

Each WebSocket message is handled in its own task so a long LLM response never blocks the next user utterance.

03 · Multi-agent routing

OpenAI Swarm with handoff functions

Triage owns intent classification and delegates by calling a transfer_to_X function. Each specialist has its own tool surface and an explicit way home.

server/agent_swarm.py

python

self.triage_agent = TriageAgent([

self.transfer_to_accounts,

self.transfer_to_payments,◀ Triage agent gets all transfer fns

self.transfer_to_applications,

])

self.accounts_agent = AccountsAgent(

transfer_to_payments=self.transfer_to_payments,

handle_account_balance=handle_account_balance,◀ Specialists know how to return

retrieve_bank_statement=retrieve_bank_statement,

)

def transfer_to_accounts(self, ctx, user_message):

self.current_agent = self.accounts_agent

return self.accounts_agent

def transfer_back_to_triage(self, ctx, response):

self.current_agent = self.triage_agent

return self.triage_agent

Handoffs are first-class tools: they are functions the LLM literally calls. There is no hidden router.

server/agents/triage_agent.py

python

triage_instructions = """

You are the Triage Agent ...

1. Account balances or statements → Accounts Agent

2. Transfers, scheduled payments, cancellations → Payments Agent

3. Loan or credit card applications → Applications Agent◀ Loan = Applications, not Payments

4. Need more info? Ask one direct question.

5. Don't expose your internal decisions.

"""

Verbatim from server/agents/triage_agent.py and reused as the system prompt for Live AI mode.

04 · Action layer

Real bank operations against an in-memory ledger

transfer_funds is the canonical example: validate, check funds, mutate balances, record a payment, return a Result that the LLM can read back to the caller.

server/agents/payments_agent.py

python

def transfer_funds(ctx, from_account, to_account, amount):

if not validate_account_id(from_account):

return Result(value="Source account does not exist.", agent=None)

if not validate_account_id(to_account):

return Result(value="Destination account does not exist.", agent=None)

if not validate_amount(amount):

return Result(value="Amount must be a positive number.", agent=None)◀ Defensive validation

db = get_db()

if db["accounts"][from_account]["balance"] < amount:

return Result(value="Insufficient funds.", agent=None)

db["accounts"][from_account]["balance"] -= amount

db["accounts"][to_account]["balance"] += amount◀ Real balance mutation

new_payment_id = generate_payment_id()

db["payments"][new_payment_id] = {

"from_account": from_account,

"to_account": to_account,

"amount": amount,

"date": datetime.now().strftime("%Y-%m-%d"),◀ Append to ledger

"status": "Completed",

}

set_db(db)

return Result(

value=f"Transferred {amount:.2f{'}'} from {'{'}from_account{'}'} to {'{'}to_account{'}'}. Payment ID: {'{'}new_payment_id{'}'}.",

agent=None,◀ agent=None ⇒ control returns to triage

)

05 · Applications & decentralized storage

LaTeX → PDF → Pinata IPFS

When the caller applies for a loan or credit card, the Applications Agent renders a LaTeX template, compiles to PDF with pdflatex, and pins the file to IPFS through Pinata. The CID surfaces in the operator console.

server/agents/applications_agent.py

python

tex_filepath = os.path.join(APPLICATIONS_DIR, tex_filename)

pdf_filepath = os.path.join(APPLICATIONS_DIR, pdf_filename)

with open(tex_filepath, "w") as f:

f.write(latex_content)◀ Run pdflatex in the apps dir

subprocess.run(["pdflatex", "-interaction=nonstopmode", tex_filename],◀ Pin the resulting PDF to IPFS

cwd=APPLICATIONS_DIR, check=True)

upload_pdf_to_pinata(pdf_filepath, "LOAN")

server/pinata.py

python

PINATA_URL = "https://api.pinata.cloud/pinning/pinFileToIPFS"

headers = {

"pinata_api_key": PINATA_API_KEY,

"pinata_secret_api_key": PINATA_SECRET_API_KEY,

}

with open(pdf_path, "rb") as f:

files = {"file": (os.path.basename(pdf_path), f, "application/pdf")}

metadata = {"name": f"{document_type}_{stem}", "keyvalues": {...}}

data = {"pinataMetadata": json.dumps(metadata)}

r = requests.post(PINATA_URL, files=files, data=data, headers=headers)

◀ Returns IpfsHash + URL

if r.status_code == 200:

return r.json() # {"IpfsHash": "...", ...}

06 · Why we picked these

Tradeoffs we made under a 24-hour clock

Telephony

Retell AIinstead of Twilio Voice + custom STT/TTS

Retell collapses STT, TTS, barge-in handling, and the LLM bidirectional WebSocket into one provider. We had hours, not days.

Multi-agent

OpenAI Swarminstead of LangGraph or hand-rolled router

Swarm's handoff-as-a-tool model maps 1:1 to a phone-call transfer. Simpler mental model, smaller diff against vanilla OpenAI tool calling.

Document storage

Pinata IPFSinstead of S3 / Supabase Storage

The challenge sponsor was Pinata. Bonus: a content-addressed CID is naturally tamper-evident, which is a nice property for a regulated artifact.

Bank data

In-memory dictinstead of Postgres / SQLite

Hackathon scope. The dict's shape ports cleanly to a real DB later — no ORM lock-in, no migrations to maintain over the weekend.

Console transport

Plain WebSocketinstead of SSE / polling

One bidirectional socket per operator, identical contract to the LLM channel. Easier to extend with operator-initiated actions later.

Frontend

Next.js 15 + RSCinstead of Vite SPA

Matches Vercel deployment defaults and lets the rebuilt project page ship serverless route handlers (Live AI) without a second backend.

Want the real telephony loop?

The full Python backend is in server/. Set the env vars below and run uvicorn main:app — the operator console here will pick the WebSocket back up with a one-line change in src/lib/store.ts.

RETELL_API_KEY

OPENAI_API_KEY

PINATA_API_KEY

View server/ on GitHub Build notes