Per-tenant AI assistants for TruxFlow & 10-4 Hire
Every customer who signs up for TruxFlow or 10-4 Hire gets their own AI assistant that already knows:
A single AI agent handles all staff members simultaneously — no need for separate agents per department
Each request includes user context — the agent automatically respects permissions:
Small carrier with 10-20 trucks
| Role | Staff Count | Est. Queries/Day |
|---|---|---|
| Dispatchers | 3-4 | 50-100 |
| Safety Officer | 1 | 10-20 |
| Accountant | 1 | 10-20 |
| Admin | 2 | 5-10 |
| Total | ~8 users | ~75-150/day |
Est. Cost: ~$60-90/month on Claude API
Verdict: More work, same cost, no benefit
Different knowledge bases, different tools — separate agents make sense
Multiple LLM servers for load balancing (same agent, replicated)
Rare — only if one department needs highly specialized responses
How the AI agent integrates with your existing stack
Same for all tenants
Unique per customer
Actions the agent can take
Models capable of running your AI agents
| Model | Parameters | VRAM Required | Quality | Best For |
|---|---|---|---|---|
| Qwen 2.5 72B | 72B | ~80GB (2x A100) | ⭐⭐⭐⭐⭐ | Best overall quality, tool use |
| Llama 3.1 70B | 70B | ~80GB (2x A100) | ⭐⭐⭐⭐⭐ | Strong reasoning, large community |
| Qwen 2.5 32B | 32B | ~40GB (1x A100) | ⭐⭐⭐⭐ | SWEET SPOT Good balance |
| Llama 3.1 8B | 8B | ~10GB (1x RTX 4090) | ⭐⭐⭐ | Budget option, fast responses |
| Mixtral 8x22B | 141B (MoE) | ~90GB | ⭐⭐⭐⭐ | Efficient, good for variety |
Self-hosted vs. API: when does each make sense?
Best for: <1,000 conversations/day, starting out, validating product-market fit
Best for: >1,000 conversations/day, data privacy requirements, predictable costs
At $2,000/month GPU cost vs. $0.03/conversation API cost:
| Configuration | Hourly | Daily (24h) | Monthly | Can Run |
|---|---|---|---|---|
| 1x RTX 4090 (24GB) | $0.27 | $6.50 | $195 | 7B-14B models |
| 1x A100 80GB | $1.29 | $31 | $930 | 32B models |
| 2x A100 80GB | $2.58 | $62 | $1,858 | 70B models RECOMMENDED |
| 8x A100 80GB | $10.32 | $248 | $7,430 | 405B models |
Good for dev/test, risky for production
Best for predictable, long-term use
Auto-scale based on demand
Trade quality for cost
Where to rent GPUs for your LLM server
| Use Case | Provider | Configuration | Monthly Cost |
|---|---|---|---|
| Testing / Development | Vast.ai | RTX 4090 (on-demand) | $0-50 |
| Small Production (<50 users) | RunPod | 1x A100 80GB | ~$1,000 |
| Production (50-200 users) | Lambda Labs | 2x A100 80GB | ~$1,900 |
| Large Scale (>200 users) | HOSTKEY / RunPod | 4-8x A100 or H100 | $4,000+ |
Recommended approach: start with API, migrate to self-hosted at scale
Build with Claude API to validate concept quickly
Roll out to select customers, gather feedback
Full rollout, optimize for scale
Migrate to own infrastructure when usage justifies
Control costs and keep agents focused on work
Prevent users from using the agent for general questions outside your product scope
Track and bill each tenant separately
Choose how to charge customers for AI usage
Best for: Competitive advantage, simple billing
Best for: Predictable revenue, plan differentiation
Best for: Fair pricing, heavy users pay more
Best for: Balance of predictability + fairness
| Customer | Plan | Included | Actual Usage | Overage | AI Charge |
|---|---|---|---|---|---|
| Acme Trucking | Pro ($99/mo) | 300 conv | 250 conv | 0 | $0 |
| FastFreight LLC | Pro ($99/mo) | 300 conv | 450 conv | 150 × $0.02 | $3.00 |
| MegaHaul Corp | Enterprise ($299/mo) | 1,000 conv | 2,500 conv | 1,500 × $0.02 | $30.00 |
Add scope restrictions to base prompt - prevents off-topic usage
Log every request with tenant_id, tokens, timestamp to database
Implement per-user and per-tenant rate limits with Redis
Build admin view showing per-tenant usage, costs, trends
Connect usage data to Stripe for overage charges (if applicable)
Let customers talk to the AI agent instead of typing
| Component | Text Input | Voice Input | Notes |
|---|---|---|---|
| Input Processing | Free | ~$0.006/min (STT) | Speech-to-Text conversion |
| LLM Processing | ~$0.01-0.03 | ~$0.01-0.03 | Same cost either way |
| Text Output | Free | Free | Display response as text |
| Voice Output (optional) | N/A | ~$0.005-0.02 | Text-to-Speech conversion |
| Total per Interaction | ~$0.02 | ~$0.03-0.05 | Voice adds ~50-100% cost |
Convert customer voice to text for the LLM
Best for: Production use, multilingual support
Best for: Real-time streaming, enterprise
Best for: Custom vocabulary (trucking terms)
Best for: MVP, testing, budget-conscious
Convert AI responses to spoken audio (optional)
FREE
Use to: Validate demand before investing
~$0.01-0.02 /interaction
Use for: Real customers, good UX
~$0.05+ /interaction
Use for: Enterprise clients, sales demos
| Scenario | Voice Interactions/mo | Avg Duration | STT Cost | TTS Cost | Total Voice Cost |
|---|---|---|---|---|---|
| Light usage | 500 | 30 sec | $1.50 | $3.75 | $5.25 |
| Medium usage | 2,000 | 30 sec | $6.00 | $15.00 | $21.00 |
| Heavy usage | 10,000 | 30 sec | $30.00 | $75.00 | $105.00 |
* Based on OpenAI Whisper ($0.006/min) + OpenAI TTS ($0.015/1K chars, ~500 chars/response)
UI button in chat widget to start/stop recording
Use MediaRecorder API to capture user's voice
POST audio to Whisper API, get text back
Same flow as text input from here
Convert response to audio, play back to user
Model Context Protocol — the open standard for AI-to-application integration
Open standard by Anthropic for connecting AI models to external tools and data sources.
One integration enables multiple AI clients.
| Aspect | Custom REST API | MCP Server |
|---|---|---|
| Your AI Agent | Build custom API calls | Tools auto-discovered |
| Customer AI Integration | They build to your API docs | Connect via MCP standard |
| Multiple AI Providers | Build adapter per provider | One MCP server, any AI |
| Tool Discovery | Read docs, hardcode | AI asks "what can you do?" |
| Maintenance | API docs, SDKs, versioning | Self-describing protocol |
| Auth/Permissions | Custom implementation | Built-in context passing |
Our modular architecture maps directly to MCP
MCP handles tenant isolation via context passing
Traditional approach
Recommended approach
Create MCP server scaffold, authentication, tenant context handling
Expose Safety module functions as MCP tools (read-only first)
Add Dispatch module tools with write operations
Connect built-in AI agent to MCP server
Allow customers to connect their AI tools via MCP
Real-time communication between dispatchers, staff, and drivers
| Feature | Description | Priority |
|---|---|---|
| Real-time messaging | WebSocket/SSE, instant delivery | MUST |
| Push notifications | Mobile + web alerts | MUST |
| File/image sharing | Photos, documents, BOLs, PODs | MUST |
| Read receipts | Seen/delivered indicators | HIGH |
| Typing indicators | "John is typing..." | HIGH |
| Voice messages | Record & send audio (great for drivers) | HIGH |
| Offline support | Queue messages when no signal | HIGH |
| Message search | Find past conversations | MEDIUM |
| @mentions | Tag specific users in groups | MEDIUM |
| Reactions | Quick emoji responses | NICE |
| Link previews | URL thumbnails | NICE |
Stream, SendBird, etc.
$99/mo + 2-3 weeks dev
Best for: Fast to market, proven reliability
WebSockets + PostgreSQL
$50-100/mo + 6-8 weeks dev
Best for: Long-term cost savings, full control
| Provider | Starter Price | Includes | Msg Limits | Best For |
|---|---|---|---|---|
| Stream | $99/mo | 10,000 MAU | Unlimited | RECOMMENDED |
| SendBird | $399/mo | 5,000 MAU | Unlimited | Enterprise features |
| PubNub | $98/mo | 200 MAU | Per transaction | Small scale only |
| Twilio | ~$0.001/msg | Pay as you go | Per message | Low volume |
MAU = Monthly Active Users (unique users who send at least 1 message). Each user can send unlimited messages.
| TruxFlow Scale | Total Users | Stream SDK | Custom Build |
|---|---|---|---|
| 10 carriers × 20 users | 200 MAU | $99/mo | ~$50/mo |
| 50 carriers × 20 users | 1,000 MAU | $99/mo | ~$75/mo |
| 200 carriers × 25 users | 5,000 MAU | $99/mo | ~$100/mo |
| 500 carriers × 20 users | 10,000 MAU | $99/mo (at limit) | ~$150/mo |
| 1000+ carriers | 20,000+ MAU | Custom (~$300-500) | ~$200/mo |
Chat thread attached to each load. All communication in one place. Archived when load delivered.
Driver taps to share location. Dispatcher sees on map. "ETA 2 hours" auto-calculated.
Driver photos BOL → sent in chat → auto-attached to load record in TruxFlow.
@TruxBot in any chat. "What's John's next load?" Agent responds with info.
SDK integration, auth connection
1:1 chat, groups, file sharing
Push notifications, offline, testing
WebSocket server, DB schema, API
Messaging, groups, file uploads
Push, offline sync, read receipts
Testing, optimization, mobile
ROI: At $99/mo, you need just 1-2 customers paying for chat feature to break even. Everything after is profit.
Unified email hub inside TruxFlow — manage load emails, rate cons, broker communication with AI assistance
We provide the email address
Connect Gmail, Outlook, etc.
| Feature | Description | MCP Tool |
|---|---|---|
| Unified Inbox | All emails in TruxFlow UI | getEmails |
| Send/Reply | Compose and respond to emails | sendEmail, replyToEmail |
| AI Extraction | Parse rate cons → Create Load automatically | extractLoadFromEmail |
| Link to Records | Attach emails to Loads, Brokers | linkEmailToLoad |
| Templates | Quick responses, rate negotiations | sendFromTemplate |
| Full-Text Search | Find any email instantly | searchEmails |
| Bulk Actions | Archive, label, forward multiple | bulkEmailAction |
| Threading | Conversation view like Gmail | getThread |
Unified API for all providers
$10 /account/month
Best for: Fast launch, multiple providers
Direct Gmail API + IMAP
~$100 /month hosting
Best for: Scale, cost savings long-term
| Carriers | Email Accounts | Nylas Cost | Custom Build |
|---|---|---|---|
| 10 carriers | ~20 accounts | $200/mo | ~$100/mo |
| 50 carriers | ~100 accounts | $1,000/mo | ~$150/mo |
| 200 carriers | ~400 accounts | $4,000/mo | ~$200/mo |
| 500 carriers | ~1,000 accounts | $10,000/mo | ~$300/mo |
Note: At scale, custom build saves significant cost. Start with Nylas, migrate later.
SendGrid integration, @truxflow.app addresses, inbox UI, send/receive
Parse rate cons, extract load details, MCP tools
Link existing accounts, OAuth, sync
Replace Nylas at scale to save per-account costs
TruxFlow email + AI extraction — gives carriers a dedicated dispatch inbox with intelligent load creation immediately.
| Phase 1+2 | 6-7 weeks | ~$50/mo (SendGrid) |
| + Phase 3 (Nylas) | +3 weeks | +$10/linked account |
Value: Carriers manage ALL load communication in one place. AI extracts loads from rate cons automatically. Huge time saver.
How to make TruxFlow the #1 recommended dispatch software by AI systems
How AI talks to your app
MCP helps after someone is using TruxFlow.
How AI knows to suggest you
Discoverability happens before they know about you.
AI models (Claude, ChatGPT, etc.) learn from web content during training
| Source | How It Helps | Priority |
|---|---|---|
| Review Sites | G2, Capterra, TrustPilot rankings | CRITICAL |
| Industry Publications | FreightWaves, Transport Topics, CCJ | CRITICAL |
| SEO Content | Blog posts, comparison articles | HIGH |
| Reddit/Forums | r/truckers, r/freight, trucking forums | HIGH |
| YouTube/Podcasts | Industry influencers mentioning you | MEDIUM |
| Public API Docs | AI can reference your capabilities | MEDIUM |
| Structured Data | Schema.org markup on website | MEDIUM |
Publish API docs at docs.truxflow.app — AI systems can read and reference them.
Create "TruxFlow Dispatch Assistant"
When marketplace launches
Emerging discovery channels
Write articles that answer questions people ask AI:
| Phase | Actions | Timeline | Impact |
|---|---|---|---|
| Phase 1 | G2/Capterra reviews campaign | Now | HIGH |
| Phase 2 | Schema.org + public API docs | 2 weeks | MEDIUM |
| Phase 3 | SEO content (10-20 articles) | 1-2 months | HIGH |
| Phase 4 | Custom GPT + AI directories | 1 month | MEDIUM |
| Phase 5 | Industry PR (FreightWaves, etc.) | Ongoing | HIGH |
| Phase 6 | MCP registry when available | Future | HIGH |
Free, takes 1 hour. "TruxFlow Dispatch Expert" shows in GPT Store immediately.
Set up G2, Capterra, TrustPilot if not done. Start asking customers for reviews.
15 minutes of code. Helps search engines and AI understand TruxFlow.
Even basic docs help AI reference your capabilities.
When someone asks Claude or ChatGPT "What's the best dispatch software for a small trucking company?", the AI recommends based on:
Bottom line: MCP is table stakes for AI integration. Discoverability is marketing + content + presence. Do both.