There's a graveyard in every brokerage's software budget: the visibility platform someone half-integrated, the pricing tool with the stale login, the workflow product that was going to fix the inbox. None of them were bad software. All of them died the same death — the vendor shipped a login and a help center, the broker's 'IT guy' (a dispatcher with patience) got it 60% configured, and entropy did the rest. Now apply that pattern to an AI agent, where configuration isn't cosmetic: thresholds decide which exceptions surface, mappings decide what the agent believes about your freight, and a wrong setting doesn't just hide a feature — it drafts the wrong thing about a real load.
The two models, honestly
AI software you run vs. managed AI service
| AI software (you operate it) | Managed AI service (vendor operates it) | |
|---|---|---|
| Onboarding | Your team maps the TMS, sets thresholds, follows docs | Vendor runs intake, mapping, and sandbox replay against your historical freight |
| Validation before go-live | Whatever testing you have time for | Shadow mode against live read-only data, comparing the agent's calls to your team's, until both sides sign off |
| Tuning over time | Drifts after the champion who configured it leaves | Vendor's job — thresholds, mappings, and new workflows tuned as your freight mix changes |
| When it breaks at 2 a.m. | A support ticket and a status page | Vendor's monitoring catches it; affected workflows pause; you get an explanation, not a mystery |
| Cost shape | Lower sticker, plus your team's hidden operating time | Higher sticker, includes the operating team |
| Control | Total — including total responsibility | Approval rules, gates, and rollout pace stay yours; the plumbing doesn't |
Why AI raises the stakes on this old question
- Configuration is judgment, not preference: a stale-tracking threshold isn't a settings choice, it's a service policy. Somebody with freight operations experience has to own it.
- Trust is path-dependent: an agent that drafts nonsense in week two — because mapping was rushed — gets ignored in week ten even after it's fixed. You only get one first impression with a dispatch floor.
- The rollout itself is the safety mechanism: sandbox replay, shadow mode, scoped go-live, then expansion gated on how often drafts get approved unchanged. That sequence — the spine of any serious vendor evaluation — is operational work someone has to actually do.
- Monitoring isn't optional: retries, dead-lettered events, and provider outages are weekly realities of any integration. Unwatched, they degrade silently into 'the AI missed it.'
When each model is right
- Choose software-you-run when you have genuine in-house ops engineering — people whose actual job is owning integrations — and the volume to justify them.
- Choose managed when you want the outcome without building that muscle: most brokerages under a few hundred thousand loads a year, and any team whose last three tools became shelfware.
- Builders are the exception: if you're creating your own freight agent, you don't want either — you want the execution layer as infrastructure, which is what Headless Haulbase is for.
Haulbase made its choice deliberately: the Haulbase Agent is a managed service, not a login. Haulbase owns intake, integration mapping, sandbox replay, shadow-mode review, monitoring, and escalation; your team owns the approvals, the rules, and the pace — including whether to expand from the Agent into ATMS later. You're not buying a tool to configure. You're buying a freight operations capability that shows up working, with the receipts to prove it.