// Walkthrough · 8 min read

INSIDE A CARRIER SCORECARD: A SAMPLE OUTPUT.

"AI-driven carrier scorecard" is a vague phrase. Here is the actual output a Marapone build produces, anonymized from a real brokerage deployment, with the source data behind every line so you can see exactly where each number comes from.

By the Marapone team · Updated 2026

The shape of the report

A working carrier scorecard isn't a dashboard. It's a weekly written document that an ops VP reads on Monday morning and acts on the same week. Three sections:

Top-line summary — five carriers, ranked, with one-sentence reasons.
Movement alerts — carriers whose performance changed materially in the last 30 days.
Per-carrier detail — the metrics behind the rank, by region and lane.

The scorecard is delivered as a PDF and as a structured row in the brokerage's database. Both formats produced from the same underlying numbers; no copy-paste.

Anonymized top-line: Q3 2025, mid-size brokerage

Top 5 carriers, weighted score (0-100):

1. Carrier A — 91.2 (consistent on-time, billing accuracy 99.4%)
2. Carrier B — 88.7 (best lane coverage, slight detention slip)
3. Carrier C — 84.3 (recovered from Q2 dip; trending up)
4. Carrier D — 81.0 (good price, weakening on-time on Memphis lanes)
5. Carrier E — 76.8 (high reweigh disputes; investigate)

Each score is a weighted aggregate of seven inputs: on-time pickup, on-time delivery, billing accuracy, accessorial dispute rate, claim frequency, tender acceptance, and POD compliance. Weights are configurable per brokerage; the defaults reflect what most ops VPs actually care about.

Where each number comes from

Take Carrier B's score above (88.7). The component sources:

On-time pickup (96.2%): from EDI 214 status messages compared to scheduled pickup window in the TMS.
On-time delivery (94.1%): Project44 milestone events vs scheduled delivery window.
Billing accuracy (98.8%): from the Marapone audit module — invoiced vs quoted.
Accessorial dispute rate (1.2%): from the dispute log in the TMS.
Claim frequency (0.18%): from claims module in TMS.
Tender acceptance (87.3%): from EDI 990 responses.
POD compliance (99.1%): POD upload within 48hrs of delivery.

Every number traces back to a specific event in a specific source system, with a timestamp. If anyone disputes a score, the audit trail is one click away.

The movement alerts section

This is the section the VP actually acts on. Anonymized example:

Movement alerts (last 30 days):

• Carrier D, Memphis lane: on-time delivery dropped from 94% to 79%. Eight late-by-2hr+ events in two weeks. Suggested action: call carrier rep before next shipper QBR.
• Carrier E, US Northeast: reweigh disputes up 4× vs prior 90-day baseline. Pattern: 12 out of 18 disputes are origin scale calibration off by >200lbs. Suggested action: validate origin scale at the affected DC.
• Carrier A, all lanes: tender acceptance dropped from 94% to 87%. Coincides with their fleet reduction announcement Q3. Suggested action: diversify backup carriers on top 5 Carrier A lanes.

The model surfaces the pattern. A human decides what to do about it. The scorecard never prescribes "fire this carrier." It gives the VP the facts in the right order.

Per-carrier detail (Carrier D, expanded)

Each carrier gets a one-page detail view. The structure for Carrier D in the example above:

Score: 81.0 (down 4.3 from previous month).
Total shipments tracked: 247.
Total spend: $284,000.
Lane breakdown: 6 lanes, with score per lane.
The Memphis lane (where the drop happened) has its own breakout: 38 shipments, 79% on-time delivery, 11 service exceptions in 30 days, top 3 root-cause categories from the EDI 214 reason codes.

An ops VP reads this in 90 seconds and walks into the carrier rep call with specifics.

Why this matters more than a dashboard

Most BI dashboards for carrier performance are well-built and rarely opened. The reason is structural: a dashboard requires the user to come to it, ask the right question, and interpret the chart. By Wednesday morning the dashboard is stale and ignored.

A weekly written report does the work in advance. It picks the most important changes. It states them in language that maps to action. It arrives in the inbox at the same time every Monday. The discipline of the deliverable is what makes the system actually used.

Honest framing:

The hard part of carrier scoring isn't the math. It's producing a report busy people actually read and act on. The Marapone scorecard is built around that constraint.

What we configured for this client

Their custom weights:

On-time delivery: 25% (their shippers care most about this).
Billing accuracy: 20%.
On-time pickup: 15%.
Tender acceptance: 15%.
Claim frequency: 10%.
Accessorial dispute rate: 10%.
POD compliance: 5%.

Other clients weight differently. Ours uses 30% on billing accuracy because invoice leakage is their biggest pain. Yours might be different. The report adjusts; the inputs don't.

// Related reading

Private AI + Your TMS

Integration patterns.

Invoice Audit: Build vs Buy

Decision frame.

What Owning the Model Means

Weights, code, runbooks.

WANT A SCORECARD
RUN ON YOUR CARRIERS?

Send us 90 days of shipment + invoice data and we'll produce a sample scorecard — anonymized in the deliverable, real numbers behind it.

Request a Sample Scorecard →

The shape of the report

Anonymized top-line: Q3 2025, mid-size brokerage

Where each number comes from

The movement alerts section

Per-carrier detail (Carrier D, expanded)

Why this matters more than a dashboard

What we configured for this client

WANT A SCORECARDRUN ON YOUR CARRIERS?

WANT A SCORECARD
RUN ON YOUR CARRIERS?