The Setup
In 60 days, 41,377 vessel visits were logged. Eighty-three point three percent โ 34,469 โ carry an estimated_cargo_hydrostatic value. The coverage headline looks solid. The method breakdown does not.
Of those 34,469 visits with estimates:
- 10,726 (25.9%) were computed under the
no_draft_datamethod. That value is 0 tons. Not missing, not uncertain โ zero. These are not cargo estimates; they are null-equivalent placeholders with a tag. - 15,125 (36.6%) used
fallback_tons_per_meterโ a population model derived from vessel class statistics, not from the vessel's measured draft during this visit. - 6,393 (15.5%) used
draft_velocity_filteredโ a real draft-change signal with medium confidence. - 514 (1.24%) used
trim_saline_correctedโ the only method that accounts for both trim and port salinity. - 6,908 (16.7%) have no estimate at all.
The Chain
The method choice is not neutral. For bulk carriers โ where cargo is the primary throughput signal โ trim_saline_corrected produces an average of 5,557 tons per visit. fallback_tons_per_meter produces an average of 7,641 tons for the same vessel class. That is a 38% gap, and it runs in the direction a fallback model would not be expected to go: the population model overstates cargo relative to the saline-corrected measurement.
The reason is selection bias. The 454 bulk carrier visits that qualify for trim_saline_corrected are vessels with complete draft data at arrival and departure in ports with known water density. They are typically mid-tier vessels with moderate loads. The fallback fires on the rest, including lighter-loaded vessels where the population average overestimates what is actually aboard.
Tankers take a different path. 6,114 tanker visits use draft_velocity_filtered at medium confidence, averaging 1,427 tons. That figure reflects draft change โ load transferred at berth โ not departure displacement. For tankers that load partially or transfer cargo at sea, the signal is structurally incomplete.
The Implication
Any analytics layer that aggregates estimated_cargo_hydrostatic for throughput forecasting is working with a composite signal that behaves differently depending on which method fired. A port dominated by no_draft_data visits will show systematically near-zero throughput. A port dominated by fallback_tons_per_meter bulk calls will show cargo estimates that skew 38% above measurement. These are not noise โ they are systematic biases by port and vessel type.
Screening workflows that use cargo estimate as a risk filter โ looking for anomalously light vessels as a sanctions or smuggling signal, for example โ will fire differently in ports where measurement is possible versus ports where fallback is the rule. A heavy fallback rate is indistinguishable from a genuine light-load signal unless the method tag is checked.
What to Watch
trim_saline_corrected coverage at 1.24% is not an infrastructure ceiling โ it is a data availability limit. The method fires when both arrival and departure draft readings are present alongside port water density data. The port_water_density_cells table currently covers 104 cells at 35 ports. Expanding that coverage would directly lift the quality tier for the most accurate method.
The 10,726 no_draft_data visits are the most tractable gap. A subset of those vessels self-report draft in AIS messages at hourly intervals; identifying which visits have usable AIS-reported draft would pull the no_draft rate down without altering the method stack.
Limitations
The 38% overestimate gap between fallback_tons_per_meter and trim_saline_corrected is computed on overlapping but non-identical bulk carrier populations. The comparison is informative but not a controlled experiment โ the fallback fires precisely on vessels where measurement is unavailable, which may have systematically different load profiles. The actual population-level gap could be smaller or larger. The draft_velocity_filtered average for tankers reflects cargo exchanged at berth, not total cargo aboard; comparing it to full hydrostatic estimates is not equivalent. Method confidence scores (0.720 for medium, 0.850 for high) are model-assigned thresholds, not empirically validated accuracy figures.
Data as of 2026-06-09. Sources: vessel_visits (60-day window, n=41,377), vessel_hydrostatic_profiles, port_water_density_cells.