fix(ops): 修复同一工牌并行多单的状态错乱
线上观察:管理员手动取消一个僵尸 DISPATCHED 单会引发"越清越多"—— 系统顺势派队列首条给仍在工作的保洁员,监听器再用"旧工单残留"机制 尝试取消当前正在执行的工单,该取消走 REQUIRES_NEW 独立事务且吞异常, 最终新单落地、旧单残留,同一设备挂多个非终态工单。 修复两处: 1. DispatchEngineImpl.autoDispatchNext 入口加设备空闲校验: 若执行人名下还有 DISPATCHED/CONFIRMED/ARRIVED/PAUSED 工单(排除 completedOrderId),直接早返回,不再派发。所有调用方(保洁/安保 handleCancelled、asyncCompleteAndDispatchNext、xxl-job 空闲扫描) 自动受保护。新增 OpsOrderMapper.selectActiveByAssignee。 2. BadgeDeviceStatusEventListener.handleDispatched 移除"残留取消": 旧逻辑用 REQUIRES_NEW 事务 + 吞异常,是对数据已错乱场景的暴力兜底, 失败时导致误杀。改为只打 ERROR 告警暴露问题,仅清理 Redis 关联。 真正的防线在 DispatchEngine 入口。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -178,6 +178,22 @@ public class DispatchEngineImpl implements DispatchEngine {
|
||||
public DispatchResult autoDispatchNext(Long completedOrderId, Long assigneeId) {
|
||||
log.info("任务完成后自动派发下一单: completedOrderId={}, assigneeId={}", completedOrderId, assigneeId);
|
||||
|
||||
if (assigneeId == null) {
|
||||
log.warn("autoDispatchNext 缺少执行人,跳过派发: completedOrderId={}", completedOrderId);
|
||||
return DispatchResult.success("缺少执行人,跳过派发", null);
|
||||
}
|
||||
|
||||
// 空闲校验:若执行人仍挂着其他活跃工单(DISPATCHED/CONFIRMED/ARRIVED/PAUSED),
|
||||
// 说明设备尚未真正空闲,不应再派发新任务——否则会触发"同一设备并行多单"的状态错乱,
|
||||
// 典型场景是管理员手动取消一个僵尸 DISPATCHED 单时,handleCancelled 会调到这里。
|
||||
List<OpsOrderDO> activeOrders = orderMapper.selectActiveByAssignee(assigneeId, completedOrderId);
|
||||
if (!activeOrders.isEmpty()) {
|
||||
OpsOrderDO head = activeOrders.get(0);
|
||||
log.info("执行人仍有活跃工单,跳过自动派发: assigneeId={}, completedOrderId={}, activeCount={}, sampleOrderId={}, sampleStatus={}",
|
||||
assigneeId, completedOrderId, activeOrders.size(), head.getId(), head.getStatus());
|
||||
return DispatchResult.success("执行人非空闲,跳过派发", assigneeId);
|
||||
}
|
||||
|
||||
Long fallbackAreaId = null;
|
||||
OpsOrderDO completedOrder = orderMapper.selectById(completedOrderId);
|
||||
if (completedOrder != null) {
|
||||
|
||||
@@ -92,6 +92,28 @@ public interface OpsOrderMapper extends BaseMapperX<OpsOrderDO> {
|
||||
.last("LIMIT 1"));
|
||||
}
|
||||
|
||||
/**
|
||||
* 查询执行人名下尚未结束的工单(DISPATCHED/CONFIRMED/ARRIVED/PAUSED)
|
||||
* <p>
|
||||
* 用于 autoDispatchNext 等调度入口的空闲校验:若该执行人仍挂着活跃工单,
|
||||
* 则不应再派发新任务,避免"越清越多"的级联派发。
|
||||
*
|
||||
* @param assigneeId 执行人ID(工牌设备ID)
|
||||
* @param excludeOrderId 需要排除的工单ID(通常是刚完成/取消触发本次调度的工单),可传 null
|
||||
* @return 活跃工单列表,按创建时间升序
|
||||
*/
|
||||
default List<OpsOrderDO> selectActiveByAssignee(Long assigneeId, Long excludeOrderId) {
|
||||
return selectList(new LambdaQueryWrapperX<OpsOrderDO>()
|
||||
.eq(OpsOrderDO::getAssigneeId, assigneeId)
|
||||
.in(OpsOrderDO::getStatus,
|
||||
WorkOrderStatusEnum.DISPATCHED.getStatus(),
|
||||
WorkOrderStatusEnum.CONFIRMED.getStatus(),
|
||||
WorkOrderStatusEnum.ARRIVED.getStatus(),
|
||||
WorkOrderStatusEnum.PAUSED.getStatus())
|
||||
.ne(excludeOrderId != null, OpsOrderDO::getId, excludeOrderId)
|
||||
.orderByAsc(OpsOrderDO::getCreateTime));
|
||||
}
|
||||
|
||||
// ==================== 统计聚合查询 ====================
|
||||
|
||||
/**
|
||||
|
||||
Reference in New Issue
Block a user