- settings: batch_size=41 - tensorrt_engine: BATCH_SIZE=41 - preprocessor: 移除 padding 逻辑,直接 batch=1 - 预处理延迟从 17ms 5ms