Shipping an anomaly model: latency, drift, and the operat

Aior · Apr 30, 2026

Deploy projenin %70'idir

Model herkesin konuşmak istediği parçadır. Deploy ise projenin hayatta kalıp kalmayacağını belirleyen parçadır. Aşağıda onlarca anomali istasyonundan sonra yakınsadığımız runbook var.

Hatta gerçekten var olan gecikme bütçeleri

Sürekli akış, 200 mm/s, 100 mm parça pitch → 500 ms parça-arası → model + IO bütçesi < 200 ms.
Indexing istasyon, 1 parça/sn → kamera okuma sonrası ~700 ms bütçe.
Sample inspection, manuel yükleme → 2-5 sn bütçe, rahat.

Bütçe kamera okuma, ağ transferi, ön işleme, inference, son işleme, eşik kontrolü ve sonuç yayınlamayı içerir.

Pratik sayılar: CPU'da PatchCore 512x512'de 200-500 ms. RTX A2000'de EfficientAD 8-15 ms. INT8'de FastFlow ile Hailo-8 5-12 ms. Modelinizi bütçenize göre seçin.

Hattı duraklatmadan sıcak model yeniden yükleme

Sistem frame düşürmeden model değiştirebilmeli:

Model loader ayrı süreçte çalışır.
Eski model trafiği sunarken yeni model gölge veride yüklenir + ısınır.
Yeni model hazır olduğunda atomik pointer swap.
Sonraki idle pencerede eski model unload.

Drift tespiti, üç sinyal

Skor dağılımı drift'i — kabul edilen parçalarda 7 günlük kayan kvantiller. 95. persantil 2σ geçerse uyar.
Embedding-uzayı drift — son 10k iyi-parça embedding'i, kalibrasyon setine karşı MMD/KL. Eşikte uyar.
Operatör override oranı — vardiya başına, kamera başına. Yukarı trend araştırılır.

Üçünü de gerektiriyoruz. Skor drift'i aydınlatma değişikliklerini, embedding drift ince kamera/ürün değişikliklerini, override oranı ilk ikinin kaçırdıklarını yakalar.

Kimsenin planlamadığı operatör UI'si

Modelin işi işaretlemek, operatörün işi karar vermek. Aralarındaki UI güveni belirler.

İşe yarayanlar:

Renk kodlu tek anomali skoru (yeşil/sarı/kırmızı) — olasılık değil.
Tıklamada heatmap — varsayılan değil, araştırma için.
"Bu neden bu skoru aldı?" düğmesi — benzer iyi-patch referansı ile en anomalili patch.
Override + neden kodu — yeniden eğitime beslenir.
Gösterilen "confidence" yok — sadece skor ve eşik.

Kural tabanlıya fallback

Her anomali istasyonunda paralel çalışan kural tabanlı reddetme yolu var. Boyut, renk, eksik bileşen — deterministik kontroller. ML modeli kuralların kaçırdığını, kurallar ML'in güvenle yanlış yaptıklarını yakalar.

Deploy yığınınız nedir?

The deployment is 70 % of the project

The model is the part everyone wants to talk about. The deployment is the part that determines whether the project survives. Below is the runbook we've converged on after a few dozen anomaly stations.

Latency budgets that actually exist on a line

Continuous flow, 200 mm/s, 100 mm part pitch → 500 ms part-to-part → model + IO budget < 200 ms.
Indexing station, 1 part/sec → ~700 ms budget after camera read.
Sample inspection, manual loading → 2-5 s budget, comfortable.

Budget includes camera read, network transfer, preprocessing, inference, post-processing, threshold check, result publish.

Practical numbers: PatchCore on a CPU is 200-500 ms at 512x512. EfficientAD on an RTX A2000 is 8-15 ms. Hailo-8 INT8 with FastFlow is 5-12 ms. Pick your model around the budget.

Hot model reload, without pausing the line

The system has to swap models without dropping frames:

Model loader in a separate process.
New model loads + warms up on shadow data while old serves.
Atomic pointer swap once new is ready.
Old model unload in next idle window.

Drift detection, three signals

Score distribution drift — rolling 7-day quantiles on accepted parts. Alert at 2σ from baseline.
Embedding-space drift — sample 10k recent good-part embeddings, MMD/KL vs calibration. Alert at threshold.
Operator override rate — per-shift, per-camera. Anything trending up is an investigation.

All three required. Score drift catches lighting; embedding drift catches subtle changes; override rate catches what the first two missed.

The operator UI nobody plans for

The model flags, the operator decides. The UI between them determines trust.

What works:

Single colour-coded anomaly score (green/yellow/red) — not a probability.
Heatmap on click — for investigation, not default.
"Why did this score?" button — most-anomalous patch with similar good-patch reference.
Override + reason code — feeds retraining.
No "confidence" displayed — just score and threshold.

Fallback to rule-based

Every anomaly station has a parallel rule-based reject path. Dimension, colour, missing component — deterministic checks. ML catches what rules miss; rules catch what ML gets confidently wrong.

What's your deployment stack?

Shipping an anomaly model: latency, drift, and the operat

Shipping an anomaly model: latency, drift, and the operat

Aior

Administrator

Deploy projenin %70'idir

Hatta gerçekten var olan gecikme bütçeleri

Hattı duraklatmadan sıcak model yeniden yükleme

Drift tespiti, üç sinyal

Kimsenin planlamadığı operatör UI'si

Kural tabanlıya fallback

The deployment is 70 % of the project

Latency budgets that actually exist on a line

Hot model reload, without pausing the line

Drift detection, three signals

The operator UI nobody plans for

Fallback to rule-based

Similar threads

Forum statistics

Members online

Latest posts

Newest members

Featured content

Trending content

Share this page

Legal Notice

We value your privacy

Shipping an anomaly model: latency, drift, and the operat

Shipping an anomaly model: latency, drift, and the operat

Aior

Administrator

Deploy projenin %70'idir​

Hatta gerçekten var olan gecikme bütçeleri​

Hattı duraklatmadan sıcak model yeniden yükleme​

Drift tespiti, üç sinyal​

Kimsenin planlamadığı operatör UI'si​

Kural tabanlıya fallback​

The deployment is 70 % of the project​

Latency budgets that actually exist on a line​

Hot model reload, without pausing the line​

Drift detection, three signals​

The operator UI nobody plans for​

Fallback to rule-based​

Similar threads

Forum statistics

Members online

Latest posts

Newest members

Featured content

Trending content

Share this page

Tüm ihtiyaçlarınız için Teklif alın

Legal Notice

We value your privacy

Deploy projenin %70'idir

Hatta gerçekten var olan gecikme bütçeleri

Hattı duraklatmadan sıcak model yeniden yükleme

Drift tespiti, üç sinyal

Kimsenin planlamadığı operatör UI'si

Kural tabanlıya fallback

The deployment is 70 % of the project

Latency budgets that actually exist on a line

Hot model reload, without pausing the line

Drift detection, three signals

The operator UI nobody plans for

Fallback to rule-based