The thing nobody tells you about OCR
99 % accuracy sounds great on a slide. On a line that produces 60 000 parts a day, 99 % means 600 misreads. If each misread costs an operator 30 seconds to clear, that's five hours of lost throughput per shift. We've watched a customer pull an OCR station out of a line on day three because of exactly this math.The real bar for line-side OCR is closer to 99.95 %, and you don't get there with a model alone. You get there with a system.
Reading paper labels vs reading metal
The difference between the two is wider than people think:- Paper labels: high contrast, consistent illumination, predictable font, repeatable position. A vanilla OCR model gets you 99.8 % out of the box.
- Laser-etched serials on metal: low contrast, specular highlights, inconsistent depth, occasional cutting fluid. A vanilla model gets you 85 % on a good day.
For the second case, the model is maybe 30 % of the solution. The other 70 % is lighting, framing, and post-validation.
The four engines we keep around
- Tesseract — fine for clean printed text. Don't fight it on metal.
- EasyOCR — surprisingly good zero-shot baseline, slow on CPU, fast enough on a 3060.
- PaddleOCR — our default for multilingual labels, especially when CJK characters or mixed Latin/Arabic show up.
- A fine-tuned CRNN — when accuracy matters more than dev time, and we have a few hundred labelled samples from the actual line.
We almost never run more than one engine in production. We do run all four during the feasibility phase, on the same images, side by side. The cheapest engine that hits 99 % on the test set is the right one.
The validation layer that saves you
Every production OCR we ship has at least one of:- Checksum digit validation — most serial schemes have one. Use it. A misread that fails the checksum is a clean reject, not a wrong-but-confident result, which is the failure mode that hurts.
- Regex shape match — "must be 3 letters + 7 digits" catches half of all errors before they leave the camera.
- Database existence check — if the read serial isn't in the production schedule, kick it back.
- Confidence threshold + human review queue — anything below threshold doesn't get accepted; it gets queued for an operator to check. A small queue is fine; silent wrong reads are not.
The pattern that works: never trust the OCR. Always cross-check with at least one independent signal.
Pre-processing tricks that disproportionately help
- Rectify the text to horizontal before OCR — most engines drop accuracy 2–5 % per 10° of rotation.
- Normalise contrast on a per-character basis (CLAHE, not global histogram equalisation).
- If the surface is curved, unwrap it with a calibrated lens map before reading.
These are the unsexy bits that move the dial from 95 % to 99.5 %.
What patterns are you using on metal serials? We're always interested in laser-etched failure cases.