Real-Time Safety Monitoring Pipeline

Computer Vision

MLOps

Author

Suraj Jaiswal

Published

April 1, 2025

An edge-deployed, real-time compliance monitoring system for a chemical factory where workers operate on hazardous chemical containers. The system enforces multiple safety use cases simultaneously using a modular deep learning pipeline.

Use Cases

PPE compliance: Detect whether workers wear safety vest and hard hat
Harness & lanyard: Verify workers on containers are tethered to the ceiling via harness and lanyard
Container tracking: Detect ingoing/outgoing trucks with containers and extract ISO numbers via OCR

Method

Component	Role
YOLOX	Person and container detection
MobileNet	Safety vest, hard hat, harness classification
OCSort	Stable multi-object tracking for unique ID assignment
Segmentation	Inclusion/exclusion zone identification
OCR	Parse ISO numbers from container surfaces
Qwen 7B (quantized VLM)	Lanyard detection (thin rope — classical models insufficient)

Implementation

Dataset Collection: - Recorded video from live CCTV RTSP streams - Ran GroundingDINO auto-annotation over videos to extract person and container crops - Parent person crop used to generate safety vest, hard hat, and harness instances - Validated and cleaned dataset before training

Inference Flow — PPE:

Video → extract frame → YOLOX (person) → OCSort (tracking)
→ parallel: (a) safety_vest  (b) hard_hat  (c) harness
→ if harness detected → Qwen 7B VLM for lanyard check

Inference Flow — Container ISO:

Video → extract frames → YOLOX (container) → OCSort (tracking)
→ OCR on container crop → extract ISO number

Qwen 7B VLM is called only when harness is detected to minimize API calls (each VLM call takes 2–3 seconds).

Deployment Architecture

Hosted on Triton Server with separate Docker containers per module on an edge device:

Container	Function
`s3_sync`	Syncs model versions and run configs with AWS S3
`triton_hosting`	Queue-based async, scalable model request processing
`vlm_inference`	Hosts Qwen7B quantized (called conditionally)
`use_case_aggregation`	Runs main inference loop, saves JSON annotation logs
`violation_detection`	Checks intervals — flags violations if PPE absent > N seconds, pushes to frontend DB

Outcomes

Detected safety vest, hard hat, harness, and lanyard compliance in real time
Container ISO numbers accurately extracted via OCR
Interval-based violation detection reduced false positives
Modular pipeline handled multiple workers and containers simultaneously with stable tracking

Tech Stack: Python YOLOX MobileNet OCSort GroundingDINO Qwen 7B Triton Server Docker AWS S3 AWS IoT Core OCR

Conclusions

A modular, edge-deployed pipeline combining detection (YOLOX), tracking (OCSort), classification (MobileNet), and VLM-based reasoning (Qwen7B) is highly effective for real-time multi-use-case industrial compliance. Conditional VLM inference significantly reduced latency and API costs without compromising accuracy.

Known limitations: Lanyard detection is challenging due to its thin, flexible nature. VLM inference latency (2–3s/call) is a bottleneck. Performance degrades in poor lighting or suboptimal CCTV placement.