Suraj Jaiswal

Real-Time Safety Monitoring Pipeline

Suraj Jaiswal — Mon, 31 Mar 2025 18:30:00 GMT

An edge-deployed, real-time compliance monitoring system for a chemical factory where workers operate on hazardous chemical containers. The system enforces multiple safety use cases simultaneously using a modular deep learning pipeline.

Use Cases

PPE compliance: Detect whether workers wear safety vest and hard hat
Harness & lanyard: Verify workers on containers are tethered to the ceiling via harness and lanyard
Container tracking: Detect ingoing/outgoing trucks with containers and extract ISO numbers via OCR

Method

Component	Role
YOLOX	Person and container detection
MobileNet	Safety vest, hard hat, harness classification
OCSort	Stable multi-object tracking for unique ID assignment
Segmentation	Inclusion/exclusion zone identification
OCR	Parse ISO numbers from container surfaces
Qwen 7B (quantized VLM)	Lanyard detection (thin rope — classical models insufficient)

Implementation

Dataset Collection: - Recorded video from live CCTV RTSP streams - Ran GroundingDINO auto-annotation over videos to extract person and container crops - Parent person crop used to generate safety vest, hard hat, and harness instances - Validated and cleaned dataset before training

Inference Flow — PPE:

Video → extract frame → YOLOX (person) → OCSort (tracking)
→ parallel: (a) safety_vest  (b) hard_hat  (c) harness
→ if harness detected → Qwen 7B VLM for lanyard check

Inference Flow — Container ISO:

Video → extract frames → YOLOX (container) → OCSort (tracking)
→ OCR on container crop → extract ISO number

Qwen 7B VLM is called only when harness is detected to minimize API calls (each VLM call takes 2–3 seconds).

Deployment Architecture

Hosted on Triton Server with separate Docker containers per module on an edge device:

Container	Function
`s3_sync`	Syncs model versions and run configs with AWS S3
`triton_hosting`	Queue-based async, scalable model request processing
`vlm_inference`	Hosts Qwen7B quantized (called conditionally)
`use_case_aggregation`	Runs main inference loop, saves JSON annotation logs
`violation_detection`	Checks intervals — flags violations if PPE absent > N seconds, pushes to frontend DB

Outcomes

Detected safety vest, hard hat, harness, and lanyard compliance in real time
Container ISO numbers accurately extracted via OCR
Interval-based violation detection reduced false positives
Modular pipeline handled multiple workers and containers simultaneously with stable tracking

Tech Stack: Python YOLOX MobileNet OCSort GroundingDINO Qwen 7B Triton Server Docker AWS S3 AWS IoT Core OCR

Conclusions

A modular, edge-deployed pipeline combining detection (YOLOX), tracking (OCSort), classification (MobileNet), and VLM-based reasoning (Qwen7B) is highly effective for real-time multi-use-case industrial compliance. Conditional VLM inference significantly reduced latency and API costs without compromising accuracy.

Known limitations: Lanyard detection is challenging due to its thin, flexible nature. VLM inference latency (2–3s/call) is a bottleneck. Performance degrades in poor lighting or suboptimal CCTV placement.

LLM-Based Code Evaluator

Suraj Jaiswal — Fri, 28 Feb 2025 18:30:00 GMT

Built an automated code evaluation system using LLMs to assess code quality, correctness, and adherence to requirements — replacing slow manual review workflows.

Key Highlights:

3× throughput improvement over manual evaluation baseline
90% reduction in manual review time
Multi-threaded Streamlit UI enabling parallel evaluation of multiple submissions

Technical Approach:

OpenAI API for LLM-based code assessment with structured prompt engineering
Multi-threaded execution for parallel evaluation runs
Configurable scoring rubrics per evaluation criteria (correctness, style, efficiency)
Streamlit frontend for reviewers to inspect and override evaluations

Tech Stack: Python OpenAI API Streamlit Prompt Engineering Multi-threading

Impact: Scaled code review capacity by 3× while cutting reviewer time by 90%, enabling faster hiring and assessment pipelines.

GenAI Bill-of-Materials Automation Pipeline

Suraj Jaiswal — Fri, 31 Jan 2025 18:30:00 GMT

Designed and deployed a GenAI pipeline at Tiger Analytics to automate Bill-of-Materials (BoM) generation for lighting parts — replacing a largely manual, time-intensive process.

Problem: Generating structured BoMs for lighting components required engineers to manually extract and organize part specifications from unstructured documents, taking significant time per product.

Solution:

LLM-based structured extraction pipeline using AWS Bedrock to parse product datasheets and specifications
Experiment tracking and model versioning with MLflow
Containerised with Docker for reproducible, production-ready deployment
Output structured BoMs with part names, specifications, quantities, and supplier info

Key Results:

80% reduction in manual BoM generation time
Consistent structured output across diverse document formats (PDFs, HTML specs, Excel sheets)

Tech Stack: Python AWS Bedrock MLflow Docker Prompt Engineering Structured Extraction

Impact: Freed up engineering time from repetitive data entry, enabling faster product configuration and procurement workflows.

Medical RAG System for Automated Billing Code Generation

Suraj Jaiswal — Wed, 31 Jul 2024 18:30:00 GMT

Built a Retrieval-Augmented Generation (RAG) system at NeuroReef Labs to automate medical billing code generation from clinical notes and patient visit records.

Problem: Medical coders manually assign ICD-10, CPT, SNOMED, and HCC codes to every patient visit — a slow, error-prone process that delays billing and reimbursement.

Solution:

RAG pipeline over clinical documentation to retrieve relevant coding guidelines and past examples
LLM generates billing codes grounded in retrieved context, reducing hallucinations
HyDE (Hypothetical Document Embeddings) used in the medical chatbot over Athena EHR data for improved retrieval quality
Applied few-shot prompt tuning for medical guideline-consistent outputs

Codes Automated:

Code Type	Purpose
ICD-10	Diagnosis classification
CPT	Procedure billing
SNOMED	Clinical terminology
HCC	Risk adjustment for Medicare

Tech Stack: Python OpenAI API RAG HyDE HuggingFace Athena EHR Prompt Engineering

Impact: Reduced manual coding effort for medical billing, improving accuracy and turnaround time for reimbursement workflows.