This directory contains the core engineering and operational AI pipeline artifacts supporting the Edge AI document-detection system evaluated throughout the research project.
The codebase integrates computer vision training workflows, edge deployment orchestration, OCR-triggered processing, and downstream PHI/PII evaluation pipelines within a unified operational architecture.
| Directory | Purpose |
|---|---|
core |
Shared orchestration workflows supporting training, inference, and experimental execution |
training |
Training subsystem documentation and workflow architecture |
inference |
Real-time inference subsystem documentation |
edge_pipeline |
GStreamer, DeepStream, and edge runtime deployment workflows |
evaluation |
Evaluation subsystem documentation and metric-generation workflows |
presidio |
OCR and PHI/PII extraction pipelines using Microsoft Presidio |
utilities |
Supporting utility scripts and experimental helper workflows |
Dataset Preparation
↓
YOLOv5 Training
↓
Cross-Validation Evaluation
↓
Edge Deployment
↓
Real-Time Inference
↓
Detection Stability Logic
↓
OCR Extraction
↓
Microsoft Presidio
↓
PHI/PII Triage Workflow
core/doc_detector_yolov5_doccorner.py
Primary orchestration script supporting:
- dataset preparation
- YOLOv5 training
- five-fold cross-validation
- inference workflows
- metric generation
- visualization export
- optional corner regression
- optional OpenCV-based refinement
- GStreamer runtime execution
presidio/pii_phi_pipeline.py
Primary OCR and PHI/PII extraction workflow supporting:
- OCR preprocessing
- Tesseract OCR execution
- Microsoft Presidio integration
- PHI/PII entity analysis
- downstream triage workflows
- JSON report generation
- document classification routing
evaluation/make_coco_gt_from_yolo.py
Utility workflow supporting conversion of YOLO-format labels into COCO-style ground-truth evaluation artifacts.
utilities/batch_pii_phi_eval_4class.py
Batch-processing utility supporting four-class PHI/PII evaluation workflows, confusion-matrix generation, and synthetic evaluation orchestration.
- NVIDIA Jetson AGX Orin 64GB
- Ubuntu 20.04
- CUDA
- TensorRT
- DeepStream 6.3
- OpenCV
- GStreamer
- YOLOv5
- PyTorch
- Real-time edge inference
- Computer vision deployment
- Privacy-preserving architectures
- Operational AI orchestration
- Embedded AI infrastructure
- OCR-triggered workflow processing
- PHI/PII triage integration
The code architecture supports dissertation research evaluating whether localized Edge AI computer vision systems can function as upstream privacy-preserving control mechanisms capable of triggering downstream OCR and PHI/PII analysis workflows only after document detection events occur.
The repository emphasizes operational AI engineering, embedded computer vision deployment, and privacy-preserving Edge AI infrastructure within healthcare-oriented environments.