I build systems that turn messy real-world text into structure: entity extraction, retrieval-augmented generation, and document understanding. I care about models that ship, reproduce, and hold up on data they have never seen.
Currently: designing a context-aware transformer that fuses sentence, section, and document-level reasoning for employer and entity extraction. (repo coming soon)
Open to ML / NLP engineering roles.
Stack: Python · PyTorch · Hugging Face Transformers · LangChain · LlamaIndex · AWS Bedrock · Docker · Streamlit
AI Agent Document Analyzer: offline RAG over PDFs with LangChain, LlamaIndex, and Ollama.
PDF Q&A Chatbot: multilingual RAG chatbot on AWS Bedrock, served with Streamlit.
Document Extractor LLM: Dockerized Streamlit app that extracts structured fields from documents with LLMs.
Hybrid Recommendation System: collaborative plus content-based filtering, served through a Flask API and Streamlit UI.
E-Commerce Fraud Detection: XGBoost on 590K transactions, ROC-AUC 0.89, F1 0.76.
Anomaly Detection (thesis): optical flow plus LSTM detecting heavy-object anomalies in real-time waste-sorting footage.
