Skip to content
View vsancnaj's full-sized avatar
💜
Focusing
💜
Focusing

Block or report vsancnaj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vsancnaj/README.md

Valentina Sanchez · ML Engineer · NLP & LLMs · New York

I build systems that turn messy real-world text into structure: entity extraction, retrieval-augmented generation, and document understanding. I care about models that ship, reproduce, and hold up on data they have never seen.

Currently: designing a context-aware transformer that fuses sentence, section, and document-level reasoning for employer and entity extraction. (repo coming soon)

Open to ML / NLP engineering roles.

Stack: Python · PyTorch · Hugging Face Transformers · LangChain · LlamaIndex · AWS Bedrock · Docker · Streamlit


Projects

AI Agent Document Analyzer: offline RAG over PDFs with LangChain, LlamaIndex, and Ollama.

PDF Q&A Chatbot: multilingual RAG chatbot on AWS Bedrock, served with Streamlit.

Document Extractor LLM: Dockerized Streamlit app that extracts structured fields from documents with LLMs.

Hybrid Recommendation System: collaborative plus content-based filtering, served through a Flask API and Streamlit UI.

E-Commerce Fraud Detection: XGBoost on 590K transactions, ROC-AUC 0.89, F1 0.76.

Anomaly Detection (thesis): optical flow plus LSTM detecting heavy-object anomalies in real-time waste-sorting footage.


LinkedIn · vsanchezn.cs@gmail.com

Pinned Loading

  1. anomaly_detection_thesis anomaly_detection_thesis Public

    This repo enhances waste-to-energy processing using LSTM classifiers to analyze high-speed waste motion from cluttered environments. We use optical flow from WasteAnt datasets to detect anomalies t…

    Jupyter Notebook 1

  2. Vesta-E-Commerce-Fraud-Detection Vesta-E-Commerce-Fraud-Detection Public

    The objective of this repository is to enhance the accuracy of fraud detection mechanisms by examining transaction patterns and feature correlations.

    Jupyter Notebook 2

  3. AI-Agent-Document-Analyzer AI-Agent-Document-Analyzer Public

    This project is an AI-powered document analysis bot designed to process and extract information from PDF documents.

    Python 1 1

  4. Hybrid-Recommendation-System Hybrid-Recommendation-System Public

    A hybrid recommendation system combining Item-Based Collaborative Filtering and Content-Based Filtering to suggest skincare products based on user preferences, product ingredients, and ratings. Fea…

    Jupyter Notebook 4

  5. document-extractor-llm document-extractor-llm Public

    A Streamlit app using Large Language Models (LLMs) for efficient document parsing and data extraction. Dockerized for easy deployment, leveraging OpenAI, Chroma, and RAG for advanced information re…

    Jupyter Notebook 1

  6. PDF-Question-Answering-Chatbot-with-AWS-Bedrock-and-Streamlit PDF-Question-Answering-Chatbot-with-AWS-Bedrock-and-Streamlit Public

    Python