Skip to content
View YuZh98's full-sized avatar
:electron:
:electron:

Highlights

  • Pro

Block or report YuZh98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
YuZh98/README.md

Hi, I'm Hugh 👋

Statistics PhD by training, tool builder by compulsion. My papers propose scalable algorithms and prove theorems, and my side projects are tools I build to help other people get their work done faster. I find it hard to leave a solvable problem alone, and whenever I run into repetitive work, I'd rather automate it than let it eat into my time.


Tools I built because I needed them

latex2arxiv: Submit to arXiv without the headache. One command cleans your LaTeX project, catches rejection-causing errors, and walks you through the upload.

PyPI PyPI Downloads Homebrew VS Code Stars

Takes any LaTeX project (zip, directory, or git URL) and outputs a submission-ready zip. Prunes unreachable files, strips draft markup and revision commands, normalizes BibTeX, and runs pre-flight checks that surface errors arXiv silently fails on. Pass --guide and it writes a step-by-step upload walkthrough with copy-paste title/authors/abstract. Gate your paper repo on compliance with --dry-run in CI. Also ships as a VS Code extension and an MCP server so AI agents can run the full pipeline without leaving the chat.

Python CLI PyPI Homebrew VS Code GitHub Actions pre-commit MCP


academic-application-tracker: Local Streamlit dashboard that answers "what do I do today?" for academics juggling dozens of applications, deadlines, and recommendation letters.

Stars

Academic job searching is chaos: overlapping deadlines, multiple recommenders per position, materials checklists that differ by institution. I built the Streamlit dashboard that cuts through it: urgency-banded deadlines, per-position recommender state, materials readiness panel, interview log, and daily action items auto-computed. The database auto-exports plaintext markdown backups on every write. 800+ tests at 97% coverage, because I actually use it on my own applications.

Python Streamlit SQLite pytest


python-project-scaffold: Skip the 30-minute setup ritual and start at your first feature commit.

Stars

Every new Python project starts with the same 30-minute ritual: wire up ruff, pyright, pytest, CI matrix, coverage gate, pre-commit, Dependabot, ADRs... I automated all of it. One click on Use this template + one python3 scripts/init-project.py and you have a green-CI repo ready for your first feature. Ships with a /new-project Claude Code skill that creates the GitHub repo and sets up branch protection, because even the setup should be one command.

Python GitHub Actions Claude Code pre-commit


Research

Bayesian inference for structured data is my obsession. When your data is a ranking, a graph partition, or an integer array under hard constraints, standard inference breaks. My work builds algorithms that don't: I prove they converge, derive consistency conditions, and ship the code to show they run fast.

Three first-author papers:

  • JCGS 2025 (published): blocked Gibbs sampler with anti-correlation Gaussian data augmentation; 23–67× faster than NUTS (the industry-standard sampler) with a geometric ergodicity proof
  • JASA (major revision): Bayesian regression over combinatorial response data via integer programming duality
  • Bernoulli (major revision, 2nd round): first consistency guarantee for graph-based clustering under model misspecification

Research code: VAE-fMRI-Alzheimer, a 3D-convolutional VAE for Alzheimer's fMRI. CUDA training on HiPerGator, 36 unit tests, 18 tutorial notebooks.


Currently building

🦀 LOBSTER-tools, a LOBSTER limit-order-book parser in Rust. Reconstructs L1/L2 book state and event-level features from raw message+orderbook CSVs. Built for high-throughput backtesting workflows. (Repo goes public when v0.1 ships.)


Stack

Python R C++ Rust JAX PyTorch GitHub Actions PyPI Homebrew HiPerGator/Slurm


📫 hugh.stats@gmail.com · Google Scholar · ORCID · LinkedIn · Website

Popular repositories Loading

  1. latex2arxiv latex2arxiv Public

    Submit to arXiv without the headache. One command cleans your LaTeX project, catches rejection-causing errors, and walks you through the upload.

    Python 3 1

  2. VAE-fMRI-Alzheimer VAE-fMRI-Alzheimer Public

    Developing models to extract information about Alzheimer's disease from fMRIs

    Jupyter Notebook 2 1

  3. Anti-correlation-Gaussian Anti-correlation-Gaussian Public

    Codes for the paper "Gibbs Sampling using Anti-correlation Gaussian Data Augmentation, with Applications to L1-ball-type Models"

    R 1

  4. Study_Notes Study_Notes Public

    My study notes for online courses, courses in school and book reading.

    Jupyter Notebook 1 1

  5. Combinatorial_Data_Modeling Combinatorial_Data_Modeling Public archive

    C++ 1

  6. combinatorial-regression combinatorial-regression Public

    Statistical Modeling for Combinatorial Response Data

    Jupyter Notebook 1