Skip to content

SimoneDG29/Github-Secret-History-Scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” GitHub Secret History Scanner

Go Version Tests License

A CLI tool that scans the entire Git history of a repository to detect leaked secrets before they can be exploited.

Designed for DevSecOps workflows, CI/CD pipelines, and security auditing.


πŸš€ Features

  • πŸ” Scan full Git history (not just working tree)
  • 🧠 Regex + entropy-based secret detection
  • 🧩 Custom detection rules via YAML (--rules)
  • 🚫 Ignore files and directories via .scannerignore or --ignore-file
  • πŸ“Š Structured JSON output
  • πŸͺ΅ Configurable logging (--log-level)
  • ⚑ Concurrent scanning for improved performance
  • 🧱 Modular architecture (scanner, detector, reporter)

🎯 Use Cases

  • Prevent secrets from being committed
  • Audit existing repositories for leaked credentials
  • Integrate into CI/CD pipelines for automated security checks

πŸ“¦ Installation

From source

go build -o gh-secret-scanner .

🐳 Docker

docker build -t gh-secret-scanner .

Run against a local Git repository:

docker run --rm \
  -v $(pwd):/repo \
  gh-secret-scanner scan --repo /repo

⚠️ Make sure the mounted directory is a valid Git repository.


βš™οΈ Usage

gh-secret-scanner scan [path-or-url]

Options

--repo <path>            Repository path or URL
--rules <file>           Custom rules YAML file
--ignore-file <file>     Ignore patterns file
--min-entropy <value>    Minimum entropy threshold
--log-level <level>      debug | info | warn | error
--fail-on-findings       Exit with error if secrets are found

🧩 Custom Rules

Define detection rules in YAML:

rules:
  - id: aws-access-key
    description: AWS Access Key ID
    pattern: 'AKIA[0-9A-Z]{16}'
    confidence: 0.7
  - id: github-token
    description: GitHub Personal Access Token
    pattern: 'ghp_[A-Za-z0-9]{36}'
    confidence: 0.7

Use them with:

gh-secret-scanner scan --rules rules.yaml

If not specified, default rules are used.


🚫 Ignore Patterns

Exclude files and directories from scanning using glob patterns.

How it works

  • Use --ignore-file <file> to specify a custom file
  • If not provided, .scannerignore is used (if present)
  • Supports glob patterns (one per line)
  • Ignores empty lines and comments (#)

Example .scannerignore

# Ignore minified files
*.min.js

# Ignore dependencies
vendor/**
node_modules/**

# Ignore test data
testdata/**

πŸ§ͺ Example

gh-secret-scanner scan . \
  --min-entropy 3.5 \
  --fail-on-findings

πŸ“Š Example Output

{
  "repository": "https://github.com/example/repo.git",
  "findings": [
    {
      "repository": "https://github.com/example/repo.git",
      "commit_hash": "abc123",
      "file_path": "config.yml",
      "secret": "AKIA1234567890EXAMPLE",
      "rule_id": "aws-access-key",
      "line": 12,
      "confidence": 0.9
    }
  ]
}

πŸ”„ CI/CD Integration (GitHub Actions)

name: Secret History Scan

on:
  pull_request:
  push:
    branches: [main]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Go
        uses: actions/setup-go@v5
        with:
          go-version: "1.22"

      - name: Build scanner
        run: go build -o gh-secret-scanner .

      - name: Run secret scan
        run: ./gh-secret-scanner scan --repo . --fail-on-findings

πŸ§ͺ Testing

This project includes a comprehensive automated test suite designed to ensure reliability, correctness, and maintainability.

Coverage

The following components are covered by tests:

  • Detector: regex-based secret detection
  • Entropy: Shannon entropy calculation
  • YAML Rules: custom rule loading and validation
  • Ignore Patterns: glob-based file exclusion logic
  • Scanner (integration tests): scanning real Git repositories using temporary test repos

Running Tests

go test ./... -v

CI Integration

Tests are automatically executed in CI using GitHub Actions on every push and pull request.

Notes

  • Tests use temporary directories and repositories (t.TempDir()) to remain isolated
  • No external dependencies or network access are required
  • Designed to be deterministic and reproducible

πŸ—οΈ Project Structure

cmd/                CLI commands (Cobra)
configs/            Default and custom YAML rules
internal/scanner/   Git history traversal
internal/detector/  Detection rules and matching
internal/entropy/   Entropy scoring
internal/reporter/  Output formatting (JSON)
internal/logging/   Logging configuration
internal/models/    Data models (e.g., findings)
tests/              Automated test suite

πŸ“Œ Notes

  • For full history scanning, ensure your Git clone is not shallow
  • In CI, always use:
fetch-depth: 0

πŸ“„ License

MIT

About

CLI tool in Go that scans the full Git history of a repository to detect leaked secrets (API keys, tokens, credentials) using regex and entropy-based rules and support for custom YAML rules and JSON output.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors