Skip to content

feat: Automate Google Search Console indexing via Indexing API + smart sitemap diff#840

Open
dhananjay6561 wants to merge 3 commits intomainfrom
feat/automate-indexing
Open

feat: Automate Google Search Console indexing via Indexing API + smart sitemap diff#840
dhananjay6561 wants to merge 3 commits intomainfrom
feat/automate-indexing

Conversation

@dhananjay6561
Copy link
Copy Markdown
Member

@dhananjay6561 dhananjay6561 commented Apr 24, 2026

What & Why

Google doesn't support IndexNow (we already have that for Bing/Yandex). Without this PR, Google discovers new/updated docs on its own crawl schedule — days or weeks late. This PR pushes URLs directly to Google the moment a deploy completes.

How it works

After every deploy to main:

  1. Diffs new sitemap.xml against the cached sitemap from the last deploy
  2. Submits only new/changed URLs → URL_UPDATED to Google Indexing API
  3. Submits removed URLs → URL_DELETED
  4. Pings GSC Sitemap API as a secondary signal
  5. Caches the sitemap for the next deploy's diff — only when all submissions completed (if quota was hit or anything failed, the baseline is preserved so skipped URLs are automatically picked up on the next run)

Files changed

File Change
docusaurus.config.js Added lastmod: "date" — makes Docusaurus emit git-based last-modified dates in sitemap so the diff can detect which pages actually changed
scripts/google-index.js New script — handles auth, sitemap diffing, URL_UPDATED/URL_DELETED submissions, retry logic (3x with backoff on 429/5xx/network errors), burst rate limiting (10 req/s), GSC sitemap ping, and quota-safe baseline gating
.github/workflows/main.yml Added fetch-depth: 0 on checkout (required for correct lastmod dates), sitemap cache restore step, and Google Indexing API step after deploy

Deployment / Setup (one-time)

  1. Google Cloud Console → enable Web Search Indexing API on the service account's project
  2. GSC → Settings → Users & permissions → add service account email as Owner
  3. GitHub Secrets → add GOOGLE_SERVICE_ACCOUNT_JSON (paste the full service account key JSON)
  4. Optionally add GSC_SITE_URL secret if the GSC property URL differs from https://keploy.io/

That's it — next push to main triggers everything automatically.

Quota

Google's default limit is 200 URL_UPDATED/day. Smart diffing means a typical deploy (10–20 changed pages) uses 10–20 of those 200.

If quota is exceeded, skipped URLs are counted as failures → the script exits non-zero → the sitemap baseline is not advanced → the next deploy re-diffs from the same old baseline and automatically retries all missed URLs. Nothing is silently dropped.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automated Google indexing for docs deploys by diffing the current sitemap against a cached previous sitemap and notifying Google via the Indexing API, with a secondary GSC sitemap ping. This is intended to reduce Google discovery lag for new/updated/removed docs pages after each main deploy.

Changes:

  • Emit per-page <lastmod> values in the generated sitemap using git commit dates to enable reliable change detection.
  • Add a new Node script to diff sitemaps and submit URL_UPDATED / URL_DELETED notifications to the Google Indexing API, plus a GSC sitemap ping.
  • Extend the main deploy workflow to restore/cache the previous sitemap and run the Google indexing step post-deploy.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
scripts/google-index.js New script to diff sitemaps and submit Indexing API notifications with retry + rate limiting, plus GSC sitemap ping.
docusaurus.config.js Enables git-based <lastmod> emission in sitemap.xml to support smart diffing.
.github/workflows/main.yml Fetch full git history for correct <lastmod>, restore/cache previous sitemap, and run Google indexing after deploy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/google-index.js Outdated
Comment thread scripts/google-index.js Outdated
Comment thread scripts/google-index.js Outdated
Comment thread scripts/google-index.js Outdated
Comment thread .github/workflows/main.yml Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/main.yml Outdated
Comment thread scripts/google-index.js Outdated
Comment thread scripts/google-index.js Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants