feat: Automate Google Search Console indexing via Indexing API + smart sitemap diff#840
Open
dhananjay6561 wants to merge 3 commits intomainfrom
Open
feat: Automate Google Search Console indexing via Indexing API + smart sitemap diff#840dhananjay6561 wants to merge 3 commits intomainfrom
dhananjay6561 wants to merge 3 commits intomainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds automated Google indexing for docs deploys by diffing the current sitemap against a cached previous sitemap and notifying Google via the Indexing API, with a secondary GSC sitemap ping. This is intended to reduce Google discovery lag for new/updated/removed docs pages after each main deploy.
Changes:
- Emit per-page
<lastmod>values in the generated sitemap using git commit dates to enable reliable change detection. - Add a new Node script to diff sitemaps and submit
URL_UPDATED/URL_DELETEDnotifications to the Google Indexing API, plus a GSC sitemap ping. - Extend the main deploy workflow to restore/cache the previous sitemap and run the Google indexing step post-deploy.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
scripts/google-index.js |
New script to diff sitemaps and submit Indexing API notifications with retry + rate limiting, plus GSC sitemap ping. |
docusaurus.config.js |
Enables git-based <lastmod> emission in sitemap.xml to support smart diffing. |
.github/workflows/main.yml |
Fetch full git history for correct <lastmod>, restore/cache previous sitemap, and run Google indexing after deploy. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ssages, non-fatal npm install
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… are retried next deploy
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & Why
Google doesn't support IndexNow (we already have that for Bing/Yandex). Without this PR, Google discovers new/updated docs on its own crawl schedule — days or weeks late. This PR pushes URLs directly to Google the moment a deploy completes.
How it works
After every deploy to
main:sitemap.xmlagainst the cached sitemap from the last deployURL_UPDATEDto Google Indexing APIURL_DELETEDFiles changed
docusaurus.config.jslastmod: "date"— makes Docusaurus emit git-based last-modified dates in sitemap so the diff can detect which pages actually changedscripts/google-index.jsURL_UPDATED/URL_DELETEDsubmissions, retry logic (3x with backoff on 429/5xx/network errors), burst rate limiting (10 req/s), GSC sitemap ping, and quota-safe baseline gating.github/workflows/main.ymlfetch-depth: 0on checkout (required for correctlastmoddates), sitemap cache restore step, and Google Indexing API step after deployDeployment / Setup (one-time)
GOOGLE_SERVICE_ACCOUNT_JSON(paste the full service account key JSON)GSC_SITE_URLsecret if the GSC property URL differs fromhttps://keploy.io/That's it — next push to
maintriggers everything automatically.Quota
Google's default limit is 200 URL_UPDATED/day. Smart diffing means a typical deploy (10–20 changed pages) uses 10–20 of those 200.
If quota is exceeded, skipped URLs are counted as failures → the script exits non-zero → the sitemap baseline is not advanced → the next deploy re-diffs from the same old baseline and automatically retries all missed URLs. Nothing is silently dropped.