Skip to content

dutifuldev/xtap-sync

Repository files navigation

xtap-sync

xtap-sync syncs local xTap JSONL exports into a Git repository. It reads flat tweets-YYYY-MM-DD.jsonl files from an xTap output directory, deduplicates tweet records by id, and writes normalized archive files under data/tweets/YYYY/MM/ in the target repository.

The target repository is configurable. It can be a private data store, a public archive, or any Git checkout that git push can update.

Install

go install github.com/dutifuldev/xtap-sync/cmd/xtap-sync@latest

From a local checkout:

go install ./cmd/xtap-sync

Sync Once

xtap-sync sync --source "$HOME/Downloads/xtap" --repo "$HOME/repos/my-xtap-data"

If the target repository has an origin remote, xtap-sync fetches that branch, merges remote records with local xTap records, commits changed archive files, and pushes. Use --no-push for local-only testing:

xtap-sync sync --source "$HOME/Downloads/xtap" --repo "$HOME/repos/my-xtap-data" --no-push

Configuration

By default, xtap-sync looks for:

$XDG_CONFIG_HOME/xtap-sync/config.json

If XDG_CONFIG_HOME is unset, it uses:

$HOME/.config/xtap-sync/config.json

Example:

{
  "source_dir": "~/Downloads/xtap",
  "repo_dir": "~/repos/my-xtap-data",
  "remote": "origin",
  "branch": "main",
  "commit_message": "sync xTap tweets",
  "push": true,
  "service_label": "dev.xtap-sync",
  "interval": "1h"
}

Command-line flags override the config file. You can also choose another config file:

xtap-sync sync --config ./xtap-sync.json

Useful environment variables:

  • XTAP_SYNC_CONFIG: default config file path
  • XTAP_SYNC_SOURCE_DIR: default xTap output directory
  • XTAP_SYNC_REPO_DIR: default target Git checkout
  • XTAP_OUTPUT_DIR: fallback xTap output directory

Background Sync

Install an hourly background sync from a config file:

xtap-sync install-service --config "$HOME/.config/xtap-sync/config.json"

Without a config file:

xtap-sync install-service --source "$HOME/Downloads/xtap" --repo "$HOME/repos/my-xtap-data"

On macOS, this installs a LaunchAgent under:

$HOME/Library/LaunchAgents/

LaunchAgent logs are written to:

$HOME/Library/Logs/xtap-sync/

On Linux, this installs a systemd user service and timer under:

$XDG_CONFIG_HOME/systemd/user/

or, if XDG_CONFIG_HOME is unset:

$HOME/.config/systemd/user/

For unattended syncs on a headless Linux host, make sure the user manager can run without an active login session:

loginctl enable-linger "$USER"

Remove it with:

xtap-sync uninstall-service

Verify

Check whether all source tweet IDs are present in the target repository:

xtap-sync verify --config "$HOME/.config/xtap-sync/config.json"

The command reports source IDs, repository IDs, missing IDs, and extra repository IDs. It exits non-zero when source tweets are missing from the target repository.

Storage Backends

xtap-sync currently syncs to Git repositories, including GitHub-hosted repositories. If GitHub storage becomes a bottleneck, a future backend could sync the same normalized archive files to object storage such as Hugging Face buckets.

Data Policy

Only data/tweets/YYYY/MM/tweets-YYYY-MM-DD.jsonl files are managed in the target repository. xTap source files are read from a flat tweets-YYYY-MM-DD.jsonl output directory. Media folders, partial downloads, logs, credentials, and common binary media files are ignored.

License

MIT

About

Sync local xTap JSONL exports into a Git-backed tweet archive.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors