使用 Pebble 替换 oplog 本地落盘队列#987
Open
SisyphusSQ wants to merge 4 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
关联 issue:#982
当前
full_sync.reader.oplog_store_disk=true路径依赖go-diskqueue作为本地 oplog 临时队列。本 PR 将该路径替换为 Pebble-backed spool,保持原有用户配置不变,并补齐 checkpoint、重启恢复、指标和测试覆盖。主要变更
collector/spool内部接口和 Pebble 实现,支持 FIFO 写入、批量读取、推进、深度统计、关闭重开、删除和旧格式 fail fast。Persister的磁盘落盘/回放状态机:全量期间写入 Pebble spool,全量完成后按顺序回放到 pending queue,回放完成后删除 spool 并切回内存 apply。OplogDiskQueue/OplogDiskQueueFinishTs只在 spool 回放完成且 worker checkpoint 追上后才标记完成;重启时可重新打开未完成 spool。/persist增加spool_depth、spool_read_seq、spool_write_seq;Prometheus 增加spool_depth、spool_write_seq、spool_read_seq、spool_write_total、spool_read_total、spool_errors_total。full_sync.reader.oplog_store_disk仅用于 oplog fetch method,避免 change stream 路径被误用。不包含
go-diskqueue本地文件;检测到旧格式时明确失败。验证
本地目标测试和构建已通过:
除本地测试和构建外,还做了两类运行验证:
collector。full_sync.reader.oplog_store_disk=true。spool_depth降为 0,spool_errors保持 0。collector。write_success > 0。full_sync_collections_progress_ratio = 1。spool_depth汇总为 0。上述验证覆盖 Pebble spool 的打开、写入、回放、清空、删除、checkpoint 推进和增量续跑。目标端写入一致性不在本次验证范围内。