Skip to content

improve performance of reading sandbox stats (oom, peak memory)#133

Merged
GuillaumeGomez merged 1 commit into
rust-lang:mainfrom
syphar:direct-cgroup
May 20, 2026
Merged

improve performance of reading sandbox stats (oom, peak memory)#133
GuillaumeGomez merged 1 commit into
rust-lang:mainfrom
syphar:direct-cgroup

Conversation

@syphar
Copy link
Copy Markdown
Member

@syphar syphar commented May 17, 2026

after each command in the build we see (and need) this:
image

most exec cat calls as <100 ms, but I regularly see calls between 2 and 10 seconds. Which is quite a waste, when we see that we currently do 4 calls after each container command (two for oom, two for memory peak).

First optimization was that we remember which cgroup version we have, so then we can save one call in the later checks.

Then I discovered that we also can get the same information directly in special files from the host. So I added this. From what I see, the locations likely speficic to the used setup / engine, so I left the exec cat approach in. I checked that the "fast" way would be used by docs.rs.

I don't like the added complexity, but the performance impact will likely be noticeable.

While we're on it I extracted the bigger cgroup stats code into a new module, container lifetime stuff stays in Container

Generally: I think most randomly show things are i/o bound, and increasing the bandwidth / iops will hopefully help.

@syphar syphar self-assigned this May 17, 2026
@syphar syphar changed the title WIP: improve performance of reading sandbox stats (oom, peak memory) improve performance of reading sandbox stats (oom, peak memory) May 18, 2026
@syphar syphar marked this pull request as ready for review May 18, 2026 06:05
@syphar syphar requested a review from GuillaumeGomez May 19, 2026 11:35
Comment thread src/cmd/sandbox/docker.rs
/// Most of the time: v2
/// on old systems like the docs.rs builder: v1
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub(super) enum CgroupVersion {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of implementing all that, why not using crates with this feature like sysinfo?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check two crates, they either didn't provide these specific metrics, or you couldn't get them for existing cgroups (only when the library also created them).

I'll check once more for crates. Sysinfo wasn't on the initial list.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • cgroups-rs has the needed functionality, is badly documented, and has compile errors when building on non-linux (inconsistent gating of the linux specific logic). could probably make it work when gating it to linux in rustwide itself ( annoying when doing this on my machine). Or trying to fix the library itself, using a git fork in rustwide, or adding a patch if it's small.
  • in sysinfo the struct is called CgroupLimits, but the names (and the used stats) don't look like limits, but the current values? In any case, the memory-peak is missing, and the oom counter too. Is that something you would see there?

what do you think?

I feel like with a stable cgroup

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking more libraries later, you can point me to the points you had in mind

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like sysinfo will need an update. For now let's go. :)

@GuillaumeGomez GuillaumeGomez merged commit 590df9f into rust-lang:main May 20, 2026
10 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants