Add harm runtime#50
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR widens relocation types to 64-bit, adds memory abstraction traits plus mmap-backed and foreign memory implementations, introduces a LabelRegistry and Builder to apply Rel64 relocations, and adds an Assembler that emits, builds, and compiles machine code (including executable mmap conversion). Cargo features and dev-deps updated. ChangesRuntime Infrastructure and Relocation Types
Sequence DiagramsequenceDiagram
participant User
participant Assembler
participant LabelRegistry
participant Memory
participant Builder
participant PositionedMemory
participant ExecutableMemory
User->>Assembler: new(ForeignMemoryBuffer)
Assembler->>LabelRegistry: new()
LabelRegistry-->>Assembler: registry
User->>Assembler: append(InstructionSeq)
Assembler->>Memory: try_extend(bytes)
Memory-->>Assembler: ok
User->>Assembler: current_named_label("target")
Assembler->>LabelRegistry: get_forward_named_label("target")
LabelRegistry-->>Assembler: LabelId
Assembler->>Memory: current position
Assembler->>LabelRegistry: define_label(id, offset)
LabelRegistry-->>Assembler: done
User->>Assembler: compile()
Assembler->>Assembler: build()
Assembler->>Memory: into_positioned_memory()
Memory->>PositionedMemory: convert
PositionedMemory-->>Assembler: PM
Assembler->>Builder: new(PM, base_addr)
Assembler->>Builder: build(named_labels, defined_labels, relocations)
Builder->>Builder: resolve named→u64
Builder->>Builder: compute label addresses
Builder->>Builder: apply each Rel64
Builder-->>Assembler: (PM, label_map)
Assembler->>PositionedMemory: into_executable_memory()
PositionedMemory->>ExecutableMemory: convert (via memmap2/make_exec)
ExecutableMemory-->>Assembler: ExecMem
Assembler-->>User: (ExecMem, label_map)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ee98019 to
c0daa2c
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@harm-runtime/src/labels.rs`:
- Around line 49-83: The current label APIs (define_label, define_named_label,
name_label) panic or silently overwrite; change them to return Result<LabelId,
LabelRegistryError> (or Result<(), ...> for define_label/name_label) and
introduce a LabelRegistryError enum with variants Unregistered(LabelId),
AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to
check self.labels: return Err(Unregistered) if the id isn't registered,
Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset
and return Ok. Change define_named_label to refuse to overwrite an existing
Offset or rebind an existing name: if name exists map to id and that id is
AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check
and insert only if not already defined, else return Err(Unregistered) or
NameConflict accordingly; return Ok(id) on success. Change name_label to return
Result and only insert name if id is registered and the name isn't already bound
(return NameConflict if it is), otherwise return Unregistered; update all
Assembler call sites to propagate or handle these Results.
In `@harm-runtime/src/memory.rs`:
- Around line 30-37: The align method can panic when alignment == 0 due to pos %
alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(),
Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally
treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment
<= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed
to compute remn from self.pos() and call self.try_extend(... ) as before to
perform padding.
In `@harm-runtime/src/memory/memmap2.rs`:
- Around line 132-166: The into_fixed_memory implementations for Vec<u8> and for
&mut Vec<u8> must explicitly reject zero-length vectors to avoid
platform-specific failures from memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
- Around line 175-190: The AArch64 test_mmap writes JIT code into memory (via
Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but
doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add
an I-cache flush before transmuting exec.as_ptr() to a function pointer by
adding the clear-cache crate to Cargo.toml and calling
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if
you prefer FFI, declare and call libc __clear_cache with the same begin/end)
immediately before the unsafe transmute/func() call in test_mmap.
In `@harm-runtime/src/runtime.rs`:
- Around line 26-28: The public build() currently panics via todo!(); change its
API to avoid panics by returning a Result and introduce a BuildError enum (e.g.,
BuildError::Unimplemented) so callers receive a structured error until
implementation is added; update the function signature pub fn build(self) ->
Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with
an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.
| #[inline] | ||
| fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> { | ||
| let mut mem = Mmap2FixedMemory::allocate(self.len())?; | ||
| mem.as_mut().copy_from_slice(&self); | ||
| Ok(mem) | ||
| } | ||
| } | ||
|
|
||
| impl Memory<Mmap2FixedMemory> for &mut Vec<u8> { | ||
| type ExtendError = Infallible; | ||
| type FixedMemoryError = std::io::Error; | ||
|
|
||
| #[inline] | ||
| fn pos(&self) -> usize { | ||
| self.len() | ||
| } | ||
|
|
||
| #[inline] | ||
| fn capacity(&self) -> Option<usize> { | ||
| None | ||
| } | ||
|
|
||
| #[inline] | ||
| fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> { | ||
| self.extend(bytes); | ||
| Ok(()) | ||
| } | ||
|
|
||
| #[inline] | ||
| fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> { | ||
| let mut mem = Mmap2FixedMemory::allocate(self.len())?; | ||
| // The memmap2 spec doesn't say that the length can be different... | ||
| mem.as_mut().copy_from_slice(self); | ||
| Ok(mem) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "*.rs" | head -20Repository: monoid/harm
Length of output: 1026
🏁 Script executed:
rg "Mmap2FixedMemory" -t rust -A 5Repository: monoid/harm
Length of output: 4285
🌐 Web query:
memmap2 map_anon(0) behavior zero-length platform-dependent
💡 Result:
memmap2::MmapMut::map_anon(0) (or MmapOptions::new().len(0).map_anon()) has no stable, cross-platform meaning: memmap2 does not guarantee that a zero-length anonymous mapping is supported; it just forwards the request to the OS, so the result is platform/OS-kernel dependent.
memmap2explicitly notes that if you don’t set a length, you’ll create a zero-length anonymous mapping attempt. Whether that succeeds is then up to the underlying system call. [1]- On Linux/POSIX-style
mmap, the length must be greater than 0; Linux documents this requirement, andmmap(…, length=0, …)fails (EINVAL on modern Linux). [2]
Practical takeaway: if you need “empty buffer” semantics, special-case len == 0 and use an empty Vec<u8> / slice instead of trying to create a 0-byte memmap.
Sources: [1] [2]
🏁 Script executed:
rg "fn allocate" harm-runtime/src/memory/memmap2.rs -A 10Repository: monoid/harm
Length of output: 616
🏁 Script executed:
cat -n harm-runtime/src/memory/memmap2.rs | head -100Repository: monoid/harm
Length of output: 2897
Reject zero-length vector conversions explicitly.
memmap2::MmapMut::map_anon(0) behavior is platform-dependent and fails on Linux/POSIX with EINVAL. Explicitly checking for zero-length in both Vec<u8> and &mut Vec<u8> conversions avoids silent platform-specific failures.
Suggested fix
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+ if self.is_empty() {
+ return Err(std::io::Error::new(
+ std::io::ErrorKind::InvalidInput,
+ "cannot convert empty vector to fixed memory",
+ ));
+ }
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
mem.as_mut().copy_from_slice(&self);
Ok(mem)
}Apply the same check to the &mut Vec<u8> implementation.
🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 132 - 166, The
into_fixed_memory implementations for Vec<u8> and for &mut Vec<u8> must
explicitly reject zero-length vectors to avoid platform-specific failures from
memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
c0daa2c to
686ad9a
Compare
|
@coderabbitai please, review. |
|
Tip For best results, initiate chat on the files or code changes.
[review] |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
harm/src/reloc.rs (1)
463-472:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject out-of-range PC-relative deltas instead of wrapping them.
wrapping_sub(...).cast_signed()turns any unsigned under/overflow into a plausiblei64. ForPREL64and the page-based relocations, that silently encodes the wrong target whenS - Pfalls outside the signed 64-bit range instead of surfacing overflow.Possible fix
pub enum Rel64Error { NotEnoughMemory { offset: usize }, InvalidOffset { offset: usize }, + Overflow, InvalidValue(::core::num::TryFromIntError), InvalidBits(BitError), }pub fn calc_delta(base: u64, value: u64, offset: usize) -> Result<Offset64, Rel64Error> { let offset64 = offset .try_into() .map_err(|_e| Rel64Error::InvalidOffset { offset })?; let instruction_addr = base .checked_add(offset64) .ok_or(Rel64Error::InvalidOffset { offset })?; - Ok(value.wrapping_sub(instruction_addr).cast_signed()) + let delta = i128::from(value) - i128::from(instruction_addr); + delta.try_into().map_err(|_| Rel64Error::Overflow) }pub fn calc_page_offset(base: u64, value: u64, offset: usize) -> Result<Offset64, Rel64Error> { const PAGE_MASK: u64 = !0xfff; let offset64 = offset .try_into() .map_err(|_e| Rel64Error::InvalidOffset { offset })?; let instruction_addr = base .checked_add(offset64) .ok_or(Rel64Error::InvalidOffset { offset })?; - Ok((value & PAGE_MASK) - .wrapping_sub(instruction_addr & PAGE_MASK) - .cast_signed()) + let delta = i128::from(value & PAGE_MASK) - i128::from(instruction_addr & PAGE_MASK); + delta.try_into().map_err(|_| Rel64Error::Overflow) }Also applies to: 479-491
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@harm/src/reloc.rs` around lines 463 - 472, The current calc_delta function uses value.wrapping_sub(instruction_addr).cast_signed(), which hides overflow by wrapping; change it to detect unsigned subtraction overflow and reject results outside the signed 64-bit range: compute value.checked_sub(instruction_addr) (or detect via value >= instruction_addr), return Rel64Error::Overflow (or a suitable Rel64Error variant) if subtraction would underflow/overflow, and only then cast the u64 difference into i64 via a checked conversion; apply the same fix to the other similar helper(s) referenced around lines 479-491 (the other calc_delta-like function) so PREL64 and page-based relocations return an error instead of silently wrapping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@harm-runtime/src/lib.rs`:
- Around line 7-13: The crate advertises an "alloc" feature but never enables
no_std and modules builder, labels, and runtime unconditionally import
std::collections::HashMap; fix by adding a crate-level conditional no_std
attribute (e.g. #![cfg_attr(not(feature = "std"), no_std)]) and then either (A)
change those modules (builder, labels, runtime) to conditionally use
std::collections::HashMap only when the "std" feature is enabled and use alloc
equivalents (alloc::collections or hashbrown behind the "alloc" feature)
otherwise, or (B) gate the entire modules builder, labels, runtime behind
cfg(feature = "std") so they are not compiled in no_std builds; ensure the
feature flags in Cargo.toml reflect this (do not advertise alloc as enabling
no_std unless the code paths compile without std).
In `@harm-runtime/src/memory.rs`:
- Around line 30-31: The trait method try_extend currently allows partial
writes; update the contract and API so callers can rely on atomic add semantics:
change try_extend in the Memory trait to explicitly guarantee all-or-nothing
behavior (no bytes are appended on Err) and document that ExtendError implies no
mutation, or alternatively add a new capacity-checked method (e.g.
try_reserve_and_extend or try_extend_exact) that first checks capacity and only
performs the write when it can complete fully; update implementations to honor
the new guarantee and adjust callers like Assembler::append to use the
all-or-nothing try_extend (or the new capacity-checked API) so buffer and
relocation bookkeeping remain in sync.
In `@harm-runtime/src/memory/foreign_memory.rs`:
- Line 18: Remove the unused lifetime parameter on the impl block for
ForeignMemoryBuffer: change the impl<'mem> ForeignMemoryBuffer { ... }
declaration to impl ForeignMemoryBuffer { ... } so the extraneous 'mem lifetime
is not declared; ensure no methods or associated items rely on a lifetime
parameter named 'mem before removing it.
In `@harm-runtime/src/runtime.rs`:
- Around line 58-68: Builder::build currently uses wrapping_add_signed when
computing label addresses and relocation addends, causing labels near u64::MAX
to wrap instead of surfacing an overflow; update the arithmetic in
Builder::build (and any helper in builder.rs that computes label_address or
relocation addend) to use checked/overflow-aware addition (e.g.,
checked_add/checked_sub or explicit overflow checks) and return
BuilderError::AddressOverflow on overflow instead of allowing wrapping; ensure
the same change is applied to all places in builder.rs that call
wrapping_add_signed so label computations and relocation addends propagate the
AddressOverflow error.
- Line 6: runtime.rs (and the similar modules builder.rs and labels.rs)
currently import std::collections::HashMap which breaks no_std+alloc builds;
replace the unconditional import with an alloc-compatible alternative or gate
it: either change uses of HashMap to an alloc-friendly collection (e.g., use
alloc::collections::BTreeMap and update type names/usages accordingly in
runtime.rs, builder.rs, labels.rs) or wrap the existing
std::collections::HashMap imports and any functions/types that depend on them
with #[cfg(any(feature = "alloc", feature = "std"))] and provide a
no_std-friendly implementation or stub for the alloc case. Target the unique
symbol "HashMap" in those files and update all type references and
constructors/usages to match the chosen collection or add the cfg attributes
consistently across runtime.rs, builder.rs, and labels.rs.
---
Outside diff comments:
In `@harm/src/reloc.rs`:
- Around line 463-472: The current calc_delta function uses
value.wrapping_sub(instruction_addr).cast_signed(), which hides overflow by
wrapping; change it to detect unsigned subtraction overflow and reject results
outside the signed 64-bit range: compute value.checked_sub(instruction_addr) (or
detect via value >= instruction_addr), return Rel64Error::Overflow (or a
suitable Rel64Error variant) if subtraction would underflow/overflow, and only
then cast the u64 difference into i64 via a checked conversion; apply the same
fix to the other similar helper(s) referenced around lines 479-491 (the other
calc_delta-like function) so PREL64 and page-based relocations return an error
instead of silently wrapping.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c8390c73-17d7-41ee-ad13-26656c1b6036
⛔ Files ignored due to path filters (1)
.DS_Storeis excluded by!**/.DS_Store
📒 Files selected for processing (13)
harm-runtime/Cargo.tomlharm-runtime/src/builder.rsharm-runtime/src/labels.rsharm-runtime/src/lib.rsharm-runtime/src/memory.rsharm-runtime/src/memory/foreign_memory.rsharm-runtime/src/memory/memmap2.rsharm-runtime/src/runtime.rsharm/src/reloc.rsharm/src/reloc/addr.rsharm/src/reloc/control.rsharm/src/reloc/data.rsharm/src/reloc/movs.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
harm-runtime/src/memory.rs (1)
28-29:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMake
try_extendatomic-on-error in the trait contract.At Line 28, the docs describe capacity failure but still don’t guarantee rollback/no mutation on
Err. This leaves callers unable to safely assume append atomicity.Proposed contract tightening
pub trait Memory { type ExtendError; @@ - /// Append data to the memory. Should fail when it reaches memory's capacity. + /// Append data to memory. + /// + /// Implementations must be all-or-nothing: + /// - `Ok(())`: all bytes were appended. + /// - `Err(_)`: no bytes were appended and position is unchanged. + /// + /// Should fail when the write would exceed fixed capacity. fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError>;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harm-runtime/src/memory.rs` around lines 28 - 29, The try_extend trait method currently documents capacity failure but doesn't guarantee rollback on error; update the Memory trait's try_extend contract to require atomicity on error (no mutation of memory/state when Err is returned) and specify that implementations must either pre-check capacity or buffer the incoming bytes and only commit on success, and clarify behavior of ExtendError when partial data would have been added; update implementations of try_extend to follow this (e.g., in methods named try_extend) and add unit tests asserting no state change on Err.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@harm-runtime/src/memory.rs`:
- Around line 71-99: The tests call Memory::align(...) but ignore its Result;
change each call in the test functions (e.g., test_align and
test_align_corner_case) to assert success by unwrapping the Result with
.expect("Memory::align failed") (or similar) so failures are explicit and the
compiler warning about unused Results is resolved; update every occurrence of
Memory::align(&mut data, ...) in these tests to use .expect(...) instead of
dropping the Result.
---
Duplicate comments:
In `@harm-runtime/src/memory.rs`:
- Around line 28-29: The try_extend trait method currently documents capacity
failure but doesn't guarantee rollback on error; update the Memory trait's
try_extend contract to require atomicity on error (no mutation of memory/state
when Err is returned) and specify that implementations must either pre-check
capacity or buffer the incoming bytes and only commit on success, and clarify
behavior of ExtendError when partial data would have been added; update
implementations of try_extend to follow this (e.g., in methods named try_extend)
and add unit tests asserting no state change on Err.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 860f9cae-0159-4ae6-86e6-75a7704c1d6c
📒 Files selected for processing (6)
harm-runtime/src/builder.rsharm-runtime/src/labels.rsharm-runtime/src/lib.rsharm-runtime/src/memory.rsharm-runtime/src/memory/foreign_memory.rsharm-runtime/src/memory/memmap2.rs
🚧 Files skipped from review as they are similar to previous changes (4)
- harm-runtime/src/memory/foreign_memory.rs
- harm-runtime/src/builder.rs
- harm-runtime/src/memory/memmap2.rs
- harm-runtime/src/labels.rs
Do not update memory pos if appending fails.
ea975a7 to
a41c666
Compare
|
@coderabbitai please, review. |
|
✅ Actions performedReview triggered.
|
Memory buffers for collecting instructions, applying relocation and execution.
Ref #31.
Summary by CodeRabbit
New Features
Refactor
Tests
Chores