fix: validate end_effector_frame to prevent OOB access on invalid frame name#53
Open
Nabil-Miri wants to merge 1 commit intolearnsyslab:mainfrom
Open
fix: validate end_effector_frame to prevent OOB access on invalid frame name#53Nabil-Miri wants to merge 1 commit intolearnsyslab:mainfrom
Nabil-Miri wants to merge 1 commit intolearnsyslab:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
If the
end_effector_frameparameter is set to a name that doesn't exist in the robot's URDF (typo, wrong frame, frame not yet published), the controller currently accepts the bad name silently and then misbehaves on the first control cycle. It is possible that the robot falls and trigger protective stop. I am not sure why the ur driver accepts such input. Depending on what happens to be in memory at that moment, the result is one of:ros2_control_nodeprocess crashes (segfault), orThis affects
CartesianController,PoseBroadcaster, andTwistBroadcaster. The root cause is thatpinocchio::Model::getFrameId("bad_name")doesn't throw on an invalid name — it returns a special "not found" value that, if used unchecked, indexes past the end of the internal data buffer.This PR adds a check at controller configuration time: if the frame name isn't found, the controller refuses to configure and prints a clear error message. It also adds two extra safety guards in
CartesianControllerto catch NaN/Inf if it sneaks in through other paths (e.g. corrupt joint states or a runtimetcp_offsetmessage).Reproducer
In any controllers YAML:
Upstream
mainactivates the controller with no warning:…and then exhibits one of several failure modes on the first
update()cycles.Observed failure modes (same config, real UR5e)
Three reproductions on identical hardware/config produced three distinct failures:
Run A — SIGSEGV in
pinocchio::computeFrameJacobianThe
ros2_control_nodeprocess dies on the firstupdate()cycle. Stack trace:The robot loses comms with the controller manager; the safety layer trips a protective stop on watchdog.
Run B — NaN torques, arm falls
The controller activates with no warning, then on
update():The out-of-bounds read decoded as a near-zero pose; the Jacobian against a non-existent frame collapsed to all zeros; orientation error contained denormalized garbage that propagated to NaN through
atan2/ quaternion log; finaltau_taskis NaN. The UR firmware rejected the NaN torques, leaving the arm unsupported — it fell under its own weight until the safety layer caught it.Root cause
src/cartesian_controller.cpp,on_configure:end_effector_frame_id = model_.getFrameId(params_.end_effector_frame); // ... no check that end_effector_frame_id < model_.nframesPer Pinocchio's documented behavior,
getFrameId(name)returnsnframeswhen no frame matches, and does not throw. The unguarded subsequent accesses inupdate():are undefined behavior. The same unguarded pattern exists in
pose_broadcaster.cppandtwist_broadcaster.cpp.Fix
Validate the frame ID in
on_configureand returnCallbackReturn::ERRORwith a clear message if the frame is unknown. This converts a runtime crash / silent misbehavior into a deterministic, actionable error:CartesianControllergets two additional guards (defense-in-depth):on_activate: refuse activation if the resolved end-effector pose is non-finite (catches NaN propagating from a parent transform).update(): throttled NaN/Inf check on the resolved end-effector pose; on detection, log and hold the previous torque command for that cycle.PoseBroadcasterandTwistBroadcasterget the sameon_configureframe-id validation.