Skip to content

fix: validate end_effector_frame to prevent OOB access on invalid frame name#53

Open
Nabil-Miri wants to merge 1 commit intolearnsyslab:mainfrom
Nabil-Miri:fix/validate-ee-frame
Open

fix: validate end_effector_frame to prevent OOB access on invalid frame name#53
Nabil-Miri wants to merge 1 commit intolearnsyslab:mainfrom
Nabil-Miri:fix/validate-ee-frame

Conversation

@Nabil-Miri
Copy link
Copy Markdown

@Nabil-Miri Nabil-Miri commented May 8, 2026

Summary

If the end_effector_frame parameter is set to a name that doesn't exist in the robot's URDF (typo, wrong frame, frame not yet published), the controller currently accepts the bad name silently and then misbehaves on the first control cycle. It is possible that the robot falls and trigger protective stop. I am not sure why the ur driver accepts such input. Depending on what happens to be in memory at that moment, the result is one of:

  • the ros2_control_node process crashes (segfault), or
  • the controller commands invalid or garbage torques: the robot either rejects them (arm goes limp and falls until the safety layer catches it) or applies them (arm moves unexpectedly until a protective stop trips).

This affects CartesianController, PoseBroadcaster, and TwistBroadcaster. The root cause is that pinocchio::Model::getFrameId("bad_name") doesn't throw on an invalid name — it returns a special "not found" value that, if used unchecked, indexes past the end of the internal data buffer.

This PR adds a check at controller configuration time: if the frame name isn't found, the controller refuses to configure and prints a clear error message. It also adds two extra safety guards in CartesianController to catch NaN/Inf if it sneaks in through other paths (e.g. corrupt joint states or a runtime tcp_offset message).

Reproducer

In any controllers YAML:

cartesian_impedance_controller:
  ros__parameters:
    task:
      end_effector_frame: this_frame_does_not_exist
ros2 control switch_controllers --activate cartesian_impedance_controller

Upstream main activates the controller with no warning:

[INFO] [cartesian_impedance_controller]: Controller activated.
[INFO] [controller_manager]: Successfully switched controllers!

…and then exhibits one of several failure modes on the first update() cycles.

Observed failure modes (same config, real UR5e)

Three reproductions on identical hardware/config produced three distinct failures:

Run A — SIGSEGV in pinocchio::computeFrameJacobian

The ros2_control_node process dies on the first update() cycle. Stack trace:

[ros2_control_node-1] [INFO] [cartesian_impedance_controller]: Controller activated.
[ros2_control_node-1] [INFO] [controller_manager]: Successfully switched controllers!
[ros2_control_node-1] Stack trace (most recent call last):
[ros2_control_node-1] #4 controller_manager::ControllerManager::update(...)
[ros2_control_node-1] #3 controller_interface::ControllerInterfaceBase::trigger_update(...)
[ros2_control_node-1] #2 crisp_controllers::CartesianController::update(...)
[ros2_control_node-1] #1 pinocchio::computeFrameJacobian<double, 0, ...>(...)
[ros2_control_node-1] #0 Eigen::internal::call_dense_assignment_loop<
[ros2_control_node-1]      Eigen::Matrix<double, 3, 3, 0, 3, 3>, ...>(...)
[ros2_control_node-1] Segmentation fault (Signal sent by the kernel [(nil)])
[ERROR] [ros2_control_node-1]: process has died [pid 2290956, exit code -11, ...]

The robot loses comms with the controller manager; the safety layer trips a protective stop on watchdog.

Run B — NaN torques, arm falls

The controller activates with no warning, then on update():

[cartesian_impedance_controller]: end_effector_pos
0
0
0
[cartesian_impedance_controller]: J: 0 0 0 0 0 0
                                     0 0 0 0 0 0
                                     0 0 0 0 0 0
                                     0 0 0 0 0 0
                                     0 0 0 0 0 0
                                     0 0 0 0 0 0
[cartesian_impedance_controller]: error: -3.29056e-310  0  0  nan  nan  nan
[cartesian_impedance_controller]: tau_task: nan nan nan nan nan nan

The out-of-bounds read decoded as a near-zero pose; the Jacobian against a non-existent frame collapsed to all zeros; orientation error contained denormalized garbage that propagated to NaN through atan2 / quaternion log; final tau_task is NaN. The UR firmware rejected the NaN torques, leaving the arm unsupported — it fell under its own weight until the safety layer caught it.

Root cause

src/cartesian_controller.cpp, on_configure:

end_effector_frame_id = model_.getFrameId(params_.end_effector_frame);
// ... no check that end_effector_frame_id < model_.nframes

Per Pinocchio's documented behavior, getFrameId(name) returns nframes when no frame matches, and does not throw. The unguarded subsequent accesses in update():

end_effector_pose = data_.oMf[end_effector_frame_id];                 // out-of-bounds
pinocchio::computeFrameJacobian(..., end_effector_frame_id, ...);     // out-of-bounds

are undefined behavior. The same unguarded pattern exists in pose_broadcaster.cpp and twist_broadcaster.cpp.

Fix

Validate the frame ID in on_configure and return CallbackReturn::ERROR with a clear message if the frame is unknown. This converts a runtime crash / silent misbehavior into a deterministic, actionable error:

[ERROR] [cartesian_impedance_controller]: end_effector_frame
'this_frame_does_not_exist' is not present in the robot model.
Refusing to configure: activating with an invalid frame results in
undefined behavior (out-of-bounds access into pinocchio::Data,
manifesting as a segfault or NaN/Inf in computed torques).

CartesianController gets two additional guards (defense-in-depth):

  • on_activate: refuse activation if the resolved end-effector pose is non-finite (catches NaN propagating from a parent transform).
  • update(): throttled NaN/Inf check on the resolved end-effector pose; on detection, log and hold the previous torque command for that cycle.

PoseBroadcaster and TwistBroadcaster get the same on_configure frame-id validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant