Skip to content

Skip RBS rewrite when no markers are present#2616

Open
mattkubej wants to merge 1 commit into
mainfrom
rbs-marker-guard
Open

Skip RBS rewrite when no markers are present#2616
mattkubej wants to merge 1 commit into
mainfrom
rbs-marker-guard

Conversation

@mattkubej
Copy link
Copy Markdown
Contributor

@mattkubej mattkubej commented May 7, 2026

Summary

Avoid running Spoom's RBS-to-Sorbet-sig translator for typed Ruby files that cannot contain runtime-rewriteable RBS syntax.

Tapioca currently calls Spoom::Sorbet::Translate.rbs_comments_to_sorbet_sigs for every typed Ruby file loaded through the RBS rewriter. A large portion of typed files in Core do not contain RBS signatures or supported RBS annotations, so the translator often parses source only to return it unchanged.

This keeps the existing typed: gate and adds a conservative marker gate. Translation now runs only when source contains one of:

  • #:
  • #|
  • # @abstract
  • # @interface
  • # @sealed
  • # @final
  • # @requires_ancestor:
  • # @override
  • # @overridable
  • # @without_runtime

False positives are safe because they still use the existing translator. False negatives would be unsafe, so the annotation list is intentionally aligned with Spoom's currently supported runtime-rewrite annotations.

@mattkubej mattkubej force-pushed the rbs-marker-guard branch from c06244a to 3ad4466 Compare May 7, 2026 23:24
@mattkubej mattkubej added the enhancement New feature or request label May 8, 2026
@mattkubej mattkubej mentioned this pull request May 8, 2026
@mattkubej mattkubej marked this pull request as ready for review May 11, 2026 16:16
@mattkubej mattkubej requested a review from a team as a code owner May 11, 2026 16:16
Comment on lines +63 to +68
def possible_rbs_runtime_rewrite_syntax?(source)
return true if source.include?("#:") || source.include?("#|")
return false unless source.include?("# @")

RBS_ANNOTATION_MARKERS.any? { |marker| source.include?(marker) }
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be scanning the source string multiple times, can we turn the search into a Regexp.union and do a single scan through the source?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like:

SEARCH_REGEXP = Regexp.union(TYPED_FILE_PATTERN, Regexp.escape("#:"), Regexp.escape("#|"), *RBS_ANNOTATION_MARKERS.map { Regexp.escape(it) })

def should_rewrite?
  source.index(SEARCH_REGEXP)
end

should give you the decision to rewrite or not with a single pass.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the redundant scans. I just pushed an amend.

However, I kept typed_file? as a separate stage rather than folding into the union. A single union would introduce OR semantics, so a typed file with no markers would match and I think we'd want to avoid the expensive Spoom translator execution in this case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, I forgot that the typed check wasn't being OR'ed. Good call.

Copy link
Copy Markdown
Member

@paracycle paracycle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but have one question

"# @overridable",
"# @without_runtime",
].freeze #: Array[String]
RBS_REWRITE_PATTERN = Regexp.union(["#:", "#|"] + RBS_ANNOTATION_MARKERS).freeze #: Regexp
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that none of these have special Regexp characters, or does Regexp.union do the right thing? I am surprised that we don't need Regexp.escape here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just sanity checked this. Looks like Regexp.union runs each String arg through Regexp.escape before joining.

Looking at the Ruby implementation, it looks like Regexp.escape and Regexp.quote share the same C function.

Check to show union's source against manually escaped and joined version:

markers = ["#:", "#|", "# @abstract", "# @interface", "# @sealed", "# @final", "# @requires_ancestor:", "# @override", "# @overridable", "# @without_runtime"]
pattern = Regexp.union(markers)
pattern.source == markers.map { |m| Regexp.escape(m) }.join("|")
# => true
pattern.source
# => "\#:|\#\||\#\ @abstract|\#\ @interface|\#\ @sealed|\#\ @final|\#\ @requires_ancestor:|\#\ @override|\#\ @overridable|\#\ @without_runtime"

Documentation doesn't explicitly state this behavior, but the example seems to highlight this escaping behavior as well: https://docs.ruby-lang.org/en/3.4/Regexp.html#method-c-union

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants