Skip to content

feat(ast-grep): Add experimental ast-grep Wasm plugin#435

Draft
kdy1 wants to merge 5 commits into
mainfrom
kdy1/ast-grep
Draft

feat(ast-grep): Add experimental ast-grep Wasm plugin#435
kdy1 wants to merge 5 commits into
mainfrom
kdy1/ast-grep

Conversation

@kdy1
Copy link
Copy Markdown
Member

@kdy1 kdy1 commented Apr 7, 2025

I'm trying to see if we can support ast-grep using a Wasm plugin so a user can modify AST using ast-grep, instead of building a plugin

@kdy1 kdy1 self-assigned this Apr 7, 2025
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 7, 2025

⚠️ No Changeset found

Latest commit: dbb1c38

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@kdy1
Copy link
Copy Markdown
Member Author

kdy1 commented Apr 7, 2025

cc @HerringtonDarkholme for visibility

@HerringtonDarkholme
Copy link
Copy Markdown

Hi Donny!

For some background, ast-grep now is relying on tree-sitter as the parser. So using it with SWC will have to include another parser. Also, the tree-sitter cannot be compiled as WASM using Rust. It has to be compiled with emscripten.

However, if ast-grep can be changed to be parser independent, say, providing a trait for AST node/Parser, it is possible to compile it with SWC to a standalone WASM.

It will take quite a lot of effort, but it is not impossible! I can do some experiments in the coming weekend and give you a rough estimate of how much time it will take.

@HerringtonDarkholme
Copy link
Copy Markdown

Hi Donny, I'm working on ast-grep's parser abstraction on this branch now. ast-grep/ast-grep#1940
The PR is still WIP but it is promising that ast-grep can be independent of tree-sitter.

@HerringtonDarkholme
Copy link
Copy Markdown

Hi Donny, now ast-grep is fully independent of tree-sitter parser now. See https://github.com/ast-grep/ast-grep/blob/290b31e6e44a9891f99d243b57c4ae1bdbaa340f/crates/core/Cargo.toml#L20

However, integrating swc still needs significant change in swc plugin system.

  1. First, ast-grep needs implementing two core traits: SgNode and Doc. https://github.com/ast-grep/ast-grep/blob/290b31e6e44a9891f99d243b57c4ae1bdbaa340f/crates/core/src/source.rs#L28-L67. For now swc's AST has many structs/enums that need changes to implement that trait. If swc can provide a "type-less" AST view, it can be easier to implement

  2. Second, swc's plugin is based on Visitor or Fold. But ast-grep's Mather or Replacer only process one node at a time. It is possible to use find_all and replace_all on the root program. Or use swc's Visitor to match ast-grep rule against all nodes. The difference is about who will traverse the AST: the first approach uses ast-grep and the second approach uses swc.

  3. swc's plugin transform code by returning new AST node from Visitor/Transformation methods. But ast-grep's change is based on string. This also needs change in swc.

@HerringtonDarkholme
Copy link
Copy Markdown

HerringtonDarkholme commented May 4, 2025

This is an example of using oxc as parser. ast-grep/ast-grep#1970

oxc is used because it has an untyped AST which makes the integration faster. But the general idea is three steps:

  1. Implement Langauge trait https://github.com/ast-grep/ast-grep/blob/26cffdd127b7cf659a14f1a030e971c520709a81/crates/oxc/src/binding.rs#L183 Language trait is used to parse Pattern and mapping numeric kind_id to human readable kind string and vice versa.
  2. implement the Doc trait https://github.com/ast-grep/ast-grep/blob/26cffdd127b7cf659a14f1a030e971c520709a81/crates/oxc/src/binding.rs#L502-L516 Doc trait is used to maintain the source code representation and parsing.
  3. implement Tree node traversal methods in SgNode https://github.com/ast-grep/ast-grep/blob/26cffdd127b7cf659a14f1a030e971c520709a81/crates/oxc/src/binding.rs#L405

@kdy1 kdy1 force-pushed the kdy1/ast-grep branch from 5d6dfa6 to 3fe3733 Compare May 7, 2025 15:25
@0xdevalias
Copy link
Copy Markdown

However, integrating swc still needs significant change in swc plugin system.

A few notes / context that may be relevant for anyone who stumbles upon this and is curious:

With regards to the comments about how the swc AST implementation made it complicated to implement it as an ast-grep backend; that would have been based on the current/older swc_ecma_* parser/AST/etc, whereas I recently discovered (see the notes in the swc section of pionxzh/wakaru#152 (comment)) there is a newer swc_es_* parser/AST/etc; so I was curious if that might make things any better/easier for this sort of integration.

Asking GitHub Copilot (GPT-5.4) to explore that a bit:

Short answer: yes, something meaningful has changed in swc since that April/May 2025 discussion — but mostly on the AST/parsing side, not yet enough to make ast-grep integration “just work.” The newer experimental swc_es_* stack looks materially more promising for ast-grep than the old swc_ecma_* stack, because it introduces a newer arena-backed, handle-based AST plus its own visitor/fold ecosystem.

What got better for ast-grep

The new swc_es_* design is closer to the kind of “type-lighter view” discussed in the PR comments.

Why:

  • swc_es_ast uses typed IDs and an arena store instead of the old pervasive owned tree style. That makes it easier to define a uniform node interface over “node ids + store” than over many nested Rust structs/enums.
  • The new AST seems intentionally simpler and more regular in places. For example, statements, declarations, expressions, TS types, etc. are separated into enums plus ID-backed references, which is more amenable to generic traversal/adaptation.
  • A dedicated swc_es_visit already exists, so traversal over this arena-backed AST is a first-class concept rather than something ast-grep would have to invent itself.

That means: if someone were going to build an ast-grep adapter for SWC, swc_es_ast is a better substrate than swc_ecma_ast.

What probably has not been solved yet

The PR comment’s objections were not only “parser/AST shape” objections.

  1. SgNode / Doc still need an adapter layer
    I did not find evidence in swc that the new AST already implements ast-grep-style generic source/node traits, or that there is an existing adapter for SgNode/Doc. The new AST being cleaner helps, but it does not remove the need to define:

    • parent/child traversal semantics
    • node kind taxonomy
    • source text / byte-range / span-backed document operations
    • possibly “typeless” matching views across many node categories
  2. String-based replacement vs AST-returning transforms still looks like the big architectural mismatch
    The original comment said ast-grep replacement is string-based while SWC plugins transform by returning/replacing AST nodes. I do not see evidence that SWC’s plugin model has shifted to a text-patch-oriented transform model. The new swc_es_visit still exposes Visit/Fold-style AST traversal and rewrite hooks, which is philosophically closer to traditional compiler transforms than to ast-grep’s textual rewrite engine.

Would swc_es_* work better?

Yes, probably better — but still as an integration project, not an off-the-shelf fit.

If I had to rank feasibility today:

  1. Best candidate: build ast-grep integration against swc_es_ast / swc_es_parser
  2. Less attractive: integrate against swc_ecma_ast
  3. Still hard either way: use it as a normal SWC Wasm plugin without adapting replacement semantics

The strongest reasons swc_es_* looks better are:

  • arena-backed handles
  • cleaner next-gen AST boundary
  • dedicated new visitor/fold crate
  • end-to-end pipeline crates around it (parser, visit, transforms, codegen, semantics)

Originally posted by @0xdevalias in pionxzh/wakaru#111 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants