The official implementation of ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm. We identify the Component Object Model (COM) as a unified executable action space across professional software, reframing GUI-fragile, API-fragmented automation as deterministic program synthesis. See the project page for demos and full results.
2026-06-20: 🎉 Paper and project page are publicly available.
🚧 Coming soon: a detailed guide for preparing dataset, setting up VM environment and running agent training & evaluation end-to-end.
Here's a quick orientation:
| Component | Description | Link |
|---|---|---|
| Benchmark | 1,000 tasks across SolidWorks, Inventor, AutoCAD; 7 engineering activities | 🤗 ComCADBench |
| Environment | Parallel Dockerized Windows VMs for training & evaluation | comforge/ |
| Agent | 9B self-correcting agent via 3-stage training: text-to-code SFT → agentic SFT → GRPO | 🤗 ComActor |
We acknowledge the outstanding open-source contributions from Qwen3.5, ms-swift, Text2CAD, Fusion 360 Gallery, and SketchGraphs.
For any questions or feedback, please:
- Open an issue in the GitHub repository
- Reach out to us at julyai@whu.edu.cn
If you find our paper and code useful, please kindly cite us.
@misc{ai2026comactreframingprofessionalsoftware,
title={ComAct: Reframing Professional Software Manipulation via COM-as-Action Paradigm},
author={Jiaxin Ai and Tao Hu and Xuemeng Yang and Shu Zou and Hairong Zhang and Daocheng Fu and Yu Yang and Hongbin Zhou and Nianchen Deng and Pinlong Cai and Zhongyuan Wang and Botian Shi and Kaipeng Zhang and Licheng Wen},
year={2026},
eprint={2606.13239},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2606.13239},
}
