Your Question
I am running slime for qwen3.5 397B model and the weight transfer is taking too long. I tried to profile the delay and it seems i am taking a lot of time in ramp down of requests in SGLang servers. Is it possible to avoid this ramp down during weight transfer similar to what has been done in pipeline RL paper: https://arxiv.org/pdf/2509.19128 for vllm. Let me know if its already supported or if there is an open PR to add this functionality.
What I've Tried
I have tried looking at documentation and FAQ but could not find the implementation or a direct way to use it in slime.
Environment (if relevant)
Additional Context
No response
Pre-submission Checklist
Your Question
I am running slime for qwen3.5 397B model and the weight transfer is taking too long. I tried to profile the delay and it seems i am taking a lot of time in ramp down of requests in SGLang servers. Is it possible to avoid this ramp down during weight transfer similar to what has been done in pipeline RL paper: https://arxiv.org/pdf/2509.19128 for vllm. Let me know if its already supported or if there is an open PR to add this functionality.
What I've Tried
I have tried looking at documentation and FAQ but could not find the implementation or a direct way to use it in slime.
Environment (if relevant)
Additional Context
No response
Pre-submission Checklist