Skip to content

[DRAFT] [LOGIC] Standardize prompt_injection.json attack structure for orchestrator compatibility#19

Draft
joe-gemini-bot[bot] wants to merge 1 commit into
masterfrom
bot/upgrade-1779224994
Draft

[DRAFT] [LOGIC] Standardize prompt_injection.json attack structure for orchestrator compatibility#19
joe-gemini-bot[bot] wants to merge 1 commit into
masterfrom
bot/upgrade-1779224994

Conversation

@joe-gemini-bot
Copy link
Copy Markdown
Contributor

Problem / Gap

The dataset modelfang/datasets/prompt_injection.json uses inconsistent keys for prompt data: most attacks have a prompts array, but pi_injection_chain uses attack_chain (array of objects) and pi_context_injection uses a single prompt string. Any orchestrator that expects a uniform prompts list will crash or skip these high-severity attacks, effectively disabling the most powerful injection vectors.

Solution & Insight

  • For pi_injection_chain: keep the existing attack_chain for multi-turn logic, but add a prompts array containing the first prompt of the chain to ensure backward compatibility with single-turn selection.
  • For pi_context_injection: change the key prompt to prompts and wrap the existing string in an array.
    Now every attack object contains a prompts key that is an array of strings, allowing the orchestrator to safely iterate over all attacks.

Impact

Prevents TypeError/KeyError crashes when the orchestrator processes these attacks, enabling the full dataset to be used in red-teaming evaluations.


Validated by Triple-AI: Scanner (NVIDIA NIM (google/gemma-4-31b-it)) → Executor (NVIDIA NIM (deepseek-ai/deepseek-v4-pro)) → Reviewer (NVIDIA NIM (moonshotai/kimi-k2.6))

Co-authored-by: HOLYKEYZ ayandajoseph390@gmail.com

This is a DRAFT PR — review and merge when ready.

…trator compatibility

Co-authored-by: HOLYKEYZ <ayandajoseph390@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants