[DRAFT] [LOGIC] Standardize prompt_injection.json attack structure for orchestrator compatibility#19
Draft
joe-gemini-bot[bot] wants to merge 1 commit into
Draft
[DRAFT] [LOGIC] Standardize prompt_injection.json attack structure for orchestrator compatibility#19joe-gemini-bot[bot] wants to merge 1 commit into
joe-gemini-bot[bot] wants to merge 1 commit into
Conversation
…trator compatibility Co-authored-by: HOLYKEYZ <ayandajoseph390@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem / Gap
The dataset
modelfang/datasets/prompt_injection.jsonuses inconsistent keys for prompt data: most attacks have apromptsarray, butpi_injection_chainusesattack_chain(array of objects) andpi_context_injectionuses a singlepromptstring. Any orchestrator that expects a uniformpromptslist will crash or skip these high-severity attacks, effectively disabling the most powerful injection vectors.Solution & Insight
pi_injection_chain: keep the existingattack_chainfor multi-turn logic, but add apromptsarray containing the first prompt of the chain to ensure backward compatibility with single-turn selection.pi_context_injection: change the keyprompttopromptsand wrap the existing string in an array.Now every attack object contains a
promptskey that is an array of strings, allowing the orchestrator to safely iterate over all attacks.Impact
Prevents
TypeError/KeyErrorcrashes when the orchestrator processes these attacks, enabling the full dataset to be used in red-teaming evaluations.Validated by Triple-AI: Scanner (NVIDIA NIM (google/gemma-4-31b-it)) → Executor (NVIDIA NIM (deepseek-ai/deepseek-v4-pro)) → Reviewer (NVIDIA NIM (moonshotai/kimi-k2.6))
Co-authored-by: HOLYKEYZ ayandajoseph390@gmail.com
This is a DRAFT PR — review and merge when ready.