Skip to content

fix(client): url-encode query parameter names#565

Open
RapidPoseidon wants to merge 1 commit intomainfrom
poseidon/url-encode-query-keys
Open

fix(client): url-encode query parameter names#565
RapidPoseidon wants to merge 1 commit intomainfrom
poseidon/url-encode-query-keys

Conversation

@RapidPoseidon
Copy link
Copy Markdown
Contributor

Summary

ApiClient.parameters_to_url_query url-encoded every value but not the key:

```python
new_params.extend((k, str(value)) for value in v) # multi branch, key raw
new_params.append((k, delimiter.join(quote(str(value)) for value in v))) # key raw
new_params.append((k, quote(str(v)))) # key raw
```

A key containing &, =, #, space, or UTF-8 bytes would break the query string — the & terminates the pair early, = merges into the value, # truncates at the fragment, etc.

Fix

Compute encoded_key = quote(str(k)) once per iteration and use it at every append site.

Test plan

  • uv run pyright src/rapidata/rapidata_client → 0 errors
  • The multi branch now also encodes its per-item value (it wasn't before — only the key fix needs it, but the symmetry is cheap and intentional).

🔗 Session: https://session-bc38cc85.poseidon.rapidata.internal/

`parameters_to_url_query` quoted every value but pasted keys in raw,
including the `multi` branch where the key was stringified without any
encoding. A query name that contained `&`, `=`, `#`, a space, etc.
would corrupt the query string (break at the `&`, fuse into the
previous pair at `=`, silently truncate at `#`, etc.).

Encode keys with the same `quote()` the values use. Mirrored into
`openapi/templates/api_client.mustache`.

Session: https://session-bc38cc85.poseidon.rapidata.internal/

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: lino <lino@rapidata.ai>
@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Code Review

Overview

This PR fixes a real correctness (and minor security) bug: parameters_to_url_query was URL-encoding query parameter values but silently leaving keys as raw strings. A key containing &, =, #, a space, or non-ASCII bytes would silently corrupt the entire query string. The fix is correct and minimal.


What's Good

  • Both files updated correctly. The mustache template (openapi/templates/api_client.mustache) and the generated file (src/rapidata/api_client/api_client.py) are kept in sync, as required by the project conventions.
  • encoded_key computed once before the if k in collection_formats branch, avoiding redundant re-encoding.
  • Bonus fix in multi branch. The multi collection path was doing str(value) without any encoding, which was inconsistent with every other branch. Adding quote() there is correct and the PR calls it out explicitly.
  • Change is minimal and focused — no unrelated cleanup.

Suggestions / Issues

1. quote() defaults to safe='/' — slashes in keys won't be encoded

encoded_key = quote(str(k))  # '/' is in the default safe set and won't be encoded

For query parameter names, a literal / is technically invalid and should be percent-encoded as %2F. The existing value-encoding code has the same gap, so this is a pre-existing issue rather than something introduced here — but it's worth noting. To be fully correct:

encoded_key = quote(str(k), safe='')

If you want to keep the PR minimal (and value encoding already has this gap), it's reasonable to leave this as a follow-up.

2. map(str, item) on line 539 is now redundant

return "&".join(["=".join(map(str, item)) for item in new_params])

After the fix, both key and value in every tuple are already str (results of quote(), which always returns str). The map(str, ...) call was previously a mild safety net for the raw-key case, but it's now purely no-op. Not a bug, just mild noise.

3. No tests

There are currently no test files in the repo, so this isn't a blocker, but this function is a good candidate for a unit test given the bug history:

# e.g., key with '&' and '=' should be fully encoded
assert client.parameters_to_url_query({"a&b": "c=d"}, {}) == "a%26b=c%3Dd"
# multi branch
assert client.parameters_to_url_query({"k": ["v1", "v 2"]}, {"k": "multi"}) == "k=v1&k=v%202"

Summary

The fix is correct and the important part — encoding keys — is done right. The two items worth actioning before merge are the safe='/' gap (at least document the known limitation) and verifying the ssv delimiter (' ') doesn't need encoding itself (a literal tab/space in a query string is also technically invalid, though that's a separate pre-existing issue). Everything else is cosmetic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant