Revert pr 2025#2041
Conversation
There was a problem hiding this comment.
Code Review
This pull request simplifies SQL query generation in aggregation_utils.py by removing single-quote escaping on import_names and hardcoding the 'dc/base/' prefix and 'dc/base/GeneratedGraphs' provenance. The reviewer points out that these changes introduce critical SQL injection vulnerabilities because user-controlled input is no longer escaped. Additionally, hardcoding the base prefix breaks functionality for non-base Data Commons instances where self.is_base_dc is False.
| safe_names = [name.replace("'", "''") for name in import_names] | ||
| prefix = "dc/base/" if self.is_base_dc else "" | ||
| provenances = [f"'{prefix}{name}'" for name in safe_names] | ||
| provenances = [f"'dc/base/{name}'" for name in import_names] |
There was a problem hiding this comment.
Reverting this change introduces a SQL injection vulnerability because single quotes in import_names (which are sourced from user-controlled HTTP request payloads) are no longer escaped. Additionally, hardcoding the 'dc/base/' prefix breaks functionality when self.is_base_dc is False.
safe_names = [name.replace("'", "''") for name in import_names]
prefix = "dc/base/" if self.is_base_dc else ""
provenances = [f"'{prefix}{name}'" for name in safe_names]
gen_graphs_prov = 'dc/base/GeneratedGraphs' if self.is_base_dc else 'GeneratedGraphs'| safe_names = [name.replace("'", "''") for name in import_names] | ||
| prefix = "dc/base/" if self.is_base_dc else "" | ||
| provenances = [f"'{prefix}{name}'" for name in safe_names] | ||
| provenances = [f"'dc/base/{name}'" for name in import_names] |
There was a problem hiding this comment.
Reverting this change introduces a SQL injection vulnerability because single quotes in import_names are no longer escaped. Additionally, hardcoding the 'dc/base/' prefix breaks functionality when self.is_base_dc is False.
safe_names = [name.replace("'", "''") for name in import_names]
prefix = "dc/base/" if self.is_base_dc else ""
provenances = [f"'{prefix}{name}'" for name in safe_names]
gen_graphs_prov = 'dc/base/GeneratedGraphs' if self.is_base_dc else 'GeneratedGraphs'| safe_names = [name.replace("'", "''") for name in import_names] | ||
| prefix = "dc/base/" if self.is_base_dc else "" | ||
| provenances = [f"'{prefix}{name}'" for name in safe_names] | ||
| provenances = [f"'dc/base/{name}'" for name in import_names] |
There was a problem hiding this comment.
Reverting this change introduces a SQL injection vulnerability because single quotes in import_names are no longer escaped. Additionally, hardcoding the 'dc/base/' prefix breaks functionality when self.is_base_dc is False.
safe_names = [name.replace("'", "''") for name in import_names]
prefix = "dc/base/" if self.is_base_dc else ""
provenances = [f"'{prefix}{name}'" for name in safe_names]
gen_graphs_prov = 'dc/base/GeneratedGraphs' if self.is_base_dc else 'GeneratedGraphs'| # Format import names for the SQL IN clause | ||
| imports_str = ", ".join([f"'{name}'" for name in safe_names]) | ||
| provenance_dcid_expr = "CONCAT('dc/base/', raw.import_name)" if self.is_base_dc else "raw.import_name" | ||
| imports_str = ", ".join([f"'{name}'" for name in import_names]) |
There was a problem hiding this comment.
Reverting this change introduces a SQL injection vulnerability because single quotes in import_names are no longer escaped. Additionally, hardcoding the 'dc/base/' prefix breaks functionality when self.is_base_dc is False.
safe_names = [name.replace("'", "''") for name in import_names]
imports_str = ", ".join([f"'{name}'" for name in safe_names])
provenance_dcid_expr = "CONCAT('dc/base/', raw.import_name)" if self.is_base_dc else "raw.import_name"| 'linkedContainedInPlace' as predicate, | ||
| ancestor_place as object_id, | ||
| '{gen_graphs_prov}' as provenance | ||
| 'dc/base/GeneratedGraphs' as provenance |
There was a problem hiding this comment.
| 'linkedMemberOf' as predicate, | ||
| ancestor as object_id, | ||
| '{gen_graphs_prov}' as provenance | ||
| 'dc/base/GeneratedGraphs' as provenance |
There was a problem hiding this comment.
| 'linkedMember' as predicate, | ||
| subject_id as object_id, | ||
| '{gen_graphs_prov}' as provenance | ||
| 'dc/base/GeneratedGraphs' as provenance |
There was a problem hiding this comment.
| JSON_VALUE(v, '$.key') as date_val, | ||
| SAFE_CAST(JSON_VALUE(v, '$.value') AS FLOAT64) as value_num, | ||
| {provenance_dcid_expr} as provenance_dcid, | ||
| CONCAT('dc/base/', raw.import_name) as provenance_dcid, |
There was a problem hiding this comment.
No description provided.