Codex/rebase local branch on main#3
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
Expands FrameX’s unified read_file / write_file surface to cover additional real-world storage formats (notably SQLite and ORC), with accompanying docs, tests, and a patch version bump.
Changes:
- Add support for ORC, SQLite, HTML/XML export,
.txt, and fixed-width text (.fwf/etc.) inread_file/write_file, including compression-path handling and dataframe-like inputs. - Add test coverage for the new formats and new SQLite query/table workflows.
- Update docs/website to reflect the expanded I/O capabilities and bump version to
0.1.2.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
framex/io/file.py |
Implements new format inference and read/write paths (ORC/SQLite/HTML/XML/fixed-width/text) plus dataframe-like coercion for write_file. |
tests/test_io.py |
Adds roundtrip/export tests for the new formats and SQLite query/table behaviors. |
docs/documents/sqlite_guide.md |
New guide documenting SQLite read/write/query workflows. |
docs/documents/features.md |
Updates feature list to include new I/O formats and export-only formats. |
docs/documents/faq.md |
Updates supported formats list and adds SQLite usage FAQ section. |
docs/documents/api_reference.md |
Documents newly supported read_file/write_file formats and adds SQLite examples/parameter notes. |
website/src/app/page.js |
Updates homepage capabilities and adds a quick link to the SQLite guide. |
pyproject.toml |
Bumps package version to 0.1.2. |
framex/_version.py |
Bumps internal __version__ to 0.1.2. |
README.md |
Updates displayed project version to 0.1.2. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| raise ValueError("SQLite file has no user tables") | ||
| table = str(row[0]) | ||
| query = f'SELECT * FROM "{table}"' | ||
| pd = get_pandas_module() | ||
| return DataFrame(pd.read_sql_query(query, conn, **kwargs)) |
There was a problem hiding this comment.
The SQLite table value is interpolated into a SQL string via an f-string. If a caller passes a table name containing a double quote, this can break out of the quoted identifier and enable SQL injection. Escape embedded quotes (e.g., replace " with "") and/or validate table against a safe identifier pattern before constructing the query string.
| import pyarrow.feather as pfeather | ||
| import pyarrow.orc as porc | ||
| import pyarrow.parquet as pq |
There was a problem hiding this comment.
pyarrow.orc is imported at module import time. In environments where PyArrow is built without ORC support, this can raise ImportError and break all read_file/write_file usage even when ORC isn’t used. Consider lazy-importing ORC inside the fmt == "orc" branches (and raising a clear error if unavailable) to keep the module usable without ORC support.
| def test_write_file_orc_roundtrip(self, tmp_path): | ||
| df = DataFrame({"a": [1, 2], "b": ["x", "y"]}) | ||
| path = tmp_path / "out.orc" | ||
| fx.write_file(df, path) | ||
| out = fx.read_file(path) | ||
| assert out.num_rows == 2 | ||
| assert out.columns == ["a", "b"] |
There was a problem hiding this comment.
This ORC round-trip test will fail hard if the runtime PyArrow build lacks ORC support. To keep CI portable, consider conditionally skipping the test when ORC support isn’t available (e.g., guard on successful import pyarrow.orc / a simple write/read capability check).
| [project] | ||
| name = "pyframe-xpy" | ||
| version = "0.1.1" | ||
| version = "0.1.2" | ||
| description = "High-performance parallel dataframe and array processing with Arrow-backed storage" |
There was a problem hiding this comment.
The PR title indicates this is a rebase onto main, but the diff includes a version bump plus new I/O formats (ORC/SQLite/HTML/XML/etc.) and new docs/tests. Please update the PR title/description to reflect the actual feature change so reviewers can scope risk appropriately.
No description provided.