Support writing Arrow RecordBatchReader or Scanner to Iceberg tables

### Feature Request / Improvement

### Summary

Please consider adding support in `pyiceberg` for writing data to Iceberg tables using streamable Arrow-native types such as

- `pyarrow.RecordBatchReader`
- `Iterator[pyarrow.RecordBatch]`
- `pyarrow.RecordBatch`
- `pyarrow.dataset.Scanner`
- `pyarrow.Table` (existing or fallback)

Operations could include:

```python
table.append(record_batch_reader)
table.overwrite(scanner)
table.upsert(scanner, primary_keys=["id"])
```
---

**Motivation**

Currently, writing data into Iceberg via Python requires materializing data entirely in memory (e.g., via `pyarrow.Table`) and converting it to Parquet manually. This limits scalability and performance, especially for:
- Large datasets that exceed memory
- Incremental / streaming ingestion
- Lazy pipelines using DuckDB, ADBC, or `Scanner.from_batches(...)`

`RecordBatchReader` and `Scanner` are both streamable abstractions ideal for these use cases.

**Benefits**

- Enables lazy and streaming by avoiding unnecessary disk I/O and memory pressure
- Avoids intermediate Parquet files, temp storage and big data pipelines without requiring full materialization
- Enables clean integration with Arrow-native tools (e.g., ADBC, DuckDB, pyarrow ecosystem)
- Avoids unnecessary disk I/O and memory pressure

**Related Context**
- [delta-rs](https://delta-io.github.io/delta-rs/api/delta_writer/) supports Arrow-native ingestion via ArrowStreamExportable, ArrowArrayExportable, and sequences of arrays

---

This feature would unlock efficient Python-native data ingestion workflows for Iceberg and align pyiceberg more closely with the rest of the Arrow ecosystem.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support writing Arrow RecordBatchReader or Scanner to Iceberg tables #2152

Feature Request / Improvement

Summary

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Support writing Arrow RecordBatchReader or Scanner to Iceberg tables #2152

Description

Feature Request / Improvement

Summary

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions