How can I distribute Pyav jobs on many servers with Spark

## Overview

I am using Spark ([pyspark](https://github.com/apache/spark/tree/master/python)) to distribute big data jobs across servers. I would like to distribute Pyav encoding and filter features with Spark on many servers .

## Expected behavior

The distributed function will execute Pyav features which need to be aware of previous and/or next frames in the GOP.

How to implement those features with GOP acknowledgement?

## Actual behavior

Isolated frame in a separate job can not be used alone.

## Investigation

A solution to give Spark the context by serializing the Frame object with pickle is not enough, see #652

## Research

I have done the following:

- [*] Checked the [PyAV documentation](https://pyav.org/docs)
- [*] Searched on [Google](https://www.google.com/search?q=pyav+how+do+I+foo)
- [*] Searched on [Stack Overflow](https://stackoverflow.com/search?q=pyav)
- [*] Looked through [old GitHub issues](https://github.com/PyAV-Org/PyAV/issues?&q=is%3Aissue)
- [*] Asked on [PyAV Gitter](https://gitter.im/PyAV-Org)
- [ ] ... and waited 72 hours for a response.

https://gitter.im/mikeboers/PyAV?at=5eab0dd59f0c955d7d97bbb1

## Additional context

@koenvo, may be you can help me on this as you propose in https://github.com/PyAV-Org/PyAV/issues/652#issuecomment-622821432


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I distribute Pyav jobs on many servers with Spark #659

Overview

Expected behavior

Actual behavior

Investigation

Research

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

How can I distribute Pyav jobs on many servers with Spark #659

Description

Overview

Expected behavior

Actual behavior

Investigation

Research

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions