Skip to content

How can I distribute Pyav jobs on many servers with Spark #659

@vtexier

Description

@vtexier

Overview

I am using Spark (pyspark) to distribute big data jobs across servers. I would like to distribute Pyav encoding and filter features with Spark on many servers .

Expected behavior

The distributed function will execute Pyav features which need to be aware of previous and/or next frames in the GOP.

How to implement those features with GOP acknowledgement?

Actual behavior

Isolated frame in a separate job can not be used alone.

Investigation

A solution to give Spark the context by serializing the Frame object with pickle is not enough, see #652

Research

I have done the following:

https://gitter.im/mikeboers/PyAV?at=5eab0dd59f0c955d7d97bbb1

Additional context

@koenvo, may be you can help me on this as you propose in #652 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions