Skip to content

Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53

Open
jburcham-intel wants to merge 1 commit into
mainfrom
tensor_dims_size_v2
Open

Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53
jburcham-intel wants to merge 1 commit into
mainfrom
tensor_dims_size_v2

Conversation

@jburcham-intel

Copy link
Copy Markdown
Contributor

No description provided.

@jburcham-intel jburcham-intel force-pushed the tensor_dims_size_v2 branch 3 times, most recently from 17200d4 to 0c53f56 Compare May 19, 2026 21:08
Comment thread ze_graph_ext.h

///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
/// @brief Identical to ze_graph_argument_properties_3_t except dims supports 8D (v1.19)
typedef struct _ze_graph_argument_properties_4_t

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with new API perhaps it's on opportunity to make it more extensible and composable

Comment thread ze_graph_ext.h
// version 3
uint32_t dims_count; ///< [out] size of shape array
char debug_friendly_name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] debug friendly name
char associated_tensor_names[ZE_MAX_GRAPH_TENSOR_NAMES_SIZE][ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] tensor name array

@lmielick lmielick May 20, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessarily large array, could we introduce a way to pass variable length strings e.g. using variable length pointer array passed as another struct chained via pNext? perhaps for each class of names?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then you might get a huge chain of structures, not sure if it is worth it. Better to waste few more bytes in single structure

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 8KB

Comment thread ze_graph_ext.h
char name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] name from input IR
ze_graph_argument_type_t type; ///< [out] type of graph argument
uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8]; ///< [out] tensor dimensions upto 8D
ze_graph_argument_precision_t networkPrecision; ///< [out] precision from input IR

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken separating network and device properties is legacy we no longer need.
Could we perhaps retire one of these or make it optional via pNext chaining?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If memory serves right, at least three fields became obsolete once the NPU plugin transitioned to OV API 2.0 (a few years ago). networkPrecision should always be equal to devicePrecision (may be worth double cheking), and the NPU plugin is using only devicePrecision nowadays. Therefore, we could consider dropping networkPrecision, and maybe also renaming devicePrecision to precision. The layout fields (networkLayout & deviceLayout) are also redundant. The plugin stores the layouts found in the ov::Model object given by OV into its own blob metadata for cosmetic purposes. So, I think these are also safe to drop.

Looking at the plugin code, I also noticed quantReverseScale & quantZeroPoint are not used anywhere. But I don't know the story behind these (why they were introduced and if they're still needed), so I'm not sure if these are safe to drop.

Comment thread ze_graph_ext.h
ze_graph_argument_layout_t deviceLayout; ///< [out] layout from compiled executable

// version 2
float quantReverseScale; ///< [out] Quantized tensor reverse scale value for input argument

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these fields introduced in version 2 mandatory and always valid?
Perhaps we could make quantization params a separate struct and chain it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quantization params have never been used. We can remove it

Comment thread ze_graph_ext.h
uint8_t quantZeroPoint; ///< [out] Quantized tesnor zero point value for input argument

// version 3
uint32_t dims_count; ///< [out] size of shape array

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With new API could we put this adjacent to dims or just require zero padding instead?

Comment thread ze_graph_ext.h
void* pNext; ///< [in,out][optional] must be null or a pointer to an extension-specific
char name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] name from input IR
ze_graph_argument_type_t type; ///< [out] type of graph argument
uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8]; ///< [out] tensor dimensions upto 8D

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can use old struct and chain new struct like _ze_graph_argument_extended_tensor_t?
Not beautiful, but does not bring repetition.

…t to v1.19

Define new graph args struct v4 with dims size of 8, bump graph ext to v1.19
@jburcham-intel

Copy link
Copy Markdown
Contributor Author

I'm 100% in favor of cleaning up our extensions. I'm planning to engage with the ZE spec owners to see what we can get adopted into the spec (most likely still an experimental extension, but no longer private). As part of that effort, we'll need to clean up and have a rock-solid API. They're not going to take our historical baggage. But, I would like to keep general cleanup separate from this change. We'll really need to think through backwards compatibility, etc.

@lmielick

Copy link
Copy Markdown

But, I would like to keep general cleanup separate from this change. We'll really need to think through backwards compatibility, etc.

Let's have an offline discussion. But we we won't have clean API if we don't do the dishes.
This PR is introducing a new API. Carrying all the legacy is not going to help.
We have graphs with 1000s of argument in some scenarios like WS so size of this strut actually matters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants