Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19 by jburcham-intel · Pull Request #53 · intel/level-zero-npu-extensions

jburcham-intel · 2026-05-19T20:24:43Z

No description provided.

lmielick · 2026-05-20T07:06:21Z

+
+///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
+/// @brief Identical to ze_graph_argument_properties_3_t except dims supports 8D (v1.19)
+typedef struct _ze_graph_argument_properties_4_t


If we go with new API perhaps it's on opportunity to make it more extensible and composable

lmielick · 2026-05-20T07:07:41Z

+    // version 3
+    uint32_t dims_count;                                            ///< [out] size of shape array
+    char debug_friendly_name[ZE_MAX_GRAPH_ARGUMENT_NAME];           ///< [out] debug friendly name
+    char associated_tensor_names[ZE_MAX_GRAPH_TENSOR_NAMES_SIZE][ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] tensor name array


This is unnecessarily large array, could we introduce a way to pass variable length strings e.g. using variable length pointer array passed as another struct chained via pNext? perhaps for each class of names?

Then you might get a huge chain of structures, not sure if it is worth it. Better to waste few more bytes in single structure

This is 8KB

lmielick · 2026-05-20T07:10:19Z

+    char name[ZE_MAX_GRAPH_ARGUMENT_NAME];                          ///< [out] name from input IR
+    ze_graph_argument_type_t type;                                  ///< [out] type of graph argument
+    uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8];         ///< [out] tensor dimensions upto 8D
+    ze_graph_argument_precision_t networkPrecision;                 ///< [out] precision from input IR


If I'm not mistaken separating network and device properties is legacy we no longer need.
Could we perhaps retire one of these or make it optional via pNext chaining?

If memory serves right, at least three fields became obsolete once the NPU plugin transitioned to OV API 2.0 (a few years ago). networkPrecision should always be equal to devicePrecision (may be worth double cheking), and the NPU plugin is using only devicePrecision nowadays. Therefore, we could consider dropping networkPrecision, and maybe also renaming devicePrecision to precision. The layout fields (networkLayout & deviceLayout) are also redundant. The plugin stores the layouts found in the ov::Model object given by OV into its own blob metadata for cosmetic purposes. So, I think these are also safe to drop.

Looking at the plugin code, I also noticed quantReverseScale & quantZeroPoint are not used anywhere. But I don't know the story behind these (why they were introduced and if they're still needed), so I'm not sure if these are safe to drop.

lmielick · 2026-05-20T07:11:10Z

+    ze_graph_argument_layout_t deviceLayout;                        ///< [out] layout from compiled executable
+
+    // version 2
+    float quantReverseScale;                                        ///< [out] Quantized tensor reverse scale value for input argument


Are these fields introduced in version 2 mandatory and always valid?
Perhaps we could make quantization params a separate struct and chain it?

Quantization params have never been used. We can remove it

lmielick · 2026-05-20T07:11:55Z

+    uint8_t quantZeroPoint;                                         ///< [out] Quantized tesnor zero point value for input argument
+
+    // version 3
+    uint32_t dims_count;                                            ///< [out] size of shape array


With new API could we put this adjacent to dims or just require zero padding instead?

lmielick · 2026-05-20T07:13:14Z

+    void* pNext;                                                    ///< [in,out][optional] must be null or a pointer to an extension-specific
+    char name[ZE_MAX_GRAPH_ARGUMENT_NAME];                          ///< [out] name from input IR
+    ze_graph_argument_type_t type;                                  ///< [out] type of graph argument
+    uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8];         ///< [out] tensor dimensions upto 8D


Perhaps we can use old struct and chain new struct like _ze_graph_argument_extended_tensor_t?
Not beautiful, but does not bring repetition.

…t to v1.19 Define new graph args struct v4 with dims size of 8, bump graph ext to v1.19

jburcham-intel · 2026-05-20T22:06:11Z

I'm 100% in favor of cleaning up our extensions. I'm planning to engage with the ZE spec owners to see what we can get adopted into the spec (most likely still an experimental extension, but no longer private). As part of that effort, we'll need to clean up and have a rock-solid API. They're not going to take our historical baggage. But, I would like to keep general cleanup separate from this change. We'll really need to think through backwards compatibility, etc.

lmielick · 2026-05-25T13:43:26Z

But, I would like to keep general cleanup separate from this change. We'll really need to think through backwards compatibility, etc.

Let's have an offline discussion. But we we won't have clean API if we don't do the dishes.
This PR is introducing a new API. Carrying all the legacy is not going to help.
We have graphs with 1000s of argument in some scenarios like WS so size of this strut actually matters.

jburcham-intel force-pushed the tensor_dims_size_v2 branch 3 times, most recently from 17200d4 to 0c53f56 Compare May 19, 2026 21:08

lmielick reviewed May 20, 2026

View reviewed changes

Increase graph arg properties dims size to 8 in v4 of struct, bump ex…

4e18233

…t to v1.19 Define new graph args struct v4 with dims size of 8, bump graph ext to v1.19

jburcham-intel force-pushed the tensor_dims_size_v2 branch from 0c53f56 to 4e18233 Compare May 20, 2026 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53

Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53
jburcham-intel wants to merge 1 commit into
mainfrom
tensor_dims_size_v2

jburcham-intel commented May 19, 2026

Uh oh!

lmielick May 20, 2026

Uh oh!

lmielick May 20, 2026 •

edited

Loading

Uh oh!

jwludzik Jun 2, 2026

Uh oh!

lmielick Jun 11, 2026

Uh oh!

lmielick May 20, 2026

Uh oh!

razvanapetroaie May 20, 2026

Uh oh!

lmielick May 20, 2026

Uh oh!

jwludzik May 20, 2026

Uh oh!

lmielick May 20, 2026

Uh oh!

lmielick May 20, 2026

Uh oh!

jburcham-intel commented May 20, 2026

Uh oh!

lmielick commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

jburcham-intel commented May 19, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lmielick May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jburcham-intel commented May 20, 2026

Uh oh!

lmielick commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lmielick May 20, 2026 •

edited

Loading