Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53
Increase graph arg properties dims size to 8 in v4 of struct, bump ext v1.19#53jburcham-intel wants to merge 1 commit into
Conversation
17200d4 to
0c53f56
Compare
|
|
||
| /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// | ||
| /// @brief Identical to ze_graph_argument_properties_3_t except dims supports 8D (v1.19) | ||
| typedef struct _ze_graph_argument_properties_4_t |
There was a problem hiding this comment.
If we go with new API perhaps it's on opportunity to make it more extensible and composable
| // version 3 | ||
| uint32_t dims_count; ///< [out] size of shape array | ||
| char debug_friendly_name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] debug friendly name | ||
| char associated_tensor_names[ZE_MAX_GRAPH_TENSOR_NAMES_SIZE][ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] tensor name array |
There was a problem hiding this comment.
This is unnecessarily large array, could we introduce a way to pass variable length strings e.g. using variable length pointer array passed as another struct chained via pNext? perhaps for each class of names?
There was a problem hiding this comment.
Then you might get a huge chain of structures, not sure if it is worth it. Better to waste few more bytes in single structure
| char name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] name from input IR | ||
| ze_graph_argument_type_t type; ///< [out] type of graph argument | ||
| uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8]; ///< [out] tensor dimensions upto 8D | ||
| ze_graph_argument_precision_t networkPrecision; ///< [out] precision from input IR |
There was a problem hiding this comment.
If I'm not mistaken separating network and device properties is legacy we no longer need.
Could we perhaps retire one of these or make it optional via pNext chaining?
There was a problem hiding this comment.
If memory serves right, at least three fields became obsolete once the NPU plugin transitioned to OV API 2.0 (a few years ago). networkPrecision should always be equal to devicePrecision (may be worth double cheking), and the NPU plugin is using only devicePrecision nowadays. Therefore, we could consider dropping networkPrecision, and maybe also renaming devicePrecision to precision. The layout fields (networkLayout & deviceLayout) are also redundant. The plugin stores the layouts found in the ov::Model object given by OV into its own blob metadata for cosmetic purposes. So, I think these are also safe to drop.
Looking at the plugin code, I also noticed quantReverseScale & quantZeroPoint are not used anywhere. But I don't know the story behind these (why they were introduced and if they're still needed), so I'm not sure if these are safe to drop.
| ze_graph_argument_layout_t deviceLayout; ///< [out] layout from compiled executable | ||
|
|
||
| // version 2 | ||
| float quantReverseScale; ///< [out] Quantized tensor reverse scale value for input argument |
There was a problem hiding this comment.
Are these fields introduced in version 2 mandatory and always valid?
Perhaps we could make quantization params a separate struct and chain it?
There was a problem hiding this comment.
Quantization params have never been used. We can remove it
| uint8_t quantZeroPoint; ///< [out] Quantized tesnor zero point value for input argument | ||
|
|
||
| // version 3 | ||
| uint32_t dims_count; ///< [out] size of shape array |
There was a problem hiding this comment.
With new API could we put this adjacent to dims or just require zero padding instead?
| void* pNext; ///< [in,out][optional] must be null or a pointer to an extension-specific | ||
| char name[ZE_MAX_GRAPH_ARGUMENT_NAME]; ///< [out] name from input IR | ||
| ze_graph_argument_type_t type; ///< [out] type of graph argument | ||
| uint32_t dims[ZE_MAX_GRAPH_ARGUMENT_DIMENSIONS_SIZE_8]; ///< [out] tensor dimensions upto 8D |
There was a problem hiding this comment.
Perhaps we can use old struct and chain new struct like _ze_graph_argument_extended_tensor_t?
Not beautiful, but does not bring repetition.
…t to v1.19 Define new graph args struct v4 with dims size of 8, bump graph ext to v1.19
0c53f56 to
4e18233
Compare
|
I'm 100% in favor of cleaning up our extensions. I'm planning to engage with the ZE spec owners to see what we can get adopted into the spec (most likely still an experimental extension, but no longer private). As part of that effort, we'll need to clean up and have a rock-solid API. They're not going to take our historical baggage. But, I would like to keep general cleanup separate from this change. We'll really need to think through backwards compatibility, etc. |
Let's have an offline discussion. But we we won't have clean API if we don't do the dishes. |
No description provided.