Skip to content

DiscoveryEngine Answer search result structData PHPDoc is too narrow #8880

@glaziermag

Description

@glaziermag

Description

GoogleCloudDiscoveryengineV1AnswerStepActionObservationSearchResult::$structData is generated with PHPDoc type array[], but the Discovery Engine REST schema defines this field as a JSON object whose property values may be any JSON value.

In repo main:

/**
* Data representation. The structured JSON data for the document. It's
* populated from the struct data from the Document, or the Chunk in search
* result.
*
* @var array[]
*/
public $structData;

/**
 * Data representation. The structured JSON data for the document. It's
 * populated from the struct data from the Document, or the Chunk in search
 * result.
 *
 * @var array[]
 */
public $structData;

The same PHPDoc is present in the current Packagist release google/apiclient-services v0.445.0 at source commit d76b09227d898db351457010c88f39eedfb815aa.

Expected type

The public Discovery Engine v1 discovery document describes the field as:

{
  "type": "object",
  "description": "Data representation. The structured JSON data for the document. It's populated from the struct data from the Document, or the Chunk in search result.",
  "additionalProperties": {
    "type": "any",
    "description": "Properties of the object."
  }
}

So the PHPDoc should permit an associative object/map with scalar values, e.g. array<string, mixed> or another non-list/object-map form, not array[] (array of arrays).

Valid response shape

I observed this in saved HTTP 200 servingConfigs:answer REST responses. A minimized AnswerQueryResponse-shaped fixture is:

{
  "answer": {
    "state": "SUCCEEDED",
    "steps": [
      {
        "actions": [
          {
            "observation": {
              "searchResults": [
                {
                  "document": "projects/PROJECT/locations/global/collections/default_collection/dataStores/DATA_STORE/branches/0/documents/DOCUMENT",
                  "structData": {
                    "title": "Structured source",
                    "facts": "Scalar string values are valid Struct fields.",
                    "version": 1
                  }
                }
              ]
            }
          }
        ]
      }
    ]
  }
}

The saved corpus had scalar string/integer structData members in 36 of 50 HTTP 200 AnswerQueryResponse fixtures. Other generated surfaces I checked use broader map/object types for this same shape, for example Java uses Map<String, Object> and the Python stubs use dict[str, typing.Any].

Impact

The runtime setter is untyped, so this is not a runtime deserialization crash. The bug is in the public generated model type documentation: IDEs and static-analysis users see array[], which is narrower than valid REST output and can incorrectly reject scalar structData properties.

Repro

Offline repro logic:

  1. Load the minimized fixture above and assert structData.title/facts are strings and structData.version is an integer.
  2. Load GoogleCloudDiscoveryengineV1AnswerStepActionObservationSearchResult.php.
  3. Assert the generated PHPDoc for $structData is @var array[].
  4. Load the Discovery Engine v1 discovery schema and assert the field is object with additionalProperties.type = any.

This reproduces on repo main and on the latest published Packagist source.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions