Skip to content

GH-49716: [C++] FixedShapeTensorType::Deserialize should strictly validate serialized metadata#49718

Open
rok wants to merge 3 commits intoapache:mainfrom
rok:gh-49716-fst-deserialize
Open

GH-49716: [C++] FixedShapeTensorType::Deserialize should strictly validate serialized metadata#49718
rok wants to merge 3 commits intoapache:mainfrom
rok:gh-49716-fst-deserialize

Conversation

@rok
Copy link
Copy Markdown
Member

@rok rok commented Apr 13, 2026

Rationale for this change

FixedShapeTensorType::Deserialize should validate input from unknown sources.

What changes are included in this PR?

Adds stricter deserialization valideation.

Are these changes tested?

Yes. New tests are added.

Are there any user-facing changes?

Stricter validation should not be noticed if metadata is correct as per spec of fixed_shape_tensor.

@rok
Copy link
Copy Markdown
Member Author

rok commented Apr 13, 2026

@pitrou

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Apr 13, 2026

I suppose VariableShapeTensorType must be similarily improved, right?

@pitrou pitrou added the Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data. label Apr 13, 2026
@pitrou
Copy link
Copy Markdown
Member

pitrou commented Apr 13, 2026

@raulcd I think this should be part of 24.0.0.

@rok
Copy link
Copy Markdown
Member Author

rok commented Apr 13, 2026

I suppose VariableShapeTensorType must be similarily improved, right?

shape is not stored in metadata of VariableShapeTensorType but rather in a field of the struct array (which is storage of variable shape tensor) and could be created invalid. But that wouldn't be checked at deserialization time typically. Is this a check we should do at deserialization time?

We do have uniform_shape which we currently parse and partly validate, but don't really validate. The idea was to potentially use it for optimizations later on/

I added a validation step for permutation.

@rok
Copy link
Copy Markdown
Member Author

rok commented Apr 13, 2026

@pitrou I pushed requested changes, please do another pass.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Apr 13, 2026
for (auto& x : document["shape"].GetArray()) {
for (const auto& x : document["shape"].GetArray()) {
if (!x.IsInt64()) {
return Status::Invalid("shape must contain integers");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we show the actual type for easy to debug on failure?

for (auto& x : document["permutation"].GetArray()) {
const auto& json_permutation = document["permutation"];
if (!json_permutation.IsArray()) {
return Status::Invalid("permutation must be an array");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

}
for (const auto& x : json_permutation.GetArray()) {
if (!x.IsInt64()) {
return Status::Invalid("permutation must contain integers");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

for (auto& x : document["dim_names"].GetArray()) {
const auto& json_dim_names = document["dim_names"];
if (!json_dim_names.IsArray()) {
return Status::Invalid("dim_names must be an array");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

}
for (const auto& x : json_dim_names.GetArray()) {
if (!x.IsString()) {
return Status::Invalid("dim_names must contain strings");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting change review Awaiting change review Component: C++ Critical Fix Bugfixes for security vulnerabilities, crashes, or invalid data.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants