Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,12 @@ public class ExprNodeGenericFuncDesc extends ExprNodeDesc implements
public ExprNodeGenericFuncDesc() {;
}

/**
* Constructor.
*
* @param children the children; a copy is made, so later changes to the passed list
* do not affect the children of this instance
*/
Comment on lines +78 to +83
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc seems a bit repetitive. We could possibly just put the mention about copy once over the field declaration:

private List<ExprNodeDesc> children;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to to see such information in the IDE when hovering over the constructor.

/* If the function has an explicit name like func(args) then call a
* constructor that explicitly provides the function name in the
* funcText argument.
Expand All @@ -86,6 +92,12 @@ public ExprNodeGenericFuncDesc(TypeInfo typeInfo, GenericUDF genericUDF,
genericUDF, funcText, children);
}

/**
* Constructor.
*
* @param children the children; a copy is made, so later changes to the passed list
* do not affect the children of this instance
*/
public ExprNodeGenericFuncDesc(ObjectInspector oi, GenericUDF genericUDF,
String funcText,
List<ExprNodeDesc> children) {
Expand All @@ -94,16 +106,28 @@ public ExprNodeGenericFuncDesc(ObjectInspector oi, GenericUDF genericUDF,
ObjectInspectorUtils.getWritableObjectInspector(oi);
assert (genericUDF != null);
this.genericUDF = genericUDF;
this.children = children;
this.children = children == null ? new ArrayList<>() : new ArrayList<>(children);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you need a new ArrayList<>(children) here? why it can't be just

    this.children = children == null ? List.of() : children;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ArrayList<>(children) is required, because otherwise the NPE occurs. I've seen that some callers of getChildren modify the list, e.g., DynamicPartitionPruningOptimization, so I've I opted for new ArrayList<>() instead of List.of().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayushtkn if we don't explicitly convert it to ArrayList, kryo cannot determine the actual runtime List object for ExprNodeGenericFuncDesc.children and uses AbstractMapBasedMultimap$WrappedCollection which is throwing NPE at deserializer in Tez Task.

Explicit cast ensure kryo knows its ArrayList and won't use AbstractMapBasedMultimap$WrappedCollection avoiding this NPE.

Copy link
Contributor

@nareshpr nareshpr Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasrebele I suspect its more of Kryo-Guava deseralizer issue when children object is not null. Do you think we need to convert null to empty ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safer to avoid null for children, as there are several places without null check, e.g., in getExprString. The children are exposed to other classes by getChildren(), so it's simpler to just use an empty list instead of adding null checks everywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach would be to check if children == null and throw NPE or IllegalArgumentException. Anyways, I think that we don't have any such calls currently at the code so choosing between exception, null, or new ArrayList<>() is rather subtle details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, though I'll leave the null-check for another PR.

Copy link
Member

@deniskuzZ deniskuzZ Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasrebele, why not use jdk21 List.of() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen that some callers of getChildren modify the list, e.g., DynamicPartitionPruningOptimization, so I've I opted for new ArrayList<>() instead of List.of().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created HIVE-29505 to make the children immutable. So I propose to postpone using List.of() until HIVE-29505 has been implemented.

this.funcText = funcText;
}

/**
* Constructor.
*
* @param children the children; a copy is made, so later changes to the passed list
* do not affect the children of this instance
*/
// Backward-compatibility interfaces for functions without a user-visible name.
public ExprNodeGenericFuncDesc(TypeInfo typeInfo, GenericUDF genericUDF,
List<ExprNodeDesc> children) {
this(typeInfo, genericUDF, null, children);
}

/**
* Constructor.
*
* @param children the children; a copy is made, so later changes to the passed list
* do not affect the children of this instance
*/
public ExprNodeGenericFuncDesc(ObjectInspector oi, GenericUDF genericUDF,
List<ExprNodeDesc> children) {
this(oi, genericUDF, null, children);
Expand All @@ -125,8 +149,14 @@ public void setGenericUDF(GenericUDF genericUDF) {
this.genericUDF = genericUDF;
}

/**
* Sets the children.
*
* @param children the new children; a copy is made, so later changes to the passed list
* do not affect the children of this instance
*/
public void setChildren(List<ExprNodeDesc> children) {
this.children = children;
this.children = children == null ? new ArrayList<>() : new ArrayList<>(children);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,21 @@
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Properties;

import com.google.common.collect.ArrayListMultimap;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.ql.io.orc.OrcInputFormat;
import org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat;
import org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc;
import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc;
import org.apache.hadoop.hive.ql.plan.ExprNodeDesc;
import org.apache.hadoop.hive.ql.plan.ExprNodeDescUtils;
import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
Expand Down Expand Up @@ -153,6 +157,25 @@ public void testUnsupportedDeserialization() throws Exception {
Assert.assertTrue(ExprNodeDescUtils.isSame(validExpr, desc));
}

@Test
public void testSerializeChildrenFromGuavaCollection() throws Exception {
ExprNodeDesc column = new ExprNodeColumnDesc(TypeInfoFactory.stringTypeInfo, "foo", null, false);
ExprNodeDesc constant = new ExprNodeConstantDesc(TypeInfoFactory.stringTypeInfo, "bar");
ArrayListMultimap<Object, ExprNodeDesc> l = ArrayListMultimap.create();
l.put("coltype", column);
l.put("coltype", constant);

FunctionInfo inFunctionInfo = FunctionRegistry.getFunctionInfo("in");
Optional<Collection<ExprNodeDesc>> v = l.asMap().values().stream().findFirst();
Assert.assertTrue(v.isPresent());
List<ExprNodeDesc> children = (List<ExprNodeDesc>) v.get();
ExprNodeGenericFuncDesc node = ExprNodeGenericFuncDesc.newInstance(inFunctionInfo.getGenericUDF(), "in", children);

byte[] buf = SerializationUtilities.serializeObjectWithTypeInformation(node);
ExprNodeDesc desc = SerializationUtilities.deserializeObjectWithTypeInformation(buf, true);
Assert.assertTrue(ExprNodeDescUtils.isSame(node, desc));
}

private MapWork doSerDeser(Configuration configuration) throws Exception, IOException {
MapWork mapWork = mockMapWorkWithSomePartitionDescProperties();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Expand Down
Loading
Loading