Skip to content

HIVE-29578: Iceberg: add support for native views#6449

Open
difin wants to merge 1 commit into
apache:masterfrom
difin:iceberg_native_views
Open

HIVE-29578: Iceberg: add support for native views#6449
difin wants to merge 1 commit into
apache:masterfrom
difin:iceberg_native_views

Conversation

@difin
Copy link
Copy Markdown
Contributor

@difin difin commented Apr 23, 2026

What changes were proposed in this pull request?

Added support for Iceberg native views in Hive for both HMS and REST catalogs.

There is a limitation in the current implementation: when Hive uses a REST catalog and creates a view on a partitioned Iceberg table, querying the view only works with CBO disabled. To be addressed in a follow-up PR.

Why are the changes needed?

To support Iceberg native views. This can be especially useful for REST Catalog clients.

Does this PR introduce any user-facing change?

Yes, new HQL syntax:

create view <view_name> as select * from <src_tbl> stored by iceberg;

How was this patch tested?

Created new and updated exiting unit and integration tests with Iceberg native views test cases.

@difin difin force-pushed the iceberg_native_views branch from 4fdad42 to 252c608 Compare April 24, 2026 23:06
@difin difin force-pushed the iceberg_native_views branch from 252c608 to e10eba5 Compare April 24, 2026 23:31
@difin difin marked this pull request as ready for review April 24, 2026 23:31
@difin difin changed the title HIVE-29578: Iceberg: support for Iceberg native views HIVE-29578: Iceberg: support native views Apr 24, 2026
@difin difin changed the title HIVE-29578: Iceberg: support native views HIVE-29578: Iceberg: add support for native views Apr 24, 2026
@difin difin force-pushed the iceberg_native_views branch from e10eba5 to 96fa476 Compare April 25, 2026 20:46
@difin difin requested review from deniskuzZ and kasakrisz April 25, 2026 20:46
@difin difin force-pushed the iceberg_native_views branch from 96fa476 to 114412a Compare April 26, 2026 15:12
Comment thread iceberg/iceberg-handler/src/test/queries/positive/iceberg_native_view.q Outdated
Comment thread ql/src/java/org/apache/hadoop/hive/ql/ddl/view/create/CreateViewAnalyzer.java Outdated
result.setLastAccessTime(nowSec);
result.setRetention(Integer.MAX_VALUE);

boolean hiveEngineEnabled = false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is hiveEngineEnabled and why is it false?

Copy link
Copy Markdown
Contributor Author

@difin difin May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hiveEngineEnabled switches how HiveOperationsBase.storageDescriptor fills the Storage Desacriptor: with HiveIcebergInputFormat / HiveIcebergOutputFormat / HiveIcebergSerDe when true, or the usual placeholder FileInputFormat / FileOutputFormat / LazySimpleSerDe when false.

Why it’s false in toHiveView:

This path materializes an HMS VIRTUAL_VIEW for REST catalog that expose Iceberg view metadata through the HMS API. That row isn’t meant to drive a Hive table scan the way a real Iceberg table commit does; execution still comes from the view definition / catalog, not from wiring Iceberg MR formats on the stub. HiveViewOperations does the same thing (hiveEngineEnabled = false).

So we keep a minimal SD consistent with normal virtual views and avoid implying this HMS object is an Iceberg-backed table for the Hive engine. For tables, HiveTableOperations still turns engine integration on/off via metadata + ConfigProperties.ENGINE_HIVE_ENABLED where that actually matters.

create view v_ice as select * from src_ice stored by iceberg;

select * from v_ice;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add

  • logical view which does some transformation on it's base table and query from it?
  • create views when the schema is specified and not specified.

Copy link
Copy Markdown
Contributor Author

@difin difin May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logical view which does some transformation on it's base table and query from it?

This is not supported by Hive itself:

update v_ice set last_name = last_name + 'a' 
fname=iceberg_native_view.q

See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs.
 org.apache.hadoop.hive.ql.parse.SemanticException: You cannot update or delete records in a view
	at org.apache.hadoop.hive.ql.parse.RewriteSemanticAnalyzer.validateTargetTable(RewriteSemanticAnalyzer.java:265)
	at org.apache.hadoop.hive.ql.parse.RewriteSemanticAnalyzer.analyze(RewriteSemanticAnalyzer.java:84)
	at org.apache.hadoop.hive.ql.parse.RewriteSemanticAnalyzer.analyzeInternal(RewriteSemanticAnalyzer.java:73)
	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:358)
	at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
	at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:109)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:499)
	at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:451)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:415)
	at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:409)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:234)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:358)
	at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:790)
	at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:760)
	at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
	at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:139)

create views when the schema is specified and not specified.

Done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logical view which does some transformation on it's base table

Sorry I mean something like

select first_name || last_name from ... where <some filter condition>

because

select * from table;

as a view definition is a kind of edge case. It is ok for testing but not a typical use-case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

break;
}
}
boolean icebergNativeView = validateOptionalViewStorageClause(storageClause);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not hardcode anything like Iceberg into compiler code. The compiler is independent from the storage handler. I'm aware that we already hove lots of code which violates this principal and it already causes lots of troubles.

Copy link
Copy Markdown
Contributor Author

@difin difin May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - moved all Iceberg-specific code into HiveIcebergStorageHandler and kept generic interfaces in the Compiler.

private static final long serialVersionUID = 1L;

/** HMS table property set when the view is declared with {@code STORED BY ICEBERG} (native Iceberg view). */
public static final String ICEBERG_NATIVE_VIEW_PROPERTY = "hive.iceberg.native.view";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this from here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private final boolean ifNotExists;
private final boolean replace;
private final List<FieldSchema> partitionColumns;
private final boolean icebergNativeView;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this from here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +104 to +107
@Explain(displayName = "iceberg native view", displayOnlyOnTrue = true)
public boolean isIcebergNativeView() {
return icebergNativeView;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this from here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 9, 2026

Comment on lines +448 to +449
VIEW_STORAGE_HANDLER_UNSUPPORTED(10448, "CREATE VIEW only supports STORED BY ICEBERG for native "
+ "Iceberg views; unsupported storage clause: {0}", true),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rephrase this error message. Remove STORED BY and let Iceberg be a parameter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +163 to +166
TableName.fromString(
view.name(), MetaStoreUtils.getDefaultCatalog(conf), Warehouse.DEFAULT_DATABASE_NAME);
result.setCatName(tableName.getCat());
result.setDbName(tableName.getDb());
Copy link
Copy Markdown
Contributor

@kasakrisz kasakrisz May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when a custom db is specified?

create view my_db.myview as...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is only called when reading from a view using a REST Catalog - it is called from HiveRESTCatalogClient.getTable(GetTableRequest tableRequest)

return conf;
}

private HiveCatalog verifyCatalog() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called verify? Based on the implementation of this method it is more like loading a catalog.
Could you please share some background of the use case of this method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't notice, fixed.

conf, DB, VIEW, cols, "select 2 as id", null, null, true, false))
.isTrue();

assertThat(verifyCatalog().viewExists(TableIdentifier.of(DB, VIEW))).isTrue();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth checking if the view definition is actually altered.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

('fn7','ln7', 2);

----------------------------------------------------------------
-- Iceberg native view via TBLPROPERTIES before AS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does before AS adds any value in this comment? AFAIK the grammar allows this way only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed "before as" part.

iceberg_create_locally_zordered_table.q,\
iceberg_merge_delete_files.q,\
iceberg_merge_files.q,\
iceberg_native_view.q,\
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any LLAP specific in view DDLs ? If not; can this test run by the default Iceberg driver?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this test is not LLAP-specific. Changed to run it by TestIcebergCliDriver instead of TestIcebergLLAPLocalCliDriver.

Comment on lines +93 to +102
@Test
public void testParseCreateViewTblpropertiesViewFormatIceberg() throws Exception {
ASTNode tree = parseDriver.parse(
"create view v1 tblproperties ('view-format'='iceberg') as select * from t", null).getTree();
assertTrue(tree.dump(), tree.toStringTree().contains("tok_createview"));
assertTrue(tree.dump(), tree.toStringTree().contains("tok_tableproperties"));
assertTrue(tree.dump(), tree.toStringTree().contains("view-format"));
assertTrue(tree.dump(), tree.toStringTree().contains("iceberg"));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test class is about the default keyword testing. Could you please move this test to the one which is about view creation. Please create a new one of not exists. Since this grammar is not Iceberg specific it can be a generic one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved to a new class for testing view parsing.

Comment on lines +248 to +251
if (explicitViewFormat) {
throw new SemanticException(ErrorMsg.VIEW_STORAGE_HANDLER_UNSUPPORTED.getMsg(
"Native view metadata is not supported for storage handler: " + handlerClass));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the knowledge about explicit view format at the caller.
Could you please move this check to the caller method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved to the caller.

Comment on lines +54 to +57
if (desc.usesNativeViewCatalog()) {
executeNativeCatalogView();
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this decision should be moved to BaseHiveIcebergMetaHook or HiveIcebergMetaHook.preCreateTable like we do in case of Iceberg tables.

Please set the StorageHandlerClass in CreateViewOperation.createViewObject to the newly created ql.metadata.Table object because later this object is passed to HMS and the meta hook.

    if (desc.usesNativeViewCatalog()) {
      storageFormat.setStorageHandler(desc.getNativeViewStorageHandlerClass());
      view.setProperty(
          org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE,
          desc.getNativeViewStorageHandlerClass());
    }

I'm not a big fan of this meta hook solution but this is whet we have in case of Iceberg tables and IMHO it is better to be consistent.

Probably create or replace and if not exists doesn't have to be handled separately in case of Iceberg views of you make this change.

Please add some tests for create or replace and if not exists to the q test iceberg_native_view.q

}

/**
* Resolves {@code STORED BY identifier} for CREATE VIEW (short names such as {@code ICEBERG} or an FQCN).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was STORED BY left here intentionally?

* Keys should be removed when {@linkplain #clearNativeViewHmsTableProperties(Map)} is invoked for the same
* handler class recorded under {@link Constants.NATIVE_VIEW_STORAGE_HANDLER_CLASS_PARAM}.
*/
default Map<String, String> getNativeViewHmsTableProperties() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this simply getViewProperties? I would like to understand what native and hms mean in this context.

AFAIK, a native object is one that doesn't have a storage handler. To me, the word native in these method names is misleading. Could you please elaborate on this a bit?"

getNativeViewHmsTableProperties
clearNativeViewHmsTableProperties
createOrReplaceNativeView
supportsNativeViewCatalog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants