-
Notifications
You must be signed in to change notification settings - Fork 886
add "ann" as reserved keyword #2005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.x
Are you sure you want to change the base?
Conversation
Good catch @Hazel-Datastax! We actually had to address something very similar to this for dsbulk. Should've occurred to me this part of the Java driver might have an issue as well. |
So, there's definitely something weird going on here. In Apache Cassandra 5.x "ann" is very definitely an unreserved keyword. The CQL docs in the Cassandra repo talk about the distinction a bit; reserved keywords can never be used as an identifier while unreserved keywords can in some situations... but those situations aren't specified. If an unreserved identifier is used in a spot that might introduce conflict it presumably would have to be quoted... but it's not clear how the driver can identify such a situation. The dsbulk change I referenced above doesn't need to worry about this distinction. It includes it's own ANTLR-derived parser (a subset of what's actually used in Cassandra) so it can identify these keyword cases using (essentially) the same grammar Apache Cassandra uses. I also note that the set "ann" is added to in this PR is explicitly for reserved keywords; note that each member of that set is a reserved keyword (as defined in the CQL docs above) and that no unreserved keywords are included. Presumably that's true because the code can always quote reserved keywords when generating CQL strings... but unreserved keywords are a bit tricker. To make it even worse: I note the following against Apache Cassandra 5.0.0:
The string "ann" works just fine as a table name there. But when I try something similar on Astra I get results similar to what I think you're describing:
So we've clearly got inconsistencies in the behaviour here between Astra and Apache Cassandra. But to make matters worse Astra is internally inconsistent: some unreserved keywords (such as "filtering" and "function") are just fine to use as table names while I can't get "ann" to be used as a table name whether I quote it or not. |
@adutra @aratno @tolbertam I'm curious about what you guys think of this. Short version:
My current thinking is that there isn't really much we can do here. Without better guidance as to when unreserved keywords should be quoted or not the Java driver can't really interject so it's up to the user to quote unreserved keywords when appropriate. If you have a full-blown CQL parser you could do better (see the referenced dsbulk issue above) but short of that you're kind of limited. Thoughts? |
The token I agree that unreserved keywords lack of a clear, well-defined meaning, but in any case, they can be table identifiers since the
So, I agree with @absurdfarce and I don't think it's correct to add About Astra vs C* 5.0 observed differences:
But in any case, and until we get more insights, the Astra behavior does not invalidate the fact that |
Yeah, you're correct on Actually, now that I say that it's interesting to me that the single-quoted version failed with a similar error message. I need to go look at the Astra ANTLR grammar to see if there's something else going on here. |
I compared the Astra ANTLR grammar to the OSS Cassandra grammar and there weren't any obvious differences. Most directly relevant to this case Astra defines K_ANN in basic_unreserved_keyword just like OSS C* does. Some follow-up testing did reveal an interesting case though:
With this request we were able to get past the ANTLR parsing and into actual functionality. I'm not at all sure where that response message is coming from (or why it's being returned in this specific case) but the fact that we got that far makes me wonder if there isn't a whitespace issue of some kind with "ann" in a way I wasn't expecting. |
I found a corner case when using Data API (stargate/data-api#1806). I cannot use
ann
as my table name, but I can use it in CQL:The reason is, inside the Java Driver, it has a set that contains all the reserved keywords. When the query builder builds the create table query, it will call
tableName.asCql(true)
. InsideasCql(true)
method, it will check if the string is in the reserved keywords set and double quoted if it’s in. Unfortunately, the set doesn’t containann
.I guess
ann
was introduced later and the keywords set hasn't been updated accordingly.