-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: managing schemas #50
Comments
I really like this idea. I will definitely take this on in the future; though I'm not sure exactly when I'd have time to take it on. I could see it happening within the next couple of months. |
I'm very much interested into this too. I'm ready to help with this. Best, |
I love this idea and this would be an awesome feature to have. I think it fits in great with the other features of kafka-gitops. I've got some availability coming up and would be able to help out as well. @jrevillard I'd be happy to have your help as well! With a feature this big, I'd like to do a bit of planning & outlining before we get started on the code. I'd like to make a few examples of how to structure the YAML and discuss. |
Hello, I just seen that you have a first implementation @Twb3 ! tball-dev@50fa5cc What's the status ? Do you need help ? Best, |
Looking forward to having this feature @Twb3! |
Hey guys sorry I did not see this earlier. I did a quick POC for myself to see what's possible. I think I've got it mostly nailed down. I hope to propose a file structure soon. Just need to write it up |
Schema Registry POCProposed Schema State File Structureschemas:
order-value:
type: Avro
file: order-schema.avsc
order-2-value:
type: Avro
file: order-schema.avsc
shipment-value:
type: Avro
file: shipment-schema.avsc
references:
- name: order-value
subject: order-value
version: 1 Each schema entry above is the name of the subject to be registered in the Schema Registry. I did this to keep it similar to how topic and service entries are the name of the resource to be created. Type is self-explanatory although this POC is restricted to only Avro, because I don't know how to parse PROTOBUF yet. (I couldn't find good examples in the confluent schema registry client either) File is a reference to the schema file located at ConfigConfig is handled via environment variables:
Login module is currently hardcoded to Things to discussSchema differencesTo know if a schema needs to be updated, I am parsing the schema file and generating a diff using zjsonpatch. I chose to use zjsonpatch because it returns differences by json node rather than the entire file. For example, the content of your schema could be identical, but you could have rearranged the order of nodes. The more I think about this while I type it makes me think it's not necessary. Ultimately, this part still needs work. DeletionSchema Registry allows us to soft-delete and permanently delete. I think we would want to always permanently delete since we want our state file to represent exactly what is deployed. This is how my code currently works. I believe this also deletes all versions. ValidationFor validating schemas I did more than just validate the yaml is valid. I check that the schema file exists at |
Nice. Avro is a very good start. Just make sure that this also will work against schema registry in confluent cloud. |
Dear @Twb3, This seems really promising thanks ! Yes Avro is a good start and the final goal would be to support: Thrift, Protocol Buffers, and JSON Schema. You say that you use the Confluent's Schema Registry Client to validate that the Avro schemas, so I think that this library would be capable of validate the other types isn't it ? Concerning config, I could contribute with Kerberos as I will need it :-) Best, |
@Twb3 How is this feat going? Is it something that is stable enough to start using? I am very eager to have this as soon as possible. |
I don't know if you were aware of this: https://github.com/domnikl/schema-registry-gitops |
There is one thing which is complicated for me to answer which is: how to deal with schema ids and versions ? Indeed, those IDs are generated server side and are used by the Kafka clients to identify the good schema. This means that there is no way to ensure the a schema will have the good ID/version and therefore, kafka-gitops cannot be the source of trust for this isn't it ? |
As promised, you can find more than a POC implementation in #76 ! Please comment, improve etc... |
@HSA72 I apologize for not following up on this sooner. I have not had the opportunity recently to dedicate time to this feature. @jrevillard Thanks for posting that link to the schema registry gitops implementation! Looks promising. |
Nice I will take a look! |
@jrevillard I wasn't aware of that project - pretty nice. I still like our approach of putting it into this tool. Maybe their owner would like to help contribute as well? I'll let you and @Twb3 take the lead on this and then give suggestions and take a look at the POC shortly. |
Hi,
Right now there is nothing like this for schema management. It might be useful to also allow/declare what topics/subjects use what schemas:
This would allow the topic state, and schema state to be managed by one tool/file.
java -jar kafka-schema-gitops-1.0-SNAPSHOT-jar-with-dependencies.jar -i <input_yaml> validate # or execute
The text was updated successfully, but these errors were encountered: