-
Notifications
You must be signed in to change notification settings - Fork 10
adding secret watcher to restart ptp4l process #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your PR, |
pkg/daemon/daemon.go
Outdated
| chronydProcessName, // there can be only one chronyd process in the system | ||
| } | ||
|
|
||
| // saFileInfo tracks authentication file information for a profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What specifically is sa short for in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
security association, sa_file is an option inside the ptp4lconf's global section where we can add a filepath to mount a specific secret inside the linuxptp-daemon-container.
basically I converted some of the code from the ptpconfig_controller here because the file watcher forces a restart on the pods in case a change is detected on the mounted secret, so instead we're having a restart on the ptp4l process.
98fdb45 to
a3fcf96
Compare
pkg/daemon/daemon.go
Outdated
| } | ||
| // Filter for relevant events (Write, Create - Kubernetes atomic updates) | ||
| // Ignore events on temporary/hidden files | ||
| if event.Op&(fsnotify.Write|fsnotify.Create) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should delete also be handled as a change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean "deleting the sa_file" then the controller actually removes the volumeMount in case the user delete it from the ptp4lconf or manually from the file system of the container. which forces a restart on the daemonset.
f594bd0 to
258876f
Compare
9563424 to
67338f9
Compare
67338f9 to
e9c61c0
Compare
e9c61c0 to
e3af397
Compare
greyerof
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor (nit) suggestions but overall looks good to me. Just one question: in case there's two keys added to an existing secret, will it receive one or two consecutive events?
pkg/daemon/daemon.go
Outdated
| tracker.processManager = pm | ||
|
|
||
| // Initialize fsnotify watcher for sa_file change detection | ||
| watcher, err := fsnotify.NewWatcher() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can use the same "saFilesWatcher" name for this local var. It's a more descriptive name to use than just "watcher".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that looks better, I changed it to saFileWatch
pkg/daemon/daemon.go
Outdated
| var watcherEvents chan fsnotify.Event | ||
| var watcherErrors chan error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with the other channels in the for-select loop (like "updateCh, stopCh)", please append the "Ch" suffix to the name of this vars. Also, adding the prefix "saFilesWatcher" might be a good fit to make them more descriptive:
var saFilesWatcherEventCh ...
var saFilesWatcherErrCh ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done !
e3af397 to
fb15fad
Compare
| defer dn.saFileWatcher.Close() | ||
| glog.Info("Using fsnotify for instant sa_file change detection") | ||
| } else { | ||
| glog.Warning("fsnotify unavailable, sa_file change detection disabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by fsnotify unavailable? That the fsnotify watcher in unavailable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case trying to initialize the fsnotify file watcher fails --> logs a warning fsnotify unavailable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example would be when authentication is disabled and there is no ptp-secret-mount directory in the linuxptp daemon pod. (the watch directory PtpSecretMountDir doesn't exists )
pkg/daemon/daemon.go
Outdated
| continue // Ignore hidden files like .data | ||
| } | ||
| glog.Infof("Security file changed: %s (op: %s), restarting PTP processes", event.Name, event.Op.String()) | ||
| err := dn.applyNodePTPProfiles() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of concerned here that applyNodePTPProfiles could theoretically be called twice in quick succession, which could have race conditions that applyNodePTPProfiles isn't hardened for. Currently we are looking at a ticker and check for updates based off that ticker, so no possibility of it being called twice in a row. What would happen if a user made multiple changes to the secret in quick succession? Is there a possibility it could break something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my concern at the beginning but I guess the Run() function is using a single Go routine which prevents parallelism, and each call will stop all processes with dn.stopAllProcesses() at the end , so it rebuilds from scratch
…ted secret - Add fsnotify watcher to monitor /etc/ptp-secret-mount/ for instant detection of security file changes (sa_file) - Trigger PTP process restart on sa_file write/create/remove events - Promotes fsnotify from indirect to direct dependency - Reinitialize event and error channels - Prevent pod restart on watcher failure
fb15fad to
27eee79
Compare
PTP TLV Authentication: Instant Secret Change Detection via fsnotify
Summary
Implements instant detection of PTP authentication file changes using
fsnotify. When authentication secrets are updated in Kubernetes, the daemon immediately restarts PTP processes without requiring pod restarts.Changes
Added fsnotify Monitoring (
pkg/daemon/daemon.go)New constant:
Daemon initialization (
New()):fsnotify.Watcheron startup/etc/ptp-secret-mount/directoryDaemonstructEvent loop (
Run()):watcher.Eventschannel for file changesWrite,Create,Remove(ignores hidden files)applyNodePTPProfiles()immediately on relevant eventsError recovery:
fsnotify.WatcherBenefits