Skip to content

Conversation

@gtannous-spec
Copy link

@gtannous-spec gtannous-spec commented Nov 24, 2025

PTP TLV Authentication: Instant Secret Change Detection via fsnotify

Summary

Implements instant detection of PTP authentication file changes using fsnotify. When authentication secrets are updated in Kubernetes, the daemon immediately restarts PTP processes without requiring pod restarts.

Changes

Added fsnotify Monitoring (pkg/daemon/daemon.go)

New constant:

const PtpSecretMountDir = "/etc/ptp-secret-mount/"

Daemon initialization (New()):

  • Creates fsnotify.Watcher on startup
  • Adds watch on /etc/ptp-secret-mount/ directory
  • Stores watcher in Daemon struct

Event loop (Run()):

  • Listens on watcher.Events channel for file changes
  • Filters events: Write, Create, Remove (ignores hidden files)
  • Calls applyNodePTPProfiles() immediately on relevant events
  • Restarts ptp4l with updated authentication keys

Error recovery:

  • Detects when fsnotify watcher crashes (error channel closed)
  • Recreates watcher with full reinitialization:
    • Creates new fsnotify.Watcher
    • Re-adds watch path
    • Reinitializes event and error channels
  • Prevents pod restart on watcher failures

Benefits

  • Instant detection - No polling delay (event-driven)
  • No pod restarts - PTP processes restart in-place
  • Simple - Single directory watch, no hash tracking
  • Resilient - Auto-recovery on watcher crashes
  • Efficient - No background CPU/I/O for polling

@github-actions
Copy link

Thanks for your PR,
Best regards.

chronydProcessName, // there can be only one chronyd process in the system
}

// saFileInfo tracks authentication file information for a profile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What specifically is sa short for in this context?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security association, sa_file is an option inside the ptp4lconf's global section where we can add a filepath to mount a specific secret inside the linuxptp-daemon-container.
basically I converted some of the code from the ptpconfig_controller here because the file watcher forces a restart on the pods in case a change is detected on the mounted secret, so instead we're having a restart on the ptp4l process.

}
// Filter for relevant events (Write, Create - Kubernetes atomic updates)
// Ignore events on temporary/hidden files
if event.Op&(fsnotify.Write|fsnotify.Create) == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should delete also be handled as a change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean "deleting the sa_file" then the controller actually removes the volumeMount in case the user delete it from the ptp4lconf or manually from the file system of the container. which forces a restart on the daemonset.

@josephdrichard josephdrichard added the ok-to-test ok to test label Nov 26, 2025
@github-actions github-actions bot removed the ok-to-test ok to test label Nov 26, 2025
@gtannous-spec gtannous-spec force-pushed the auth-tlv branch 3 times, most recently from f594bd0 to 258876f Compare November 29, 2025 20:43
@gtannous-spec gtannous-spec force-pushed the auth-tlv branch 2 times, most recently from 9563424 to 67338f9 Compare December 3, 2025 10:56
@josephdrichard josephdrichard added the ok-to-test ok to test label Dec 3, 2025
@github-actions github-actions bot removed the ok-to-test ok to test label Dec 3, 2025
@gtannous-spec gtannous-spec marked this pull request as ready for review December 3, 2025 23:33
@gtannous-spec gtannous-spec marked this pull request as draft December 3, 2025 23:33
@gtannous-spec gtannous-spec marked this pull request as ready for review December 3, 2025 23:48
@edcdavid edcdavid added the ok-to-test ok to test label Dec 8, 2025
Copy link
Collaborator

@greyerof greyerof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor (nit) suggestions but overall looks good to me. Just one question: in case there's two keys added to an existing secret, will it receive one or two consecutive events?

tracker.processManager = pm

// Initialize fsnotify watcher for sa_file change detection
watcher, err := fsnotify.NewWatcher()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can use the same "saFilesWatcher" name for this local var. It's a more descriptive name to use than just "watcher".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that looks better, I changed it to saFileWatch

Comment on lines 414 to 415
var watcherEvents chan fsnotify.Event
var watcherErrors chan error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with the other channels in the for-select loop (like "updateCh, stopCh)", please append the "Ch" suffix to the name of this vars. Also, adding the prefix "saFilesWatcher" might be a good fit to make them more descriptive:

var saFilesWatcherEventCh ...
var saFilesWatcherErrCh ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done !

@edcdavid edcdavid added ok-to-test ok to test and removed ok-to-test ok to test labels Dec 10, 2025
edcdavid
edcdavid previously approved these changes Dec 12, 2025
defer dn.saFileWatcher.Close()
glog.Info("Using fsnotify for instant sa_file change detection")
} else {
glog.Warning("fsnotify unavailable, sa_file change detection disabled")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by fsnotify unavailable? That the fsnotify watcher in unavailable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case trying to initialize the fsnotify file watcher fails --> logs a warning fsnotify unavailable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example would be when authentication is disabled and there is no ptp-secret-mount directory in the linuxptp daemon pod. (the watch directory PtpSecretMountDir doesn't exists )

continue // Ignore hidden files like .data
}
glog.Infof("Security file changed: %s (op: %s), restarting PTP processes", event.Name, event.Op.String())
err := dn.applyNodePTPProfiles()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of concerned here that applyNodePTPProfiles could theoretically be called twice in quick succession, which could have race conditions that applyNodePTPProfiles isn't hardened for. Currently we are looking at a ticker and check for updates based off that ticker, so no possibility of it being called twice in a row. What would happen if a user made multiple changes to the secret in quick succession? Is there a possibility it could break something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my concern at the beginning but I guess the Run() function is using a single Go routine which prevents parallelism, and each call will stop all processes with dn.stopAllProcesses() at the end , so it rebuilds from scratch

…ted secret

- Add fsnotify watcher to monitor /etc/ptp-secret-mount/ for instant
  detection of security file changes (sa_file)
- Trigger PTP process restart on sa_file write/create/remove events
- Promotes fsnotify from indirect to direct dependency
- Reinitialize event and error channels
- Prevent pod restart on watcher failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants