Update fgt-asg-lambda.py#16
Conversation
| self.logger.info("Attach the interface to FortiGate VM instance") | ||
| if attach_id == None: | ||
| result = self.delete_interface(cur_intf_id) #Need to delete the interface if not able to attach | ||
| if not result: #if error occurs on the delete_interface function |
There was a problem hiding this comment.
The return False condition should not be the interface not been deleted. Should be something like this:
if attach_id == None:
result = self.delete_interface(cur_intf_id)
if not result:
# some function to check whether the interface be deleted
return False
There was a problem hiding this comment.
Hello Lix the delete_interface function is from:
def delete_interface(self, cur_intf_id):
self.logger.info(f"Delete interface: {cur_intf_id}.")
response = ""
try:
response = self.ec2_client.delete_network_interface(NetworkInterfaceId=cur_intf_id)
return True
except ClientError as e:
self.logger.error(f"Error deleting network interface {cur_intf_id}: {e.response['Error']['Code']}, response: {response}")
return False
I just added return True/False
True - if the interface got deleted.
False - if the interface had an error when deleting the interface
as of now (original code), when there's an error it will do nothing. just continue. I am not sure what function we need for this one "# some function to check whether the interface be deleted" can you please recommend? thank you!
There was a problem hiding this comment.
The main point is that it should return False when attach_id is None, not when deleting the interface fails. The logic should be
if attach_id == None:
result = self.delete_interface(cur_intf_id)
return False
As for the scenario of deleting the interface failing, we can leave it right now. In the future, we can add some functionality to handle it.
There was a problem hiding this comment.
thank you Xing, I did create below since the variable result isnt used, I removed it..
if attach_id == None:
self.delete_interface(cur_intf_id) #Need to delete the interface if not able to attach
return False
continue
| self.associate_pub_ip(cur_intf_id, intf_conf) | ||
| associate_pub_ip_id = self.associate_pub_ip(cur_intf_id, intf_conf) | ||
| if associate_pub_ip_id == None: #add a check if we have successfully associate the public Interface | ||
| return False |
There was a problem hiding this comment.
We can set to continue if associate public IP fails, since it is not required for the internal function.
There was a problem hiding this comment.
thank you corrected! I bring back to original code
if "enable_public_ip" in intf_conf and intf_conf["enable_public_ip"] :
self.associate_pub_ip(cur_intf_id, intf_conf)
There was a problem hiding this comment.
Hello @lix-fortinet
could you explain why we dont need to check if associate public IP fails. why it is not required for the internal function. :) thank you!
| }] | ||
| ) | ||
| # Get private IP | ||
| b_succ = False # Initialize b_succ value |
There was a problem hiding this comment.
The initial value should be True.
There was a problem hiding this comment.
Hi Lix, not want argue here much but, isnt it safer to initialize the value as False? and it will only be true if b_succ sends True value :) but you know this better for sure.
There was a problem hiding this comment.
Actually, in the current code, we do not need this initialization. Since the change password part will initialize the b_succ. But we modified the logic to change password to when self.fgt_lic_mgmt != "fmg", not released yet. So, when self.fgt_lic_mgmt == "fmg", then some of the following conditions will not be triggered, and b_succ will not be overwritten. Then everything passed but b_succ is false, which should return true.
There was a problem hiding this comment.
hello @lix-fortinet,
ok we can remove the initialize b_succ value here. thanks!
| # event["detail"]["LifecycleActionToken"] - Token for completing the hook (only in lifecycle events) | ||
| logger.info("=" * 60) | ||
| logger.info("fgt-asg-lambda invoked") | ||
| logger.info(f"Event detail-type: {event.get('detail-type', 'N/A')}") |
There was a problem hiding this comment.
Duplicate log with the following logs.
There was a problem hiding this comment.
Hello Lix, the logs were added by my colleague to see what is happening.
There was a problem hiding this comment.
Yes, but the detail-type was logged again in the following log.
There was a problem hiding this comment.
oh I see what you mean :) I have remove the duplicate one thank you!
| logger.info(f" detail_type: {detail_type}") | ||
| logger.info(f" EC2InstanceId: {fgt_vm_id}") | ||
| logger.info(f" AutoScalingGroupName: {event_detail.get('AutoScalingGroupName', 'N/A')}") | ||
| logger.info(f" LifecycleHookName: {event_detail.get('LifecycleHookName', 'N/A')}") |
There was a problem hiding this comment.
LifecycleHookName used multiple times. Could create a variable for it.
There was a problem hiding this comment.
hello @lix-fortinet I have modified this part. changes:
move hook_name variable above managed_hooks variable
# create variable for hook_name
hook_name = event_detail.get('LifecycleHookName', '')
# create variable for managed_hooks
managed_hooks = ['fgt_asg_launch_hook', 'fgt_asg_terminate_hook']
remove the hook_name on the complete lifecycle action part as we already have defined it.
if detail_type in ["EC2 Instance-launch Lifecycle Action", "EC2 Instance-terminate Lifecycle Action"]:
logger.info(f"Completing lifecycle hook: {hook_name}")
complete_lifecycle(logger, event_detail,result="CONTINUE" )
| lifecycle_token = event_detail.get('LifecycleActionToken', '') | ||
| logger.info(f" LifecycleActionToken: {lifecycle_token[:20] + '...' if lifecycle_token else 'N/A'}") | ||
|
|
||
| # --- EARLY EXIT: Skip unmanaged lifecycle hooks --- |
There was a problem hiding this comment.
Why do we need this early exit? What other hooks does it have besides managed hooks? Also, the hook name may contain a prefix.
There was a problem hiding this comment.
Hello Lix, On our environment we added some hooks that will check threat feeds. And for some reason, this hook also receives the event from threat feeds hook. We added this so that it will not process Life cycle action for threat feeds hook.
There was a problem hiding this comment.
As for the suggestion I will check with my colleague as he is the one that modifies it :)
There was a problem hiding this comment.
Got it. You can check main.tf line 141.
Fix based from Lix comment
|
Hello @lix-fortinet I have updated the fgt-asg-lambda.py based on our discussion . kindly please check if it looks good. thank you so much! |
Related to issue #15 regarding the fgt-asg-lambda.py. We have made following modifications for the fgt-asg-lamba.py script to address several operational issues identified during deployment. These changes improve reliability, error handling, and lifecycle hook management for FortiGate instances within an Auto Scaling Group (ASG).
Proposed Changes
Restrict Lifecycle Hook Handling
The script was updated to handle only the fgt_asg_lambda lifecycle hook, ignoring all other lifecycle events triggered by the ASG. This prevents unintended processing of hooks that are outside the script’s scope.
Add Success Validation for NetworkInterface and FGTConfig Operations
Explicit success checks were introduced for both the NetworkInterface configuration and the FGTConfig operation to ensure each step completes successfully before proceeding.
Conditional Lifecycle Hook Action Based on Operation Results
The lifecycle hook action (CONTINUE or ABANDON) will be determined by the outcome of the NetworkInterface and FGTConfig success checks. If either operation fails, the lifecycle hook will be abandoned to prevent a misconfigured instance from entering service.
Default Lifecycle Action Set to ABANDON
The default behavior of the lifecycle function is updated to issue an ABANDON action. This acts as a safe fallback to ensure that no instance is placed InService unless all configuration steps have been explicitly verified as successful.
Top-Level Exception Handling in Main Function
A try/except block will be added to the main function to capture any unexpected runtime errors. Upon catching an exception, the script will immediately issue an ABANDON lifecycle action to prevent a potentially misconfigured FortiGate instance from becoming InService.
Issues Resolved:
Unintended Lifecycle Hook Processing
The script was previously responding to “ALL” lifecycle hooks within the ASG, unintentionally completing them on behalf of other hooks. This fix ensures only the intended hook is handled.
Premature Configuration Attempt
The script was previously waiting for the EC2 Instance Launch Successful event before initiating FortiGate configuration, causing timing conflicts with other lifecycle hooks. The updated flow now initiates configuration at the EC2 Instance-Launch Lifecycle Action stage.
Instance Marked InService Before Configuration Completes
Previously, the instance would transition to InService before the FortiGate was fully configured, potentially resulting in a faulty instance being placed into service. The updated logic now executes FGTConfig function first and only sends a CONTINUE action upon confirmed success.
Unhandled Runtime Errors
Any unexpected errors in the main function were not being caught, potentially leaving lifecycle hooks in an incomplete state. The added exception handling ensures that any error results in an ABANDON action, maintaining ASG integrity.
I would like to bring the above proposed changes to your attention for review. These modifications have been successfully validated in our environment, and initial testing results are promising. Please let us know if the changes are acceptable . Thank you!