Skip to content

Commit bc5539d

Browse files
alexmvtimabbott
authored andcommitted
tornado: Move SIGTERM shutdown handler into a callback.
A SIGTERM can show up at any point in the ioloop, even in places which are not prepared to handle it. This results in the process ignoring the `sys.exit` which the SIGTERM handler calls, with an uncaught SystemExit exception: ``` 2021-11-09 15:37:49.368 ERR [tornado.application:9803] Uncaught exception Traceback (most recent call last): File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/http1connection.py", line 238, in _read_message delegate.finish() File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/httpserver.py", line 314, in finish self.delegate.finish() File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/routing.py", line 251, in finish self.delegate.finish() File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2097, in finish self.execute() File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 2130, in execute **self.path_kwargs) File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/gen.py", line 307, in wrapper yielded = next(result) File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/tornado/web.py", line 1510, in _execute result = method(*self.path_args, **self.path_kwargs) File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 150, in get request = self.convert_tornado_request_to_django_request() File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/handlers.py", line 113, in convert_tornado_request_to_django_request request = WSGIRequest(environ) File "/home/zulip/deployments/2021-11-08-05-10-23/zulip-py3-venv/lib/python3.6/site-packages/django/core/handlers/wsgi.py", line 66, in __init__ script_name = get_script_name(environ) File "/home/zulip/deployments/2021-11-08-05-10-23/zerver/tornado/event_queue.py", line 611, in <lambda> signal.signal(signal.SIGTERM, lambda signum, stack: sys.exit(1)) SystemExit: 1 ``` Supervisor then terminates the process with a SIGKILL, which results in dropping data held in the tornado process, as it does not dump its queue. The only command which is safe to run in the signal handler is `ioloop.add_callback_from_signal`, which schedules the callback to run during the course of the normal ioloop. This callbacks does an orderly shutdown of the server and the ioloop before exiting.
1 parent 847bf82 commit bc5539d

File tree

2 files changed

+16
-4
lines changed

2 files changed

+16
-4
lines changed

zerver/management/commands/runtornado.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ def inner_run() -> None:
104104
from zerver.tornado.ioloop_logging import logging_data
105105

106106
logging_data["port"] = str(port)
107-
setup_event_queue(port)
107+
setup_event_queue(http_server, port)
108108
add_client_gc_hook(missedmessage_hook)
109109
setup_tornado_rabbitmq()
110110

zerver/tornado/event_queue.py

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
List,
2323
Mapping,
2424
MutableMapping,
25+
NoReturn,
2526
Optional,
2627
Sequence,
2728
Set,
@@ -603,12 +604,24 @@ def send_restart_events(immediate: bool = False) -> None:
603604
client.add_event(event)
604605

605606

606-
def setup_event_queue(port: int) -> None:
607+
def handle_sigterm(server: tornado.httpserver.HTTPServer) -> NoReturn:
608+
logging.warning("Got SIGTERM, shutting down...")
609+
server.stop()
610+
tornado.ioloop.IOLoop.instance().stop()
611+
sys.exit(1)
612+
613+
614+
def setup_event_queue(server: tornado.httpserver.HTTPServer, port: int) -> None:
615+
ioloop = tornado.ioloop.IOLoop.instance()
616+
607617
if not settings.TEST_SUITE:
608618
load_event_queues(port)
609619
atexit.register(dump_event_queues, port)
610620
# Make sure we dump event queues even if we exit via signal
611-
signal.signal(signal.SIGTERM, lambda signum, stack: sys.exit(1))
621+
signal.signal(
622+
signal.SIGTERM,
623+
lambda signum, frame: ioloop.add_callback_from_signal(handle_sigterm, server),
624+
)
612625
add_reload_hook(lambda: dump_event_queues(port))
613626

614627
try:
@@ -617,7 +630,6 @@ def setup_event_queue(port: int) -> None:
617630
pass
618631

619632
# Set up event queue garbage collection
620-
ioloop = tornado.ioloop.IOLoop.instance()
621633
pc = tornado.ioloop.PeriodicCallback(
622634
lambda: gc_event_queues(port), EVENT_QUEUE_GC_FREQ_MSECS, ioloop
623635
)

0 commit comments

Comments
 (0)