I have a feeling we need to improve the answer generation, and my best bet is that we would be better off copying the logic of the TypeScript answer generation code.
This is a pretty big project (you almost have to start from scratch) but it is self-contained -- almost all the current logic is in answers.py.