Skip to content

Commit

Permalink
refactor(docs): Simplify XQueue Migration ADR for better readability
Browse files Browse the repository at this point in the history
This commit updates the XQueue Migration ADR to be more accessible while
maintaining critical technical information. Key changes include:

- Streamline context section focusing on core system limitations
- Reorganize decision section with clearer structure
- Add implementation approach section for gradual migration
- Maintain technical details in consequences section
- Keep all original references for further documentation

Part of the XQueue migration initiative.

refactor: squashed migrations

- Delete all migrations
- Run again makemigrations
- Test in secure environment

refactor: change submission queue record name functionalities

- Change SubmissionQueueManager to ExternalGraderDetailManager
- Change create_submission_queue_record to create_external_grader_detail
- Change validation to queue name in create_submission
- Remove unnecessary test to check queue name error

refactor(submissions): Rename SubmissionQueueRecord to ExternalGraderDetail

- Rename model SubmissionQueueRecord to ExternalGraderDetail
- Update all references in Python files
- Create Django migration for model renaming
- Update related imports and references

feat: add ADR for SubmissionQueueRecord migration from XQueue

Add Architecture Decision Record (ADR) documenting the design and implementation
of the SubmissionQueueRecord model as part of the gradual migration from XQueue
to edx-submissions.

This ADR covers:
- Context of current XQueue architecture
- Technical details of new model implementation in edx submissions
- Migration and compatibility considerations
- Impact analysis and consequences

feat: add architecture decisions to documentation

- Create new decisions folder
- Configurate in index.rst

test: add validation tests for SubmissionQueueRecord state transitions

- Add test_clean_invalid_transitions to verify error cases:
  - pending -> retired (not allowed)
  - pending -> invalid_status
  - pulled -> pending (not allowed)
  - failed -> retired (not allowed)
- Enhance test_clean_valid_transitions with proper assertions

This improves test coverage for state transition validations.

fix: change validation SubmissionQueueRecord clean method

- Add an optional status can_transition_to
- Implement old status in clean method

This resolves a bug in the implementation of transition validation when using the clean method.

fix: remove try except in test_clean_new_instance

- Remove unnecessary try/except in test_clean_new_instance
- Add validations when new queue record call clean method
- Extend coverage measure in test_models

test: Add tests to extend coverage for SubmissionQueueRecord model

- Status transition rules
- Failure count tracking
- Queue length calculations
- Submission retrieval logic

The new tests improve the robustness of the submissions app by covering.

docs: Improve create_submission docstring formatting

- Fix Sphinx documentation warnings
- Correct formatting and indentation issues
- Clarify parameter and return value descriptions

Resolves documentation build warnings related to:
- Inline strong start-string without end-string
- Unexpected indentation

fix: correct docstring formatting in create_submission

- Add proper indentation and blank lines between sections
- Fix example code block formatting
- Replace >>> with ... for continued lines in example
- Ensure consistent spacing throughout docstring

This resolves the ReadTheDocs build failure and docstring formatting warning.

fix: code style and quality improvements

- Fix pylint warnings:
  - Remove protected access warnings in test files
  - Remove pointless string statements
  - Add missing final newlines
- Fix isort import ordering in:
  - api.py
  - test_models.py
  - test_api.py
- Fix pycodestyle issues:
  - Remove extra blank lines
  - Fix operator spacing
  - Add proper spacing before inline comments
  - Fix long lines in migration files
  - Fix line spacing in test files
- Fix docstring formatting in api.py create_submission method

Testing:
make test_quality passes without warnings

feat: add grading metadata fields to submission queue record

- Add grader_file_name CharField (128 chars max) with empty default
- Add points_possible PositiveIntegerField with default value of 1
- Update create_submission_queue_record to handle new fields

fix: Remove duplicate nested create_submission function definition

- Removes the redundant nested function definition

feat: add nullable pullkey and grader_reply to Submission model

feat(SubmissionQueueRecord): add pullkey and grader_reply fields

- Add pullkey CharField(128) for tracking submission processing key
- Add grader_reply TextField to store grader's response
- Fields support queue processing and feedback management to xwatcher

test: add comprehensive tests for Submission and SubmissionQueueRecord models

Add missing test suite for Submission model and add new tests for SubmissionQueueRecord:

Submission model tests (previously non-existent):
- Basic model functionality (creation, retrieval)
- String and repr representations
- Large answer handling
- Field mutability behavior
- Submission ordering
- Soft deletion mechanism
- JSON serialization of answers

SubmissionQueueRecord model tests:
- Default status and initialization
- Valid and invalid status transitions
- Failure count tracking
- Record processability logic
- Queue name validation
- One-to-one relationship with Submission
- Status time updates

Updates:
- Fix repr test to handle UUID and string conversions
- Document actual mutability behavior
- Add validation for large answers
- Ensure proper status transitions and timing
- Verify queue processing logic

The test suite ensures both models work correctly together
and maintain data integrity through their lifecycle.
Note: This commit adds the first test coverage for the Submission model,
which previously lacked any automated testing.

refactor: update submission queue record tests

- Update test cases to use new public create_submission_queue_record function
- Update assertions to use SubmissionQueueCanNotBeEmptyError instead of ValueError
- Adjust test function parameters to match new interface (queue_name and files)
- Add new test cases for empty queue_name validation

The changes ensure test coverage for:
- Direct queue record creation with valid and invalid inputs
- Integration with create_submission function
- Multiple submission queue records
- Error handling scenarios

fix: add blank line in models

- This fixed the pylint warning

feat(error): add SubmissionQueueCanNotBeEmptyError class error

- This class is to is raised when queue name is empty

fix: improve submission queue record validation

- Update condition for creating submission queue record to properly check for queue_name
- Use `event_data.get()` for safer dictionary access
- Change ValueError to SubmissionQueueCanNotBeEmptyError for more specific error handling
- Add proper validation in parent function to prevent None access errors

The changes ensure that:
1. Queue name validation is more robust
2. Error handling is more specific and clear
3. Null checks are properly implemented

Test coverage remains unchanged.

refactor(submission): simplify queue record creation with explicit field

- Replace dynamic unpacking of event_data with explicit queue_name field
- Remove unnecessary field expansion to improve code clarity and maintainability

test: add queue record unit and integration tests

- Add unit tests for _create_submission_queue_record helper function
  - Basic queue record creation
  - Queue name validation
  - Database error handling

- Add integration tests with create_submission
  - Queue record creation via submission creation
  - Multiple queue records with same name
  - Error handling with missing queue name
  - Database error integration testing

Organized tests into clearly separated unit and integration sections
for better maintainability and clarity.

feat: add submission queue record functionality

- Add _create_submission_queue_record helper function to handle queue record creation
- Modify create_submission to accept event_data as kwargs
- Update documentation to reflect new parameters and functionality
- Allow dynamic field assignment using event_data kwargs in queue record creation

The main changes:
- Added ability to create SubmissionQueueRecord with dynamic fields
- Improved error handling for queue record creation
- Made event_data more flexible using kwargs pattern
- Maintained backwards compatibility with existing submission creation

feat(submissions): add SubmissionQueueRecord model

- Add SubmissionQueueManager to process queries
- Replace multiple timestamps with a single status_time field
- Introduce explicit state machine for submission processing
- Add atomic state transitions with validation
- Optimize database indexes for queue operations

Key changes:
- Add STATUS_CHOICES and VALID_TRANSITIONS for state management
- Consolidate pull_time, push_time, return_time into status_time
- Add transaction-safe update_status() method
- Improve queue manager with safer submission retrieval
- Add validation for state transitions
- Optimize indexes for common query patterns

The changes maintain backwards compatibility with existing queue
processing while providing a more robust and maintainable approach
to state management.
  • Loading branch information
leoaulasneo98 committed Feb 21, 2025
1 parent 639e196 commit f1d19f4
Show file tree
Hide file tree
Showing 8 changed files with 983 additions and 39 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
1. Creation of ExternalGraderDetail Model for XQueue Migration
###############################################################

Status
******

**Provisional** *2025-02-10*

Implemented by https://github.com/openedx/edx-submissions/pull/283

Context
*******

Currently, Open edX uses a separate system called XQueue to handle the grading of student submissions for certain
types of problems (like programming assignments). It implements a REST API with MySQL backend,
requiring HTTP communication between services that manages submissions. This system works, but it has some limitations:

- HTTP dependency creates unnecessary synchronous coupling
- Complex state management across services
- No native queue system (implemented through database)
- Unnecessary complexity in system architecture

Decision
********

As part of Phase 1 of the XQueue migration plan, we will create a new ExternalGraderDetail model in edx-submissions to
simplify the grading system architecture. This is the first step in a larger plan that will eventually include an event
bus implementation for handling the submissions workflow.

What's New

A new database model that will:

- Keep track of submission status more clearly
- Store all grading-related information in one place
- Make it easier to process submissions in order
- Handle errors and retries automatically

Key Features

- Better tracking of submission status (pending, being graded, completed, failed)
- Clearer connection between submissions and their grading results
- Improved error handling and retry capabilities
- Easier monitoring of the grading process
- Simplify the xwatcher and edx platform queue processing architecture
- Reduce inter-service communication overhead

Implementation Approach

We'll implement this change gradually:

- First, build the new system alongside the existing one
- Test thoroughly to ensure everything works as expected
- Slowly transition from the old system to the new one
- Keep the old system running until we're sure the new one works perfectly

Consequences
************

Positive:
---------

Model Structure:
* Clean data separation via OneToOneField relationship
* Explicit state management with VALID_TRANSITIONS
* Protected state changes using atomic transactions

Integration:
* Compatible with existing XWatcher interface
* Maintains current queue naming patterns
* Enables parallel system operation during migration

Development:
* Integrated status validation and retry
* Comprehensive status tracking

Negative:
---------

Technical Challenges:
* Required atomic updates for status and timestamps
* Additional database overhead from new indexes

Testing Needs:
* Comprehensive state transition testing required
* Integration testing with xwatcher

Neutral:
--------

Process Impact:
* New queue processing patterns to learn
* Additional monitoring requirements

Operations:
* State transition monitoring needed
* Temporary increased system complexity

References
**********

Current System Documentation:
* XQueue Repository: https://github.com/openedx/xqueue
* XQueue Watcher Repository: https://github.com/openedx/xqueue-watcher

Migration Documents:
* Current XQueue Documentation: https://github.com/openedx/edx-submissions/tree/master/docs

Related Repositories:
* edx-submissions: https://github.com/openedx/edx-submissions
* edx-platform: https://github.com/openedx/edx-platform

Future Event Integration:
* Open edX Events Framework: https://github.com/openedx/openedx-events
* Event Bus Documentation: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Event+Bus

Related Architecture Documents:
* Open edX Architecture Guidelines: https://openedx.atlassian.net/wiki/spaces/AC/pages/124125264/Architecture+Guidelines
7 changes: 7 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,14 @@ API Documentation

api

Architecture Decisions
----------------------

.. toctree::
:maxdepth: 1
:glob:

decisions/*

Indices and tables
==================
Expand Down
130 changes: 98 additions & 32 deletions submissions/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,30 @@
Public interface for the submissions app.
"""

# Stdlib imports
import itertools
import logging
import operator
import warnings
from uuid import UUID

# Django imports
from django.conf import settings
from django.core.cache import cache
from django.db import DatabaseError, IntegrityError, transaction

# Local imports
# SubmissionError imported so that code importing this api has access
from submissions.errors import ( # pylint: disable=unused-import
ExternalGraderQueueCanNotBeEmptyError,
SubmissionError,
SubmissionInternalError,
SubmissionNotFoundError,
SubmissionRequestError
)
from submissions.models import (
DELETED,
ExternalGraderDetail,
Score,
ScoreAnnotation,
ScoreSummary,
Expand All @@ -48,64 +52,122 @@
TOP_SUBMISSIONS_CACHE_TIMEOUT = 300


def create_submission(student_item_dict, answer, submitted_at=None, attempt_number=None, team_submission=None):
"""Creates a submission for assessment.
def create_external_grader_detail(submission, event_data):
"""
Creates a ExternalGraderDetail for a given submission.
Args:
submission (Submission): The submission object to create a queue record for.
event_data (dict): Data to be included in the queue record. Must include a 'queue_name' key.
Returns:
ExternalGraderDetail: The created queue record.
Raises:
ExternalGraderQueueCanNotBeEmptyError: If event_data doesn't contain required queue_name.
SubmissionInternalError: If there's an error creating the queue record.
"""

if not event_data.get('queue_name'):
raise ExternalGraderQueueCanNotBeEmptyError("event_data must contain 'queue_name'")

try:
queue_record = ExternalGraderDetail.objects.create(
submission=submission,
queue_name=event_data['queue_name'],
grader_file_name=event_data.get('grader_file_name', ''),
points_possible=event_data.get('points_possible', 1),

)
return queue_record

except DatabaseError as error:
error_message = (
f"An error occurred while creating queue record for submission {submission.uuid} "
f"with event data: {event_data}"
)
logger.exception(error_message)
raise SubmissionInternalError(error_message) from error


def create_submission(
student_item_dict,
answer,
submitted_at=None,
attempt_number=None,
team_submission=None,
**event_data
):
"""
Creates a submission for assessment.
Generic means by which to submit an answer for assessment.
Args:
student_item_dict (dict): The student_item this
submission is associated with. This is used to determine which
course, student, and location this submission belongs to.
student_item_dict (dict): The student_item this submission is associated with.
This is used to determine which course, student, and location this
submission belongs to.
answer (JSON-serializable): The answer given by the student to be assessed.
submitted_at (datetime): The date in which this submission was submitted.
submitted_at (datetime, optional): The date on which this submission was submitted.
If not specified, defaults to the current date.
attempt_number (int): A student may be able to submit multiple attempts
attempt_number (int, optional): A student may be able to submit multiple attempts
per question. This allows the designated attempt to be overridden.
If the attempt is not specified, it will take the most recent
submission, as specified by the submitted_at time, and use its
attempt_number plus one.
team_submission (TeamSubmission, optional): The team submission this individual
submission is associated with, if any.
event_data (dict, optional): If provided, creates a ExternalGraderDetail
for this submission. Must contain at least a ``queue_name`` key.
Returns:
dict: A representation of the created Submission. The submission
contains five attributes: student_item, attempt_number, submitted_at,
created_at, and answer. 'student_item' is the ID of the related student
item for the submission. 'attempt_number' is the attempt this submission
represents for this question. 'submitted_at' represents the time this
submission was submitted, which can be configured, versus the
'created_at' date, which is when the submission is first created.
dict: A representation of the created Submission. The submission contains
five attributes: student_item, attempt_number, submitted_at, created_at,
and answer.
The returned dictionary includes:
- student_item: ID of the related student item for the submission
- attempt_number: Attempt this submission represents for this question
- submitted_at: Time this submission was submitted
- created_at: Time the submission was first created
- answer: The submitted answer
Raises:
SubmissionRequestError: Raised when there are validation errors for the
student item or submission. This can be caused by the student item
missing required values, the submission being too long, the
attempt_number is negative, or the given submitted_at time is invalid.
SubmissionInternalError: Raised when submission access causes an
internal error.
student item or submission. This can occur due to:
- Student item missing required values
- Submission being too long
- Attempt number is negative
- Submitted time is invalid
SubmissionInternalError: Raised when submission access causes an internal error.
ValueError: If event_data is provided but missing required queue_name.
Examples:
>>> student_item_dict = dict(
>>> student_id="Tim",
>>> item_id="item_1",
>>> course_id="course_1",
>>> item_type="type_one"
>>> )
>>> create_submission(student_item_dict, "The answer is 42.", datetime.utcnow, 1)
>>> student_item_dict = {
... "student_id": "Tim",
... "item_id": "item_1",
... "course_id": "course_1",
... "item_type": "type_one"
... }
>>> create_submission(student_item_dict, "The answer is 42.", datetime.utcnow(), 1)
{
'student_item': 2,
'attempt_number': 1,
'submitted_at': datetime.datetime(2014, 1, 29, 17, 14, 52, 649284 tzinfo=<UTC>),
'submitted_at': datetime.datetime(2014, 1, 29, 17, 14, 52, 649284, tzinfo=<UTC>),
'created_at': datetime.datetime(2014, 1, 29, 17, 14, 52, 668850, tzinfo=<UTC>),
'answer': u'The answer is 42.'
'answer': 'The answer is 42.'
}
"""

student_item_model = _get_or_create_student_item(student_item_dict)
if attempt_number is None:
first_submission = None
attempt_number = 1
try:
first_submission = Submission.objects.filter(student_item=student_item_model).first()
Expand Down Expand Up @@ -134,7 +196,11 @@ def create_submission(student_item_dict, answer, submitted_at=None, attempt_numb
submission_serializer = SubmissionSerializer(data=model_kwargs)
if not submission_serializer.is_valid():
raise SubmissionRequestError(field_errors=submission_serializer.errors)
submission_serializer.save()

submission_instance = submission_serializer.save()

if event_data.get("queue_name"):
create_external_grader_detail(submission_instance, event_data)

sub_data = submission_serializer.data
_log_submission(sub_data, student_item_dict)
Expand Down
9 changes: 9 additions & 0 deletions submissions/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,15 @@ class SubmissionNotFoundError(SubmissionError):
"""


class ExternalGraderQueueCanNotBeEmptyError(SubmissionError):
"""
This error is raised when queue name is empty.
If the create submission call have an event data with queue name empty,
this error may be raised.
"""


class SubmissionRequestError(SubmissionError):
"""
This error is raised when there was a request-specific error
Expand Down
39 changes: 39 additions & 0 deletions submissions/migrations/0004_externalgraderdetail.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Generated by Django 4.2.17 on 2025-02-21 13:18

from django.db import migrations, models
import django.db.models.deletion
import django.utils.timezone


class Migration(migrations.Migration):

dependencies = [
('submissions', '0003_ensure_ascii'),
]

operations = [
migrations.CreateModel(
name='ExternalGraderDetail',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('queue_name', models.CharField(max_length=128)),
('grader_file_name', models.CharField(default='', max_length=128)),
('points_possible', models.PositiveIntegerField(default=1)),
('status', models.CharField(choices=[('pending', 'Pending'), ('pulled', 'Pulled'),
('retired', 'Retired'), ('failed', 'Failed')], default='pending',
max_length=20)),
('pullkey', models.CharField(blank=True, max_length=128, null=True)),
('grader_reply', models.TextField(blank=True, null=True)),
('status_time', models.DateTimeField(db_index=True, default=django.utils.timezone.now)),
('created_at', models.DateTimeField(db_index=True, default=django.utils.timezone.now)),
('num_failures', models.PositiveIntegerField(default=0)),
('submission', models.OneToOneField(on_delete=django.db.models.deletion.CASCADE,
related_name='queue_record', to='submissions.submission')),
],
options={
'ordering': ['-created_at'],
'indexes': [models.Index(fields=['queue_name', 'status', 'status_time'],
name='submissions_queue_n_fbabd8_idx')],
},
),
]
Loading

0 comments on commit f1d19f4

Please sign in to comment.