Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FC-73 Xqu add submission file (Feature: Submission File System - Enhanced File Management for Graded Submissions) #286

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

leoaulasneo98
Copy link

FC-73 Feature: Submission File System - Enhanced File Management for Graded Submissions

⚠️ Important: This PR builds on the SubmissionQueueRecord infrastructure deployed in previous PRs. Please ensure those changes are fully deployed before merging.

Description

This pull request implements a comprehensive file management system for submissions through the introduction of the SubmissionFile model and its associated SubmissionFileManager. This enhancement supports proper storage, retrieval, and processing of files attached to submissions within the Open edX ecosystem.

Motivation

The existing submission system requires enhanced file handling capabilities to:

  • Provide seamless transition for existing services (xwatcher)
  • Establish reliable file storage with appropriate metadata
  • Integrate seamlessly with the new submission queue architecture

Previously, the XQueue server managed files for submissions that required document attachments. However, this was implemented inefficiently using plain text fields (s3_keys and s3_urls) without proper structure or validation. This approach led to inconsistent file handling, potential security issues, and maintenance challenges as the system scaled.

Key Improvements

Model Enhancements

  • Introduced SubmissionFile model with:
    • Secure file storage mechanisms
    • Proper association with submission queue records
    • Standardized URL generation for grader access
    • Original filename preservation

File Processing System

  • Developed SubmissionFileManager with capabilities for:
    • Processing multiple file formats (bytes, file objects)
    • Robust error handling for file processing issues
    • Consistent URL generation for xqueue compatibility
    • Efficient retrieval methods for external grader integration

Error Handling

  • Implemented comprehensive error handling for:
    • Invalid file objects
    • IO and OS exceptions during file reading
    • Unicode decoding errors
    • Non-standard file formats

Technical Details

File Processing

  • Support for multiple input types including:
    • Raw bytes
    • Django ContentFile objects
    • UploadedFile instances
    • Custom file-like objects with read() methods

Integration Points

  • Seamless connection with ExternalGraderDetail for complete submission workflow
  • Compatible interface for existing xqueue consumers

Testing Strategy

Extensive test coverage includes:

  • Various file input types
  • Complete processing workflow verification
  • Edge case handling for invalid files
  • Error handling for exceptional conditions
  • File retrieval in grader-compatible format

Open edX Compliance

This implementation adheres to Open edX standards through:

  • Modular design with clear separation of concerns
  • Comprehensive test coverage of edge cases
  • Well-documented interfaces
  • Consistent error handling patterns

Performance Considerations

  • Efficient file storage with minimal overhead
  • Optimized database access patterns
  • Careful handling of file objects to prevent memory issues

BREAKING CHANGES: None. Designed for full compatibility with existing systems.

This commit updates the XQueue Migration ADR to be more accessible while
maintaining critical technical information. Key changes include:

- Streamline context section focusing on core system limitations
- Reorganize decision section with clearer structure
- Add implementation approach section for gradual migration
- Maintain technical details in consequences section
- Keep all original references for further documentation

Part of the XQueue migration initiative.

refactor: squashed migrations

- Delete all migrations
- Run again makemigrations
- Test in secure environment

refactor: change submission queue record name functionalities

- Change SubmissionQueueManager to ExternalGraderDetailManager
- Change create_submission_queue_record to create_external_grader_detail
- Change validation to queue name in create_submission
- Remove unnecessary test to check queue name error

refactor(submissions): Rename SubmissionQueueRecord to ExternalGraderDetail

- Rename model SubmissionQueueRecord to ExternalGraderDetail
- Update all references in Python files
- Create Django migration for model renaming
- Update related imports and references

feat: add ADR for SubmissionQueueRecord migration from XQueue

Add Architecture Decision Record (ADR) documenting the design and implementation
of the SubmissionQueueRecord model as part of the gradual migration from XQueue
to edx-submissions.

This ADR covers:
- Context of current XQueue architecture
- Technical details of new model implementation in edx submissions
- Migration and compatibility considerations
- Impact analysis and consequences

feat: add architecture decisions to documentation

- Create new decisions folder
- Configurate in index.rst

test: add validation tests for SubmissionQueueRecord state transitions

- Add test_clean_invalid_transitions to verify error cases:
  - pending -> retired (not allowed)
  - pending -> invalid_status
  - pulled -> pending (not allowed)
  - failed -> retired (not allowed)
- Enhance test_clean_valid_transitions with proper assertions

This improves test coverage for state transition validations.

fix: change validation SubmissionQueueRecord clean method

- Add an optional status can_transition_to
- Implement old status in clean method

This resolves a bug in the implementation of transition validation when using the clean method.

fix: remove try except in test_clean_new_instance

- Remove unnecessary try/except in test_clean_new_instance
- Add validations when new queue record call clean method
- Extend coverage measure in test_models

test: Add tests to extend coverage for SubmissionQueueRecord model

- Status transition rules
- Failure count tracking
- Queue length calculations
- Submission retrieval logic

The new tests improve the robustness of the submissions app by covering.

docs: Improve create_submission docstring formatting

- Fix Sphinx documentation warnings
- Correct formatting and indentation issues
- Clarify parameter and return value descriptions

Resolves documentation build warnings related to:
- Inline strong start-string without end-string
- Unexpected indentation

fix: correct docstring formatting in create_submission

- Add proper indentation and blank lines between sections
- Fix example code block formatting
- Replace >>> with ... for continued lines in example
- Ensure consistent spacing throughout docstring

This resolves the ReadTheDocs build failure and docstring formatting warning.

fix: code style and quality improvements

- Fix pylint warnings:
  - Remove protected access warnings in test files
  - Remove pointless string statements
  - Add missing final newlines
- Fix isort import ordering in:
  - api.py
  - test_models.py
  - test_api.py
- Fix pycodestyle issues:
  - Remove extra blank lines
  - Fix operator spacing
  - Add proper spacing before inline comments
  - Fix long lines in migration files
  - Fix line spacing in test files
- Fix docstring formatting in api.py create_submission method

Testing:
make test_quality passes without warnings

feat: add grading metadata fields to submission queue record

- Add grader_file_name CharField (128 chars max) with empty default
- Add points_possible PositiveIntegerField with default value of 1
- Update create_submission_queue_record to handle new fields

fix: Remove duplicate nested create_submission function definition

- Removes the redundant nested function definition

feat: add nullable pullkey and grader_reply to Submission model

feat(SubmissionQueueRecord): add pullkey and grader_reply fields

- Add pullkey CharField(128) for tracking submission processing key
- Add grader_reply TextField to store grader's response
- Fields support queue processing and feedback management to xwatcher

test: add comprehensive tests for Submission and SubmissionQueueRecord models

Add missing test suite for Submission model and add new tests for SubmissionQueueRecord:

Submission model tests (previously non-existent):
- Basic model functionality (creation, retrieval)
- String and repr representations
- Large answer handling
- Field mutability behavior
- Submission ordering
- Soft deletion mechanism
- JSON serialization of answers

SubmissionQueueRecord model tests:
- Default status and initialization
- Valid and invalid status transitions
- Failure count tracking
- Record processability logic
- Queue name validation
- One-to-one relationship with Submission
- Status time updates

Updates:
- Fix repr test to handle UUID and string conversions
- Document actual mutability behavior
- Add validation for large answers
- Ensure proper status transitions and timing
- Verify queue processing logic

The test suite ensures both models work correctly together
and maintain data integrity through their lifecycle.
Note: This commit adds the first test coverage for the Submission model,
which previously lacked any automated testing.

refactor: update submission queue record tests

- Update test cases to use new public create_submission_queue_record function
- Update assertions to use SubmissionQueueCanNotBeEmptyError instead of ValueError
- Adjust test function parameters to match new interface (queue_name and files)
- Add new test cases for empty queue_name validation

The changes ensure test coverage for:
- Direct queue record creation with valid and invalid inputs
- Integration with create_submission function
- Multiple submission queue records
- Error handling scenarios

fix: add blank line in models

- This fixed the pylint warning

feat(error): add SubmissionQueueCanNotBeEmptyError class error

- This class is to is raised when queue name is empty

fix: improve submission queue record validation

- Update condition for creating submission queue record to properly check for queue_name
- Use `event_data.get()` for safer dictionary access
- Change ValueError to SubmissionQueueCanNotBeEmptyError for more specific error handling
- Add proper validation in parent function to prevent None access errors

The changes ensure that:
1. Queue name validation is more robust
2. Error handling is more specific and clear
3. Null checks are properly implemented

Test coverage remains unchanged.

refactor(submission): simplify queue record creation with explicit field

- Replace dynamic unpacking of event_data with explicit queue_name field
- Remove unnecessary field expansion to improve code clarity and maintainability

test: add queue record unit and integration tests

- Add unit tests for _create_submission_queue_record helper function
  - Basic queue record creation
  - Queue name validation
  - Database error handling

- Add integration tests with create_submission
  - Queue record creation via submission creation
  - Multiple queue records with same name
  - Error handling with missing queue name
  - Database error integration testing

Organized tests into clearly separated unit and integration sections
for better maintainability and clarity.

feat: add submission queue record functionality

- Add _create_submission_queue_record helper function to handle queue record creation
- Modify create_submission to accept event_data as kwargs
- Update documentation to reflect new parameters and functionality
- Allow dynamic field assignment using event_data kwargs in queue record creation

The main changes:
- Added ability to create SubmissionQueueRecord with dynamic fields
- Improved error handling for queue record creation
- Made event_data more flexible using kwargs pattern
- Maintained backwards compatibility with existing submission creation

feat(submissions): add SubmissionQueueRecord model

- Add SubmissionQueueManager to process queries
- Replace multiple timestamps with a single status_time field
- Introduce explicit state machine for submission processing
- Add atomic state transitions with validation
- Optimize database indexes for queue operations

Key changes:
- Add STATUS_CHOICES and VALID_TRANSITIONS for state management
- Consolidate pull_time, push_time, return_time into status_time
- Add transaction-safe update_status() method
- Improve queue manager with safer submission retrieval
- Add validation for state transitions
- Optimize indexes for common query patterns

The changes maintain backwards compatibility with existing queue
processing while providing a more robust and maintainable approach
to state management.
Add extensive test suite covering edge cases and error handling
  - Test file processing with different input types (bytes, file objects)
  - Implement error handling tests for IO errors and invalid files
  - Verify complete file processing workflow

feat: implement submission file workflow

- Create Submission file model
- Develop SubmissionFileManager for centralized DB query handling
- Add comprehensive test suite for submission file actions
- Implement validation tests for file addition in create_external_grader_detail
- Document architectural decisions with ADR
- Add test queue folder in .gitignore
@openedx-webhooks
Copy link

Thanks for the pull request, @leoaulasneo98!

This repository is currently maintained by @openedx/committers-edx-submissions.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

🔘 Update the status of your PR

Your PR is currently marked as a draft. After completing the steps above, update its status by clicking "Ready for Review", or removing "WIP" from the title, as appropriate.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Feb 24, 2025
Copy link

codecov bot commented Feb 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.01%. Comparing base (639e196) to head (1ab52ad).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #286      +/-   ##
==========================================
+ Coverage   93.53%   95.01%   +1.48%     
==========================================
  Files          18       19       +1     
  Lines        1995     2530     +535     
  Branches       90      105      +15     
==========================================
+ Hits         1866     2404     +538     
+ Misses        118      115       -3     
  Partials       11       11              
Flag Coverage Δ
unittests 95.01% <100.00%> (+1.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U
Projects
Status: Waiting on Author
Development

Successfully merging this pull request may close these issues.

3 participants