-
Notifications
You must be signed in to change notification settings - Fork 211
Archive channel tree command [DRAFT] #2654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Archive channel tree command [DRAFT] #2654
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2654 +/- ##
=======================================
Coverage 85.39% 85.39%
=======================================
Files 298 298
Lines 15767 15767
=======================================
Hits 13465 13465
Misses 2302 2302 Continue to review full report at Codecov.
|
What's needed to help push this forward, @ivanistheone? |
This is minimal additions to make sure JSON archive format really works with treediffer preset="studio" defined in https://github.com/learningequality/treediffer/blob/master/src/treediffer/presets.py#L39-L80
7fefb6c
to
a3f3cdb
Compare
For context this PR was due to a misunderstanding on my part—when I head Jordan was working on channel diff, I rushed to get archive channel command and associated detailed diff code ready so she could use it, but then I realized "channel diff" meant just the simpler "channel counts diff" and detailed diff wasn't in scope, hence the pause on it. That being said, it would be a good to start archiving channels data, even if no frontend for these yet. @rtibbles Here is a mini-list of possible next steps:
Other related dev work:
I'm a bit out of the loop so cannot speak as to priority/timeframes, but happy to help out in free time on B. after A. (confirm this mgmt command is needed). |
Use casesThese were discussed a bit with Jordan and @kollivier as useful, but not sure if/when they would fit in roadmap: 1/ channeldiff task + command
See standalone POC command-line code for this here: treediffer/examples/studiodiffferpoc.py 2/ channeldiff UIrun channeldiff task, then 3/ archivalNot sure if need to tackle that right now since requires consideration about scalability + long term user data retention. Would be nice to have a combined command archivechannel that does both archivechanneltree and archivechanneldb.
4/ PUBLISH/EXPORT Koibri DB from studio JSON archive treeInstead of export.py being based on direct access to DB; Kolibri-DB creation can be an independent task with input studio_tree_archive.json --> Kolibri DB (plus perseus files get if needed).
5/ content provenanceAll the expensive "graph analytics" like which channel imports from can be done easily based on channel archives json 6/ ROC data importerNot needed for ROC prototype, but good to have full Studio data (including provenance) |
Description
This is a POC for "channel archiving" command that exports the complete channel tree as JSON.
Steps to Test
./contentcuration/manage.py archivechanneltree {channel_id}
for a{channel_id}
that exists in the local DB.Implementation Notes
At a high level, how did you implement this?
archive_channel_tree(channel_id, tree='main')
incontentcuration/contentcuration/utils/archive.py
archivechanneltree
that calls this function.Does this introduce any tech-debt items?
Since we're using a new serializer for this task, the fields of that serializer would have to be kept up to data as Studio data models evolve.
Checklist
Comments
This is strictly POC and not finished; would need to be continued in order make sure channel archives contain all the info needed for all possible use cases (e.g. is info enough to "restore" a channel from archive?).
Reviewers
exportchannel
command) from the need to access studio DB (assuming all the necessary info is present in the archived