Skip to content

Commit c17d702

Browse files
committed
Adds a Delta Report column to highlight change events that only impact the read_only field; Document Delta report - first draft.
1 parent a04a691 commit c17d702

File tree

2 files changed

+58
-4
lines changed

2 files changed

+58
-4
lines changed

README.md

+55-2
Original file line numberDiff line numberDiff line change
@@ -277,10 +277,63 @@ Run Python from the virtual environment (see Python Virtual Environment document
277277

278278
Where `--dso_type` is `[communities|collections|items|people|bitstreams]`
279279

280+
## Jupiter Delta Report
280281

281-
## Jupiter Delta
282+
### Jupiter Delta Report: Overview
283+
284+
Given that not all ERA content can be frozen, a delta report aims to communicate any changes to the ERA content from a specified date until the time the report is run. The report aims to communicate each content change leveraging the same data enabling ERA/Jupiter versions UI. Each change event is recorded as a row in a Google Sheet contain both the changed Jupiter fieldname/value data and the Scholaris mappings for the fieldname/value data.
285+
286+
The assumption is the number of actionable changes is small and a human can leverage the information in the report to
287+
288+
* see that a Jupiter object has changed
289+
* identify the Scholaris fieldnames to update and the Scholaris value
290+
291+
### Delta Report: How to use
292+
293+
The report can be presented and shared as a Google Doc to allow users to sort, search, modify as needed.
294+
295+
An early example <https://docs.google.com/spreadsheets/d/1GWxwEtM0EOyoPP5RUrf1oQCtIVz1re-8n0Be4Kybt8E/edit?gid=0#gid=0>
296+
297+
The header includes:
298+
299+
```csv
300+
type,change_id,jupiter_id,is_jupiter_currently_readonly,changed at,event,jupiter delta,scholaris mapped delta,jupiter delta formatted,scholaris mapped delta formatted
301+
```
302+
303+
Where:
304+
305+
* type: item|thesis
306+
* change_id: change event id (PaperTrail::Version ID)
307+
* jupiter_id: ERA ID
308+
* is_jupiter_currently_readonly: "true" if the ERA object is currently read only
309+
* read_only_event: "true" if this change event only updated the read only field and the obj updated at timestamp
310+
* changed_at: change event timestamp
311+
* event: the type of the change record: update|destroy
312+
* jupiter delta: jupiter change event details (what field plus old => new values)
313+
* jupiter delta formatted: an easier to read version
314+
* scholaris mapper: the jupiter delta mapped to Scholaris fieldname and new value equivalents
315+
* scholaris mapper formatted: an easier to read version
316+
317+
Process thoughts:
318+
319+
* In ERA, changing an object to "read only" creates a new version change event: the read_only_event column that indicated if change event only updates read only status thus allow one to filter/order these events if too numerous
320+
* Sort by item_id & date ascending: this allows grouping a sequence of updates over time; if the same field changed multiple times then use the most recent.
321+
* Event "destroy" means the object has been deleted and there will be no Scholaris mapping
322+
323+
Question:
324+
325+
* How best to present such that given a Jupiter on can easily find the Scholaris equivalent.
326+
327+
### Delta Report: How to generate
328+
329+
See script details: `jupiter_output_scripts/jupiter_delta.rb`
330+
331+
Rough outline:
332+
333+
* Needs the Ruby Class used in step 1 of SAF package generation
334+
* Run `jupiter_output_scripts/jupiter_delta.rb`
335+
* Upload CSV into Google Docs for Sharing
282336

283-
`jupiter_output_scripts/jupiter_delta.rb`
284337

285338
## Jupiter Statistics to Scholaris
286339

jupiter_output_scripts/jupiter_delta.rb

+3-2
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ def process_change_event(change_event, obj)
7474

7575
def perform()
7676
CSV.open(@output_file, 'wb') do |csv|
77-
csv << ['type', 'change_id', 'jupiter_id', 'is_jupiter_currently_readonly', 'changed at', 'event', 'jupiter delta', 'scholaris mapped delta', 'jupiter delta formatted', 'scholaris mapped delta formatted']
77+
csv << ['type', 'change_id', 'jupiter_id', 'is_jupiter_currently_readonly', 'read_only_event', 'changed at', 'event', 'jupiter delta', 'scholaris mapped delta', 'jupiter delta formatted', 'scholaris mapped delta formatted']
7878
PaperTrail::Version.where(created_at: @date..).find_each do |row|
7979
# How to communicate key/value mapping differences from Jupiter to DSpace?
8080
# First part, add documentation describing how to use the output
@@ -86,8 +86,9 @@ def perform()
8686
# structure in a new column listing the Scholaris key/value pairs to update
8787
obj = row.item
8888
read_only = "True" if obj && obj.read_only?
89+
read_only_event = "True" if row.object_changes && row.object_changes.keys.to_set == ["updated_at", "read_only"].to_set
8990
scholaris_mapping = process_change_event(row, obj)
90-
csv << [row.item_type, row.id, row.item_id, read_only, row.created_at, row.event, row.object_changes, scholaris_mapping, JSON.pretty_generate(row.object_changes), JSON.pretty_generate(scholaris_mapping)]
91+
csv << [row.item_type, row.id, row.item_id, read_only, read_only_event, row.created_at, row.event, row.object_changes, scholaris_mapping, JSON.pretty_generate(row.object_changes), JSON.pretty_generate(scholaris_mapping)]
9192
end
9293
end
9394

0 commit comments

Comments
 (0)