Skip to content
This repository was archived by the owner on Jan 8, 2022. It is now read-only.

Commit fd787c0

Browse files
committed
Fix bug with item exporter such that the list is always an AF::Relation
Prior this could be an array which does not respond to #find_each and winds up loading tons of data into memory. Monkeypatch AF to allow POSTing to Solr. Confirmed working. And prefer `#each` with `#find_each` which, when chained with `#where`, has unpredictable results. (Instead of operating on the ~4K items in the relation returned by `#where`, `#find_each` operated on all instances of the model.) To support this, add amother AF monkeypatch to use HTTP POST.
1 parent 6b7396d commit fd787c0

File tree

1 file changed

+43
-3
lines changed

1 file changed

+43
-3
lines changed

bin/export-items

+43-3
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,53 @@ COLLECTION_DRUIDS_LIST = 'collection_druids.txt'
88

99
require_relative '../config/environment'
1010

11+
# Monkey-patch AF to allow using HTTP POST (for querying items by their collection)
12+
module ActiveFedora
13+
class SolrService
14+
def self.query(query, args={})
15+
raw = args.delete(:raw)
16+
args = args.merge(:q=>query, :qt=>'standard')
17+
result = SolrService.instance.conn.post('select', :data=>args)
18+
return result if raw
19+
result['response']['docs']
20+
end
21+
end
22+
23+
module FinderMethods
24+
def find_in_batches conditions, opts={}
25+
data = { :q => create_query(conditions) }
26+
opts[:qt] = @klass.solr_query_handler
27+
#set default sort to created date ascending
28+
unless opts[:sort].present?
29+
opts[:sort]= @klass.default_sort_params
30+
end
31+
32+
batch_size = opts.delete(:batch_size) || 1000
33+
34+
counter = 0
35+
begin
36+
counter += 1
37+
response = ActiveFedora::SolrService.instance.conn.paginate counter, batch_size, "select", { :method => :post, :params => opts, :data => data }
38+
docs = response["response"]["docs"]
39+
yield docs
40+
end while docs.has_next?
41+
end
42+
end
43+
end
44+
1145
collection_druids = File.exist?(COLLECTION_DRUIDS_LIST) ?
1246
File.read(COLLECTION_DRUIDS_LIST).split.map { |bare_druid| "druid:#{bare_druid}" } :
1347
[]
1448

15-
list = collection_druids.any? ?
16-
Hydrus::Collection.find(collection_druids).flat_map(&:items) :
49+
list = if collection_druids.any?
50+
where_collection_in_list_query = ActiveFedora::SolrService.construct_query_for_rel(
51+
collection_druids.map { |druid| [:is_member_of_collection, "info:fedora/#{druid}"] },
52+
' OR '
53+
)
54+
Hydrus::Item.where(where_collection_in_list_query)
55+
else
1756
Hydrus::Item.all
57+
end
1858

1959
def contributors(item)
2060
item.contributors.
@@ -63,7 +103,7 @@ end
63103

64104
warn "Exporting #{list.count} items"
65105
count = 0
66-
list.find_each do |item|
106+
list.each do |item|
67107
count += 1
68108
warn count
69109
begin

0 commit comments

Comments
 (0)