cfg_* procedure(s) for preferable .gitattributes for various known dataset types #71

yarikoptic · 2019-07-17T14:52:47Z

ATM we have cfg_bids which

sets up .gitattributes to have some files directly in git
sets up metadata extraction configuration

But besides BIDS I keep running into the need to establish .gitattributes for following types, where I think following, analogous to BIDS one, should be done

.feat and .gfeat FSL outputs

.gitattributes - may be use a cfg_text2git?

on a sample .gfeat directory of 9GB, with a regular cfg_text2git I ended up with 260KB .git/objects

that allowed to quickly install that dataset elsewhere, datalad get **/*.png

metadata
- datalad: eventually might configure the extractor
- git-annex: we might like to annotate with annex metadata file types may be so on shells without ** ppl could quickly get all needed supplementary data files to browse the results

fmriprep

.gitattributes

I had

*.md annex.largefiles=nothing
*.html annex.largefiles=nothing
*.json annex.largefiles=nothing
CITATION.* annex.largefiles=(not(mimetype=text/*))

which resulted in 32MB .git/objects for ~500GB dataset (~250 subjects).

metadata
- configure extractors (nifti1, bids, may be more when support FreeSurfer etc)
- interesting use case since BIDS(-derivative) dataset is not at the top of this dataset which has two directories -- fmriprep and freesurfer, so bids extractor should be informed to look into fmrieprep/

HOWTO

Pretty much all those scenarios are very similar and just require only slightly different specification. I see two implementation possibilities

breed cfg_* scripts

extract common code from cfg_bids into some cfg_common.py helper
reuse from within individual cfg_bids, cfg_feat, cfg_fmriprep

create (optionally parametrized) cfg_neuroimaging_dataset

which would sense (or "force" via explicit parameter) the type of the dataset and act accordingly (if can figure out, crash if fails and no explicit parameter such as "bids") is specified

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cfg_* procedure(s) for preferable .gitattributes for various known dataset types #71

cfg_* procedure(s) for preferable .gitattributes for various known dataset types #71

yarikoptic commented Jul 17, 2019

cfg_* procedure(s) for preferable .gitattributes for various known dataset types #71

cfg_* procedure(s) for preferable .gitattributes for various known dataset types #71

Comments

yarikoptic commented Jul 17, 2019

.feat and .gfeat FSL outputs

fmriprep

HOWTO

breed cfg_* scripts

create (optionally parametrized) cfg_neuroimaging_dataset