Add package 2024_Barquera_ChichenItza #211

RodrigoBarquera · 2024-09-05T14:01:01Z

PR Checklist for a new package submission

The package does not exist already in the community archive, also not with a different name.
The package title in the POSEIDON.yml conforms to the general title structure suggested here: <Year>_<Last name of first author>_<Region, time period or special feature of the paper>, e.g. 2021_Zegarac_SoutheasternEurope, 2021_SeguinOrlando_BellBeaker or 2021_Kivisild_MedievalEstonia.
The package is stored in a directory that is named like the package title.

The Publication column in the .janno file is filled and the respective .bib file has complete entries for the listed mentioned keys.
The .janno file does not include any empty columns or columns only filled with n/a.
The order of columns in the .janno file adheres to the standard order as defined in the Poseidon schema here.
The .janno and the .ssf files are not fully quoted, so they only use single- or double quotes ("...", '...') to enclose text fields where it is strictly necessary (i.e. their entry includes a TAB).

The package passes a validation with trident validate --fullGeno.

Large genotype data files are properly tracked with Git LFS and not directly pushed to the repository. For an instruction on how to set up Git LFS please look here. If you accidentally pushed the files the wrong way you can fix it with git lfs migrate import --no-rewrite path/to/file.bed (see here).

stschiff · 2024-10-09T07:35:03Z

Hi @RodrigoBarquera, this is great. Super that you even entered the relationship columns, which I know is a lot of work.

Sorry for taking so long to give feedback, but I have some points:

We actually would like the Collection_ID column to reflect the ID from the actual collection. I see that you've used the column Alternative_IDs for that. I suggest that you simply rename the Alternative_IDs column to Collection_ID and remove the empty Collection_ID column.
You have only given date information for the few samples that you've C14-dated. But I'm sure you can also give dates for all samples that have no C14-date, right? We have contextual in the Date_Type for that, and it would be good to fill. We generally aspire to have at least contextual dates for every single sample, to facilitate meta-analyses through space and time. Note that with contextual dates, you should only fill columns Date_BC_AD_Start, Date_BC_AD_Median and Date_BC_AD_End, where the median can just be the mid-point of the interval.
I see that you've left columns Endogenous, Nr_SNPs, Coverage_on_Target_SNPs, Damage, Contamination, Contamination_Err, Contamination_Meas and Contamination_Note empty. I'm sure these information are available in your paper, right? Do you need help with these? We have three student assistants now who can help with this. Let us know! I would be willing to leave these empty for now, but if it's just about needing help, let us help.
The Genetic_Source_Accession_IDs should be filled. They can all have the exact same Project Accession ID entry from the ENA.

Again, let us know if you need help with this and we can ask someone from our team.

stschiff · 2024-12-03T09:33:24Z

@RodrigoBarquera did you have a chance to look into this, or do you need help from one of us?

nevrome · 2025-01-14T16:10:12Z

@RodrigoBarquera Another reminder. Please let us know if you would like to hand this over to an other assignee.

RodrigoBarquera · 2025-02-11T15:42:37Z

Hi Stephan! :)

With the help of Thiseas, I finally managed to complete the missing spots. Two things were left unchanged: the dating for all samples, as we found in the same context individuals spanning 500 years, we cannot accurately date any further by context. And the contamination estimates we don't have them consistently for all libraries for all individuals. If it is absolutely necessary, I can compile them and make a weighted average for the contamination estimates, but since eager 1 and 2 were used at the time, I would need to rerun everything on eager 2 to be consistent. But this would be from the TF data, since the SG is in all cases too low for it to get an accurate estimate.

stschiff · 2025-02-12T10:22:51Z

Thanks, @RodrigoBarquera. OK, maybe then forget about contamination, that's OK. But for the dates, please write here what the span is. Even something like 1000 - 2000 AD would be better than nothing, given that our repository has data from the last 40,000 years or so. I am happy to input them for you.

stschiff · 2025-02-12T10:23:56Z

Nevermind, just looked at your paper and saw 500-900 AD. That's perfect. I will enter this.

stschiff · 2025-02-12T10:29:54Z

OK, so if nobody objects, I would like to take over this PR now, by merging into a local branch and making the final touches.

stschiff · 2025-02-12T15:31:39Z

This PR is being further worked on in #250

added new package named 2024_Barquera_ChichenItza

dd80eab

nevrome changed the title ~~added new package named 2024_Barquera_ChichenItza~~ Add package 2024_Barquera_ChichenItza Sep 6, 2024

stschiff self-assigned this Sep 9, 2024

nevrome assigned RodrigoBarquera and unassigned stschiff Jan 14, 2025

nevrome requested a review from stschiff January 14, 2025 16:14

stschiff removed their request for review January 31, 2025 16:07

stschiff marked this pull request as draft January 31, 2025 16:08

Add files via upload

2fa794e

nevrome assigned stschiff and unassigned RodrigoBarquera Feb 12, 2025

stschiff changed the base branch from master to 2024_Barquera February 12, 2025 15:29

stschiff marked this pull request as ready for review February 12, 2025 15:29

stschiff merged commit 7663b11 into poseidon-framework:2024_Barquera Feb 12, 2025
1 check passed

stschiff mentioned this pull request Feb 12, 2025

Add 2024_Barquera_ChichenItza #250

Open

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add package 2024_Barquera_ChichenItza #211

Add package 2024_Barquera_ChichenItza #211

RodrigoBarquera commented Sep 5, 2024 •

edited

Loading

stschiff commented Oct 9, 2024 •

edited by TCLamnidis

Loading

stschiff commented Dec 3, 2024

nevrome commented Jan 14, 2025

RodrigoBarquera commented Feb 11, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

Add package 2024_Barquera_ChichenItza #211

Add package 2024_Barquera_ChichenItza #211

Conversation

RodrigoBarquera commented Sep 5, 2024 • edited Loading

PR Checklist for a new package submission

stschiff commented Oct 9, 2024 • edited by TCLamnidis Loading

stschiff commented Dec 3, 2024

nevrome commented Jan 14, 2025

RodrigoBarquera commented Feb 11, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

stschiff commented Feb 12, 2025

RodrigoBarquera commented Sep 5, 2024 •

edited

Loading

stschiff commented Oct 9, 2024 •

edited by TCLamnidis

Loading