Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein Close Residues Lifting (Graph to Hypergraph) #37

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

bertranMiquel
Copy link
Contributor

The UniProt dataset is a custom dataset that is created by fetching data from the UniProt API.
The dataset is created by fetching a list of proteins based on a query and then fetching the structure of each protein using the AlphaFold API. The dataset is then created by creating a graph for each protein where the nodes are the residues and edges are the connections between residues. These connections are usually done by the closeness of the residues. In this example, we connect the residues in two ways, representing the data into a graph:

  • Sequentialwise: Connecting residues that appear in a sequential order (one after another). This approach is based on the presence of peptide bonds, which link the amino acids in a protein chain in a specific sequence.
  • Closewise: Connecting residues that are close to each other (under than a threshold) and the direction between CarbonAlpha (CA) and CarbonBeta (CB) atoms of each residue are less than 90 degrees between different residues. This approach ensures that residues are connected when they are in close proximity and have a similar orientation, indicating that their spatial arrangement and orientation are biologically appropriate (the residues are appropriated with a similar orientation).

The target variable is the mass of the protein.

This representation can be improved by lifting it to an hypergraph.
As done in Jiang et al. (2021), we will create an hypergraph by grouping the connected residues that are close to each other (less than a parameter).

This pull request is done under the team formed by: Bertran Miquel Oliver, Manel Gil Sorribes, Alexis Molina

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@levtelyatnikov
Copy link
Member

Please consider making the tests pass to make this submission valid for the challenge

@levtelyatnikov
Copy link
Member

levtelyatnikov commented Jul 8, 2024

Hello, @bertranMiquel ! Thank you for your submission. As we near the end of the challenge, I am collecting participant info for the purpose of selecting and announcing winners. Please email me (or have one member of your team email me) at [email protected] so I can share access to the voting form. In your email, please include:

Before July 12, make sure that your submission respects all Submission Requirements laid out on the challenge page. Any submission that fails to meet this criteria will be automatically disqualified.

@bertranMiquel
Copy link
Contributor Author

bertranMiquel commented Jul 9, 2024 via email

@gbg141 gbg141 added award-category-2 Lifting to Combinatorial, Hypergraph or Graph Domain award-category-3 Feature-based Lifting (including those that simultaneously leverage the connectivity) labels Jul 9, 2024
@gbg141 gbg141 added Winner Awarded submission and removed challenge-icml-2024 labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
award-category-2 Lifting to Combinatorial, Hypergraph or Graph Domain award-category-3 Feature-based Lifting (including those that simultaneously leverage the connectivity) Winner Awarded submission
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants