Replies: 1 comment 1 reply
-
Okay, reading the paper again, I can determine the null distribution by taking all (right?) gene-GO pair cosine similarities from my 3 replicate random graphs and obtain the p value for a particular gene-GO pair of interest by assuming its cosine similarity comes from the same null distribution? What distribution are you using for H0, though? Normal, t? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear GeneWalk developers,
thank you for this easy-to-use tool!
In your paper you are stating: "Currently, only connected GO terms are considered for identification of function relevance, but we imagine that GeneWalk could be extended to predict novel gene functions because of high similarity scores between a gene and unconnected GO terms"
This is exactly what I am interested in. I haven't tried yet, but I guess I could open the node_vectors_X.pkl files, extract the vector embedding of a gene of interest and compare it with different GO-BP vector embeddings via cosine similarity? I can do that for all calculated GeneWalk graphs (I used the default of 3) and for the randomized graphs and then perform a t test with H0 that the cosine similarities are not different between the graphs and their randomized counterparts? Is that roughly what you imagined what one could do, or is it more complicated? Do you have another suggestion how to do it or do you already have code in place that you are open to share?
Beta Was this translation helpful? Give feedback.
All reactions