Skip to content

Commit aea6c06

Browse files
authored
fix: updated task definitions (#12)
* fix: updated missing requirements and fixed task 3 description * fix: added additional references from AM
1 parent 939ad1c commit aea6c06

File tree

1 file changed

+25
-25
lines changed

1 file changed

+25
-25
lines changed

docs/tasks.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ hide:
99
> **Date** 2024-10-14 to -15<br>
1010
> **Location** BioLabs Heidelberg and Online<br>
1111
12-
- General software requirements: [Streamlit](https://streamlit.io/), [Ollama](https://ollama.com/)
13-
- General database requirements: vector database (our internal stack is preferential towards FAISS, but ChromeDB, Milvus, can be used just as well).
12+
- Infrastructure: [Amazon Web Services](https://aws.amazon.com/de/) and [Code Ocean](https://codeocean.com/)
13+
- Application requirements: [Streamlit](https://streamlit.io/), [Ollama](https://ollama.com/), [LangChain](https://www.langchain.com/), and [FAISS](https://github.com/facebookresearch/faiss)
1414

1515
## AI agents for computational modelling and simulation
1616
- Special software requirements: https://github.com/copasi/basico
@@ -19,47 +19,47 @@ hide:
1919
- Other data: [PubMed](https://pubmed.ncbi.nlm.nih.gov/) for original articles
2020

2121
### Task type 1
22-
- Description: Simulation of a mathematical model and reporting of the biomarker trajectories and predicted clinical efficacy
23-
- Input: simulation parameters such as initial concentrations
24-
- Output: time-course of simulation species
22+
- **Description**: Simulation of a mathematical model and reporting of the biomarker trajectories and predicted clinical efficacy
23+
- **Input**: simulation parameters such as initial concentrations
24+
- **Output**: time-course of simulation species
2525

2626
### Task type 2
27-
- Description: Creating a mathematical model from scratch
28-
- Input: Original article describing the mathematical model and list of equations
29-
- Output: SBML model with annotated species
27+
- **Description**: Creating a mathematical model from scratch
28+
- **Input**: Original article describing the mathematical model and list of equations
29+
- **Output**: SBML model with annotated species
3030

3131
## AI agents for omics and foundation models
3232
- Special software requirements: [scGPT](https://www.nature.com/articles/s41592-024-02201-0)
3333
- Data for analysis: [cell by gene](https://cellxgene.cziscience.com/)
3434
- Other tools/analyses: [differential gene set enrichment analysis using GO](https://amigo.geneontology.org/amigo), UMAP
3535

3636
### Task type 1
37-
- Description: Integration of multiple scRNA seq datasets, correction for batch effects, annotation of cells, and reporting of the results as a UMAP
38-
- Input: multiple cellxgene datasets for a particular disease (e.g., Rheumatoid Arthritis)
39-
- Output: UMAP visualization with cell annotation
37+
- **Description**: Integration of multiple scRNA seq datasets, correction for batch effects, annotation of cells, and reporting of the results as a UMAP
38+
- **Input**: multiple cellxgene datasets for a particular disease (e.g., Rheumatoid Arthritis)
39+
- **Output**: UMAP visualization with cell annotation
4040

4141
### Task type 2
42-
- Description: Simulation of gene perturbation and reporting of the predicted differentially expressed genes using pathway enrichment analysis
43-
- Input: cell x gene dataset for a particular disease, knockout gene list
44-
- Output: list of differentially expressed genes and pathway enrichment analysis visualization
42+
- **Description**: Simulation of gene perturbation and reporting of the predicted differentially expressed genes using pathway enrichment analysis
43+
- **Input**: cell x gene dataset for a particular disease, knockout gene list
44+
- **Output**: list of differentially expressed genes and pathway enrichment analysis visualization
4545

4646
## AI agent for Biomedical knowledge graph reasoning and construction
47-
- Special software requirements: [LLMGraphTransformer](https://api.python.langchain.com/en/latest/graph_transformers/langchain_experimental.graph_transformers.llm.LLMGraphTransformer.html) and [ULTRA](https://github.com/DeepGraphLearning/ULTRA)
48-
- Biomedical knowledge graph dataset: [PrimeKG](https://github.com/mims-harvard/PrimeKG) specifically the subset used in [STARK](https://github.com/snap-stanford/stark)
47+
- Special software requirements: [PyTorch Geometric](https://github.com/pyg-team/pytorch_geometric) and [available models through PyG](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html), [LLMGraphTransformer](https://api.python.langchain.com/en/latest/graph_transformers/langchain_experimental.graph_transformers.llm.LLMGraphTransformer.html), and schema-agnostic graph foundation model (e.g., [ULTRA](https://github.com/DeepGraphLearning/ULTRA))
48+
- Biomedical knowledge graph dataset: [PrimeKG](https://github.com/mims-harvard/PrimeKG) specifically the subset used in [STARK](https://github.com/snap-stanford/stark) for textual Q&A
4949
- Other data: [PubMed](https://pubmed.ncbi.nlm.nih.gov/) for original articles
5050
- Graph database: [NetworkX](https://networkx.org/)
5151

5252
### Task type 1
53-
- Description: Knowledge graph Q&A and retrieval of K-hop subgraph explanations
54-
- Input: Natural language question (see subset used in https://arxiv.org/abs/2404.13207 for PrimeKG)
55-
- Output: Ranked nodes answers and visualization of k-hop subgraphs
53+
- **Description**: Knowledge graph Q&A and retrieval of the K-hop subgraph explanations
54+
- **Input**: Natural language question (see subset used in https://arxiv.org/abs/2404.13207 for PrimeKG)
55+
- **Output**: Ranked nodes answers and visualization of k-hop subgraphs
5656

5757
### Task type 2
58-
- Description: Disease knowledge graph construction from text using a LLM to graph model and link prediction model to fill in gaps
59-
- Input: List of disease MeSH terms and associated articles from PubMed and list of nodes and edges (same as in PrimeKG)
60-
- Output: NetworkX representation of the knowledge graph and visualization
58+
- **Description**: Disease knowledge graph construction from text using a text-to-graph model to construct the initial knowledge graph and a link prediction model to fill in gaps in the reconstructed knowledge graph
59+
- **Input**: List of disease MeSH terms and associated articles from PubMed and list of nodes and edges (same as in PrimeKG)
60+
- **Output**: NetworkX representation of the knowledge graph and visualization
6161

6262
### Task type 3
63-
- Description: Same as type 1 but including protein embeddings from https://www.uniprot.org/help/embeddings and additional vector similarity search of drug targets embeddings
64-
- Input: Natural language question (see subset used in https://arxiv.org/abs/2404.13207 for PrimeKG)
65-
- Output: Ranked nodes answers and visualization of k-hop subgraphs
63+
- **Description**: Same as type 1 but including protein embeddings from https://www.uniprot.org/help/embeddings and additional vector similarity search of drug targets embeddings
64+
- **Input**: Natural language question (see subset used in https://arxiv.org/abs/2404.13207 for PrimeKG)
65+
- **Output**: Ranked nodes answers and visualization of k-hop subgraphs

0 commit comments

Comments
 (0)