Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ecMouseGEM #84

Draft
wants to merge 34 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
666739a
feat: reuse scripts from ecHumanGEM
mihai-sysbio Feb 15, 2021
08be1e6
style: update reference to Human-GEM
mihai-sysbio Feb 15, 2021
ff58930
feat: a first attempt towards ecMouseGEM
mihai-sysbio Feb 15, 2021
bf9cc4a
fix: id for the carbon exchange reaciton
mihai-sysbio Feb 15, 2021
62ec7cf
fix: set up empty model folder
mihai-sysbio Feb 16, 2021
5bed492
fix: formatting in PR body
mihai-sysbio Feb 16, 2021
1e79be2
chore: update protdatabase file (swissprot entries)
IVANDOMENZAIN Feb 17, 2021
e98e352
chore: update protDatabase file with KEGG info for M. musculus
IVANDOMENZAIN Feb 17, 2021
dd43816
feat: try out specific Mouse-GEM branch
mihai-sysbio Mar 23, 2021
a2361c4
fix: add a temporary model version
IVANDOMENZAIN Mar 24, 2021
32d04b6
fix: ignore multiple protein matches for a given gene in getEnzymeCodes
IVANDOMENZAIN Mar 25, 2021
27c6228
fix: bug in DB assignment to multiple conflicting proteins cell
IVANDOMENZAIN Mar 25, 2021
6ebc216
fix: avoid conflicting grRules in preprocessModel
IVANDOMENZAIN Mar 25, 2021
47fbc33
fix: correct b vector
IVANDOMENZAIN Mar 26, 2021
6727f77
fix: correct cSource uptake reaction name
IVANDOMENZAIN Mar 26, 2021
0f5c130
fix: remove getConstrained model step
IVANDOMENZAIN Mar 26, 2021
d0e5c3a
fix:generate non-empty ecModel_batch structure
IVANDOMENZAIN Mar 26, 2021
7029936
fix: skip SBML saving steps
IVANDOMENZAIN Mar 26, 2021
a5b49f3
fix: non-empty ecModel_batch
IVANDOMENZAIN Mar 26, 2021
0b7f14e
fix: change target branch for PRs
IVANDOMENZAIN Apr 1, 2021
eb46777
feat: crash the pipeline if PR creation fails
mihai-sysbio Apr 2, 2021
a6d20d1
fix: pr message in hub pull request
IVANDOMENZAIN Apr 2, 2021
4fdabbf
fix: correct default ub in GEM
IVANDOMENZAIN Apr 2, 2021
cc7a0be
fix: enable generation of ecModel_batch
IVANDOMENZAIN Apr 2, 2021
c104e03
fix: update calibration parameters ecMouseGEM
IVANDOMENZAIN Apr 5, 2021
f047680
fix: skip SBML saving steps
IVANDOMENZAIN Apr 5, 2021
7a911c9
fix: enable generation of SBML files
IVANDOMENZAIN Apr 6, 2021
2306ec6
fix: enable all existing GEMs
mihai-sysbio Jul 29, 2021
b95d491
feat: promote Mouse-GEM to a typical gh setup
mihai-sysbio Jul 29, 2021
4079b0d
feat: bump up the GECKO version
mihai-sysbio Jul 29, 2021
2923b5f
fix: use main as default branch
mihai-sysbio Jul 29, 2021
9680f75
chore: update path for yeast-GEM model
mihai-sysbio Jul 29, 2021
dcb8e10
Merge branch 'master' into feat/ecMouseGEM
mihai-sysbio Jul 29, 2021
f27cd91
fix: use correct new_version variable
mihai-sysbio Jul 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,15 @@ install_dir = gurobi/

[GECKO]
url = https://github.com/SysBioChalmers/GECKO
version = v2.0.1
version = v2.0.2
install_dir = GECKO/
branch = master

[ecYeastGEM]
type = 'gem'
url = https://github.com/SysBioChalmers/yeast-GEM
download_url =
model_filename = /ModelFiles/mat/yeastGEM.mat
model_filename = /model/yeast-GEM.mat
mat_model = .model
install_dir = yeast-GEM
version = v8.3.1
Expand Down Expand Up @@ -81,3 +81,13 @@ mat_model = .ihuman
install_dir = Human-GEM
version = v0
database_tag = hsa

[ecMouseGEM]
type = 'gem'
url = https://github.com/SysBioChalmers/Mouse-GEM
download_url =
model_filename = /model/Mouse-GEM.mat
mat_model = .mouseGEM
install_dir = Mouse-GEM
version = v1.1.0
database_tag = mmu
4 changes: 2 additions & 2 deletions ecHumanGEM/scripts/changeMedia_batch.m
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
function exchModel = changeMedia_batch(model,cSource,irrev,measuredMets,fluxes)
% changeMedia_batch
%
% Set a Hams culture medium for a humanGEM based model. This function works
% Set a Hams culture medium for a Human-GEM based model. This function works
% for either standard or enzyme constrained-GEMs
%
% model An ihuman-based GEM
% model An Human-GEM based model
% irrev (logical) TRUE if model comes in an irreversible format.
% Default = false
% measuredMets (string) metNames for measured compounds. Optional
Expand Down
Empty file added ecMouseGEM/model/.keep
Empty file.
Binary file added ecMouseGEM/scripts/ProtDatabase.mat
Binary file not shown.
159 changes: 159 additions & 0 deletions ecMouseGEM/scripts/changeMedia_batch.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
function exchModel = changeMedia_batch(model,cSource,irrev,measuredMets,fluxes)
% changeMedia_batch
%
% Set a Hams culture medium for a Human-GEM based model. This function works
% for either standard or enzyme constrained-GEMs
%
% model An Human-GEM based model
% irrev (logical) TRUE if model comes in an irreversible format.
% Default = false
% measuredMets (string) metNames for measured compounds. Optional
% fluxes (vector) Measured fluxes [mmol/gDw h]. Optional
% NOTE: NEGATIVE = CONSUME
% POSITIVE = PRODUCE
%
% exchModel (struct) Model with Ham's media constraints
%
% Ivan Domenzain. Last edited: 2020-11-20

if nargin<5
fluxes = [];
if nargin<4
measuredMets = [];
if nargin<3 || isempty(irrev)
irrev = true;
end
end
end
%Remove unconstrained field, if available
if isfield(model,'unconstrained')
model = rmfield(model,'unconstrained');
end
%Check if boundary metabolites are present, if so then remove them
boundaryIndx = find(strcmpi(model.compNames,'Boundary'));
if ~isempty(boundaryIndx)
boundary = find(model.metComps==boundaryIndx);
model = removeMets(model,boundary,false,false,false,true);
end
%Ham's media composition
mediaComps ={'glucose'
'arginine'
'histidine'
'lysine'
'methionine'
'phenylalanine'
'tryptophan'
'tyrosine'
'alanine'
'glycine'
'serine'
'threonine'
'aspartate'
'glutamate'
'asparagine'
'glutamine'
'isoleucine'
'leucine'
'proline'
'valine'
'cysteine'
'thiamin'
'hypoxanthine'
'folate'
'biotin'
'pantothenate'
'choline'
'inositol'
'nicotinamide'
'pyridoxine'
'riboflavin'
'thymidine'
'aquacob(III)alamin'
'lipoic acid'
'sulfate'
'linoleate'
'linolenate'
'O2'
'H2O'
'retinoate'
'Fe2+'
'Pi'
'alpha-tocopherol'
'gamma-tocopherol'};

%Default flux bounds [LB, UB]
fluxBounds = [-ones(length(mediaComps),1), ones(length(mediaComps),1)]*1000;
%Check if provided mets are part of media's formulation
if ~isempty(measuredMets)
%Modify fluxBounds with the correspondent provided flux measurements
[iA,iB] = ismember(measuredMets,mediaComps);
fluxBounds(iB(iA),:) = fluxes(iA).*ones(sum(iA),2); % force flux to be equal to measured value (LB = UB = measured val)
if any(~iA)
%If measured mets are not in media formulation, then add them
mediaComps = [mediaComps; measuredMets(~iA)];
fluxBounds = [fluxBounds; fluxes(~iA).*ones(sum(~iA),2)];
end
end

if ~irrev
modelStr = 'model';
%Set uptake fluxes for media mets
[exchModel,unusedMets] = setExchangeBounds(model,mediaComps,fluxBounds(:,1),fluxBounds(:,2),true);
else
modelStr = 'ecModel';
[exchRxns,exchIndxs] = getExchangeRxns(model);
%Exclude protein pool exchange
exchIndxs = exchIndxs(1:end-1);
exchRxns = exchRxns(1:end-1);
%Differentiate between uptakes and production reactions
uptkIndxs = exchIndxs(find(contains(exchRxns,'_REV')));
prodIndxs = exchIndxs(find(~contains(exchRxns,'_REV')));
%Open all production reactions
exchModel = setParam(model,'ub',prodIndxs,1000);
%close all uptakes
exchModel = setParam(exchModel,'ub',uptkIndxs,0);
%Open uptake of media components one by one
unusedMets = [];
for i=1:length(mediaComps)
%Get metabolite indx
metIndx = getIndexes(model,strcat(mediaComps{i},'[s]'),'metcomps');
%Get rxns for metabolite
metRxns = find(model.S(metIndx,:));
%Get the uptake reaction for the metabolite
metExchRxn = intersect(metRxns,uptkIndxs);
if isempty(metExchRxn)
unusedMets = [unusedMets; mediaComps(i)];
else
%Open it!
if fluxBounds(i,1) < 0
% if the metabolite is being consumed
exchModel.ub(metExchRxn) = abs(fluxBounds(i,1));
else
exchModel.ub(metExchRxn) = 0;
end
%If the metabolite was measured, also set its production rate
metExchRxn = intersect(metRxns,prodIndxs);
if fluxBounds(i,2) > 0
exchModel.ub(metExchRxn) = abs(fluxBounds(i,2));
else
exchModel.ub(metExchRxn) = 0;
end
end
end
end

% report unused metabolites
if ~isempty(unusedMets)
fprintf('WARNING: The following metabolites are either not in the model or do not have exchange reactions:\n');
fprintf('\t%s\n',unusedMets{:});
end

%Check if model is feasible
sol = solveLP(exchModel);
if ~isempty(sol.x)
disp(['Constrained ' modelStr ' is feasible'])
else
disp(['Constrained ' modelStr ' is unfeasible'])
exchModel = [];
end
end
152 changes: 152 additions & 0 deletions ecMouseGEM/scripts/constrainEnzymes.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
function [model,enzUsages,modifications,GAM,massCoverage] = constrainEnzymes(model,f,GAM,Ptot,pIDs,data,gRate)
% constrainEnzymes
%
% Main function for overlaying proteomics data on an enzyme-constrained
% model. If chosen, also scales the protein content, optimizes GAM, and
% flexibilizes the proteomics data.
%
% model ecModel.
% f (Opt) Estimated mass fraction of enzymes in model.
% GAM (Opt) Growth-associated maintenance value. If not
% provided, it will be fitted to chemostat data.
% Ptot (Opt) Total protein content, provide if desired content
% is different from the one reported in getModelParameters [gProt/gDw]
% pIDs (Opt) Protein IDs from proteomics data.
% data (Opt) Protein abundances from proteomics data [mmol/gDW].
% gRate (Opt) Experimental growth rate at which the proteomics
% data were obtained [1/h]
%
% model ecModel with calibrated enzyme usage upper bounds
% enzUsages Calculated enzyme usages after final calibration
% (enzyme_i demand/enzyme_i upper bound)
% modifications Table with all the modified values
% (Protein ID/old value/Flexibilized value)
% GAM Fitted GAM value for the ecModel
% massCoverage Ratio between measured and total mass of protein in the model
%
% Usage: [model,enzUsages,modifications, GAM,massCoverage] = constrainEnzymes(model,f,GAM,Ptot,pIDs,data,gRate,c_UptakeExp)
%
% Benjamin J. Sanchez. Last update 2018-12-11
% Ivan Domenzain. Last update 2020-03-02
%

%get model parameters
cd ..
parameters = getModelParameters;
sigma = parameters.sigma;
c_source = parameters.c_source;
%Constrain glucose uptake rate if experimental value is available
if isfield(parameters,'c_UptakeExp')
model = setParam(model,'ub',[c_source '_REV'],parameters.c_UptakeExp);
end
cd limit_proteins
%Compute f if not provided:
if nargin < 2
[f,~] = measureAbundance(model.enzymes);
else
if isempty(f)
[f,~] = measureAbundance(model.enzymes);
end
end
%Leave GAM empty if not provided (will be fitted later):
if nargin < 3
GAM = [];
end
%Load Ptot if not provided:
if nargin < 4
Ptot = parameters.Ptot;
end
%No UB will be changed if no data is available -> pool = all enzymes(FBAwMC)
if nargin < 5
pIDs = cell(0,1);
data = zeros(0,1);
end
%Remove zeros or negative values
data = cleanDataset(data);
%Assign concentrations as UBs [mmol/gDW]:
model.concs = nan(size(model.enzymes)); %OBS: min value is zero!!
fprintf('Matching data to enzymes in model...')
for i = 1:length(model.enzymes)
match = false;
for j = 1:length(pIDs)
if strcmpi(pIDs{j},model.enzymes{i}) && ~match
model.concs(i) = data(j)*model.MWs(i); %g/gDW
rxn_name = ['prot_' model.enzymes{i} '_exchange'];
pos = strcmpi(rxn_name,model.rxns);
model.ub(pos) = data(j);
match = true;
end
end
end
%Count mass of non-measured enzymes:
measured = ~isnan(model.concs);
concs_measured = model.concs(measured);
Pmeasured = sum(concs_measured);
%Get protein content in biomass pseudoreaction:
Pbase = Ptot;
if Pmeasured > 0
%Calculate fraction of non measured proteins in model out of remaining mass:
[fn,~] = measureAbundance(model.enzymes(~measured));
fm = Pmeasured/Ptot;
f = fn/(1-fm);
%Discount measured mass from global constrain:
fs = (Ptot - Pmeasured)/Pbase*f*sigma;
else
fs = f*sigma;
end
%Constrain the rest of enzymes with the pool assumption:
if sum(strcmp(model.rxns,'prot_pool_exchange')) == 0
model = constrainPool(model,~measured,full(fs*Pbase));
end
fprintf(' Done!\n')
%Display some metrics:
disp(['Total protein amount measured = ' num2str(Pmeasured) ' g/gDW'])
disp(['Total enzymes measured = ' num2str(sum(measured)) ' enzymes'])
disp(['Enzymes in model with 0 g/gDW = ' num2str(sum(concs_measured==0)) ' enzymes'])
disp(['Total protein amount not measured = ' num2str(Ptot - Pmeasured) ' g/gDW'])
disp(['Total enzymes not measured = ' num2str(sum(~measured)) ' enzymes'])
disp(['Total protein in model = ' num2str(Ptot) ' g/gDW'])
enzUsages = [];
if nargin > 7
[tempModel,enzUsages,modifications] = flexibilizeProteins(model,gRate,c_UptakeExp,c_source);
Pmeasured = sum(tempModel.concs(~isnan(tempModel.concs)));
model = updateProtPool(tempModel,Ptot,f*sigma);
end
massCoverage = Pmeasured/Ptot;
if isempty(enzUsages)
enzUsages = table({},zeros(0,1),'VariableNames',{'prot_IDs' 'usage'});
modifications = table({},zeros(0,1),zeros(0,1),zeros(0,1),'VariableNames',{'protein_IDs' 'previous_values' 'modified_values' 'flex_mass'});
else
plotHistogram(enzUsages.usage,'Enzyme usage [-]',[0,1],'Enzyme usages','usages')
end
%Plot histogram (if there are measurements):
plotHistogram(concs_measured,'Protein amount [mg/gDW]',[1e-3,1e3],'Modelled Protein abundances','abundances')
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function plotHistogram(variable,xlabelStr,xlimits,titleStr,option)
if iscell(variable)
cell2mat(variable);
end
if sum(variable) > 0
variable(variable==0) = 1E-15;
figure
if strcmpi(option,'abundances')
hist(variable*1e3,10.^(-3:0.5:3))
set(gca,'xscale','log')
else
hist(variable,(0:0.05:1))
end
xlim(xlimits)
xlabel(xlabelStr)
ylabel('Frequency');
title(titleStr)
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function data = cleanDataset(data)
for i=1:length(data)
if data(i)<=0
data(i) = NaN;
end
end
end
Loading