NAME
ElMo-Comp - Elementary Mode Computation
CONTENTS
Description
Commands And Options
ElMo-Comp input file format
ElMo-Comp output file format
Examples
Divide-and-conque
Browsing EFMs
Yield computation
Notes on elementary flux mode computation
See also
License and citation
History
DESCRIPTION
ElMo-Comp is an implementation of Nullspace Algorithm for the computation of elementary flux modes for the given metabolic network of biochemical reactions. Software implements Nullspace Algorithm for serial and parallel execution mode. Parallelization is provided as distributed-memory (MPI), shared-memory(OpenMP) or hybrid.
As input the program accepts the description of a metabolic network in human-interpretible format as used in earlier METATOOL (v5.1) software which implements analogous Nullspace Algorithm.
COMMANDS AND OPTIONS
elmocomp [options] OPTIONS:
-b / --bit_efm_file <str> bit-valued EFM output file name [no] -c / --compress <str> use reduced-memory execution (slower!) [no] -e / --expand_reversible <str> request split of all reversible reactions into irreversibles [yes] -f / --file <str> input file name (METATOOL format) -i / --partition_id <int> parititon number in divide-and-conquer [0] -j / --integer_nullspace <str> enforce use of integer nullspace basis when possible [no] -n / --num_threads <int> number of threads [1] -o / --output_file <int> real-valued EFM output file name [no] -p / --partition_size <int> size of reaction partition subset in divide-and-conquer [0] -r / --rank_test <str> algebraic rank test [LU] -u / --show_mem <str> show memory usage [no] -v / --verify <str> validation of EFM after computing [no] -x / --calc_expa <str> calculate by processing only irreversible reactions [no] -y / --comp_yield <str> name of input file with substrate/target reactions for yield computation []
ELMO-COMP INPUT FILE FORMAT
The file input of the ‘ElMo-Comp’ is given in METATOOL input file format . Examples of input files are given in the 'data' folder of the software package available for download. User may also request the computation of the yields of target reactions with respect to substrates. This can be done using the '-y' option with the name of file containing names of substrate and target reactions.
ELMO-COMP OUTPUT FILE FORMAT
ElMo-Comp can output elementary flux modes in (1) bit-valued matrix (64 times compression) and (2) floating-point real-valued matrix. Storage of elementary flux modes in bit-valued matrix can significantly reduce memory and disk usage for the case of large networks, and allows faster processing of modes where only zero/non-zero pattern is required.
EXAMPLES
Here we run the software on a metabolic network found in 'data' folder (data/iCT_Ecoli_59.txt) having 47 metabolites and 59 reactions following commands are run:
$ ./elmocomp -f data/iCT_Ecoli_59.txt
Alternatively, one may request the output to be saved in both bit-compressed and real-valued format, and call the program as:
$ ./elmocomp -f data/iCT_Ecoli_59.txt -b efm59.bin -o efm59.txt
Options '-b' generates bit-valued EFM file efm59.bin and a meta-file efm59.bin.meta which contains information about the stoichiometry matrix. User may chose to save the output in only one of the available output formats by omiting the other option from command line, such as:
$ ./elmocomp -f data/iCT_Ecoli_59.txt -b efm59.bin
or
$ ./elmocomp -f data/iCT_Ecoli_59.txt -o efm59.txt
Bit-valued EFM output file may be navigated and explored from Octave/MATLAB using scripts in 'browse' folder or using the post-processing tool ElMo-Proc.
Program 'elmocomp' parses input file into several auxiliary files that may be easily read and processed (e.g. in Matlab/Octave). Files stoich.elmo,reactions.elmo, metabolites.elmo and reversible.elmo contain stoichiometry matrix, reactions, metabolites and reversibility information.
By default, software splits reversible reactions into irreversibles, thus making entire network consists of irreversible reactions. To specify that original network should remain unsplit, use the option '-e no' as:
$ ./elmocomp -f data/iCT_Ecoli_59.txt -e no
DIVIDE-AND-CONQUER
Divide-and-conquer allows splitting of the computation into subproblems, and computing of EFMs for each subproblem. User may specify 2^P (power of 2) partitions, using command line options '-p' and '-i'. Splitting the above used problem may be done using 4 (2^2) partitions. Option '-p' specifies number of reactions used for splitting, while option '-i' specifies the partition identifier (0...2^P-1).
$ ./elmocomp -f data/iCT_Ecoli_59.txt -p 2 -i 0 -b efm59.bin
$ ./elmocomp -f data/iCT_Ecoli_59.txt -p 2 -i 1 -b efm59.bin
$ ./elmocomp -f data/iCT_Ecoli_59.txt -p 2 -i 2 -b efm59.bin
$ ./elmocomp -f data/iCT_Ecoli_59.txt -p 2 -i 3 -b efm59.bin
Results will be available in files efm59.bin.00, efm59.bin.01, efm59.bin.10, efm59.bin.11.
BROWSING EFMs
To load the full floating-point EFM output file efm59.txt and stoichiometry matrix from above example following set of commands in MATLAB will suffice:
load stoich.elmo;
load efm59.txt;
efm59=efm59';
To further probe if the mass balance requirement is satisfied the following expression should evaluate to zero or a value close to zero:
max(max(abs(stoich*efm59)));
To load and browse the bit-compressed EFM output file efm59.bin and its associated meta-file efm59.bin.meta there are two MATLAB scripts load_bit_ems.m and read_bit_ems.m in the directory 'browse' which allow for the processing of the modes.
To load the bit-compressed EFM matrix call the following command from MATLAB:
results=load_bit_ems('efm59.bin','efm59.bin.meta');
Variable results will contain several fields such as:
result =
stoich: [47x59 double]
num_ems: 44354
bit_ems: [354832x1 double]
bytes_per_em: 8
To read one or more elementary modes, a structure named opts with field em_no containing an array of indices of modes is passed to function read_bit_ems as in:
opts.em_no=[1:100];
results=read_bit_ems(results,opts);
This performed the expansion and reading of the first hundred elementary modes in the field results.ems.
YIELD COMPUTATION
To request computation of yield for target reactions, one should add '-y' option with the name of file containing substrate and target reactions. In this particular example given here with 59 reactions, a sample file content may be given as:
#comp_yields.elmo
substrates: GG1
targets: TRA1 BIO
Line starting with # character is a comment, while only the list of substrate and target reactions is relevant. User may list multiple substrate and target reactions, in which case for each elementary mode yield is computed as the ratio of the target reaction flux vs. the sum of fluxes in all listed substrate reactions.
Sample command call is given as:
$ ./elmocomp -f data/iCT_Ecoli_59.txt -b efm59.bin -y comp_yields.elmo
The output is saved into the file yields.txt and may not be renamed from program. It contains the yields matrix of dimensions (# of EFM)x(# of target reactions).
NOTES ON ELEMENTARY FLUX MODE COMPUTATION
Metabolic networks, pathways and elementary flux modes
Software published at this website is used to compute elementary flux modes, class of metabolic pathways, which satisfy mass-balance, thermodynamic constraints and genetic independence (e.g. reaction minimality). Network is assumed to operate in quasi-steady state.
Nullspace Algorithm
Nullspace Algorithm is derived from the Double Description Method used to compute extreme rays in a polyhedral cone. Its computational complexity is still an open problem.
Memory Requirements
SEE ALSO
EFMTools
METATOOL v5.1
AUTHORS
D.J. and D.B. (University of Minnesota, Twin Cities) implemented the software in C++ using some of the routines from the EFMTools and METATOOL v5.1 implementations of Nullspace Algorithm.
LICENSE AND CITATION
The full ElMo-Comp package is distributed under GPL.
The inner workings of this software are described in the papers given in section Publications and Datasets:
D. Jevremovic, D. Boley, C.P. Sosa, "Divide-and-conquer approach to the parallel computation of elementary flux modes in metabolic networks", 10th IEEE workshop on High Performance Computational Biology, 2011 link
D. Jevremovic, C. Trinh, F. Srienc, D. Boley, "Parallelization of the Nullspace Algorithm for the Computation of Metabolic Pathways", Parallel Computing no. 37, vol. 6-7, (2011) pp. 261-278 link
D. Jevremovic, C. Trinh, F. Srienc, D. Boley, "A Simple Rank Test to Distinguish Extreme Pathways from Elementary Modes in Metabolic Networks", technical report 08-029, 2008 link
HISTORY
ElMo-Comp implements a version of Nullspace Algorithm based on rank-test evaluation of pruned candidate modes. I started writing it in Winter 2008 when the efficient and scalable parallelization was needed to due inability to use, at the time popular, METATOOL. The algorithmic content is similar to the one implemented in METATOOL 5.1 and described in associated papers, with the improvements in the access of memory, optimization and use of reduced-rank test which attained increased efficiency. In May 2010, I implemented routines for iterative compression of metabolic network, as was done in the EFMTools software. While EFMTools comes with the shared-memory parallelization, ElMo-Comp has an implementation of distributed-memory parallelization with good scalability and imbalance rate.
Last change: August 2012 |