mcl(1)                    USER COMMANDS                    mcl(1)



  NAME
      mcl  - The Markov Cluster Algorithm, aka the MCL algorithm.

  SYNOPSIS
      mcl <-|fname> [-I f (inflation)] [-o str (fname)]  [-scheme
      k (resource scheme)]

      These  options are sufficient in 95 percent of the cases or
      more. The first argument must be the name of  a  file  con-
      taining a graph/matrix in the mcl input format, or a hyphen
      to read from STDIN. With respect to clustering, only the -I
      option  and  -scheme  option  are  relevant and the rest is
      immaterial.

      As of the first 1.002 release, mcl will accept a very  gen-
      eral input format.  Graph indices no longer need be sequen-
      tial; you can use any set of indices you like, as  long  as
      they  are  in  a suitable range. Refer to the mcxio(5) sec-
      tion, and use mcl -z to find the range from  which  indices
      can be picked.

      As of the first 1.003 release, clmformat enables you to map
      a clustering onto a format better  suited  for  inspection,
      using  an  index  file (the so called tab file) mapping mcl
      indices onto descriptive labels.  Read the clmformat manual
      for  more  information - it is simple to use and the manual
      is small.

      A mechanism for pipelines is supported  (as  of  the  first
      1.003  release).   Refer  to the PIPELINES section for more
      information.  A prepackaged pipeline for BLAST data is pre-
      sent in the form of mclblastline.

      The  full  listing of mcl options is shown below, separated
      into parts corresponding with functional  aspects  such  as
      clustering, threading, verbosity, pruning and resource man-
      agement, automatic output naming, and dumping.  The -scheme
      parameter  provides  a  single  access point to the pruning
      options, and should  be  sufficient  in  most  cases.   mcl
      allows comprehensive tuning and access to its internals for
      those who are interested, so it has many options.

      Baseline clustering options
      [-I f (inflation)] [-o str (fname)]  [-scheme  k  (resource
      scheme)]

      Additional clustering options
      [-l  n  (initial  iteration  number)] [-L n (main iteration
      number)] [-i f (initial inflation)]  [-do  str  (carry  out
      action)]

      Input manipulation options
      [-c f (centering)] [-pi f (pre-inflation)] [-pp n (preprune
      count)] [-in-gq f (filter threshold)]

      Verbosity options
      [-v str (verbosity type on)] [-V str (verbosity type  off)]
      [--silent  (very)] [--verbose (very)] [-progress k (gauge)]
      [--show (print (small) matrices to screen)]

      Thread options
      [-te k (#expansion threads)] [-ti k  (#inflation  threads)]
      [-t k (#threads)] [--clone (when threading (experimental))]
      [-cloneat n (trigger)]

      Output file name options
      [-o str (fname)] [-ap str (use str as  file  name  prefix)]
      [-aa  str  (append  str  to suffix)] [-az (show output file
      name and exit)]

      Dump options
      [-di i:j (dump interval)] [-dm k (dump modulo)]  [-ds  stem
      (dump  file  stem)] [-dump str (type)] [-digits n (printing
      precision)]

      Output options
      [-sort str (sort mode)] [--expand-only (factor out computa-
      tion)] [--inflate-first (rather then expand)]

      Info options
      [--jury-charter (explains jury)] [--version (show version)]
      [-how-much-ram k (RAM upper  bound)]  [-h  (most  important
      options)]  [-z  (show  current settings)] [-az (show output
      file  name  and  exit)]  [--show-schemes   (show   resource
      schemes)]

      Pruning options
      The  following  options  all pertain to the various pruning
      strategies that can be employed by mcl. They are  described
      in  the  PRUNING OPTIONS section, accompanied by a descrip-
      tion of the mcl pruning strategy.  If your graphs are  huge
      and  you  have  an  appetite for tuning, have a look at the
      following:

      [-p f (cutoff)] [-P n (1/cutoff)] [-S n (selection number)]
      [-R  n  (recovery  number)]  [-pct  f (recover percentage)]
      [-warn-pct  n  (prune  warn  percentage)]  [-warn-factor  n
      (prune  warn  factor)]  [--dense  (allow matrices to fill)]
      [--adapt (pruning)] [--rigid (pruning)]  [-ae  f  (adaptive
      pruning exponent)] [-af f (adaptive pruning factor)] [-nx x
      (x window index)] [-ny y (y window  index)]  [-nj  j  (jury
      window index)] [-nw w (nr of windows)] [-nl w (nr of itera-
      tions)] [--thick (expect dense input graph)]

      The first argument of mcl must be a  file  name,  but  some
      options  are  allowed  to  appear  as  the  first  argument
      instead. These are the options that cause mcl to print  out
      information  of  some  kind, after which it will gracefully
      exit. The full list of these options is

      -z, -h, --version, --show-settings, --show-schemes, --jury-
      charter, -how-much-ram k.

  DESCRIPTION
      mcl  implements  the  MCL  algorithm,  short for the Markov
      cluster algorithm, a cluster algorithm for graphs developed
      by  Stijn van Dongen at the Centre for Mathematics and Com-
      puter Science in Amsterdam, the Netherlands. The  algorithm
      simulates  flow  using  two  simple algebraic operations on
      matrices.  The inception of this flow process and the  the-
      ory  behind  it  are  described elsewhere (see REFERENCES).
      Frequently asked questions are answered  in  the  mclfaq(7)
      section.   The  program  described  here is a fast threaded
      implementation written by the algorithm's creator with con-
      tributions  by several others. Anton Enright co-implemented
      threading; see the HISTORY/CREDITS section for  a  complete
      account.   See  the APPLICABILITY section for a description
      of the type of graph mcl likes best, and for a  qualitative
      assessment  of  its  speed.   mcl is accompanied by several
      other utilities for analyzing  clusterings  and  performing
      matrix and graph operations; see the SEE ALSO section.

      The first argument is the input file name (see the mcxio(5)
      section for its expected format), or  a  single  hyphen  to
      read  from  stdin. The rationale for making the name of the
      input file a fixed parameter is that you typically do  sev-
      eral  runs  with different parameters. In command line mode
      it is pleasant if you do not have to skip over an immutable
      parameter all the time.

      The  -I f  option  is  the  main control, affecting cluster
      granularity. Using mcl is as simple as typing  (assuming  a
      file proteins contains a matrix/graph in mcl input format)

      mcl proteins -I 2.0

      The  above  will result in a clustering written to the file
      named out.proteins.I20s2. It is - of course -  possible  to
      explicitly specify the name of the output file using the -o
      option. Refer to the -ap option for a description of  mcl's
      procedure  in automatically constructing file names from it
      parameters.

      The mcl input format is described in the mcxio(5)  section.
      Clusterings  are  also  stored  as matrices - this is again
      discussed in the mcxio(5) section.  You presumably want  to
      convert  the  output  to something that is easier to inter-
      pret. The mcl matrix format is perhaps unpleasant to  parse
      in the quick and dirty way. You can use

      clmformat -icl <mcl-out-file> -dump -

      to  convert  mcl  output  to a line/tab based format, where
      each lines contains a cluster in the form of  tab-separated
      indices.  If  you  throw in the -tab <tab-file> option, you
      can get tab-separated labels. Refer to the clmformat manual
      page for more information.

      In  finding  good  mcl  parameter settings for a particular
      domain, or in finding cluster structure at different levels
      of  granularity,  one typically runs mcl multiple times for
      varying values of f (refer to the  -I  option  for  further
      information).

      mcl  expects  a  nonnegative  matrix  in the input file, or
      equivalently, a weighted (possibly directed) graph. NOTE  -
      mcl  interprets the matrix entries or graph edge weights as
      similarities, and it likes undirected input graphs best. It
      can  handle  directed  graphs,  but any node pair (i,j) for
      which w(i,j) is much smaller than w(j,i) or vice versa will
      presumably  have a slightly negative effect on the cluster-
      ings output by mcl. Many such node pairs will have  a  dis-
      tinctly  negative  effect, so try to make your input graphs
      undirected. How your edge weights are computed  may  affect
      mcl's  performance. In protein clustering, one way to go is
      to choose the negated logarithm of the BLAST  probabilities
      (see REFERENCES).

      mcl's  default  parameters  should make it quite fast under
      almost all circumstances. Taking  default  parameters,  mcl
      has  been  used  to  generate good protein clusters on 133k
      proteins, taking 10 minutes running time on a  Compaq  ES40
      system  with  four  alpha  EV6.7  processors.  It  has been
      applied (with good results) to graphs with 800k nodes,  and
      if  you have the memory (and preferably CPUs as well) noth-
      ing should stop you from going further.

      For large graphs, there are several  groups  of  parameters
      available  for  tuning the mcl computing process, should it
      be necessary. The easiest thing to  do  is  just  vary  the
      -scheme  option.  This  triggers different settings for the
      group of pruning parameters {-p/-P, -R, -S, and -pct}.  The
      default  setting  corresponds  with -scheme 2.  There is an
      additional group of control parameters  {--adapt,  --rigid,
      -ae,  -af},  which may be helpful in speeding up mcl.  When
      doing multiple mcl runs for the same graphs with  different
      -I  settings (for obtaining clusterings at different levels
      of granularity), it can be useful to factor out  the  first
      bit of computation that is common to all runs, by using the
      --expand-only option one time  and  then  using  --inflate-
      first  for  each  run  in the set.  Whether mcl considers a
      graph large depends mainly on  the  graph  connectivity;  a
      highly  connected graph on 50,000 nodes is large to mcl (so
      that you might want to tune resources) whereas  a  sparsely
      connected  graph on 500,000 nodes may be business as usual.

      mcl is a memory munger. Its precise appetite depends on the
      resource  settings.  You  can get a rough (and usually much
      too pessimistic) upper bound for the amount of RAM that  is
      needed by using the -how-much-ram option. The corresponding
      entry in this manual page contains the simple  formula  via
      which the upper bound is computed.

      Two other groups of interest are the thread-related options
      (you can specify the number of threads to  use)  {-t,  -te,
      -ti,  --clone,  -cloneat} and the verbosity-related options
      {--verbose, --silent, -v, -V}.   The  actual  settings  are
      shown  with  -z, and for graphs with at most 12 nodes or so
      you can view the MCL matrix iterands on screen by supplying
      --show (this may give some more feeling).

      MCL  iterands allow a generic interpretation as clusterings
      as well. The clusterings associated with early iterands may
      contain  a  fair  amount  of  overlap.  Refer  to the -dump
      option, the  mclfaq(7)  manual,  and  the  clmimac  utility
      (Interpret  Matrices  As Clusterings).  Use clmimac only if
      you have a special reason; the normal usage of mcl is to do
      multiple  runs  for varying -I parameters and use the clus-
      terings output by mcl itself.

      The creator of this page feels  that  manual  pages  are  a
      valuable resource, that online html documentation is also a
      good thing to have, and that info pages are way  way  ahead
      of their time. The NOTES section explains how this page was
      created.

      In the OPTIONS section  options  are  listed  in  order  of
      importance, with related options grouped together.

  OPTIONS
      -I f (inflation)
         Sets  the  main  inflation value to f. This value is the
         main handle for affecting  cluster  granularity.  It  is
         usually  chosen somewhere in the range [1.2-5.0]. -I 5.0
         will tend to result  in  fine-grained  clusterings,  and
         -I 1.2  will tend to result in very coarse grained clus-
         terings. Your mileage will vary depending on the charac-
         teristics of your data. That is why it is a good idea to
         test the quality and coherency of your clusterings using
         clmdist  and  clminfo. This will most likely reveal that
         certain values of -I are simply not right for your data.
         The  clmdist section contains a discussion of how to use
         the cluster validation tools shipped with mcl  (see  the
         SEE ALSO section).

         A second option for affecting cluster granularity is the
         -c option.  It may possibly increase granularity.

         With low values for -I, like -I 1.2, you should be  pre-
         pared to use more resources in order to maintain quality
         of  clusterings,  i.e.  increase  the  argument  to  the
         -scheme option.

      -o str (fname)
         Output the clustering to file named fname.  It is possi-
         ble to send the clustering to stdout by supplying  -o -.
         The  clustering  is output in the mcl matrix format; see
         the mcxio(5) section for more information on this.

         Look at the -ap option and its siblings  for  the  auto-
         matic  naming  constructions  employed  by mcl if the -o
         option is not used.

      -scheme k (use a preset resource scheme)
         There are currently seven  different  resource  schemes,
         indexed  1..7.   High  schemes  result in more expensive
         computations that may possibly  be  more  accurate.  The
         default  scheme  is  4. When mcl is done, it will give a
         grade (the so called jury synopsis) to the  appropriate-
         ness  of the scheme used. A low grade does not necessar-
         ily imply that the resulting clustering  is  bad  -  but
         anyway, a low grade should be reason to try for a higher
         scheme. The grades are listed  in  the  PRUNING  OPTIONS
         section under the -nj option.

         The   PRUNING  OPTIONS  section  contains  an  elaborate
         description of the way mcl manages resources, should you
         be  interested.  In case you are worried about the vali-
         dation of the resulting clusterings, the mclfaq(7)  sec-
         tion has several entries discussing this issue. The bot-
         tom line is that you have  to  compare  the  clusterings
         resulting  from different schemes (and otherwise identi-
         cal parameters) using utilities such as clmdist, clminfo
         on  the  one  hand,  and  your own sound judgment on the
         other hand.

         If your input graph is extremely dense, with an  average
         node  degree  (i.e.  the  number of neighbours per node)
         that is somewhere above 500, you may need to filter  the
         input graph by removing the nodes of highest degree (and
         projecting  them  back  onto  the  resulting  clustering
         afterwards) or by using the -pp option.

      --show-schemes (show preset resource schemes)
         Shows  the explicit settings to which the different pre-
         set schemes correspond.

         The characteristics are written in the same format (more
         or less) as the output triggered by -v pruning.

      -c f (centering)
         The larger the value of f the more nodes are attached to
         themselves rather than their neighbours, the more expan-
         sion  (the  spreading  of  flow  through  the  graph) is
         opposed, and the more fine-grained clusterings  tend  to
         be. f should be chosen greater than or equal to 1.0. The
         default is f=1.0. This option has a much  weaker  effect
         than  the  -I  option, but it can be useful depending on
         your data.

      -do str (carry out action)
      -dont str (omit action)
         These are the actions that can be controlled:

         clm
         log
         show-log
         keep-overlap

         The clm action appends a short synopsis  of  performance
         criteria to the output cluster file, as well as a synop-
         sis of the granularity characteristics.

         The log action appends several  characteristics  of  the
         mcl run to the output cluster file, such as the resource
         management log, parameter  settings,  timings,  and  the
         command  line.   Useful bookkeeping, and it is advisable
         to always use it.

         The show-log action sends the resource management log to
         STDOUT.

         The  keep-overlap  action  causes  mcl to retain overlap
         should this improbable event occur. In theory,  mcl  may
         generate  a  clustering  that contains overlap, although
         this almost never happens in practice,  as  it  requires
         some  particular  type  of symmetry to be present in the
         input graph (not just any symmetry will do).  Mathemati-
         cally  speaking, this is a conjecture and not a theorem,
         but I am willing to eat my shoe if it does not hold (for
         marzipan values of shoe). It is easy though to construct
         an input graph for which certain mcl settings result  in
         overlap  -  for example a line graph on an odd number of
         nodes. The default is to remove overlap should it occur.

         This option has more than theoretical use because mcl is
         able to generate clusterings associated with  intermedi-
         ate  iterands.   For  these clusterings, overlap is more
         than a theoretical possibility, and will often occur. If
         you  specify  the -L k option, mcl will output the clus-
         tering associated with the last iterand computed, and it
         may well contain overlap.

         This  option  has  no effect on the clusterings that are
         output when using -dump cls - the default for  those  is
         that  overlap  is  not touched, and this default can not
         yet be overridden.

      -v str (verbosity type on)
         See the --verbose option below.

      -V str (verbosity type off)
         See the --verbose option below.

      --silent (very)
         See the --verbose option below.

      --verbose (very)
         These are the different verbosity modes:

         progress
         pruning
         explain
         all

         where all means all three previous modes.  --verbose and
         -v all  turn  them all on, --silent and -V all turn them
         all off. -v str and -V str turn on/off the  single  mode
         str  (for  str  equal  to  one  of progress, pruning, or
         explain).  Each verbosity mode is given  its  own  entry
         below.

      -v progress
         This  mode  causes  mcl  to emit an ascii gauge for each
         single  matrix  multiplication.  It  uses  some  default
         length  for  the  gauge,  which  can  be  altered by the
         -progress k option. Simply using the  latter  will  also
         turn  on  this  verbosity  mode.  This mode can give you
         quickly an idea how long an mcl run might take.  If  you
         use  threading (see the -t option and its friends), this
         option may slow down the program a little  (relative  to
         -V progress, not relative to a single-CPU mcl run).

      -v explain
         This mode causes the output of explanatory headers illu-
         minating the output generated with the pruning verbosity
         mode.

      -v pruning
         This  mode causes output of resource-related quantities.
         It has a separate entry in the PRUNING OPTIONS  section.

      -progress k (gauge)
         If  k>0  then  for  each  matrix multiplication mcl will
         print an ascii gauge telling how far it  is.  The  gauge
         will be (in some cases approximately) k characters long.
         If k<0 then mcl will emit a gauge that  is  extended  by
         one  character  after  every  |k|  vectors computed. For
         large graphs, this option has been  known  to  ease  the
         pain of impatience. If k=0 then mcl will print a message
         only after every matrix multiplication, and  not  during
         matrix  multiplication. This can be useful when you want
         mcl to be as speedy as possible, for example when  using
         parallellized  mode  (as  monitoring  progress  requires
         thread communication).  For parallellization (by thread-
         ing) see the -t option.

      -aa str (append str to suffix)
         See the -ap option below.

      -ap str (use str as file name prefix)
         If  the  -o  option  is not used, mcl will create a file
         name (for writing output to) that should uniquely  char-
         acterize  the  important  parameters used in the current
         invocation of mcl. The default format is  out.fname.suf,
         where out is simply the literal string out, fname is the
         first argument containing the name of the file (with the
         graph)  to  be  clustered,  and  where suf is the suffix
         encoding a set of parameters (described further  below).

         The -ap str option specifies a prefix to use rather than
         out.fname as sketched above.  However, mcl  will  inter-
         pret  the  character '=', if present in str, as a place-
         holder for the input file name.

         If the -aa str option is used, mcl will  append  str  to
         the  suffix  suf created by itself.  You can use this if
         you need to encode some extra information  in  the  file
         name suffix.

         The  suffix  is  constructed  as  follows.  The -I f and
         -scheme parameter are always encoded.  The -pi f,  -l k,
         -i f,  and  -c f  options  are  only encoded if they are
         used. Any real argument f is encoded using  exactly  one
         trailing  digit  behind  the  decimal  separator  (which
         itself is not written).  The  setting  -I 3.14  is  thus
         encoded  as I31. The -scheme option is encoded using the
         letter 's', all other options mentioned here are encoded
         as themselves (stripped of the hyphen). For example

         mcl small.mci -I 3 -c 2.5 -pi 0.8 -scheme 5

         results in the file name out.small.mci.I30s5c25pi08.  If
         you want to know beforehand what file name will be  pro-
         duced, use the -az option.

      -az (show output file name and exit)
         If  mcl  automatically constructs a file name, it can be
         helpful to known beforehand what that file name will be.
         Use  -az  and mcl will write the file name to STDOUT and
         exit. This can be used if mcl is integrated  into  other
         software for which the automatic creation of unique file
         names is convenient.

      -te k (#expansion threads)
         See the -t k option below.

      -ti k (#inflation threads)
         See the -t k option below.

      --clone (when threading)
         See the -t k option below.

      -cloneat n (trigger)
         See the -t k option below.

      -t k (#threads)
         The -t options are self-explanatory. Note that threading
         inflation  is  hardly  useful, as inflation is orders of
         magnitude faster than expansion. Also note that  thread-
         ing is only useful if you have a multi-processor system.

         The --clone option says to give each thread its own copy
         of  the matrix being expanded/squared. The latter option
         can be further controlled using the --cloneat k  option.
         Copies are only made if the source matrix (the one to be
         squared) has on average at least k positive entries  per
         vector. This option is probably not very useful, because
         without it mcl is a memory munger already.

         When threading, it is best not to turn on  pruning  ver-
         bosity  mode  if  you  are  letting  mcl run unattended,
         unless you want to scrutinize its output later. This  is
         because  it  makes mcl run somewhat slower, although the
         difference is not dramatic.

      -l n (initial iteration number) (small letter ell)
         The number of times mcl will use a  different  inflation
         value  before  it switches to the (main) inflation given
         by the -I (capital eye) option. The different  value  is
         called  initial  inflation and is tunable using the -i f
         option (default value f=2.0). The default value (to  -l)
         is  zero.  This  option  supplies  new ways of affecting
         cluster granularity, e.g. by supplying

         mcl proteins -i 1.4 -l 2 -I 4.0

         one lets expansion prevail during the first  two  itera-
         tions,  followed  by inflation catching up (in a figura-
         tive way of writing).  This may  be  useful  in  certain
         cases,  but  this  type  of experiment is certainly sec-
         ondary to simply varying -I (capital eye).

      -L n (main iteration number)
         Normally, mcl computes the MCL process until it has con-
         verged  fully  to a doubly idempotent matrix. The number
         of iterations required is  typically  somewhere  in  the
         range  10-100.   The first few iterations generally take
         the longest time.  The -L option can be used to  specify
         the  number  of iterations mcl may do at most. When this
         number is reached, mcl will output the clustering  asso-
         ciated with the iterand last computed.

      -i f (initial inflation)
         The  inflation value used during the first n iterations,
         where n  is  specified  by  the  -l  (ell)  option.   By
         default, n=0 and f=2.0.

      -pi f (pre-inflation)
         If  used, mcl will apply inflation one time to the input
         graph before entering the main process. This can be use-
         ful  for  making the edge weights in a graph either more
         homogeneous (which may result in less granular  cluster-
         ings)  or  more  heterogeneous (which may result in more
         granular clusterings).  Homogeneity is achieved for val-
         ues  f  less  than  one, heterogeneity for values larger
         than one.  Values to  try  are  normally  in  the  range
         [2.0,10.0].

      -di i:j (dump interval)
      -dump-interval i:j
         Dump  during iterations i..j-1. See the -dump str option
         below.

      -dm k (dump i+0..i+k..)
      -dump-modulo k
         Sampling rate: select only these iterations in the  dump
         interval.  See the -dump str option below.

      -ds stem (file stem)
      -dump-stem stem
         Set  the  the  stem  for  file  names  of dumped objects
         (default mcl). See the -dump str option below.

      -dump str (type)
         str can be of the following types.

         ite
         dag
         cls
         chr

         Repeated use is allowed.   The  ite  option  writes  mcl
         iterands  to  file.  The  cls  option writes clusterings
         associated with mcl iterands to  file.   These  clusters
         are  obtained  from  a particular directed acyclic graph
         (abbreviated as DAG) associated with each  iterand.  The
         dag  option writes that DAG to file. The DAG can option-
         ally be further pruned and then again be interpreted  as
         a  clustering  using  clmimac, and clmimac can also work
         with the matrices written  using  the  ite  option.   It
         should  be noted that clusterings associated with inter-
         mediate iterands may contain overlap, which is interest-
         ing  in many applications. For more information refer to
         mclfaq(7) and the REFERENCES section below.

         The chr option says, for each iterand  I,  to  output  a
         matrix  C with characteristics of I. C has the same num-
         ber of columns as I. For each column k in C, row entry 0
         is  the  diagonal or 'loop' value of column k in I after
         expansion and pruning, and before inflation and  rescal-
         ing.  Entry  1  is  the  loop  value after inflation and
         rescaling.  Entry 2 is the center of column k  (the  sum
         of  its  entries  squared)  computed after expansion and
         before pruning, entry 3 is the maximum  value  found  in
         that  column  at the same time. Entry 4 is the amount of
         mass kept for that column after pruning.

         The -ds option sets the stem for file  names  of  dumped
         objects  (default  mcl). The -di and -dm options allow a
         selection of iterands to be made.

      -digits n (printing precision)
         This has two completely different uses. It sets the num-
         ber  of  decimals  used for pretty-printing mcl iterands
         when using the --show option (see below),  and  it  sets
         the  number  of  decimals  used for writing the expanded
         matrix when using the --expand-only option.

      --show (print matrices to screen)
         Print matrices to screen. The number of significant dig-
         its  to  be  printed  can  be  tuned  with -digits n. An
         80-column screen allows graphs (matrices) of size up  to
         12(x12)  to  be  printed  with  three  digits  precision
         (behind the comma), and of size up to 14(x14)  with  two
         digits.  This  can give you an idea of how mcl operates,
         and what the effect of pruning is.  Use  e.g.  -S 6  for
         such a small graph and view the MCL matrix iterands with
         --show.

      -sort str (sort mode)
         str can be one of  lex,  size,  revsize,  or  none.  The
         default is 'revsize', in which the largest clusters come
         first. If the mode is  'size',  smallest  clusters  come
         first,  if the mode is 'lex', clusters are ordered lexi-
         cographically, and if the mode is 'none', the  order  is
         the same as produced by the procedure used by mcl to map
         matrices onto clusterings.

      --inflate-first (rather then expand)
         Normally, mcl will take the input graph/matrix, make  it
         stochastic,  and  start  computing an mcl process, where
         expansion and  inflation  are  alternated.  This  option
         changes  that to alternation of inflation and expansion,
         i.e. inflation is the first operator to be applied. This
         is intended for use with an input matrix that was gener-
         ated with the --expand-only option (see below).  If  you
         do  multiple mcl runs for the same graph, then the first
         step will be the same for all runs, namely computing the
         square  of the input matrix. With the pair of --inflate-
         first and --expand-only this bit  of  computing  can  be
         factored  out.   NOTE this option assumes that the input
         matrix is stochastic (as it will be when generated  with
         --expand-only.   The  --inflate-first option renders all
         options useless that will  otherwise  affect  the  input
         matrix, and precisely these options do affect the matrix
         resulting from using --expand-only. See the entry  below
         for more information.

      --expand-only (factor out computation)
         This  option  makes  mcl  compute just the square of the
         input graph/matrix, and write it to the file  name  sup-
         plied  with  the  -o  flag, or to the default file named
         out.mce. NOTE in this case the output matrix  is  not  a
         clustering.  The  intended use is that the output matrix
         is used as input for mcl with the --inflate-first switch
         turned  on,  so that multiple mcl runs need not redo the
         same computation (the first expansion step).

         Note that the -scheme parameters affect the matrix  com-
         puted  with --expand-only. Other options that affect the
         matrix resulting from this option: -pp, -c, and -digits.
         The  latter  option  sets  the  precision  for output in
         native ascii format.

      -in-gq f (filter threshold)
         mcl will remove any edges in the  input  graph  (equiva-
         lently,  entries  in  the  input  matrix)  for which the
         weight is below f.

      -pp n (preprune count)
         For each  column  vector  (node)  in  the  input  matrix
         (graph)  mcl will keep the n entries (outgoing edges) of
         that vector (node) that have largest weight  and  remove
         the rest.

      --jury-charter (explains jury)
         Explains how the jury synopsis is computed from the jury
         marks.

      --version (show version)
         Show version.

      -how-much-ram n (RAM upper bound)
         n is interpreted as the number  of  nodes  of  an  input
         graph.   mcl  will  print  the  maximum amount of RAM it
         needs for its computations.  The formula for this number
         in bytes is:

            2 * c * k * n

            2  :  two matrices are concurrently held in memory.
            c  :  mcl cell size (as shown by -z).
            n  :  graph cardinality (number of nodes).
            k  :  MAX(s, r).
            s  :  select number (-S, -scheme options).
            r  :  recover number (-R, -scheme options).

         This  estimate  will usually be too pessimistic. It does
         assume though that the average node degree of the  input
         graph  does not exceed k. The -how-much-ram option takes
         other command-line arguments into account  (such  as  -S
         and  -R), and it expresses the amount of RAM in megabyte
         units.

      -h (show help)
         Shows a selection of the most important mcl options.

      --show-settings (show settings)
         A synonym for the -z option.

      -z (show settings)
         Show current settings for tunable  parameters.   --show-
         settings is a synonym.

  PRUNING OPTIONS
      -p f (cutoff)
      -P n (1/cutoff)
      -S s (selection number)
      -R r (recover number)
      -pct pct (recover percentage)
         After  computing a new (column stochastic) matrix vector
         during expansion (which is  matrix  multiplication  c.q.
         squaring), the vector is successively exposed to differ-
         ent pruning strategies. The intent of  pruning  is  that
         many  small  entries are removed while retaining much of
         the stochastic mass of the original vector. After  prun-
         ing,  vectors  are  rescaled to be stochastic again. MCL
         iterands are theoretically  known  to  be  sparse  in  a
         weighted  sense,  and this manoever effectively perturbs
         the MCL process a little in  order  to  obtain  matrices
         that  are genuinely sparse, thus keeping the computation
         tractable. An example of monitoring pruning can be found
         in  the discussion of -v pruning at the end of this sec-
         tion.

         mcl proceeds as follows. First, entries that are smaller
         than  cutoff  are removed, resulting in a vector with at
         most 1/cutoff entries. The cutoff can be supplied either
         by -p, or as the inverse value by -P. The latter is more
         intuitive, if your intuition is like  mine  (and  the  P
         stands for precision or pruning by the way).  The cutoff
         just described is rigid; it is the same for all vectors.
         The  --adapt  option  causes the computation of a cutoff
         that depends on a vector's homogeneity  properties,  and
         this option may or may not speed up mcl.

         Second,  if  the remaining stochastic mass (i.e. the sum
         of all remaining entries) is less than pct/100  and  the
         number of remaining entries is less than r (as specified
         by the -R flag), mcl will try to regain ground by recov-
         ering the largest discarded entries. The total number of
         entries is not allowed to grow larger than r.  If recov-
         ery  was  not  necessary,  mcl tries to prune the vector
         further down to at most s entries  (if  applicable),  as
         specified  by  the  -S flag. If this results in a vector
         that satisfies the recovery condition then  recovery  is
         attempted,  exactly  as described above. The latter will
         not occur of course if r <= s.

         The default setting is  something  like  -P 4000  -S 500
         -R 600.  Check the -z flag to be sure. There is a set of
         precomposed settings, which can be  triggered  with  the
         -scheme k option. k=2 is the default scheme; higher val-
         ues for k result in costlier and more accurate  computa-
         tions  (vice  versa  for  lower, cheaper, and less accu-
         rate).  The schemes are listed using the  --show-schemes
         option.  It  is advisable to use the -scheme option only
         in interactive mode, and to use the explicit expressions
         when doing batch processing. The reason is that there is
         no guarantee whatsoever that the schemes will not change
         between  different  releases. This is because the scheme
         options should reflect good  general  purpose  settings,
         and it may become appararent that other schemes are bet-
         ter.

         Note that 'less accurate' or  'more  accurate'  computa-
         tions  may  still  generate the same output clusterings.
         Use clmdist to compare output clusterings for  different
         resource  parameters.  Refer to clmdist for a discussion
         of this issue.

      -warn-pct k (prune warn percentage)
      -warn-factor k (prune warn factor)
         The two options -warn-pct  and  -warn-factor  relate  to
         warnings  that may be triggered once the initial pruning
         of a vector is completed. The idea is to issue  warnings
         if initial pruning almost completely destroys a computed
         vector, as this may be a sign that the  pruning  parame-
         ters should be changed. It depends on the mass remaining
         after initial pruning whether a warning will be  issued.
         If  that  mass is less than warn-pct or if the number of
         remaining entries is smaller  by  a  factor  warn-factor
         than  both the number of entries originally computed and
         the recovery number, in that  case,  mcl  will  issue  a
         warning.

         -warn-pct  takes an integer between 0 and 100 as parame-
         ter, -warn-factor takes a  real  positive  number.  They
         default  to  something  like 30 and 50.0. If you want to
         see less warnings, decrease warn-pct and increase  warn-
         factor. Set warn-factor to zero if you want no warnings.

      --dense (allow matrices to fill)
         This renders all pruning options useless except for one.
         After  each  expansion step, mcl will remove all entries
         that are smaller than the threshold specified by  -p  or
         -P,  which  acts  like  a  precision in this case. After
         removal, the matrix columns are rescaled to be  stochas-
         tic again.

         If  the  -p threshold (precision) is zero or very small,
         the --dense option results in a rather accurate and very
         costly  computation  of the MCL process. Do not use this
         option for graphs with more than  several  thousands  of
         entries, or you will have trouble digging your processor
         out of swap.

      --rigid (pruning)
         See the --adapt option below.

      -ae f (adaptive pruning exponent)
         See the --adapt option below.

      -af f (adaptive pruning factor)
         See the --adapt option below.

      --adapt (pruning)
         The default mcl pruning behaviour as described under the
         -P  option is called rigid pruning (it being the default
         renders the switch --rigid currently useless),  refering
         to  the  fact  that  the  first stage of pruning removes
         entries smaller than a  fixed  threshold.   The  options
         discussed  here  enable  the  computation of a threshold
         that depends on the  homogeneity  characteristics  of  a
         vector.   This   behaviour  is  triggered  by  supplying
         --adapt.

         The --adapt behaviour only  affects  the  first  pruning
         stage, c.q.  the computation of the first threshold (see
         the discussion under the -P option). It does not  inter-
         fere  with  either selection or recovery. It is affected
         however by the threshold as specified by the -P  option.
         When  using  --adapt, you typically use the -P option as
         well, and you can and should use a higher value then you
         would without using --adapt.

         All  that said, --adapt triggers this behaviour: Given a
         stochastic vector v, its mass center  of  order  two  is
         computed,  which  is  the sum of each entry squared. The
         mass center of v, call it c, is strongly related to  its
         homogeneity properties (see REFERENCES). The threshold T
         is computed as 1/f * pow(c, e), where e and  f  are  the
         arguments  to  the  -af f and -ae e options respectively
         (check -z for the respective defaults).  For either e or
         f decreasing it means that T becomes larger.  Finally, T
         is maxed with the rigid threshold value,  which  can  be
         altered  using  either  -p f or -P n.  The latter is why
         you should increase the -P  parameter  n  (so  that  the
         rigid  threshold  is decreased) once you switch to adap-
         tive pruning. The adaptive threshold should be the  main
         factor  controlling  pruning,  with  the rigid threshold
         acting as a safeguard that does not take over too often.

         This  may  seem  complicated, but the rules are actually
         quite simple, and you may just disregard the  definition
         of  T. The usefulness of these options will vary. If you
         want to speed up mcl, try it out and add --adapt to your
         settings.

      --thick (expect dense input graph)
         This option is somewhat esoteric. It does not affect the
         matrices as computed by mcl, but it affects the  way  in
         which  they  are  computed.  If  the input graph is very
         dense, this may speed up mcl a little.

      -v pruning
         Pruning  verbosity  mode  causes  mcl  to  emit  several
         statistics related to the pruning process, each of which
         is described below. Use -v explain  to  get  explanatory
         headers in the output as well (or simply use -v all).

         Selection and recovery
         The  number of selections and recoveries mcl had to per-
         form during each iteration is shown. It also  shows  the
         number of vectors for which the mass after final pruning
         was below the fraction defined by the -pct option  as  a
         percentage (default probably 90 or 95).

         Initial and pruned vector footprint distributions
         The distribution of the vector footprints (i.e. the num-
         ber of nonzero entries)  before  and  after  pruning  is
         shown. This is assembled in a terse (horrid if you will)
         format, looking as follows (with some context  stripped,
         noting  that  the  data  for  three  expansion  steps is
         shown):

         ----------------------------------------------------
          mass percentages  | distr of vec footprints       |
                  |         |____ expand ___.____ prune ____|
           prune  | final   |e4   e3   e2   |e4  e3   e2    |
         all ny nx|all ny nx|8532c8532c8532c|8532c8532c8532c|
         ---------.---------.---------------.---------.-----.
          98 88 86  98 91 86 _________022456 ___________0234
          98 89 86  98 94 91 _______00245678 ___________0234
          98 90 89  99 95 94 _______00235568 ___________0234
          ...

         This particular  output  was  generated  (and  truncated
         after  three  rounds  of  expansion  and inflation) from
         clustering a protein graph on 9058 nodes  with  settings
         -I 1.4, -P 2000, -S 500, -R 600, and -pct 95.

         The  header  entries 8532c85.. indicate thresholds going
         from 80000, 50000, 20000, 12500, 8000, all the way  down
         to  300,  200,  and 125. The character 'c' signifies the
         base 1.25 (for no apparent reason). The second entry '2'
         (after  '0') on the first line signifies that roughly 20
         percent of  all  the  vectors  had  footprint  (#nonzero
         entries) between 800 and 1250.  Likewise, 40 percent had
         footprint between 300 and 500. The '0' entries signify a
         fraction  somewhere below 5 percent, and the '@' entries
         signify a fraction somewhere above 95 percent.

         Two columns are listed, one  for  the  expansion  vector
         footprints  (i.e. after squaring), and the other for the
         vector footprints right after initial pruning took place
         (i.e.  before selection and recovery, after either adap-
         tive or rigid pruning).  This may give an  idea  of  the
         soundness of the initial pruning process (overly severe,
         or overly mild), and the extent to  which  you  want  to
         apply selection and/or recovery.

         Initial and final mass windows
         The  mass averages of the pruned vectors after the first
         selection stage are shown, and the mass averages of  the
         vectors  as  finally  computed, i.e. after selection and
         recovery. Note that the latter corresponds to a  differ-
         ent  stage than what is shown for the vector footprints,
         if either selection or recovery is turned on.  For  both
         cases,  three  averages  are shown: the average over all
         vectors, the average over the worst  x  cases,  and  the
         average  over  the  worst y cases. The mass averages are
         shown as percentages: '98' on the first line  under  the
         'prune/all'  column means that overall 98 percent of the
         stochastic mass of the matrix was kept after pruning.

         This example demonstrates that  many  entries  could  be
         removed while retaining much of the stochastic mass. The
         effect of the recovery (-R) parameter is also clear: the
         final  averages are higher than the initial averages, as
         a result of mcl undoing some overenthousiastic  pruning.

         An  average over the worst k cases is called a window of
         width k; internally, mcl tracks many more such  windows.
         The  result  of  this can be seen when using the -do log
         option (which appends a log to the  cluster  output)  or
         the -do show-log option (which sends the log to STDOUT).
         From a fixed set of windows those  that  are  applicable
         are  tracked,  that  is, all those windows for which the
         width does not exceed the graph cardinality. The windows
         in  the fixed set have respective sizes 1, 2, 5, 10, 20,
         50, and so on up until 5000000 (which makes  15  windows
         in all).

      -nx i (x window index)
      -ny j (y window index)
         The  options -nx and -ny both take an index in the range
         1..15. The default values for -nx and  -ny  are  respec-
         tively  4  and 7, denoting the fourth and seventh window
         of respective widths 10 and 100. They are  used  in  the
         verbosity output as described above.

      -nj i (jury window index)
         The  -nj  denotes  a window index in the same way as -nx
         and -ny do. This particular window is used for computing
         the  jury  marks, which are the three number reported by
         mcl when it is done. They are a reminder  of  the  exis-
         tence  of  pruning and its importance for both speed and
         accuracy, and they are indicative rather  than  authora-
         tive.

         These jury marks are simply the respective mass averages
         in the jury window for the first three  iterations.  The
         marks are even further simplified and mapped to the jury
         synopsis, which is a single grade expressed as an adjec-
         tive.  The  grades  are, in decreasing order of achieve-
         ment,  perfect  exceptional  superior   excellent   good
         acceptable  mediocre  poor  bad  lousy  miserable  awful
         wretched atrocious. Doing 'mcl --jury-charter' will tell
         you how the jury marks map onto the jury synopsis.

         The  jury  marks should preferably be higher than 70. If
         they are in the vicinity of 80 or 90, mcl is doing  fine
         as  far as pruning is concerned.  Choose a higher scheme
         if you think them too low. For very dense graphs that do
         have  strong  cluster structure, the jury marks can sink
         as low as to the 30's and 40's, but the clusterings gen-
         erated  by mcl may still be good. The marks and the syn-
         opsis grade the severity of pruning, not  cluster  qual-
         ity.  Note  that  the  jury  becomes  friendlier,  resp.
         harsher when the -nj option is increased/decreased.

      -nw w (nr of windows)
         Normally, mcl will  use  all  windows  that  have  width
         smaller  than  the  cardinality of the input graph. This
         option limits the set of windows to those w  windows  of
         smallest width.  This affects the -do log output.

      -nl l (number of iterations)
         By  default,  mcl  will log the window mass averages for
         the first ten iterations. This options sets that  number
         to l.  It affects the -do log output.

  PIPELINES
      In  general,  clustering  requires several stages; creating
      the matrix, running mcl, and  displaying  the  result.  The
      display  stage  is  supported by clmformat. The matrix cre-
      ation stage often needs only be done once for a given  data
      collection,  followed  by  repeated  runs  of the other two
      stages for varying inflation values and scheme settings.

      The matrix creation stage can often be split up in two more
      stages,  namely  parsing  a data file in some given format,
      and assembling a matrix from the data bits and pieces, such
      as  node  indices and edge weights or even edge weight con-
      tributions.  The assembly step can be done by  mcxassemble,
      which  allows  a very general input format and customizable
      behaviour in how the bits and pieces should be  transformed
      to  the  input  graph.   This  leaves the parse stage to be
      filled in.

      The mclpipeline script implements a generic  and  customiz-
      able  pipeline  encapsulating the four stages distinguished
      here (parsing, assembly, clustering, display). It is possi-
      ble  to  let  only part of the pipeline be active, and many
      other features are supported. The IO mechanism is  entirely
      file  based, and files are associated with parametrizations
      via file name extensions (by all means a simple mechanism).

      mclpipeline requires a single parse script to be specified.
      It will be plugged into the pipeline and you should be  set
      to  run.   The  parse  script  must  satisfy  the interface
      requirements described in the mclpipeline manual page.

      For BLAST input, the mclblastline script provides  a  dedi-
      cated  interface.  It uses the mcxdeblast script that comes
      prepackaged with mcl.

  APPLICABILITY
      mcl will work very well for graphs in which the diameter of
      the natural clusters is not too large. The presence of many
      edges between different clusters  is  not  problematic;  as
      long as there is cluster structure, mcl will find it. It is
      less likely to work well for graphs with clusters (inducing
      subgraphs) of large diameter, e.g. grid-like graphs derived
      from Euclidean data. So mcl in its canonical form  is  cer-
      tainly  not  fit  for boundary detection or image segmenta-
      tion. I experimented  with  a  modified  mcl  and  boundary
      detection  in the thesis pointed to below (see REFERENCES).
      This was fun and not entirely unsuccesful,  but  not  some-
      thing to be pursued further.

      mcl  likes undirected input graphs best, and it really dis-
      likes graphs with node pairs (i,j) for which an  arc  going
      from  i  to j is present and the counter-arc from j to i is
      absent. Try to make your input graph undirected.   Further-
      more,  mcl  interprets  edge weights in graphs as similari-
      ties. If you are used to working with dissimilarities,  you
      will  have to convert those to similarities using some con-
      version formula. The most important thing is that you  feel
      confident  that  the similarities are reasonable, i.e. if X
      is similar to Y with weight 2, and X is similar to  Z  with
      weight  200, then this should mean that the similarity of Y
      (to X) is neglectible compared with the similarity of Z (to
      X).

      mcl is probably not suited for clustering tree graphs. This
      is because mcl works  best  if  there  are  multiple  paths
      between  different  nodes  in  the natural clusters, but in
      tree graphs there is only one  path  between  any  pair  of
      nodes. Trees are too sparse a structure for mcl to work on.

      mcl may well be suited for  clustering  lattices.  It  will
      depend  on  the density characteristics of the lattice, and
      the conditions for success are the same as those for  clus-
      tering graphs in general: The diameter of the natural clus-
      ters should not be too large.  NOTE when clustering a  lat-
      tice,  you have to cluster the underlying undirected graph,
      and not the directed  graph  that  represents  the  lattice
      itself.  The  reason  is  that one has to allow mcl (or any
      other cluster algorithm) to 'look  back  in  time',  so  to
      speak.  Clustering and directionality bite each other (long
      discussion omitted).

      mcl has a worst-case time complexity O(N*k^2), where  N  is
      the number of nodes in the graph, and k is the maximum num-
      ber of neighbours tracked during computations. k depends on
      the  -P  and -S options. If the -S option is used (which is
      the default setting) then k equals the value  corresponding
      with  this  option.  Typical  values for k are in the range
      500..1000. The average case is much better than  the  worst
      case  though, as cluster structure itself has the effect of
      helping mcl's pruning schemes, certainly if the diameter of
      natural clusters is not large.

  FILES
      There  are  currently  no resource nor configuration files.
      The mcl matrix format is described in the mcxio(5) section.

  ENVIRONMENT
      MCLXASCIIDIGITS
         When  writing matrices in ascii format, mcl will use the
         environment variable MCLXASCIIDIGITS (if present) as the
         precision (number of digits) for printing the fractional
         part of values.

      MCLXIOVERBOSITY
         MCL and its sibling  applications  will  usually  report
         about  matrix  input/output  from/to disk. The verbosity
         level can be regulated via  MCLXIOVERBOSITY.  These  are
         the levels it can currently be set to.

          1  Silent but applications may alter this.
          2  Silent and applications can not alter this.
          4  Verbose but applications may alter this.
          8  Verbose   and   applications   can  not  alter  this
             (default).

      MCLXIOFORMAT
         MCL and its sibling applications will by default  output
         matrices  in ASCII format rather than binary format (cf.
         mcxio(5)).  The desired format can be controlled via the
         variable  MCLXIOFORMAT. These are the levels it can cur-
         rently be set to.

          1  Ascii format but applications may alter this.
          2  Ascii format and applications  can  not  alter  this
             (default).
          4  Binary format but applications may alter this.
          8  Binary format and applications can not alter this.

      MCLXASCIIFLAGS
         If matrices are output in ascii format, by default empty
         vectors will not be listed. Equivalently  (during  input
         time),  vectors  for  which  no  listing  is present are
         understood to be empty - note that  the  presence  of  a
         vector is established using the domain information found
         in the header part.  It is possible to  enforce  listing
         of  empty  vectors  by  setting  bit '1' in the variable
         MCLXASCIIFLAGS.

  DIAGNOSTICS
      If mcl issues a diagnostic error, it will  most  likely  be
      because  the  input matrix could not be parsed succesfully.
      mcl tries to be helpful in describing  the  kind  of  parse
      error.   The mcl matrix format is described in the mcxio(5)
      section.

  BUGS
      No known bugs at this time. Please send bug reports to mcl-
      devel@micans.org.

  AUTHOR
      Stijn van Dongen.

  HISTORY/CREDITS
      The  MCL algorithm was conceived in spring 1996 by the pre-
      sent author.  The first implementation of the MCL algorithm
      followed that spring and summer. It was written in Perl and
      proved the viability of the algorithm.  The  implementation
      described  here  began  its  life in autumn 1997. The first
      versions of the vital matrix library were designed  jointly
      by  Stijn  van  Dongen  and  Annius  Groenink in the period
      Oktober 1997 - May 1999. The efficient matrix-vector multi-
      plication  routine  was  written by Annius. This routine is
      without significant changes still one of  the  cornerstones
      of this MCL implementation.

      Since May 1999 all MCL libraries have seen much development
      and redesign by the present author. Matrix-matrix multipli-
      cation has been rewritten several times to take full advan-
      tage of the sparseness properties of the stochastic  matri-
      ces  brought  forth  by the MCL algorithm. This mostly con-
      cerns the issue of pruning - removal of small elements in a
      stochastic column in order to keep matrices sparse.

      Very  instructive  was  that  around April 2001 Rob Koopman
      pointed out that selecting the k largest elements out of  a
      collection of n is best done using a min-heap. This was the
      key to the second major rewrite (now counting three) of the
      MCL  pruning schemes, resulting in much faster code, gener-
      ally producing a more accurate computation of the MCL  pro-
      cess.

      In May 2001 Anton Enright initiated the parallellization of
      the mcl code and threaded  inflation.  From  this  example,
      Stijn  threaded  expansion. This was great, as the MCL data
      structures and operands (normal matrix  multiplication  and
      Hadamard multiplication) just beg for parallellization.

      In  Jan  2003  the  03-010  release  introduced support for
      sparsely enumerated (i.e. indices need not  be  sequential)
      graphs  and matrices, the result of a major overhaul of the
      matrix library and most higher layers.   Conceptually,  the
      library  now  sees  matrices as infinite quadrants of which
      only finite subsections happen to have nonzero entries.

      Joost van Baal set up the mcl CVS tree and packaged mcl for
      Debian  GNU/Linux. He completely autotooled the sources, so
      much so that at first I found it hard  to  find  them  back
      amidst  bootstrap, aclocal.m4, depcomp, and other beauties.

      Jan van der Steen shared his elegant mempool  code.  Philip
      Lijnzaad  gave  useful  comments.  Philip, Shawn Hoon, Abel
      Ureta-Vidal, and Martin Mokrejs sent helpful bug reports.

      Abel Ureta-Vidal and Dinakarpandian Deendayal commented  on
      and contributed to mcxdeblast and mcxassemble.

  SEE ALSO
      mclfaq(7) - Frequently Asked Questions.

      mcxio(5) - a description of the mcl matrix format.

      There  are many more utilities. Consult mclfamily(7) for an
      overview of and links to  all  the  documentation  and  the
      utilities in the mcl family.

      mcl development is discussed on mcl-devel@lists.micans.org,
      (subscribtion)          information          is          at
      https://lists.micans.org:446/listinfo/mcl-devel , this list
      is archived at  https://lists.micans.org:446/pipermail/mcl-
      devel/.

      mcl's home at http://micans.org/mcl/.

  REFERENCES
      Stijn van Dongen, Graph Clustering by Flow Simulation.  PhD
      thesis, University of Utrecht, May 2000.
      http://www.library.uu.nl/digia-
      rchief/dip/diss/1895620/inhoud.htm

      Stijn  van Dongen. A cluster algorithm for graphs.  Techni-
      cal Report INS-R0010, National Research Institute for Math-
      ematics and Computer Science in the Netherlands, Amsterdam,
      May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z

      Stijn van  Dongen.  A  stochastic  uncoupling  process  for
      graphs.   Technical  Report  INS-R0011,  National  Research
      Institute for  Mathematics  and  Computer  Science  in  the
      Netherlands, Amsterdam, May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0011.ps.Z

      Stijn van Dongen. Performance criteria for graph clustering
      and Markov cluster experiments. Technical Report INS-R0012,
      National  Research  Institute  for Mathematics and Computer
      Science in the Netherlands, Amsterdam, May 2000.
      http://www.cwi.nl/ftp/CWIreports/INS/INS-R0012.ps.Z

      Enright A.J., Van Dongen S., Ouzounis  C.A.   An  efficient
      algorithm  for  large-scale  detection of protein families,
      Nucleic Acids Research 30(7):1575-1584 (2002).

  NOTES
      This  page  was  generated   from   ZOEM   manual   macros,
      http://micans.org/zoem.  Both  html  and  roff pages can be
      created from the same source without having to bother  with
      all the usual conversion problems, while keeping some level
      of sophistication in the typesetting.



  mcl 1.004, 04-103          12 Apr 2004                     mcl(1)
