TODO for bcbio with Debian
==========================

The following is a list of packages that are used by at least one of the
workflows bcbio aggregates. A complete coverage by Debian of all these tools
seems increasingly unlikely - their packaging would take longer than the
tools' scientific lifespan, i.e. there is a new generation of problem solving
emerging and/or the workflow as a whole will have mutated in the meantime.

We yet have to find an answer to this challenge.  A series of repositories
of packages that are still waiting in the new queue for their acceptance
in the distribution is available as

 deb http://med.functional.domains/r unstable main contrib non-free
 deb http://med.functional.domains/med unstable main contrib non-free
 deb http://med.functional.domains/python unstable main contrib non-free
 deb http://med.functional.domains/science unstable main contrib non-free

Anyway, for quite some this is to just go and use Conda.  We should also
consider mixing Conda with Debian, though. But let us first see where
we are with bcbio:

Packages already accepted
------------------------

Are no longer shown in this list but are added to the dependencies of
bcbio.

Packages already in the New Queue
---------------------------------

package pizzly (in new queue) (in crash space)
   https://github.com/pmelsted/pizzly
   https://salsa.debian.org/med-team/pizzly

package r-wasabi (in new queue) (in crash space)
   https://salsa.debian.org/r-pkg-team/r-other-wasabi

package libjs-scribl (in new queue) (in crash space)
  https://salsa.debian.org/med-team/libjs-scribl
  http://chmille4.github.io/Scribl/
    A surprisingly neat JavaScript library to show genomic features
       Needed for DFSG-compliance of seqcluster

package cthreadpool (in new queue) (in crash space)
   https://github.com/Pithikos/C-Thread-Pool/blob/master/thpool.h
      Needed by ViennaRNA

Packages just waiting for someone addressing the last mile for the upload
-------------------------------------------------------------------------

package pyomo (in crash space)
  https://github.com/Pyomo/pyomo
  https://salsa.debian.org/science-team/pyomo
    build-deps on pyutilib
    needed by optitype
    Status: Builds if errors in build time tests will be ignored

package pyutilib (in crash space)
  https://github.com/PyUtilib/pyutilib
  https://salsa.debian.org/python-team/modules/pyutilib
    needed by pyomo for testing
    Status: Builds if errors in build time tests will be ignored

package python-seqcluster (in crash space)
   https://github.com/lpantano/seqcluster/
   https://salsa.debian.org/med-team/python-seqcluster
      The Python3 transition seems fine now.
      The package needs RNAfold fof the ViennaRNA package for testing. Otherwise it stalls without an error message.
      Dependencies are incomplete / an update wrecked the one immaculate testing.
      When grepping for "conda install":
       ./doc/source/installation.rst:    ~/install/seqcluster/anaconda/conda install seqcluster seqbuster bedtools samtools pip nose numpy scipy pandas pyvcf -c bioconda
       ./Dockerfile:     conda install --yes -c conda-forge -c bioconda scipy seqcluster bedtools samtools pip nose setuptools -q && \
       ./.travis.yml:- conda install --yes -c conda-forge -c bioconda memory_profiler openjdk pysam pybedtools pandas numpy biopython progressbar2 pyyaml bedtools samtools mirtop viennarna -q
      so, for some successful testing it seems like we need viennarna, still. That is not yet in the distribution. If truly dependending on it then bcbio ends up in contrib.

     seqcluster accepted a patch to eliminate a "Free as in beer" graph library from its source tree at ./misc/js/amcharts.js. These are now references to the online location in a CDN.


     Additional requirements:
      python3-dateutils (in new queue)
      python3-pytz ----> packaged but under the name python3-tz

     And there are R packages that should be recommended at least (from https://github.com/lpantano/seqcluster/blob/master/scripts/install_libraries.R)
       # We cover all CRAN packages
       # Most BioC packages are missing - even edgeR ? :
       apparently optional for bcbio: r-bioc-edgeR - missing
       apparently optional for bcbio: r-bioc-HTSFilter - Depends DESeq and edgeR which both in turn depend locfit which makes all non-free
       apparently optional for bcbio: r-bioc-DEGreport - all pre-depends except edgeR are in new
       # novel packages
       apparently optional for bcbio: install.github("hbc/CHBUtils") - Uploaded to new as r-other-chbutils_0.1+git20171026.a226cee-1
       apparently optional for bcbio: r-bioc-isomiRs - (needs r-bioc-DEGreport) all pre-depends except edgeR are in new
       apparently optional for bcbio: install_github('rstudio/rmarkdown')  -- this may be the same we have from CRAN - not checked

python3-dateutils (in new queue) (in crash space)
  https://salsa.debian.org/python-team/modules/python-dateutils
    Needed by python-seqcluster - actually I am not 100% sure about where in the latest version of seqcluster that is

package viennarna (in crash space)
    https://github.com/ViennaRNA/ViennaRNA
    https://salsa.debian.org/med-team/vienna-rna/blob/master/debian/changelog
       No idea about where exactly we stand with this.
       The package now has a custom license file - redistributable but non-free.
    Needed for testing seqcluster.
	Steffen revisits ViennaRNA, will update to new upstream version this weekend
           Using current master not release because of erroneous commits of autogenerated files in 2.4.13 release source tree.
           Manual removal of folders along instructions in d/copyright
    Python and Perl packages are not functional, yet, not needed for seqcluster
    Waiting for acceptance of cthreadpool.

Package MutliQC (in crash space)
   https://salsa.debian.org/med-team/multiqc
   Not redistributable until we have removed the highcharts library from that source tree
     https://github.com/ewels/MultiQC/issues/800

Packages presumed easy to package
---------------------------------

package gjh_asl_json
  https://github.com/ghackebeil/gjh_asl_json
    needed by pyomo for testing

package hts_nim_tools
   https://github.com/brentp/hts-nim-tools
   Likely addressed with the other efforts on nim behind mosdepth

package "asl solver"
  http://www.ampl.com/netlib/ampl/solvers.tgz
    needed by gjh_asl_json

r-bioc-edgeR
   https://tracker.debian.org/pkg/r-bioc-edger
	This was removed from buster in February - a bit of a bummer, I tend to think
   https://salsa.debian.org/r-pkg-team/r-bioc-edger
   https://bioconductor.org/packages/release/bioc/html/edgeR.html
     Needed by seqcluster

r-bioc-HTSFilter
   https://bioconductor.org/packages/release/bioc/html/HTSFilter.html
     Needed by seqcluster

r-bioc-DEGreport
   https://bioconductor.org/packages/release/bioc/html/DEGreport.html
     Needed by seqcluster


Packages of importance for basic NGS workflows with motivation beyond bcbio
---------------------------------------------------------------------------

package mosdepth
  https://salsa.debian.org/med-team/mosdepth
  several nim libraries are needed. These have all been addressed on https://salsa.debian.org/nim-team
  of which the first already is in the new queue

package oncofuse
  https://salsa.debian.org/med-team/oncofuse
    Stuck over typical Java issue

package optitype
  https://salsa.debian.org/med-team/optitype (rudimentary)
    Stuck over too many dependencies

package vcfanno
  https://salsa.debian.org/med-team/vcfanno
  Not fun because of many GO packages that are still missing,
  "biogo" being one of them.
    Stuck over too many dependencies



Difficult to package to the degree that one is tempted to use conda for the meantime
------------------------------------------------------------------------------------

package snpEff
  http://snpeff.sourceforge.net/
  https://salsa.debian.org/med-team/snpeff/blob/master/debian/changelog
  The packaging needs help with its dependencies:
     apfloat 1.6.3:
       https://github.com/mtommila/apfloat
       http://www.apfloat.org/apfloat_java/maven.html
     charts4j 1.3:
       https://mvnrepository.com/artifact/com.googlecode.charts4j/charts4j/1.3
     com.typesafe.akka:akka-actor 2.0.1
     distlib 0.9.1:
       https://sourceforge.net/projects/statdistlib/?source=navbar

package qsignature
   https://sourceforge.net/p/adamajava/wiki/Home/
   https://sourceforge.net/p/adamajava/wiki/qSignature/
   http://sourceforge.net/projects/adamajava/files/qsignature.tar.bz2/download
   https://sourceforge.net/p/adamajava/code/HEAD/tree/trunk/adamajava/qsignature/

Now at https://github.com/AdamaJava/adamajava

add hisat2_extract_splice_sites.py to the hisat2 package
  -> Done in Git in preparation for 2.1.0-3
  -> Added arch-indep python3-hisat2 package

package sailfish
   https://salsa.debian.org/med-team/sailfish
    FTBFS


package vt
   https://github.com/atks/vt

package Rmath
   https://github.com/atks/Rmath
     needed by vt

package fgbio
  https://github.com/fulcrumgenomics/fgbio
  Anyone much into scala?

package manta
  https://github.com/Illumina/manta/
  https://salsa.debian.org/med-team/manta

package strelka
https://github.com/Illumina/strelka/releases
  - Steffen works on this one
  # Strelka self-compiles htslib with these options - we should cross-check with what we are doing
  ## Addresss sanitizer build options for htslib/samtools
  #set(HTSLIB_CFLAGS '-O0 -g -fsanitize=address -fno-omit-frame-pointer -fno-optimize-sibling-calls')


# An easy one:
package qualimap
  https://salsa.debian.org/med-team/qualimap
  Unfortunately some of the Java code that is needed as predepenceny is simply lost see
     https://alioth-lists.debian.net/pipermail/debian-med-packaging/2016-January/038322.html


Packages no longer used by bcbio
--------------------------------

package lumpy
  https://github.com/arq5x/lumpy-sv

