Copyright and installation instructions are contained within the package. We strongly recommend that you consult the tutorial before attempting to use IDP.
|IDP 0.1.2||Please see the release notes for more information|
IDP 0.1.2 - Release Notes
IDP integrates short reads (e.g. Illumina data) and long reads (e.g. PacBio data) to identify gene isoforms (transcripts) from transcriptome (see Figure above).
- One input of IDP is the short-read RNA-seq results: junctions (bed file) AND alignments of short reads (sam file).
Most RNA-seq tools, such as SpliceMap and Tophat can output these two files.
- The other input is the long reads: raw sequences (FASTA file) OR alignment of long reads (PSL file by BLAT or GPD file)
The error-corrected long reads from PacBio data is perferred. LSC is our default error-correction tool.
- The IDP output are the gene isoform identifications and quantification of genes and gene isoforms. hESC transcriptome (H1 cell line) is the first one identified by this methods. For more details of this transcriptome, please see its homepage http://www.healthcare.uiowa.edu/labs/au/IDP/hESC.asp and our paper Characterization of the human ESC transcriptome by hybrid sequencing [preprint].
Changes in version 0.1.2
- The parameter read_length is now set in the run.cfg file
- Removed dependencies on python argparse and cython modules
- Added a python_path option in cfg file in case the default "/usr/bin/python" is not suitable
- Improved handling of I/D in CIGAR strings as insertions/deletions in exon regions
- Deleted intermediate files generated in MLT_MT.py and parseSAM_MT.py scripts