Copyright and installation instructions are contained within the package. We strongly recommend that you consult the tutorial before attempting to use IDP.
|IDP 0.1.9||Please see the release notes for more information|
IDP 0.1.9 - Release Notes
IDP integrates short reads (e.g. Illumina data) and long reads (e.g. PacBio data) to identify gene isoforms (transcripts) from transcriptome (see Figure above).
- One input of IDP is the short-read RNA-seq results: junctions (bed file) AND alignments of short reads (sam file).
Most RNA-seq tools, such as SpliceMap and Tophat can output these two files.
- The other input is the long reads: raw sequences (FASTA file) OR alignment of long reads (PSL file by BLAT or GPD file)
The error-corrected long reads from PacBio data is perferred. LSC is our default error-correction tool.
- The IDP output are the gene isoform identifications and quantification of genes and gene isoforms. hESC transcriptome (H1 cell line) is the first one identified by this methods. For more details of this transcriptome, please see its homepage http://www.healthcare.uiowa.edu/labs/au/IDP/hESC.asp and our paper Characterization of the human ESC transcriptome by hybrid sequencing [preprint].
Changes in version 0.1.9
- We added support for setting a hard cutoff for the fraction total gene expression to consider an isoform as a candidate (min_isoform_fraction). This is mutually exclusive with FPR so please use one or the other. We also added support for a hard RPKM cutoff of isoform expression to be considered as an isoform candidated (min_isoform_rpkm).
- We added the utility IDP_merge_genepred.py to facilitate the comparison of multiple IDP runs. Since each prediction can create uniquely named loci, this will merge them into a comparable format.