IMSEQ is a fast, PCR and sequencing error aware tool to analyze high throughput data from recombined T-cell receptor or immunoglobolin gene sequencing experiments. It derives immune repertoires from sequencing data in FASTA / FASTQ format.
Please cite the following publication when you use IMSEQ:
Kuchenbecker L, Nienen M, Hecht J, Neumann AU, Babel N, Reinert K, Robinson PN. IMSEQ - a fast and error aware approach to immunogenetic sequence analysis. Bioinformatics. 2015;31(18):2963–71. [PubMed] [Journal] [BibTeX]
Getting IMSEQ
IMSEQ is freely available and its source code is made available under the GPLv2 license. You can obtain binaries and sources for the latest release, IMSEQ 1.1.0, by choosing from one of the following links:
- IMSEQ 1.1.0 for Linux (64 bit)
Statically linked binary supporting most 64 bit Linux systems - IMSEQ 1.1.0 for OS X (64 bit)
Requires OS X 10.9 or newer - IMSEQ 1.1.0 Sourcecode
A source tarball of IMSEQ 1.1.0.
Changes can be reviewed in the Changelog.
For the sources and build instructions of the current development version check out the GitHub repository.
Using IMSEQ
IMSEQ requires at least two input files and the specification of at least one output file. The two input files are specified using the switches
$ imseq -ref segment-reference.fa -o output-file.tsv input-file.fastq.gz
- segment-reference.fa
A file containing the V and J segment sequences for the gene and species that is analyzed. The file must be in FASTA format and the sequence IDs must follow the IMSEQ FASTA ID Specification. Gene segment reference files for the human T-cell receptor alpha and beta chain as well as immunoglobolin heavy, light-kappa and light-lambda genes are provided with IMSEQ. - input-file.fastq.gz
One (single-end sequencing) or two (split paired-end sequencing) FASTA or FASTQ files with the input reads. If two files are specified, the first file is considered to be V-read and the second file the V(D)J-read, if only one file is specified it has to contain V(D)J-reads. V-reads are parsed as forward sequence, V(D)J-reads are parsed as reverse complementary sequence. This behavior can be inversed using the-r
switch. - output-file.tsv
An output file where the detailed per-read analysis results are written in TSV format. See the output file specifications for more details.
Further Reading
IMSEQ supports many more options that can adjust its behaviour to many experimental conditions. All options are documented in the Manual. Furthermore, the Tutorial will guide you through some basic analysis of the example files that come with IMSEQ.