Parametric Alignment of Drosophila Genomes

Colin Dewey, Peter Huggins, Kevin Woods, Bernd Sturmfels, Lior Pachter

0

[Introduction] [Polytopes] [Cis-regulatory Elements] [Software]

Introduction

Here we present the alignment polytopes and software for the paper "Parametric Alignment of Drosophila Genomes."

Polytopes

Segments of the Drosophila melanogaster genome that have been parametrically aligned with orthologous segments of the Drosophila pseudoobscura genome may be visualized through this
UCSC Genome Browser custom track.

Cis-regulatory Elements

As part of our whole genome parametric alignment of Drosophila melanogaster and Drosophila pseudoobscura, we have produced alignment polytopes for the set of transcription factor binding sites available at the
Drosophila DNase I Footprint Database. To view the polytopes for segments overlapping these elements, load the UCSC Genome Browser custom track with the "FlyReg" track turned on.

Software

We have implemented two algorithms for the parametric alignment of two DNA sequences: the polytope-propagation algorithm, and the beneath-beyond algorithm. The polytope-propagation algorithm can compute the alignment polytope for two sequences for scoring schemes with any number of parameters. However, it is quite slow due to the fact that it repeatedly peforms expensive convex hull computations. The beneath-beyond algorithm is much faster, and its current implementation can handle scoring schemes of dimension up to five. On current hardware, the beneath-beyond implementation can produce the 3d alignment polytope of two sequences of length 150bp in less than one second. Implementations of both algorithms are made available here and have identical interfaces.

Binaries

Available here are Linux binaries for the two programs. Binaries for other platforms may be available on request.
To obtain help on running these programs, execute them on the command line with the "--help" option.

Source Code

The source code for the implementation of the polytope-propagation algorithm is available as part of Colin Dewey's source code package. Source code for the beneath-beyond implementation will be made available soon.

Scoring Matrices

The parametric alignment programs can take as input a symbolic scoring matrix. The default scoring scheme is the Jukes-Cantor scoring matrix with affine gap scores (a 3d model). Here are matrices for the phylogenetic models JC69, K80, K81, SYM.