Skip to Content
Software and Supplementary Materials
This page contains an archive of supplementary materials from previous AWC publications, and software produced by AWC members that is available to download.
Index-free De Novo Assembly of Mixed Mitochondrial Genomes
Index-free De Novo Assembly of Mixed Mitochondrial Genomes Scripts are described in the paper "Index-free de novo assembly and deconvolution of mixed mitochondrial genomes" by McComish BJ, Hills SFK, Biggs P and Penny D (2010)(Submitted to Genome Biology and Evolution) .
Bennet McComish, Massey University, Palmerston North, Phone: +64 6 356 9099 extn 2569
index_free_assembly.rar (21 KB)
NTRFinder 1.0 is developed to find Nested Tandem Repeats (NTRs). This program takes a fasta file as input and display the output in the textarea of the main page. It is developed under Java JDK 1.5 and requires Java Runtime Environment 1.5+ for execution.
NTRFinder_v1_jar.zip (20,844 KB)
Please uninstall any existing version of Spectronet before installing this version. This new version of Spectronet includes Closest Tree and cluster-powered Fast Hadamard Transform algorithms, the Treeness Triangle for visualising sequence information, and several bug fixes. Consult the in-program help for more information on the new features.
Spectronet127.exe (815 KB)
Holland, B. R., K.T. Huber, D. Penny, and V. Moulton. 2005. The MinMax Squeeze: Guaranteeing a minimal tree for
population data. Mol. Biol. Evol. 22:235-242.
and Pierson, M.J., R. Martinez-Arias, B.R. Holland, N.J. Gemmell, M.E. Hurles, and D. Penny. 2006. Deciphering Past
Human Population Movements in Oceania: Provably Optimal Trees of 127 mtDNA Genomes. Mol. Biol. Evol. 23(10).
for finding lower bounds on the parsimony score of an alignment.
mms 1_5.zip (364 KB)
Two-States Triplet Markov — 2STM
2STM calculates Markov matrices from 2-state character data sets with 3 sequences simultaneously. The program reads 4-state character nucleotide data sets and outputs estimates of the three Markov matrices from the root to each taxon. 2STM also calculates the variability of estimates (bootstrap) and some simple statistics, such as composition of nucleotide characters, either in 4 states or 2 states.
Executable Windows program, C code and example data can be obtained from this WinZip file
2STM.zip (224 KB) .
Site Strip Search — For site-stripping analyses of nucleotide alignments
This script selects subsets of taxa from a given alignment. The subsets are chosen arcording to the homoplasy of the sites. The resulting data set may be automatically sent to PAUP* or MrBayes for further analysis.
This beta version has been tested on Linux and Windows operating systems.
The Perl script can be downloaded as a zip or gzip archive.
site_strip_search.zip (79 KB)
site_strip_search.gz (79 KB)
This program implements the methods described in
Holland, B. R., G. Conner, K. Huber, V. Moulton. 2006. Imputing supertrees and supernetworks from quartets. Systematic Biology (to appear).
and Holland, B. R., G. Conner, K. T. Huber, V. Moulton. 2006. Imputing supertrees and supernetworks from quartets, (1 page abstract). In: 6th Workshop on Algorithms in Bioinformatics (WABI 2006) Eds B. Moret and P. Buchner, Lecture Notes in Bioinformatics. 4175:162.
Contact Details Barbara Holland
Genotyping Utilities Package
GenoTyper Rearranger (GTR) is a utility that converts the output AFLP data from genotyping programs (currently ABI's GeneMapper and SoftGenetics' GeneMarker) to various formats, allowing easier display and manipulation.
AFLP Replicate Difference Calculator is a utility calculates the difference of some parameter (peak height, for example) between replicates in a table of AFLP data. This script takes two inputs: the AFLP data and a table that declares which samples are replicates of which other samples.
See the AFLP page for more details.
LineageSpecificSeqgen is an extension to the seq-gen program that allows generation of sequences with both changes in the proportion of variable sites and changes in the rate at which sites switch between being variable and invariable.
Ref: Shavit L, Penny D, Hendy MD, Holland BR: LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites. BMC Evolutionary Biology 2008 (in press).
lineage_specific_seq_gen.zip (2,367 KB)
Supplementary Information for AWC projects
Supplementary Material for Treeness Triangle Paper
White W.T.1, Hills S.F.1, Gaddam R.1, Holland B.R.1, Martin W.2 and Penny D.1 (2007) Treeness Triangles: Visualizing the loss of phylogenetic signal (submitted to Molecular Biology and Evolution)
1. Allan Wilson Center for Molecular Ecology and Evolution Massey University, Palmerston North New Zealand
2. Institute for Botanik III University of Düsseldorf Düsseldorf Germany Email: [email protected]
|TT_Suppl_Information_6-3-2007_DP.doc (260 KB)||Supplementary tables and figures||MS World|
|treeness_triangle_real_data.zip (3,915 KB)||The chloroplast data analyzed in the paper||WinZIP|
|Readme||How to set up and use the Treeness Triangle software||HTML|
|treeness_triangle_win32.zip (320 KB)||Treeness Triangle software for Windows users||WinZIP|
|treeness_triangle_mac_osx_10_3_9_tar.gz (561 KB)||Treeness Triangle software for Mac OS X users||tar gzip|
|treeness_triangle_source_tar.gz (51 KB)||Source code for the Treeness Triangle software (also needed by of users Linux and other Unix operating systems)||tar gzip|
Supplementary Material for Oceanic Paper
Pierson MJ, Martinez-Arias R, Holland BR, Gemmell NJ, Hurles ME, Penny D. (2006) Deciphering Past Human Population Movements in Oceania: Provably Optimal Trees of 127 mtDNA Genomes. Mol Biol Evol. 2006 Jul 19’
Melanie Pierson. Email: [email protected]
DOWNLOADS_Oceanic_supplementary.pdf (1,081 KB)
Professor David Penny Research Director, Professor of Theoretical Biology, Massey University - Palmerston North Phone: +64 6 350 5033 Fax: +64 6 350 5626 Email: [email protected]
|arthro35-taxonomy.doc (25 KB)||Taxonomy of 35-taxon dataset||MS World|
|arthro25-taxonomy.doc (24 KB)||Taxonomy of 25-taxon dataset||MS World|
Atheer Matroud Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Private Bag 11 222 Email: [email protected]
|AtheerNTR__Apr.pdf (172 KB)||AN ALGORITHM TO FIND NESTED TANDEM REPEATS|
Using Ancestral Sequences to Uncover Potential Gene Homologs
L. J. Collins1+, A. M. Poole1,2 and D. Penny1
Gene homologs between distantly related species can be difficult to identify. We test the idea that inferred ancestral sequences could aid in finding gene homologs. Ancestral sequences are inferred by aligning gene homologs on a known tree and estimating the most-likely amino acid for each position at each node in that tree. BLAST, HMMER are used separately and together with ancestral sequences, to search the genome sequence databases of Encephalitozoon cuniculi, Entamoeba histolytica and Giardia lamblia for RNase P protein homologs. RNase P proteins, Pop4, Pop1, Pop5 and Rpp21 have been reported in humans and at least two other eukaryotic species but have yet to be identified in the above genomes. Using ancestral sequences reconstruction (ASR) for these proteins, we successfully identified putative homologs from E. histolytica, G. lamblia and E. cuniculi. In some cases the use of ASR outperformed BLAST and HMMER. Overall including ancestral sequences in searches with BLAST and/or HMMER was the most successful approach in the recovery of four potential RNase P protein gene homologs from G. lamblia, making this a useful technique in early homolog identification.
1 Allan Wilson Centre for Molecular Ecology and Evolution, Institute of Molecular BioSciences, Massey University, Private Bag 11222, Palmerston North, New Zealand.
2 Department of Molecular Biology and Functional Genomics, Stockholm University, SE-106 91, Stockholm, Sweden. + Corresponding Author: [email protected]
|Supp_FiguresB.doc (601 KB)||MS World|
|Supp_Tables.doc (123 KB)||MS World|
Avian Datasets and Supplementary Information
All Queries to Gillian: [email protected]
Kerryn E. Slack, Frédéric Delsuc, P.A. (Trish) McLenachan, Ulfur Arnason and David Penny (2006) Resolving the root of the avian mitogenomic tree by breaking up long branches. Molecular Phylogenetics and Evolution (in press)
|30b6r.12SLnt3ry||30 birds + all 6 reptile outgroups||DNA (1+2 NT, 3 RY, S+L NT)||Nexus||29/05/06|
|30b.12nt3rySLnt||30 birds||DNA (1+2 NT, 3 RY, S+L NT)||Nexus||29/05/06|
K. E. Slack, C. M Jones, T. Ando, G. L.(Abby) Harrison, E. Fordyce, U. Arnason and D. Penny (2006) Early penguin fossils, plus mitochondrial genomes, calibrate avian evolution. Molecular Biology and Evolution 23: 1144-1155.
|25b+6rept||25 birds + all 6 reptile outgroups||
DNA (1+2 NT, 3 RY, S+L NT)
Harrison G.L., McLenachan P.A., Phillips M.J., Slack K.E., Cooper A., and Penny, D. (2004) Four new avian mitochondrial genomes help get to basic evolutionary questions in the Late Cretaceous. Mol. Biol. Evol. 21(6):974-983.
|24b6r12n3rSLn||24 birds + all 6 reptile outgroups||
DNA (1+2 NT, 3 RY, S+L NT
|Bird_Taxa_List.doc (33 KB)||MS World||20/2/04|
|Revised_Bird_Annotations.doc (43 KB)||MS World||20/2/04|
|Bird_Tables_Headings.doc (23 KB)||MS World||20/2/04|
|Bird_Tables.xls (96 KB)||MS Excel||20/2/04|
|Reptile_Taxa_List.doc (23 KB)||MS World||24/7/02|
|Revised_Reptile_Annotatns.doc (101 KB)||MS World||9/9/02|
|Reptile_Tables_Headings.doc (23 KB)||MS World||24/7/02|
|Reptile_Tables.xls (37 KB)||MS Excel||24/7/02|
|sup_inf_penguin_goose.zip (73 KB)||WinZip||20/2/04|
|sup_inf_penguin_goose_tar.gz (70 KB)||Tar gzip||20/2/04|
Slack K. E., Janke A., Penny D. & Arnason U. (2003). Two new avian mitochondrial genomes (penguin and goose) and a summary of bird and reptile mitogenomic features. Gene 302: 43-52.
|19bird||19 birds||Protein (Amino acid)||PHYLIP||26/8/02|
19 birds + 2 crocodilians
|Protein (Amino acid)||PHYLIP||26/8/02|
19 birds + 2 lizards
|Protein (Amino acid)||PHYLIP||26/8/02|
19 birds + 2 turtles
|Protein (Amino acid)||PHYLIP||26/8/02|
19 birds + all 6 reptile outgroups
|Protein (Amino acid)||PHYLIP||29/05/06|
|datasets.zip (400 KB)||ALL datasets||WinZip||20/05/06|
|datasets_tar.gz (336 KB)||Tar gzip||20/2/04|
All queries to Matt Phillips Email:
2006 Asian Institute in Statistical Genetics and Genomics at Jeju Islands, Korea
(lectures by Matt Phillips and Barbara Holland)
|SK_OverviewPhylogenetics.ppt (3,906 KB)||Overview of Phylogenetic methods and applications||MS Powerpoint|
|SK_DistanceBasedMethods.ppt (379 KB)||Distance Based Methods for estimating phylogenetic trees||MS Powerpoint|
|SK_Parsimony_and_search.ppt (606 KB)||Parsimony and searching tree-space||MS Powerpoint|
|SK_MaximumLikelihood.ppt (2,107 KB)||Maximum Likelihood and model selection||MS Powerpoint|
|SK_BayesianMethods.ppt (2,980 KB)||Bayesian Inference and Molecular Dating||MS Powerpoint|
|SK_BtspConsensusSupertrees.ppt (193 KB)||The bootstrap, consenus-trees, and super-trees||MS Powerpoint|
|SK_SplitsGraphs.ppt (489 KB)||Exploring Phylogenetic Data with Splits-Graphs||MS Powerpoint|
|SK_DifficultProblems.ppt (3,559 KB)||Difficult problems ... and solutions||MS Powerpoint|