PAUP* Test Version Downloads
PAUP* test-version downloads
Please join the (low-traffic) Google group paup-announce
to receive announcements of updates by email.
Update on status of PAUP*:
My original plan was to release PAUP* GUI versions requiring payment, in order to generate some revenue for
supporting the program (I am paying out of my own pocket to host the web site as well as buying hardware
and software needed for development.) More recently, I'm having trouble finding the motivation to work on
the update to the Mac GUI, which requires a complete rewrite of the interface code. (It's not just a 32-bit
vs. 64-bit issue--the problem is that Apple dropped support for the Carbon framework which is what I had used
for the current version.)
Rather than continuing this cycle of issuing expiring releases and then creating problems for people when
they expire, I have now decided to remove the expiration from all versions--both GUI and command line.
I am still on course to release an open source version later this year, but in the meantime you can
continue downloading the precompiled binaries here.
Versions can be updated using either an auto-updating system (GUI version) or just returning to the
download site (both versions).
Please report any problems you encounter with this version, no matter how minor
(dave@phylosolutions.com).
Macintosh OS X (GUI version; universal binaries):
Note: After extracting the downloaded file (if necessary), you may need to right-click the icon
and choose Open the first time (to get around the warning that the application is from an unknown developer).
- Mountain Lion and later (OS X 10.8+; compiled with icc): PAUP_dev_icc.zip
IMPORTANT NOTICE: The Mac GUI version will NOT run on MacOS 10.15 (Catalina) and later.
If you want to run the GUI version on MacOS, you must have MacOS 10.14 or earlier (either as your primary
operating system or running as a virtual guest OS). I hope to develop a GUI version that will run on Catalina,
but it is a huge amount of work, and will take some time. The command-line version (see below) is compiled
for both 32- and 64-bit, and runs fine on Catalina.
Note regarding font issues on OS X: Some users have reported that PAUP* refuses to launch
(particularly after upgrading to El Capitan), complaining that the "Required font "MesloLGS-Regular" was not
found in the application package." This probably indicates corruption in the system font tables rather than
a problem with the PAUP* application bundle. A fix that has consistently worked is to kill the "fontd" process
using the Activity Monitor program. This process will be automatically restarted and will rebuild the font
tables, and PAUP* will now launch successfully.
Windows GUI version:
This version requires Windows 7 or later (either 32- or 64-bit).
Command-line binaries:
Note: After gunzip'ing the downloaded file (if necessary), you may need to make it
executable using a command like
chmod a+x paup4a168_osx.
- Mac OS X:
- Windows:
- No separate command-line version is needed. Just download and run the installer for the GUI version.
A console (command-line) version will also be installed, and can be run by entering "paup" at a command prompt.
- Intel Linux:
Choose a flavor that works for you. These binaries should work on other Linux distributions as well,
but let me know (info@phylosolutions.com) if you can't get them to run.
Also, note that these command-line versions need the Fortran runtime library to be available. If the program does
not start due to a missing library, try installing the GFortran compiler (or something that will install
the needed library, such as R, LAPACK, or Octave).
Release notes
PAUP* test-version release notes
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Please report any problems you encounter with this version, no matter how
minor (dave@phylosolutions.com).
Update: PAUP* seems stable enough now that ALL versions no longer expire. Official open-source release of the
program is getting very close.
Version 4.0a169 (current)
New feature:
- The 'qAge' command for estimating speciation (divergence) times under the multispecies coalescent model is now
fully operational (see preprint and
tutorial).
Other changes/fixes:
- Fixed memory corruption causing crashes and misbehavior of the "Delete/Restore Taxa" and "Define outgroup"
dialog boxes.
- Fixed inconsequential (but potentially confusing) output glitch in command-line-equivalent output string
shown after closing SVDQuartets dialog box.
Version 4.0a168
Two potentially serious bugs fixed:
- Optimization of tree likelihoods under the Mkv model was compromised due to a problem in the calculation of the
likelihood function and its derivatives when the number of states varied over characters and the
"mkStateSpace=variable" option (the default) was in effect. I suspect that this bug was introduced
at version 4.0a162 but it could have been earlier or later. If you are using likelihood under the Mkv model,
your data set contains only variable characters, and the number of states for each character is not constant,
you should rerun your analyses.
- Likelihood calculations for non-DNA models were sometimes incorrect in version 4.0a166/167. Please rerun any
maximum-likelihood analyses involving amino-acid or "generic" models.
Other changes/fixes:
- qAge method for quartet-based estimation of node ages under multispecies coalescent is now working.
- More info on CPU processor and configuration is output at startup.
- Fixed possible crash during "taxcolor" command (and related commands).
- Windows version correctly determines the number of processors on machines with more than 64 cores.
- The requirement for Windows 7 or later is now strictly enforced (it's too hard to fix bugs affecting more
recent Windows versions and also maintain backward compatibility to versions that are >13 years old).
- Consensus trees can now be saved to treefiles in formats other than Nexus (use 'format' option).
- Multispecies coalescent simulator now supports more than one sampled individual per species.
- Fixed hang outputting branch-length table for simulation when branch lengths for input trees contain more than
11 decimal digits.
- Fixed crash during calculation of single-site likelihoods under Mkv model.
- Fixed extraneous zero-byte being included at end of file for files imported into Mac editor.
- Fixed possible cosmetic glitch in tree output when length of branch connecting ingroup to outgroup
node exceeded 1
Version 4.0a167
This build just extends the expiration date of the previous version (affects GUI versions only).
I will be posting a newer version very soon.
Version 4.0a166
Changes/fixes:
- Fixed SVDQuartets crash if number of threads requested was larger than number of taxa.
- Fixed failure to respect 'scoreDigits' option if given on LSet rather than LScores command.
- Added "saveAsRooted" option to ConTree command. When the "treefile" option is requested, the "saveAsRooted=y"
option can be added to request that the consensus of the input trees be saved as rooted rather than unrooted trees
(e.g., to simplify plotting with other programs like FigTree).
- Allow maximum likelihood analyses with completely missing states for DNA analyses (where necessary, an arbitrary
tiny value is used for the frequency of these states to avoid numerical problems. which was already being done for
other data types).
- Fixed possible crash when maximum-likelihood distances were requested (especially after doing a previous
tree-based maximum-likelihood analysis)
- Fixed failure to apply changes to font size in title and tree number made in "Font/size" windows.
- Allow rooted two-trip trees.
- Fixed crash if the "distance" setting was changed from a model-based distance to a standard distance and
certain operations were requested (e.g., treefile saving) that used the new distance without proper reinitialization.
- Incorrect results were output if a model partition was active when the automated model selection (AutoModel)
command was issued. Now, an error message is issued, and the use can either not do the command or turn off
model partitioning (LSet mpartition=none).
Version 4.0a165
Changes/fixes:
- Fixed problem with two successive likelihood score evaluations (e.g., lscores followed by describetrees)
with this option combination: molecular clock, G+I models, and gamma shape and pinvar parameters estimated
rather than fixed.
- Fixed incorrect labeling of trees in output files from tree-to-tree distances when these distances were
calculated for a smaller subset of all trees in memory.
- Fixed possible memory corruption when reading in data (may explain some nonreproducible crashes).
Version 4.0a164
Changes/fixes:
- Fixed failure to properly import simple or tab-delimited text files when the "interleaved" option was requested
(regardless of whether the file was actually in interleaved format to begin with).
- Fixed several potential memory corruption issues that may have been responsible for startup crashes on some
hardware/OS combinations (I have not been able to reproduce these crashes myself).
- Fixed crash when beginsim/endsim loop was executed from within a Python block in non-GUI versions.
- Fixed crash if trees are obtained by neighbor-joining but current global optimality criterion is parsimony,
and resulting tree is saved with "root=yes" or "brlens" options.
- Fixed crash during multithreaded likelihood calculation if the ratio of number of sites to number of threads is
small enough to cause some threads to be assigned no patterns.
Version 4.0a163
This version fixes an annoying bug in the embedded text editor of the macOS version (files were sometimes not being
saved after making changes despite nothing seeming to go wrong). I first noticed the problem after upgrading to
macOS 10.13 (High Sierra), but the bug may have affected other OS versions as well.
The only other major fix involves bootstrapping with SVDQuartets on very large data sets.
Changes:
- Fixed memory corruption leading to crashes with SVDQuartets bootstrapping on very large data sets.
- R/Y recoded DNA sequences are now handled correctly in SVDQuartets. E.g., states can be A and C, or R and Y
(recoding into states 0 and 1 is no longer necessary).
- Fixed issues with saving files from the editor in the macOS version.
- Fixed broken "ratematrix" setting in garli.conf file.
- "Save trees to file" and "Matrix representation" items in the Trees menu are no longer disabled if an
editor window is frontmost (i.e., now consistent with other menu items).
- Fixed incorrect handling of ambiguities when calculating distances between a pair of sequences for which
some states were not observed at at least one site containing an unambiguous state in both sequences
(unlikely, but can happen with extremely short sequences or when sequences containing a high proportion of
missing data have little overlap in non-missing regions)
Version 4.0a162
New features:
- Conditioning of likelihoods on character variability is now supported for all models (rather than just Mkv models
for "standard" data.) This setting is needed to support calculations for SNP data where constant
sites have been excluded from the matrix, leading to an acquisition bias that causes overestimation of the
substitution rates. The option is "lset condvar=no|yes|auto": "no" = no conditioning,
"yes" = always condition, and "auto" = condition if no constant sites are present in the matrix.
- SSE vectorization is now available for double-precision likelihood calculations (in addition to
single-precision previously available). The vectorization and
precision options are now completely independent. Use "lset precision=single|double" to request
single- vs. double-precision calculation, and "lset vectorize=no|yes" to turn vectorization off
or on (but ordinarily there would be no reason to turn vectorization off).
Bugs fixed:
- Fixed freezing and/or obviously invalid likelihood score calculation for data sets containing relatively large
numbers of taxa, especially using the Windows version.
- Fixed failure to show mean, standard deviation, g1, and g2 statistics for tree-score histograms and bar charts
when tree scores were integer values (i.e., unweighted parsimony or weighted parsimony with all-integer weights).
- Restored non-empty default settings for some filename options.
- Fixed possible crash when saving N best trees during a search and when some saved trees were not binary.
- Fixed glitch in output of single-site likelihoods with Mkv model.
- Fixed failure to properly import files in "simple text" and "tab-delimited text" formats.
- Restored "LG" to menu for amino-acid models in the Likelihood Settings dialog.
- Fixed a Python-block parsing problem if the Python code included a multiline string containing a line starting
with "end;" (e.g., if the string contained a Nexus block). Now, the "end;" terminating a Python block must
begin in column one of the input text (no leading whitespace), and any line beginning with "end;" in the Python
code must be indented by at least one column.
Other changes:
- The "NST=Mkv" option is no longer allowed. The 'condVar' option (see above) should now be used to specify
whether likelihoods are conditioned on character variability.
- The 'remainder' keyword is now allowed when defining character and taxon partitions. For example, the two
definitions below are equivalent:
- charpartition oddeven = odd:1-.\2, even:2-.\2;
- charpartition oddeven = odd:1-.\2, even:remainder;
Version 4.0a161
This version fixes a significant bug in the previous version (build 160), which only existed
for one day. Do not continue using 4a160!
Please read the release notes for version 4a160 as well.
Bug fixed:
- Fixed incorrect handling of hash-table collisions. The most obvious consequence was failure to
read treefiles containing translation tables correctly. It is possible that other
functionality was also affected.
Version 4.0a160
This version fixes several bugs and adds a couple of enhancements; most changes involve under-the-hood work in
preparation for the release of version 5.0.
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Unless major problems are found with this version, the official launch of the next major release (5.0)
will occur sometime in early 2018. Please report any problems you encounter with this version, no matter how
minor (dave@phylosolutions.com).
Bugs fixed:
- Fixed an issue that could cause crashing or incorrect parsimony calculations when several stepmatrices
were simultaneously active.
- Fixed a problem with accessing an out-of-bounds memory location while reading trees after one or more
taxa had been deleted. Although the data read from this address were never actually used, the problem could
nonetheless cause a memory violation (and crash) on some operating system versions (Windows 8 and later, and
possibly some Linux systems as well).
- Fixed a freeze when the "lset estall;" command was issued with partitioned models when state frequencies were being
estimated by ML (rather than using empirical frequencies).
- Fixed problems with saving files from PAUP's editor in macOS 10.13 (and possibly other macOS versions newer
thab 10.8). These problems included bogus complaints about resource forks not being saved, and spurious
notifications and queries about file being modified outside of PAUP's editor.
- Fixed problems with CStatus output formatting with non-integer character weights.
- Fixed a crash (rather than issuing an error message) if the "multilocus" option was used for simulation
without the "mscoalescent" option being on, when true trees had not been assigned to each locus.
- Fixed incorrect button titles in the "Reweight Characters" dialog box for the Windows version.
- Fixed a problem with "automodel" output when input sequences were extremely long (leading to likelihood
scores requiring many digits).
- Fixed failure to calculate likelihoods on highly unresolved trees due to underflow problems.
- Fixed some glitches in the Windows-version Print/Preview Trees window.
- Fixed cmd-backspace shortcut for newer macOS versions (delete current command).
- Fixed incorrect explanation about why SSE vectorization was suppressed when double-precision calculation
was requested.
- Fix problems with likelihood calculation when switching between likelihood models with site-specific rates
versus equal or gamma rates, or when each observed site pattern occurs in only one partition subset.
Other changes:
- Quartet weights are stored in a more memory efficient (but slower) way for SVDQuartets when the number of taxa is
large enough that full storage and direct-lookup of quartet weights requires a huge memory allocation (the default
threshold for switching to the slower method is 1GB). This change allows SVDQuartets to be run for some data sets
for which memory allocation was previously failing.
- Several slow components of the QFM quartet-search algorithm are now multithreaded, providing 2-3 fold speedups on
large benchmarks using 4 threads on a machine with 4 physical CPUs.
- Progess-bar-like output is now shown in non-GUI versions, crudely emulating the progress window in GUI versions.
- One or more transition rates can be forced to zero when estimating exchangeability parameters in generic k-state
GTR models (technically, this is also allowed for DNA models but it wouldn't make much sense). For example, you
could use this to emulate "ordered" parsimony characters by setting the rate between non-adjacent states to zero
and estimating the rates for adjacent states. To do this, you must use a genRClass statement, and use "0" for
state pairs between which transitions are not allowed, and an alphabetic code (a, b, ...) otherwise. E.g., for a
data set containing characters with three states (0, 1, and 2), you would do "lset genrmatrix=(a 0 b)" to indicate
that transitions between states 0 and 1 occur at one rate (a), between 1 and 2 occur at a different rate (b) and
transions between 0 and 2 are disallowed.
- The command-line interface for using a taxon partition to assign tips to species has been slightly changed.
The 'speciesTree' option is no longer used, and the 'partition' option has been renamed to 'taxpartition'. Now,
just use "taxpartition=" to do the assignment, or "taxpartition=none" to revert to 1 tip per
species.
- A few new commands were added. These features have not been fully tested, however, so I will wait until the next
test-version release to document them.
- Font handling for macOS version has been completely overhauled, replacing use of the deprecated ATSUI with more
modern CoreText APIs. Win32 font handling was also refactored to increase code-sharing between Mac and Window
versions.
Version 4.0a159
This version corrects a few minor problems with the previous version and opens up a couple of new features.
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Unless major problems are found with this version, the official launch of the next major release (5.0)
will occur sometime in early 2018. Please report any problems you encounter with this version, no matter how
minor (dave@phylosolutions.com).
New features:
- MergeTaxa command:
The MergeTaxa
command ("Merge Taxa" menu command) can be used to reduce the number of undeleted taxa
by merging taxa that are nearly or completely identical.
Only one member of each set of merged taxa is retained; the remaining ones are deleted (as if they had been
explicitly deleted using a Delete
command).
Use of this command is strongly recommended if your data set contains redundant taxa (i.e., have no differences),
and may be useful when some subsets of taxa differ by only a small amount.
The choose
option determines how the single taxon to be kept from each group of redundant taxa
will be chosen. The default is choose=mostcomplete
, which specifies that the taxon with the
least missing data will be kept (ties are broken randomly). If choose=first
, the first taxon
(in the order taxa appear in the data matrix) will be kept. If choose=random
, one member of the
redundant group will be chosen at random.
If simthreshold
is reduced from its default value of 1, then pairs of taxa having this level of
similarity or greater are iteratively merged until all remaining undeleted pairs have a similarity below this
level ("similarity" is defined as the proportion of sites that have the same character state divided by the
number of characters that do not exhibit a missing state in either member of the taxon pair).
Other options are seed=
value (the default of 0 gets an initial seed from the system) and
minKeep=
n (n specifies a lower limit on the number of taxa to retain as undeleted.
Note that if the value of simthreshold
is 0, exactly n taxa will be retained regardless
of their level of similarity.
- Generic k-state likelihood models:
Likelihood analyses can now be performed on arbitrary data using Poisson/proportional models
(comparable to JC/F81 models) as well as GTR models containing any number of states, when data are input using
the "standard" datatype. The usefulness of this feature is unclear, but some users have expressed a desired to
evaluate models involving non-standard codings of sequence data, or to perform model-based analyses with other
data types. Remember that for models other than k-state Poisson (which assumes equal state frequencies
and transition rates between states), these models assume equivalence of states across characters, which
is not typically appropriate for morphological data. (The only likelihood model typically applicable for
morphological data is the "Mkv" model which has been available for some time.)
New options supporting this feature are:
genFreq = empirical|equal|estimate|previous|
(<list>)
Specifies state frequencies, analogous to "basefreq" options for DNA.
genRmatrix = (<r01><r02>...)|estimate|previous|oneST
Specifies whether the exchangeabilities are to be estimated or set to user-defined values. For user-defined
values, the order is the upper triangle of a symmetric matrix, entered row-wise, or equivalently, the lower
triangle of a symmetric matrix, entered column-wise. If oneST
is specified, a one-subtitution
type model is used ("Poisson" for equal state frequencies, "proportional" for unequal state frequencies,
analogous to the JC and F81 DNA models, respectively).
Note that if the data matrix contains only binary characters, the state frequencies and transition rates
are confounded for reversible models. In this case, the rate matrix entries are ignored, and only the base
frequencies are used to parameterize the model.
Bugs fixed:
- Because of a problem with the build system, SSSE3 instructions were disabled in non-Macintosh versions
(my intention was to disable SSSE3 instructions only when they were not supported by the active processor).
- Taxon and node names were being truncated inappropriately in "apomorphy list" output when all taxon names
were shorter than 8 characters.
- Likelihood scores for amino-acid models were inaccurate when amino-acid frequencies were estimated by ML
(rather than using the default setting of "empirical").
Other changes:
- Estimation of amino-acid rate matrices is now allowed. Currently, this can only be requested from the
command line; the dialog interface will be updated in a subsequent release. See output from "lset ?" for
relevant option names and values. You should only use this option if you know what you're doing. Parameter
optimization can be slow, and should ordinarily be used only when the number of sites is large.
- "OneST" (one substitution type) models are now available for amino acid models. This option allows
use of the Poisson and proportional models for equal and unequal state frequencies, respectively.
- Parsimony calculations for (symmetric) stepmatrix characters are now much faster for data sets containing
large numbers of taxa and/or characters. (I had accidentally disabled some code that allows these characters
to be processed more efficiently.)
Version 4.0a158
This release restores compatibility with processors that lack SSSE3 support and fixes a couple of other bugs.
Numerous under-the-hood changes to support as-yet unannounced features have also been made.
Bugs fixed:
- Topological constraints were not being enforced in SVDQuartets (this option has been disabled; see below).
- Search for optimal tree could hang in QFM phase of SVDQuartets.
- Predefind "Constant" character set was not being updated after deleting/undeleting taxa.
- Crashes could occur if 'N' ambiguity codes were used with stepmatrix characters.
- Command-line equivalent generated from "Set Character Types" dialog did not show character ranges properly.
Other changes:
- Different code paths are now used to support processors (e.g., AMD) that lack support for SSSE3
instructions (rather than refusing to run if SSSE3 support is lacking).
- Code for resizing tree buffers completely refactored; efficiency improved when reading in treefiles
containing huge numbers of trees.
- Enforcement of topological constraints was not working with SVDQuartets. This option will therefore
be disabled until I find time to reimplement this feature correctly.
Version 4.0a157
This version one potentially serious bug as well as some other minor bug fixes and small changes.
Bugs fixed:
- Memory corruption occurred when deletion of taxa caused a previously multistate character to become binary,
leading to crashes and/or invalid parsimony calculations.
- Fixed off-by-one error that could cause exact parsimony searches to fail in a very unlikely situation.
- Lines from Python block were being truncated by one character when echoing to output window.
- ToNexus command was unintentionally converting destination file to "Classic Mac" line termination.
Other changes:
- The "ClearTrees" command is now allowed at any time, even if no taxa have been defined.
- A "shuffle" option is now provided for the FastTree wrapper (with an associated "seed" option).
Version 4.0a156
This version corrects a couple of additional issues. (MrModelTest users: Also read the item under
Changes for version 4.0a155.)
Bugs fixed:
- Fixed failure to show histogram of tree scores (and possible crash) with exhaustive search under
parsimony criterion.
- Fixed refusal to run parsimony on non-DNA data sets when the likelihood option 'pinvar' was set to a
nonzero value.
- Fixed loss of tree names and user-defined branch lengths when trees not retained by a filter were deleted.
Version 4.0a155
This version corrects a minor issue with the previous release and adds one simple feature.
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Unless major problems are found with this version, the official launch of the next major release (5.0)
will occur sometime in Fall 2017. Please report any problems you encounter with this version, no matter how
minor (dave@phylosolutions.com).
Changes:
Version 4.0a154
This version fixes bugs that caused the previous version to crash with some data files and commands. See the notes
for 4.0a153 for other recent changes.
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Unless major problems are found with this version, the official launch of the next major release (5.0)
will occur sometime in Fall 2017. Please report any problems you encounter with this version, no matter how
minor (info@phylosolutions.com).
Bugs fixed:
- Fixed crash in heuristic search under parsimony criterion with some data sets.
- Fixed crash after deleting characters and then deleting a different set of characters.
- Fixed crash when character types are changed from the dialog box (if no user-defined character types have been defined).
- Values defined for 'MISSING' are now incorporated into the Format command for files imported from FASTA format.
- Removed a line of debugging output that was inadvertently left enabled.
Other changes:
- Restored ability to read treefiles containing taxa that are currently deleted by pruning deleted tips from the input trees.
- Added a new predefined character-set "binary". Now, you can do, e.g., "include binary/only;" to exclude all characters
having more than 2 states.
- Improved output formatting with very long character names (Character status, parsimony change and apomorphy lists).
Version 4.0a153
This update adds few new features, but a great deal has changed under the hood to improve reliability, memory usage,
and correctness. See the notes below for more information.
The feature set for the official release version is now nearly complete (although some new features need to be
documented). Unless major problems are found with this version, the official launch of the next major release (5.0)
will occur sometime in Fall 2017. Please report any problems you encounter with this version, no matter how minor (info@phylosolutions.com).
Bugs fixed:
- Fixed a major, and apparently longstanding, problem with parsimony jackknifing (which I don't use and for which
I unfortunately hadn't written an automated test). The bug resulted in completely bogus support values for some or all groups, so
any parsimony jackknife runs should be repeated.
- Fixed problem with dialog-box setting of "Jac" resampling option.
- Fixed crashes with parsimony analyses using stepmatrix characters.
- Prevented crash when reading in treefiles containing currently deleted taxa. Unfortunately the "fix" (for now) is to
disallow doing this.
- Fix refusal to use restriction-site distances with 0/1 data.
- Fixed complaint about "expecting whitespace" when a data line in a Data or Characters block ended with a comment.
- Fixed crash when using "previous" for ML siterates option.
- Fixed problem preventing use of ML for non-sequence data when constant characters were present (i.e., non-Mkv).
- Fixed other issues (e.g., crashing, incorrect likelihoods) with the Mkv model.
- Fixed an output formatting issue with character-change lists from the DescribeTrees command.
- Ancestral states reconstructed under Mkv model were not being shown correctly.
Other changes:
- In version 4.0a151, I introduced a new system to reduce memory requirements for data sets containing
very large numbers of characters. It turns out that while this system is very effective for data sets
containing relatively few taxa (e.g., less than 10), it often dramatically increased the memory usage
for data sets with many more taxa. I have now replaced that system with a new one that has much better performance.
Note that you can use the command "
set datastorage=full;
" (prior to executing a Data
or
Characters
block) to restore the old system for data storage, although this is not generally recommended.
- Gamma-distributed rates may now be used with the Mkv model (although it's not clear whether they should be used).
- Added the "keepOne" option to the ClearTrees command. This option allows you to clear all but one of the trees
currently in memory. The command "
cleartrees keepone=random;
" chooses one of the original trees to keep.
If you need consistency across runs you can specify a seed on the first ClearTrees
command
(e.g., "cleartrees keepone=random seed=999;
"). Alternatively, you can do
"cleartrees keepone=first;
" to clear all but the first tree from memory.
Version 4.0a152
This version fixes several bugs that were introduced at 4.0a151, plus a couple of old ones.
Bugs fixed:
- A crash occurred after issuing the command "delete all;". Note that "undelete B D E/only;" is equivalent
to "delete all; undelete B D E;" and is a somewhat cleaner way to delete all taxa other than B, D, and E.
- Exact parsimony searches (branch-and-bound or exhaustive) crashed if all binary and ordered multistate
characters were uninformative.
- Attempts to do searches under likelihood criterion with non-sequence data resulted in a crash.
- The rooted vs. unrooted setting for trees being input from a Trees block was not respected in certain
scenarios.
- A crash occurred trees were read using mode=7 when trees from file contained branch lengths but trees
in memory were not user-defined trees.
- Branch-and-bound likelihood searches sometimes returned no optimal trees (probably only for small data sets).
- Eliminated bogus error messages about codon models not being allowed with the current data.
- The denominator used in calculating "standard" distances for amino-acid and DNA data was not correct when
"X" (amino-acid) or "N" (DNA) codes were used in the data matrix. Sites containing these codes for one or both
members of a pair of sequences were not being treated as uninformative, causing the denominator to be too large.
- Attempted to fix problem reported by a few users that Windows GUI version does not quit properly,
preventing it from re-launching without killing it from the Task Manager "Processes" tab (or restarting).
I have not been able to reproduce this problem myself, so I don't know whether this change worked or not.
Other changes:
- Needed Fortran libraries are once again linked statically in OSX and Linux command-line versions (fixing
missing-library issue on systems with lacking or incomplete developer tools).
Version 4.0a151
This version fixes a few bugs but most of the changes are "under the hood." I will be rolling out several
new features early in 2017, but they are still too incomplete and/or buggy to open up for general use at
this time.
Bugs fixed:
- A crash occurred when the "fast stepwise" bootstrapping option was requested.
- Memory corruption could occur when files containing DOS/Windows line endings were opened in
the Mac-version editor.
- Saving the "N best" trees did not work correctly if N was equal to the current value of 'maxtrees'.
- Out-of-order taxon labels were not handled correctly in second or subsequent Characters blocks.
Other changes:
- Memory requirements for data sets containing very large number of characters were reduced, dramatically
in some cases.
- Many additional changes were made in preparation for forthcoming new features.
Version 4.0a150
This release is mainly to fix an SVDQuartets bug that was inadvertently introduced in the previous version.
Bugs fixed:
- Writing of treefiles containing trees from each SVDQuartets bootstrap-replicate was broken in 4.0a149
(all trees in the treefile were identical).
- Fixed incorrect setting of partitioned-likelihood model after automated model partitioning
("autopartition") completed.
Other changes:
- The system for generating random numbers has been completely overhauled. The old method used
a multiplicative congruential method with a prime modulus of 231 - 1 and a multiplier of 397204094.
This generator had very good statistical properties (e.g., it was also used in the SAS system for many
years), however it has a relatively short period (the number of values generated before the sequence
begins to repeat) of 231-2 ≈ 2.1 x 109.
Although the previous generator was perfectly fine for the bootstrapping and permutation tasks
for which it was previously used, I am working on a new internal simulator for which a longer-period
generator is needed. The standard generator now uses the KISS method developed by George
Marsaglia, which has a much longer period, about 2124 ≈ 10124.
You can still request the old generator using the command "set rng=legacy;
"
before beginning an operation that uses random number generation (e.g., bootstrapping).
You can restore the default behavior using the command "set rng=KISS;
".
I have also changed the method for specifying starting seeds. By default, the initial seed
value is set to 0, which results in the initial seed being obtained from a nondeterministic
random number generator provided by the operating system. The probability that two runs could
start with the same seed when using a system-provided seed is now extremely small (relevant, e.g.,
for starting multiple jobs simultaneously on a cluster). Also, if you provide your own (nonzero)
seed, it will no longer be updated automatically after a command that uses random number generation.
Repeating a command without changing the starting seed should now produce exactly the same results.
If you use a nonzero seed and want to use a different random number sequence for the second
invocation of a command, you will now need to specify a new seed explicitly.
- A PAUP block with an "outgroup" command is now included in bootstrap treefiles written by
SVDQuartets.
- SVDQuartets now allows use of the "Erik+2" normalization of Fernández-Sánchez and Casanellas (2016),
as well as the ability to specify an arbitrary expected rank for the flattening matrix (e.g., to allow
for site mixtures due to heterotachy).
Version 4.0a149
Important notes regarding SVDQuartets:
The SVDQuartets method is very new and we are still learning things about how to implement it well.
Please read the following list of changes carefully and re-run any analyses that might have been
affected.
- Improved accuracy:
The primary method used for singular value decomposition in PAUP* is the DGESDD routine from
the LAPACK library (which is also used by R, MATLAB, and much other numerical software). We have
discovered that the singular values returned by this function can be inaccurate for "sparse"
flattening matrices containing many zero entries (i.e., when many of the 256 possible site
patterns for 4 taxa are not observed). Very small singular values can occur in this case,
and sometimes values that should be exactly zero are instead returned as small positive values.
The numerical inaccuracy can be great enough to cause the wrong quartet topology to be preferred,
or one of the topologies to be chosen as best when in fact all three topologies have equal scores.
Several modifications to the code have been made to improve accuracy under these conditions. First, rows
and columns containing all-zero entries are removed prior to SVD (which does not affect the
estimated rank of the matrix). In addition to improving accuracy of DGESDD, in some cases, this
matrix reduction makes SVD calculation entirely unnecessary, as the rank of the matrix cannot be
greater than min(number of rows, number of columns). Second, when all three of the quartet
topologies have very similar scores, an additional SVD is performed using a slower, but more accurate,
LAPACK routine (DGEJSV); the singular values returned by this routine are then used to calculate
the quartet scores. Third, if the scores for all three topologies for a quartet are equal (within
a tolerance for floating-point roundoff error), the quartet is simply discarded. (Previously, one
of the three resolutions was kept, and the decision as to which one was chosen was impacted
by the numerical inaccuracy described above).
These changes are most likely to affect data sets with relatively small numbers of sites, or
data sets containing two or more very similar sequences. If at all possible, please repeat
any previous analyses to make sure that the results were not affected.
- Other SVDQuartets issues:
- Bug fix: The number of consistent/inconsistent quartets was reported incorrectly in
species-tree SVDQuartets analyses (i.e., when using a taxon partition to assign lineages or
samples to species).
- Bug fix: The progress bar did not update properly during SVDQuartets bootstrap.
- Bug fix: The program did not respond to user-abort request in SVDQuartets bootstrap.
- Bug fix: A crash could occur when memory allocation failed in SVDQuartets due to very
large numbers of taxa.
- Enhancement: Partial ambiguities (R, Y, etc.) are now incorporated into the SVDQuartets
flattening matrix. Use the command defaults svdquartets ambigs=missing; prior to
running SVDQuartets to restore the old behavior (although it's hard to imagine a reason for
doing so).
- Enhancement: The SVDQuartets solution found by QFM quartet assembly may now input to
SPR local search for (possible) further improvement. Additional options for the quartet-assembly
step will be opened up in a future version. For now, you can use one of the commands:
defaults qsearch localsearch=nni;
defaults qsearch localsearch=spr;
defaults qsearch localsearch=tbr;
to request local search via NNI, SPR, or TBR branch swapping, or:
defaults qsearch localsearch=none;
to restore the default behavior (no local search).
These commands may significantly increase the search time for data sets containing larger
numbers of taxa. Note that currently, the defaults qsearch command must be issued prior to
initiating an SVDQuartets analysis; this awkward interface will be improved soon.)
Other bugs fixed:
- After deleting taxa, crashes sometimes occurred when showing constraint trees (and even if no crash occurred, tip labels on tree
were usually incorrect).
- Non-unit character weights were not always used correctly in likelihood analysis.
- Tree length calculations were incorrect with asymmetric stepmatrix characters and all-missing ancestral states (the "standard"
ancestor). Trees found in tree searches were obviously incorrect, and tree scores reported in the search output did not match
the scores output by the "Pscores" (Tree Scores->Parsimony) command.
- Branches were being collapsed inappropriately with asymmetric stepmatrix characters and all-missing ancestral states, possibly
producing star topologies. The same bug caused minimum and maximum branch lengths to be reported incorrectly in the
table of branch lengths output by the DescribeTrees command, and ancestral state reconstructions to be obviously incorrect.
- The partition homogeneity test produced completely bogus results when partition subsets contained excluded characters, or when
some characters were not assigned to any partition subset.
- The partition homogeneity test inadvertently included one extra character in each randomized subset, affecting the distribution of
tree lengths for randomized partitions. This bug was probably inconsequential for all but extremely small data sets.
- The "Reweight Characters" command was not dealing properly with more than one tree in memory.
- Crash occurred when balanced minimum evolution (BME) distance search was combined with random addition sequence.
- Output of tree names for trees read from treefiles was off by one (e.g., "PAUP_2" rather than "PAUP_1").
- Autopartitioning could crash when using AICc if the number of sites in a subset was not at least two larger than the number
of model parameters.
- Crash occurred if an exact parsimony search was attempted immediately after a heuristic likelihood search.
- Windows installer failed to install a DLL needed to run the console (command-line) version.
- Fixed cosmetic glitch in Likelihood Settings dialog box.
Other changes:
- The format for specification of 6ST (GTR) rate matrices in the LSet command has been relaxed. In addition to the
old format in which five values (rAC, rAG, rAT, rCG,
and rCT) are supplied and the last rate (rGT), is assumed to be "1", you can now specify all 6 rates.
Furthermore, there is no longer a requirement that this last rate be "1". You can use any values you like, and the matrix will
automatically be rescaled such that rGT=1. E.g. the following are now all equivalent:
- rmatrix=(1 2 1 1 4)
- rmatrix=(1 2 1 1 4 1)
- rmatrix=(2 4 2 2 8 2)
- rmatrix=(0.1 0.2 0.1 0.1 0.4 0.1)
- Options for negative branch-length handling now work for Balanced Minimum Evolution (probably only relevant for constrained searches).
- Compatibility with the Geneious plugin has been restored.
- Eliminated external dependency on system LAPACK/BLAS libraries. All LAPACK code is now included in the source distribution and
compiled and linked entirely within PAUP's normal build system.
Version 4.0a148
This version was not released, and changes since 4.0a147 are described above.
Version 4.0a147
This is primarily a bug-fix release, but significant progress has been made on new features that will be
opened up in a future version.
SVDQuartets users please read!
An important change has been made to the quartet assembly phase of the SVDQuartets method. Even though the Reaz et al. "QFM"
algorithm was performing well with most data sets, it obtained very poor solutions with some larger data sets.
The problem was traced to the initial partitioning for each "divide" stop of the divide-and-conquer
algorithm. A new strategy was implemented for this partitioning, replacing the corresponding component of QFM.
With this new method, QFM performs much better, typically running more quickly than previously and often finding better
solutions than other competing algorithms.
Because of these changes, it is strongly recommended that any previous analyses be re-run using this new version to confirm
that they were not adversely affected by the deficiencies of the previous algorithm.
Bugs fixed:
- SVDQuartets finds extremely suboptimal trees under certain circumstances (see above).
- Crashes sometimes occur with combination of random-addition-sequence and steepest-descent search.
- Overall probabilities of ancestral-state reconstruction probabilities are not correct with site-specific-rates models.
- Likelihood calculations are inaccurate under Mk and Mkv models when the number of states is variable across characters and
the (default) mkstatespace=variable option is in effect.
- The weighted SH test sometimes reports an incorrect P value of 0 (or some other obviously incorrect value).
- Trees found using quartet puzzling are not kept in memory.
- Reading of treefiles in Newick format (i.e., not Nexus) fails when all-digit taxon labels are present.
- Patristic distance and homoplasy matrices output using the DescribeTrees command are incorrect (this bug goes back a long way).
- Last line of single-site likelihoods is not output when last character in data matrix is excluded.
- Crashing during attempt to change default settings from Automodel and Export-Data dialogs.
- Likelihood calculated incorrectly with combination of rooted trees, non-clock model, and user-input branch lengths.
- Crash occurs if 'X' is used as an amino-acid character state.
Other changes:
- Control of single- versus double-precision calculation of likelihood scores is now handled using a
new LSet option (precision=single or precision=double).
- The ability to combine hold>1 with random addition sequences in stepwise addition has been restored.
- Code for parsimony using asymmetric cost matrices was completely refactored to eliminate multiple bug-fix
hacks.
- Code for handling conditional-likelihoods was refactored in order to support GPU computation using the
Beagle library (forthcoming).
- Accuracy of conditional-likelihood rescaling was improved with no noticeable performance penalty (and
possibly even a speedup for very large data sets).
- More under-the-hood changes were made in preparation for partitioned likelihood models.
- Many code cleanups in anticipation of releasing an open-source version.
Version 4.0a146
Changes:
- SVDQuartets is now much faster when the data contain a high proportion of missing and/or ambiguous states.
- Likelihood evaluation of rooted trees under non-clock models is now supported.
Bugs fixed:
- SVDQuartets crashes or obtains invalid results after deletion of taxa.
- SVDQuartets with treeInf=curtrees crashes after a previous run in which bootstrapping was performed.
- SVDQuartets bootstrap tree is inappropriately rooted if analysis follows a search in which rooted trees were obtained.
- SVDQuartets crashes with some data sets if the 'showScores' or 'showSV' options are used.
- P value for SH test is erroneously reported as "0.0" when two trees being compared are identical or have equal likelihood scores.
- Crashes or other misbehavior occurs with amino-acid input files containing stop codes (these are now automatically
converted to "missing").
- Attempting to plot a NJ tree as an unrooted tree causes a crash.
- Out-of-bounds memory write occurs when reading constraint trees containing currently deleted taxa, usually leading to a crash.
- If a partitioned ClockChecker analysis is done and then a new data set is executed (or the previous one is re-executed),
a crash ensues if a subsequent ClockChecker command is invoked without explicitly assigning a new character partition.
- Results of reading in treefiles is no longer being reported.
- Altered width of output display window is not preserved between sessions in Windows version.
- Output of single-site likelihoods for Mkv model may contain obviously invalid values.
- ML branch length optimization is borked after an "OK to optimize rather than using user branch lengths?" query.
- Gray lines on trees (for indicating negative branch lengths) are sometimes too faint (I now use dotted lines instead).
- Parsimony analyses complain erroneously about "tree length overflow" with polymorphic asymmetric stepmatrix characters
containing more than 2 states.
- PScores formatting with non-integer scores uses scientific notation inappropriately, e.g., if a stepmatrix contains fractional
costs.
- Asymmetric stepmatrix tree length calculations are sometimes incorrect in SPR/TBR searches.
- A crash sometimes occurs after estimating site-specific rates via ML and then using "lset siterates=previous;"
Version 4.0a145
This version addresses major problems with the previous version released about 3 weeks ago (I was rushing to get
an update finished before a workshop, and my testing was inadequate after making a complicated under-the-hood change
intended to reduce memory usage). The problems were mostly, but not entirely, associated with the deletion of taxa
(and there may well have been problems in addition to those reported below). There may still be a few related issues,
but I wanted to get an update out as soon as possible.
In adition, there was a problem with the ML settings dialog interface that caused use of the "previous" buttons
associated with model parameter estimates to have no effect (model parameters were still subsequently estimated by ML
when the interface claimed to have set them to fixed values). Searches were then excruciatingly slow, as model
parameters were being estimated for each tree evaluated in a search, which I never recommend doing.
Finally, the bug with the partition homogeneity (=ILD) test was not completely fixed as reported in 4.0a144.
Results were valid only if (1) characters in each partition subset were contiguous and (2) there were exactly two
subsets in the partition. (Unfortunately, these conditions were both true for the data set that caused me to recheck
the test, and I didn't test other cases.) I am satisfied that the test is now working correctly when characters
in each subset are not contiguous (e.g., codon positions) as well as when more than two subsets are present.
Changes:
- A new SVDQuartets option allows saving of trees from each bootstrap replicate to a treefile.
- You can now plot both the inferred tree (point estimate) as well as the bootstrap consensus tree after performing
an SVDQuartets analysis (GUI versions).
Bugs fixed:
- Crashes occur in multiple contexts after deleting taxa (including pruning of deleted taxa from existing trees
at the time of deletion, computing consensus trees, etc.).
- Clicking on the "previous" buttons associated with model parameters in the "Likelihood Settings" dialog box
appear to set the parameters to fixed values, but this change was not actually registered when the dialog was
dismissed.
- Results from the partition homogeneity test are invalid if characters in each partition subset are not contiguous
or if there are more than two subsets in the partition.
- The "append" option on the "Log" command does not work correctly.
- Crashes sometimes occur after loading a Nexus file containing only taxon data (i.e., no data matrix).
- Very small files (<256 bytes) are converted from Unix/OSX line endings to "Classic Mac" when saved from the
editor (Mac version).
- Messages like "Object 0x467c2d0 of class NSSound autoreleased with no pool in place - just leaking"
are written to console log.
- Message is incorrectly issued that a file was externally modified after doing "Stack Editor Windows" (Mac version).
- SaveTrees command crashes if no prior tree search was performed (e.g., only the "generateTrees" command is issued).
Version 4.0a144
New features:
- The "ClockChecker" command is now available. ("Automated Clock Tests" in the Analysis menu of Mac/Windows versions).
For one or more rooted trees initially in memory, a likelihood-ratio test as well as AIC/BIC comparisons are made for
models that enforce a clock vs. models with unconstrained branch lengths. The analysis may be performed for the entire
data set, or for subsets of the data defined by a character partition ("charpartition"). When a character partition is
used, subsets of characters (sites) can be excluded if they are determined to be non-clocklike according to any desired
LRT or AIC/BIC threshold.
Serious bug fixed:
- The partition homogeneity (=ILD) test has been reporting invalid results for an unknown amount of time. Apparently,
I broke it as a side effect of an unrelated change, but I didn't notice this because I no longer use this method
myself and I hadn't gotten around to writing an automated build test.
I apologize for this error. Unfortunately, you will need to rerun any analyses with this test that you have performed
with any of the "4.0a" builds of PAUP*.
Other changes:
- Options are now available for writing the design and/or expected-distance matrices to files in the "DScores" command.
- The number of permitted ambiguity combinations has been increased. The number of character states plus the number
of ambiguity combinations can now be up to 254.
- AIC, AICc, and BIC options are now available in TreeScores:Likelihood dialog box.
- "Parsimony approximations" are no longer available in TreeScores:Likelihood dialog box. This old option is no longer
relevant for modern analyses.
Bugs fixed:
- Bootstrap analysis with SVDQuartets crashes with "non-species-tree" analyses (i.e., when a taxon partition is not
used to assign tip taxa to species).
- Minimum-possible tree length is sometimes calculated incorrectly when "N" is used as a character state, leading to incorrect
values for the consistency index and related fit measures.
- Informativeness under parsimony is sometimes incorrectly determined when the data contain ambiguity codes. Apart from output
that indicated the number of parsimony-informative characters, the only real consequence of this bug is that some uninformative
characters might have been retained when the intent was to exclude uninformative characters (since these characters were actually
uninformative, their presence should not affect the trees found in searches, but might cause discrepancies in tree lengths
compared to earlier versions).
- g%SD was reported incorrectly in output resulting from least-squares distance analysis.
- "Showtrees" crashes in some situations when no data are present.
- No error message is issued when the internal limit on the number of ambiguity combinations used for polymorphism/uncertainty
is reached.
- Treefile from "Save Trees" after SVDQuartets bootstrap uses wrong translation table.
Version 4.0a143:
Not released.
Version 4.0a142:
This update fixes several bugs affecting likelihood, distance, and SVDQuartets analyses. In addition, a console (command-line) version
is also now installed along with the Windows GUI version (after installation, you should be able to invoke this console version by typing
"paup" in a command-prompt window).
Changes:
- Exhaustive quartet sampling is now permitted with "species-tree" analyses using SVDQuartets.
- Two-state (presence/absence) data may now be analyzed using SVDQuartets.
- Criterion for preferring evaluation of all-possible quartets changed: now have a simple switch for using exhaustive
evaluation if the number of quartets would be less than that specied by "nquartets".
- SVDQuartets is faster in Windows version due to a change in the way the LAPACK routines are implemented.
Bugs fixed:
- The model chosen by automated model selection ("automodel") is not always set as the default model for subsequent analyses,
despite output that indicates otherwise.
- Crash occurs if a "species-tree" analysis (i.e., with a taxon partition) is attempted with fewer than four species-subsets
with SVDQuartets.
- SVDQuartets does not work correctly after deleting taxa.
- Crashing and/or invalid results when using Mk model after taxon deletion.
- Branch-and-bound sometimes fails to find any trees for distance analyses when no character-data matrix is present (this bug possibly
had ramifications that extended to other kinds of searches and sorts).
- "Append" option on Log command does not work correctly in some versions.
- SVDQuartets leaks memory if bootstrap analysis is canceled before the first replicate has completed.
- Branch-and-bound searches crash if the number of non-deleted taxa is a multiple of 64.
- "Nexus Options" dialog in Windows version contains duplicated/missing items.
- Log-likelihood scores for ML analyses with amino-acid data are sometimes invalid for trees that are large enough to require
conditional-likelihood rescaling (the calculation was correct for non-vectorized double-precision evaluation; the bug only affected
rescaling with vectorized calculations using SSE).
- Distance searches did not always respect the "precision" setting when outputting tree scores.
Version 4.0a141:
This release addresses numerical problems in the Windows version that sometimes cause premature termination of maximum-likelihood
optimizations. Windows users should update to this version; the most recent Mac/Linux release remains 4.0a140.
Bugs fixed:
- Automated partitioning (and probably other likelihood optimizations) stops without reporting any results.
Version 4.0a140:
This release addresses a memory corruption issue that primarily affects the Windows version. All Windows users should
update if they are doing maximum likelihood analyses. Mac users should update as well unless it is inconvenient for
some reason.
Bugs fixed:
- Crashes occur unpredictably in the MS-Windows version when maximum-likelihood analyses are performed using
single-precision arithmetic and SSE vectorization (which is the default setting).
- A crash occurs if the "LSet" command is issued before data have been loaded (all versions).
Version 4.0a139:
This is yet another quick bug-fix release to fix a stupid mistake I made while rushing to get an update ready for
a workshop. The bug was serious; please apply the update. Also check the release notes for previous versions if you
have not already done so..
Bugs fixed:
- Likelihood of models containing both gamma-distributed rates and invariable sites ("+I+G") is calculated incorrectly.
Version 4.0a138:
This is a quick bug-fix release to address several issues with maximum likelihood in verison 4.0a137.
Please see the release notes for that version as well.
Bugs fixed:
- Crashes, freezes, or incorrect calculations after changing likelihood models.
- Files output from LScores are no longer compatible with JModelTest and MrModelTest.
- Crashes on the second or later ML analysis using the GTR model.
Version 4.0a137:
New features:
- The SVDQuartets species-tree method of Chifman and Kubatko is now available (see Chifman, J., and Kubatko, L. 2014.
Quartet inference from SNP data under the coalescent model. Bioinformatics 30:3317–3324.)
- Partitioned likelihood models are now available (from the command line only). Unfortunately, I have not had a chance
to document how to use them, but this will be a high priority, as will dialog-box support for the Mac and Windows versions.
- Likelihood of the "unconstrained" (multinomial) model can now be computed when missing/ambiguous data are present,
using the method of Waddell (2005). This calculation is usually fast, but it might be slow for very large data sets
with a lot of missing data. Consequently, you need to request unconstrained likelihoods as a new option on the
LScores command (or by checking the relevant box in the Tree Scores->Likelihood dialog box).
- Calculation and output of AIC(c) and BIC scores is now supported in the LScores command (Tree Scores->Likelihood
menu command).
Other changes:
- Taxon labels in plotted trees are now auto-sized based on number of taxa and page size. Once the font size has
been explicitly set by the user, however, it remains in effect for the remainder of the run (or until it is reset again).
Important note:
Maximum-likelihood bootstrapping has apparently been broken for over a year without me (or
apparently anyone else) noticing it. The problem was so severe that it is extremely unlikely that you would have
accepted the results as valid (in fact you probably would not have even been able to get a run to finish).
However, if you somehow managed to perform an ML bootstrap analysis with any version between 4.0a130 and
4.0a136, you need to repeat the analysis.
Bugs fixed:
- ML bootstrapping fails to run correctly or produces obviously invalid results (see above).
- ML analyses do not always respect character weights intended to be treated as pattern-frequency counts.
- Memory is corrupted, leading to crashes or allocation failures, after changing parsimony character types.
- Tree lists do not handle multiple tokens (e.g., "lscores 2;" works but "lscores 2 5;" does not); bug introduced
at version 4.0a134.
- Statefreqs/Basefreqs command crashes if issued before any other model-based analysis has been performed.
- Option to set number of decimal places for branch-lengths in output treefiles is being ignored.
- Probabilities of state assignments for ML reconstructions (e.g., "lset allprobs; desc/xout=both;") are
not correct for some models (at least G+I)
- Ancestral-state reconstructions under Mk(v) models are not correct.
- Crashes after showing error message for unrecognized option on HSearch command.
- Windows-version: mouse-wheel activity while the recalled command-list is disclosed also scrolls the text
in the main display window.
- Slider controls are not updating the text values in the Windows-version Startup Preferences dialog.
- Output window width is not always in sync with output line length under Windows 7 (but OK under XP).
- Crash when attempting to plot unrooted NJ, bootstrap, etc., trees.
- LSet command output by Automodel includes all 6 components of the 6ST rate matrix. The last "1" causes
the command not to work if pasted back into the command or a file.
Version 4.0a136:
New features:
- Added "autopart" command and "Automated partitioning" menu command to choose partitioned models a la PartitionFinder.
(Partitioned models are not yet supported in PAUP*, but the output from this command can be used to set up partitioned models
in Garli or RAxML.)
- Added "subtreecolor" command to specify colors on plotted trees for subtrees induced by a list of taxa (OSX version only).
Do "subtreecolor ?;" for syntax (e.g., "subtreecolor red 1-3 5-8;"). Similar commands are available for coloring only the tip
labels ("tipcolor") or both the subtree and the tip labels ("taxcolor").
Other changes:
- The default setting for reading treefiles has been changed to automatic storing of branch lengths and tree weights
unless storeBrLens=no or storeTreeWts=no is explicitly requested. (These options are still useful if treefiles
contain many trees and the branch lengths and/or tree weights are not needed, in which case they
can consume a lot of memory unnecessarily.)
- Added an option to suppress showing of taxon (tip) labels on graphical tree plots.
- Made a few tweaks to graphical tree drawing that arguably improve appearance.
- Output command-line equivalent when optimality criterion is changed from the Analysis menu.
- Restored support for "nowarn" option on ClearTrees command. The option does nothing, but this restores backward
compatibility with old scripts.
Bugs fixed:
- L-BFGS optimizer fails on some data sets with clock models under default (Thorne) parameterization.
- Bootstrap support values written to treefiles were either garbage or associated with incorrect nodes.
- Parsimony analysis crashes with asymmetric stepmatrix characters.
- Optimization of G+I models is incorrect under clock model with some parameterizations.
- Glitches with formatting/truncation of tree scores for output (e.g., pscores); introduced in 4.0a134.
- Crash or other misbehavior after deleting taxa when no trees were in memory.
- Crash with topology constrained distance searches.
- User-input branch lengths are not retained when trees are pruned after deletion of taxa.
- Fixed truncation of user-input branch lengths to integer values when describing trees in the absence of a data matrix.
- SaveDist command ignores "append" option.
- K2P distances were incorrectly calculated as "zero" when the only differences between two taxa were transitions.
Version 4.0a135:
This version was not distributed. See release notes for 4.0a136.
Version 4.0a134:
New feature:
- The "automodel" command ("Automated Model Selection" on the Trees menu) is now available. This command emulates
the function of the ModelTest/JModelTest programs without having to run a separate standalone program. Models
may be chosen from one of four model sets (corresponding to those in JModelTest) using the AIC, AICc, BIC,
and DT criteria.
Other changes:
- Additional care is now undertaken when optimizing ML models containing both gamma-distributed rates and
a proportion of invariable sites ("G+I"). Most importantly, an occasional problem is now avoided where ending
at a local optimum causes the likelihood of a more complex model to be worse than that of a simpler model
nested within it.
- Minimum output width is now 100 characters, which requires a display with at least an 800x600 screen
resolution. (I just needed a little more breathing room for some output tables.)
- The long-deprecated "nowarn" option on ClearTrees command has been removed; I assume you wouldn't issue
the command unless you were already sure you didn't want to clear the trees.
Bugs fixed:
- Files containing multibyte unicode characters are being truncated when saved from Mac editor.
- NJ/UPGMA segfaults in command-line version when saving to treefile is requested.
- Interface freezes if a list item is double-clicked in "Show Reconstructions" dialog box on Mac.
Version 4.0a133:
Changes:
- Improve font appearance in Windows version.
- Improve installer and auto-update interface in Windows version.
- "Restore Open Documents" events are now ignored in the Mac version. Previously, documents that were open the last
time PAUP* was quit were triggered to re-opened by the operating system. However, this causes problems for PAUP* when
the default starting mode is "Execute" but the document being re-opened is not an executable NEXUS file.
Bugs fixed:
- Crash occurs while processing character-partitions entered in "vector" format.
- Crash occurs with likelihood if the number of unique site patterns is less than the number of multithreaded tasks.
- "Find" fails in Mac editor if document contains multibyte Unicode characters.
- Setting of gapMode to "newState" from GUI is ignored.
Version 4.0a132:
This version was not distributed.
Version 4.0a131:
- Fixed crashing when rate matrix parameters for GTR model were estimated by maximum likelihood.
Version 4.0a130:
Important note:
After a bit of soul-searching, I have decided to change the default multithreading setting to be a
single thread. Previously, I queried the system for the number of cores available and defaulted to one thread per
core. The problem with this method is that many machines now have a large number of cores, and using this many
threads degrades rather than enhances performance. Also, on a machine running multiple long jobs, better system
performance is obtained by not having all of these jobs compete for the same set of cores. This is is especially
relevant for jobs running on HPC clusters using the Sun Grid Engine for scheduling--the operating system will
happily report that (say) 16 cores are available even though there are already 15 other jobs running on a node.
So now, if you want to use multithreading for maximum likelihood calculations, you must explicity specify
lset nthreads=n;
(where n > 1) before beginning a likelihood analysis (or use the
equivalent setting from the "Optimization" pane of the "Likelihood Settings" dialog box). You will probably want
to do some experimentation to determine the best value for nthreads--two to four may be better than one, but
depending on the size of the data, requesting too many threads will be detrimental. You can also use
lset nthreads=auto;
to obtain the old behavior of using one thread per core.
New features:
- Balanced minimum evolution is now available for distance analysis, either from the Distance Settings dialog box or by using
"objective=bme" from the DSet command.
Changes:
- Likelihood scores are now written to score-file in sorted order (unless single-site likelihoods are requested).
- By default, character/taxon partitions and ratesets are now output as lists of character numbers rather than in the previous
"dot-plot" format. The old format is still available by specifying "/dotplot" on the ShowCharPtns, ShowTaxPtns, and ShowRatesets
commands, or by choosing the relevant menu items from the "Show Other" submenu.
- Many internal changes were made in anticipation of support for partitioned likelihood models. I had hoped
support for partitioning would materialize for this release, but there are still too many issues remaining to be
resolved.
Bug fixes:
- Fixed problem with output of single-character tree lengths to score-file using PScores (Tree Scores->Parsimony) command
(the lengths for the last tree were output for every tree in the file).
- Fixed crashing or other misbehavior of BaseFreq command ("Base/AA Frequencies" menu command).
- Fixed context-dependent crashing of the Likelihood Settings dialog box.
- Fixed failure to compute a valid kernel agreement subtree (KAST) or Adams consensus tree, due to the same underlying cause.
- Fixed crash after closing "Permutation tests..." dialog box.
- Fixed crash during constrained search if taxa were deleted after a previous constrained search was completed.
- Fixed possible crash while showing histogram from exhaustive search when the number of histogram classes exceeds the range of tree
scores.
- Fixed failure to respect the "taxlabmatch=relaxed" setting.
- Fixed possible crash after using DiscardData command (due to failure to reallocate a default ML model).
- Fixed failure of branch-and-bound search to save optimal tree(s) with large-ish data sets.
- Fixed crash with branch-and-bound search under distance criterion.
- Fixed a problem with filtering trees when convexity constraints were in effect.
- Fixed cosmetic glitches in "Evaluate random trees" dialog box.
- Fixed help window popping up if used clicked in the "Base weight" field of the Reweight Characters dialog box.
- Fixed several glitches in Windows-version dialog boxes.
- Fixed glitch in Windows-version Edit menu causing some items to not be displayed.
- "Execute " item in File menu is now updated after a "Save As" to a different name.
- Fixed possible failure of "Find" window in editor not appearing due to being located offscreen.
- Fixed failure to remember previous main window location if it was to the left of the monitor containing the menu bar.
- Fixed possible failure to redefine the "MissAmbig" character set correctly after taxa were deleted.
- Fixed problem with export of data containing characters with multistate taxa with datatype=standard.
- Fixed problems with output of estimated model parameters to tree-score file from LScores.
- Restored compatibility of LScores tree-score file with Shimodaira's CONSEL program.
- Fixed crashing with 2-state (Cavender-Farris) ML models when vectorization was in effect.
- Fixed glitches in graphical tree plots when the default "show as phylogram" setting was changed to "cladogram" (e.g.,
the wrong tree was sometimes shown when multiple trees were stored in memory).
- Fixed crashing with exhaustive search under distance-based optimality criteria.
- Fixed incorrect calculation of likelihoods under Mk(v) model when number of observed states was variable across characters and
molecular clock was enforced.
- Fixed output of patristic distance and homoplasy matrices associated with "DescribeTrees" command.
- Fixed broken Windows-version auto-updater.
Version 4.0a129:
New features:
- Multiple characters blocks now auto-create a charpartition 'Chars' with subsets defined by block names specified in TITLE commands.
- Charsets can now also be specified as charPtnName.subsetName, where charPtnName is a previously defined
charpartition, and subsetName is the name assigned to one of its subsets.
Bug fixes:
- Fixed problems with combination of conditional-likelihood rescaling, invariable-sites models, and multithreading. Rescaling was
not being done correctly, with the result that data sets and trees containing larger numbers of taxa were more strongly affected.
- Fixed problems with ancestral-state reconstruction and single-site likelihood calculation when vectorization was enabled.
- Fixed problem in code for checking character partitions that caused ML site-specific rates and partition
homogeneity test to fail.
- Improved robustness of parameter estimation when the model of among-site rate variation includes both gamma-distributed
rates and a proportion of invariable sites.
- Fixed more problems with multiple characters blocks (including cleanup after memory allocation failure).
- Fixed spurious quoting of taxon names when importing FASTA-format files.
- Fixed hang or crash if state-frequencies were estimated under GTR model but rate matrix was fixed.
- Fixed crash if single-site likelihoods were requested with Mkv model.
- Fixed problems saving files with file or directory names containing non-ASCII characters.
- Eliminated memory leak if character or taxa partitions are redefined.
- Editor caret is placed at the beginning of a document created by importing a foreign format, rather than leaving it
at the end of the document.
Version 4.0a128:
Yes, I know. The previous version (4.0a126) was a disaster. I decided to add support for multiple Characters blocks in
Nexus input files in order to support (e.g.) Mesquite-generated files that were causing PAUP* to crash. This turned
into a big mess and I had to rush a version out for the European Molecular Evolution Workshop, Several new bugs were
introduced during this period. Unfortunately, I then I got hopelessly busy on
non-PAUP-related projects and couldn't work on the program for a while.
Anyway, the problems fixed in this release include:
- Failure to retain charsets, taxsets, etc. following a Data or Characters block.
- Failure to respect the "respectCase" option in the Format command of a Data or Characters block.
- Completely broken likelihood analysis for amino-acid data.
- Crashing when reading in user-supplied distances using a Distances block.
- Crashing of 64-bit versions when reading trees containing output comments (e.g., files output by Garli).
- Minor issues associated with file-selection dialogs.
- %SD values with least-squares distances using weight powers other than 2 were being calculated incorrectly in the
DScores command (but not DescribeTrees).
- Failure to output a consensus tree if it was completely unresolved, and fixed possible problems with showing
a consensus tree when input trees were rooted.
- Crashing in consensus-tree calculation when midpoint rooting was enabled.
- "CountSwaps" command crashing when topological constraints were being imposed.
With regard to multiple Characters blocks, note that TITLE and LINK blocks created by Mesquite are currently ignored.
If you use multiple Characters blocks, they must conform to a preceding Taxa block (i.e., the same taxa, in the same
order). Eventually I may try to improve support for Mesquite's extensions to the Nexus format, but this is a start.
This version also provides a few additional fixes and improvements:
- Linux and MacOSX/10.5+ builds now contain support for embedded Python. Documentation will come soon.
- In GUI versions, the "ToNexus" command can now be used to import files in formats other than Nexus directly into a
new editor window (by not specifying a "tofile"). Previously, it was only possible to send the converted data to a
file.
- Include-exclude characters dialog box is now much more responsive in Mac version when the number of characters
shown in the lists is extremely large.
- Provided an option for retaining underscore characters in taxon names, charsets, etc., rather than converting to
whitespace (use "Nexus Options" dialog box or "set underscore=[keep|convert];" from the command line).
- Support for import of distance matrices in Phylip and Relaxed Phylip format has been improved. A bug causing
import of Phylip-formatted to fail was fixed.
- Distance matrices may now be exported in either traditional or relaxed Phylip formats.
- Added ability to read user-supplied least-squares weights for distance analysis (arcane for most users).
- Added some new capabilities least-squares distances, including exponential weights and "geometric" versions of
the percent standard deviation statistic (see Waddell et al. 2010,
http://arxiv.org/abs/1012.5882v1
Version 4.0a127:
This build was not generally released..
Version 4.0a126:
- Fixed bug in AU test that could cause incorrect P values, crashes, or freezes in cases where a candidate tree was never the best
tree in any bootstrap replicate. It appears that in most cases, the effects of the bug were inconsequential, causing a small P value
to be reported as a different small P value (often within the range of the bootstrap error variance). However, it is possible that
it could have caused more serious errors, and rerunning any analyses that may have been affected is recommended.
- Improved speed of storage and retrieval of large numbers of trees by using better hashing strategies.
- Fixed some problems with inability to open files using relative pathnames in file specifications.
- Added a new mode to the "CD" command: "CD *;" from an active input file explicitly sets the current
working directory to the directory containing this file.
- The default setting for handling character weights in bootstrap analysis has been changed to the more natural "treat as repeat
counts" option. Note that this option cannot be used if character weights take noninteger values; in that case an error message will
now be issued unless one of the other options for weight-handling is specified.
- Added "taxcolor" command to specify colors for taxon labels on plotted trees.
- Import of relaxed-Phylip format data files now works correctly with taxon names longer than 10 characters.
- GammaPlot command no longer requires a data matrix (which was silly).
- Tree storage is now more efficient when a substantial number of the original taxa have been deleted.
- Fixed bogus error message that agreement-metric tree distances could not computed because one or more taxa were missing
from the tree, when taxa had indeed been deleted but the tree contained only non-deleted taxa.
- Fixed crash with ML distances and 6ST substitution models.
- Fixed crash with bootstrap under distance criterion when using ML distances.
- Fixed possible hang if a calculation thread completed while ab error message alert was active (unlikely, but possible).
- Eliminated spurious message about uninformative characters not being included in Goloboff
scores in output of parsimony scores for individual characters, when Goloboff weighting was not in
effect.
- Eliminated spurious message about constant characters being ignored under Mkv model when criterion was not likelihood.
- Fixed output glitches with Goloboff-parsimony tree scores (showing of too many decimal places, failure to show tree lengths when
Goloboff parsimony was active).
- Fixed possible crash if more than 255 characters were typed into the Mac-version command line.
- Fixed possible crash with ABC tree-to-tree distances if an input tree was a star tree or if more than 128 taxa were present
and at least one input tree was nonbinary.
- Fixed incorrect output of command-line equivalent when "UPGMA" was chosen as the search method in the "Bootstrap/Jackknife"
dialog interface.
- Fixed incorrect jackknife sampling when character weights were present and the "simple" option for weight handling was requested.
- Catch user-entered tree-number of "0" as an error, rather than crashing.
Version 4.0a125:
- Fixed problem with scoring of distance trees containing negative branch lengths (introduced in 4.0a124).
- Removed some debugging output accidentally left enabled.
Version 4.0a124:
- The "usertree" command is now available. It can be used to quickly input a single tree without having to deal
with the complexity of a full Trees block.
- Importing of distance matrices in Phylip and relaxed Phylip format is now supported.
- Alphanumeric names are now recognized in tree lists when trees have non-integer names (the integer number
can still be used).
- Exporting to Phylip (character-data and distances) now replaces blanks with underscores appropriately, depending
the context (tree descriptions vs. data/distance matrices)
- Switched to "Meslo LG" as the main display and default editor font--Apple's Menlo Regular font had
problems with the width of the '%' glyph as well as some other undesirable features, and the previously used
Deja Vu Sans Mono had issues with the hyphen (too short) and the tilde character (not curvy enough to be
easily distinguishable from a hyphen).
- Tightened up Print/Preview Tree(s) code, trying to fix an elusive crashing bug reported by users that I
haven't been able to reproduce.
- More "under the hood" work in preparation for partitioned/mixture models (hopefully this didn't break anything)
- Fixed some minor interface issues:
- incorrect radio-button titles in Gaps panel of Parsimony Settings dialog (Windows version only)
- cursor remaining as watch cursor after cmd-S in Mac editor
- better error diagnostics when end-of-file is reached unexpectedly while executing a file
Version 4.0a123:
- The Macintosh OS X editor has been completely rewritten for OS X Leopard and later (now based on the Cocoa
NSTextView class rather than the buggy and poorly supported Multilingual Text Engine [MLTE]). Improvements
include:
- support for Unicode text
- fast search and replace
- ability to handle arbitrarily long lines
- elimination of problems involving tabs
- user-selectable fonts
- zippier performance in general
- Added calculation and output of the "Kernel Maximum Agreement Subtree" (KAST) along with previous agreement
subtree calculations. The KAST is the subtree containing only those taxa found in every optimal agreement subtree.
- Added "Defaults" buttons to "Import Data", "Export Data", and "Get Trees", and "Save Trees" dialogs so that users
can choose their own preferences for the default settings.
- Changed the command-line flags for Unix and Mac command-line versions, going to a more standard POSIX/Gnu-like
system. This may break some people's scripts that use the old flags for starting PAUP*, but it just had to be done.
You can run start paup with the -h or --help flags to get a summary of the options, and modifying the script to use
the new syntax should be trivial.
- Added command-line argument support for invoking the Mac GUI version of PAUP* from the prompt in a terminal window
or from a script. (Ordinarily, you would just use the non-GUI version instead, but there are circumstances
where the ability to invoke the GUI version without relying on the Finder is handy.)
- Multithreading is now working for ML calculations in the Windo