summaryrefslogtreecommitdiffstats
path: root/academic/RepeatMasker/README
blob: 536b0897a59dc8ac7030fefc7000793bf936ab4b (plain)
RepeatMasker is a program that screens DNA sequences for interspersed
repeats and low complexity DNA sequences. The output of the program is a
detailed annotation of the repeats that are present in the query
sequence as well as a modified version of the query sequence in which
all the annotated repeats have been masked (default: replaced by Ns).

Currently over 56% of human genomic sequence is identified and masked by
the program. Sequence comparisons in RepeatMasker are performed by one
of several popular search engines including:

- nhmmer (part of 'HMMER', available on SBo)
- Cross_Match. Due to licensing, you should obtain this yourself:
  http://www.phrap.org
- ABBlast/WUBlast. Due to licensing, you should obtain this yourself:
  https://blast.advbiocomp.com/licensing/
- RMBlast (found as 'ncbi-rmblastn' on SBo)

RepeatMasker makes use of curated libraries of repeats and currently
supports Dfam (profile HMM library derived from Repbase sequences) and
Repbase, a service of the Genetic Information Research Institute.

WARNING!
Due to the bundled databases, the installed size of this is 2.1 GB!

NOTE!
The package is installed in /opt. After install go fo /opt/RepeatMasker
and run the RepeatMasker Configuration Program:

# perl ./configure

See README.SLACKWARE for details.