summaryrefslogtreecommitdiffstats
path: root/academic/RepeatMasker/README
blob: a62062fa85b236512a451c5b8fc5052c1d8bc5e5 (plain)
RepeatMasker is a program that screens DNA sequences for interspersed
repeats and low complexity DNA sequences. The output of the program is a
detailed annotation of the repeats that are present in the query
sequence as well as a modified version of the query sequence in which
all the annotated repeats have been masked (default: replaced by Ns).

Currently over 56% of human genomic sequence is identified and masked by
the program. Sequence comparisons in RepeatMasker are performed by one
of several popular search engines including:

- nhmmer (part of 'HMMER', available on SBo)
- Cross_Match. Due to licensing, you should obtain this yourself:
  http://www.phrap.org
- ABBlast/WUBlast. Due to licensing, you should obtain this yourself:
  https://blast.advbiocomp.com/licensing/
- RMBlast (found as 'ncbi-rmblastn' on SBo)

RepeatMasker makes use of curated libraries of repeats and currently
supports Dfam (profile HMM library derived from Repbase sequences) and
Repbase, a service of the Genetic Information Research Institute.

WARNING!
Due to the bundled databases, the installed size of this is 1.8 GiB!

NOTE!
The package is installed in /opt. After install go fo /opt/RepeatMasker
and run the RepeatMasker Configuration Program:

# perl ./configure

See README.SLACKWARE for details.