Motif Sequence Searching - Symbols and Codes

Symbols

Introduces controlled variabilities into the search query.

Use this symbol:

When you want to:

Example

Possible Answers

.

Translate to X for protein or enumerate to A, G, C, and T for nucleotide (generates four search terms).

S.GKD

SXGKD/SRGKD

...

Match any nucleotide\amino acid.

S....GKD

SXXXXGKD/SFTSYGKD

[XYZ]

Designate a set of possible nucleotide\amino acid character matches within the square brackets.

[SG]XXXXGKD

SXXXXGKD/GXXXXGKD

^XYZ$

Search for the exact sequence XYZ.

^SXXXXGKD$

SXXXXGKD

{m,n}

Designate a stretch (range) of X residues with at least m and maximum n length.

SX{3,4}GKD

SXXXGKD/SXXXXGKD

{n}

Designate a stretch (range) of X residues of exactly n length.

SX{4}GKD

SXXXXGKD

Amino Acid Codes

Creates a positive scoring mismatch for amino acids (e.g., A sequence query with a B will retain the B, and the B will score positively against D or N in the hit subject sequence).

Use this symbol:

When you want to search:

Example

Possible Answers

X

any amino acid

LXRK

Allows matching of any amino acid at the position of X.

B

D or N

LBRK

LDRK or LDNK

Z

E or Q

LZRK

LERK or LQRK

J

I or L

LJRK

LIRK or LLRK

 

Nucleotide Codes

Generates multiple queries for nucleotides where the degenerate code is replaced by the nucleotides it represents.

Use this symbol:

When you want to search:

Example

Possible Answers

N

A or C or G or T

ACNT

ACAT or ACCT or ACGT or ACTT

R

A or G

ACRT

ACAT or ACGT

Y

C or T

ACYT

ACCT or ACTT

M

A or C

ACMT

ACAT or ACCT

K

G or T

ACKT

ACGT or ACTT

S

C or G

ACST

ACCT or ACGT

W

A or T

ACWT

ACAT or ACTT

H

A or C or T

ACHT

ACAT or ACCT or ACTT

B

C or G or T

ACBT

ACCT or ACGT or ACTT

V

A or C or G

ACVT

ACAT or ACCT or ACGT

D

A or G or T

ACDT

ACAT or ACGT or ACTT

 

Back to Search Sequences - Motif