Books
Adams, M. D., Fields, C. and Venter, J. C. (1994). Automated DNA Sequencing and Analysis.
New York: Academic Press, 368 pages.
Bishop, M. J. (1994). Guide to Human Genome Computing. London: Academic Press, 350 pages.
Brutlag, D. L. and Sternberg, M. J. E. (1996). Sequences and Topology. London: Current
Biology Ltd., 427 pages.
Creighton, T. E. (1993). Proteins: Structures and Molecular Properties (Second Edition
ed.). New York: Freeman.
Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory (1st ed.). New York
NY: John Wiley and Sons Inc.
Doolittle, R. F. (1986). Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid
Sequences. University Science Books, Mill Valley, California.
Doolittle, R. F. (1990). Molecular Evolution: Computer Analysis of Protein and Nucleic
Acid Sequences (1 ed.). Methods in Enzymology Volume 183, New York: Academic Press.
Doolittle, R. F. (1996). Computer Methods for Macromolecular Sequence Analysis. (Vol.
266). New York: Academic Press. 711 Pages.
Fasman, G. D. (1989). Prediction of Protein Structure and the Principles of Protein
Conformation. New York NY: Plenum Press,
Feller, W. (1968). An introduction to probability theory and its application. 3rd Edition
. New York: John Wiley and Sons.
James, M. (1985). Classification Algorithms (1st ed.). New York, NY: John Wiley and Sons.
Gribskov, M. and Devereux, J. (1991). Sequence Analysis Primer. New York: Stockton Press,
279 pages.
Gusfield, D. (1997). Algorithms on Strings, Trees and Sequences. (1st. ed.). Cambridge,
UK: Cambridge University Press, 534 pages.
Hunter, L. (1993). Artificial Intelligence and Molecular Biology. Menlo Park, CA: AAAI
Press, 470 pages.
Hunter, L., Searls, D. and Shavlik, J. (1993). First International Conference on
Intelligent Systems for Molecular Biology. Menlo Park, CA.: AAAI Press.
Knuth, D. E. (1973). Sorting and Searching . Reading Mass: Addison-Wesley.
Lander, E. S. and Waterman, M. S. (1995). Calculating the Secrets of Life: Applications of
the Mathematical Sciences in Molecular Biology. Washington D. C.: National Academy Press,
285 pages.
Lesk, A. (1991). Protein Architecture: A Practical Approach . Oxford: IRL Press at Oxford
University Press. 287 pages
Neapolitan, R. E. (1990). Probabilistic Reasoning in Expert Systems: Theory and Algorithms
. New York, New York: John Wiley and Sons.
Sankoff, D. and Kruskal, J. B. (1983). Time Warps, String Edits, and Macromolecules: The
Theory and Practice of Sequence Comparison . Reading, Massachusetts: Addison-Wesley. 382
pages
Schultze-Kremer, S. (1994). Advances in Molecular Bioinformatics. Washington D.D.: IOS
PRess, 259 pages.
Smith, D. W. (1994). Biocomputing: Informatics and Genome Projects. New York: Academic
Press Inc., 336 pages.
Trifonov, E. N. and Brendel, V. (1986). Gnomic: A Dictionary of Genetic Codes. Balaban
Publishers, Philadelphia, Pennsylvania.272 Pages.
von Heijne, Gunnar (1987). Sequence Analysis in Molecular Biology: Treasure Trove or
Trivial Pursuit, Academic Press, New York. 188 Pages
Waterman, M. (1988). Mathematical Methods for DNA Sequences, CRC Press, Cleveland Ohio.
283 Pages.
Waterman, M. S. (1995). Introduction to Computational Biology. Chapman & Hall Press,
London 430 pages.
Reviews
Altschul,
S. F., Boguski, M. S., Gish, W. and Wootton, J. C. (1994). Issues in searching molecular
sequence databases. Nat Genet 6 (2), 119-29.
Boguski,
M. S. (1992). Computational sequence analysis revisited: new databases, software tools,
and the research opportunities they engender. J Lipid Res, 33(7), 957-74.
Chao,
K.-M., Hardison, R. C. and Miller, W. (1994). Recent developments in linear-space
alignment methods: A survey. J. Computational Biology 1 (4), 271-291.
Doolittle,
R. F. (1994). Protein sequence comparisons: searching databases and aligning sequences.
Curr Opin Biotechnol 5 (1), 24-8.
Felsenstein,
J. (1988). Phylogenies from molecular sequences: inference and reliability. Annual Review
of Genetics 22 , 521-565.
Fischer,
D., Rice, D., Bowie, J. U. and Eisenberg, D. (1996). Assigning amino acid sequences to
3-dimensional protein folds. Faseb J 10 (1), 126-36.
Fischer,
C., Schweigert, S., Spreckelsen, C., & Vogel, F. (1996). Programs, databases, and
expert systems for human geneticists--a survey. Hum Genet, 97(2), 129-37.
Garnier,
J. and Levin, J. M. (1991). The protein structure code: what is its present status? Comput
Appl Biosci 7 (2), 133-42.
Gelfand,
M. S. (1995). Prediction of function in DNA sequence analysis. J Comput Biol 2 (1),
87-115.
Gschwend,
D. A., Good, A. C., & Kuntz, I. D. (1996). Molecular docking towards drug discovery. J
Mol Recognit, 9(2), 175-86.
Hogue,
C. W. (1997). Cn3D: a new generation of three-dimensional molecular structure viewer.
Trends Biochem Sci, 22(8), 314-6.
Holm,
L. and Sander, C. (1994). Searching protein structure databases has come of age. Proteins
19 (3), 165-73.
Holm,
L., & Sander, C. (1996). Mapping the protein universe. Science, 273(5275), 595-603.
Mural,
R. J., Einstein, J. R., Guan, X., Mann, R. C. and Uberbacher, E. C. (1992). An artificial
intelligence approach to DNA sequence feature recognition. Trends Biotechnol 10 (1-2),
66-9.
Rost,
B. and Sander, C. (1994). Structure prediction of proteins--where are we now? Curr Opin
Biotechnol 5 (4), 372-80.
Russell,
R. B., & Sternberg, M. J. (1995). Structure prediction. How good are we? Curr Biol,
5(5), 488-90.
Stormo,
G. D. (1988). Computer methods for analyzing sequence recognition of nucleic acids. Annu.
Rev. Biophys. Biophys. Chem. 17, 241-263.
Tyler,
E. C., Horton, M. R. and Krause, P. R. (1991). A review of algorithms for molecular
sequence comparison. Comput Biomed Res, 24(1), 72-96.
Vingron,
M. and Waterman, M. S. (1994). Sequence alignment and penalty choice. Review of concepts,
case studies and implications. J Mol Biol 235 (1), 1-12.
Waterman,
M. S. (1994). Parametric and ensemble sequence alignment algorithms. Bull Math Biol,
56(4), 743-67.
White,
S. H. (1994). Global statistics of protein sequences: implications for the origin,
evolution, and prediction of structure. Annu Rev Biophys Biomol Struct 23 , 407-39.
3.2 Databases: what is on the internet
02references.html
General References on the Internet
Cerf, V. (1991). Networks. Scientific American, 265(3), 72-84.
Dertouzos, M. L. (1991). Communications, Computers and Networks. Scientific American,
265(September), 62.
Engst, A. C. (1996). Internet Starter Kit (4th ed.). Indianapolis, IN: Hayden Books 858
pages.
Estrada, S. (1993). Connecting to the Internet. O'Reilly & Associates Inc.,
Sebastopol, CA. 170 pages.
Jennings, D. M., Landweber, L. H., Fuchs, I. H., Farber, D. J., and Adrion, W. R. (1986).
Computer Networking for Scientists. Science 231, 943-950.
Kehoe, B. P. (1993). Zen and the Art of Internet (Second Edition ed.). Engelwood Cliffs,
NH 07632: P.T.R. Prentice Hall.
Krol, E. (1992). The Whole Internet User's Guide and Catalog (2nd ed.). Sebastopol,
California: O'Reilly and Associates, Inc., 376 pages.
Quarterman, J. S. and Carl-Mitchell, S., (1994). The Internet Connection: System
Connectivity and Configuration, Addison Wesley Publishing Company, Menlo Park, CA. pages
270.
Tesler, L. G. (1991). Networked Computing in the 1990's. Scientific American, 265(Sept.),
86.
Shimomura, T. (1996). Takedown. Hyperion Press, New York., pages 324.
Walsh, J. (1988). Designs on a National Research Network. Science, 239, 861.
Appel, R. D., Sanchez, J.-C., Bairoch, A., Golaz, O., Ravier, F., Pasquali, C., Hughes, G.
J. and Hochstrasse, D. F. (1996).
The Swiss-2DPAGE database of two-dimensional polyacrylamide gel electrophoresis, its
status in 1995. Nucleic Acids Res., 24(1), 180-181.
Bairoch, A., Bucher, P. and Hofman, K. (1996). The
Prosite Database, its status in 1995. Nucleic Acids Res., 24(1), 189-196.
Bairoch, A. and Apweiler, R. (1996). The
Swiss-Prot Protein sequence data bank and its new supplement Trembl. Nucleic Acids
Res., 24(1), 21-25.
Bairoch, A. (1996). The
ENZYME Data Bank in 1995. Nucleic Acids Res., 24(1), 221-222.
Bairoch, A. (1991). SEQANALREF:
a sequence analysis bibliographic reference databank. Comput Appl Biosci, 7(2), 268.
Barker, W. C., George, D. G. and Hunt, L. T. (1990). Protein
sequence database. Methods Enzymol, 183, 31-49.
Benson, D. A., Boguski, M., Lipman, D. J. and Ostell, J. (1996). GenBank.
Nucleic Acids Res., 24(1), 1-5.
Bleasby, A., Griffiths, P., Harper, R., Hines, D., Hoover, K., Kristofferson, D.,
Marshall, S., O'Reilly, N. and Sundvall, M. (1992). Electronic
communications and the new biology. Nucleic Acids Res, 20 (16), 4127-4128.
Brandt, K. A. (1993). The
GDB Human Genome Data Base: a source of integrated genetic mapping and disease data.
Bull Med Libr Assoc, 81(3), 285-92.
Chiang, D. (1994). Reaching NLM through the Internet. Med Ref Serv Q, 13(1), 83-92.
Cinkosky, M., Fickett, J. W., Gilna, P. and Burks, C. (1991). Electronic
Data Publishing and GenBank. Science, 252 (31 May), 1273-1277.
Cuticchia, A. J., Fasman, K. H., Kingsbury, D. T., Robbins, R. J. and Pearson, P. L.
(1993). The
GDB human genome data base anno 1993. Nucleic Acids Res, 21(13), 3003-6.
Engels, W. R. (1993). Contributing
software to the internet: the Amplify program. Trends Biochem Sci, 18(11), 448-50.
Fasman, K. H., Letovsky, S. I., Cottingham, R. W. and Kingsbury, D. T. (1996). Improvements
to the GDB™ Human Genome Data Base. Nucleic Acids Res., 24(1), 57-63.
Frey, A. H. (1994). The
internet biologist [news]. Faseb J, 8(14), 1110.
Frisse, M. E., Kelly, E. A. and Metcalfe, E. S. (1994). An
Internet primer: resources and responsibilities. Acad Med, 69(1), 20-4.
Fuchs, R. (1994). Sequence
analysis by electronic mail: a tool for accessing Internet e-mail servers. Comput Appl
Biosci, 10(4), 413-7.
George, D. G., Barker, W. C., Mewes, H. W., Pfeiffer and Tsugita, A. (1996). The
PIR-International Protein Sequence Database. Nucleic Acids Res., 24(1), 17-20.
Heumann, K., George, D. and Mewes, H. W. (1994). A
new concept of sequence data distribution on wide area networks. Comput Appl Biosci,
10(5), 519-26.
Holm, L. and Sander, C. (1996). The
FSSP database: fold Classification based on structure-structure alignment of proteins.
Nucleic Acids Res., 24(1), 206-209.
Hutchinson, F. and Donnellan, J. E., Jr. (1994). Yale
database for DNA sequence changes in mutagenesis. Nucleic Acids Res, 22(17), 3566-8.
Huysmans, M., Richelle, J. and Wodak, S. J. (1991). SESAM:
a relational database for structure and sequence of macromolecules. Proteins, 11(1),
59-76.
Jacobson, D. (1994). The
World Wide Web for biologists. Protein Sci, 3(11), 2159-61.
Jones, R. (1992). Alerting
users to relevant new entries in the GenBank DNA sequence database. Comput Appl
Biosci, 8(2), 199.
Keen, G.et al. (1996). The
Genome Sequence Database (GSDB): Metting the challenge of genomic sequencing. Nucleic
Acids Res., 24(1), 13-16.
Krawetz, S. A. (1989). Sequence
errors described in GenBank: a means to determine the accuracy of DNA sequence
interpretation. Nucleic Acids Res, 17 (10), 3951-7.
O'Donnell, C. (1994). Obtaining
software via INTERNET. Methods Mol Biol, 24, 345-54.
Peitsch, M. C., Wells, T. N., Stampf, D. R. and Sussman, J. L. (1995). The
Swiss-3DImage collection and PDB-Browser on the World-Wide Web. Trends Biochem Sci,
20(2), 82-4.
Pietrokovski, S., Henikoff, J. G. and Henikoff, S. (1996). The
Blocks Database A system for Protein Classification. Nucleic Acids Res., 24(1),
197-200.
Roberts, R. J. and Macelis, D. (1996). REBASE
- restriction enzymes and methylases. Nucleic Acids Res., 24(1), 223-235.
Rodriguez-Tomé, P., Stoehr, P., Cameron, G. N. and Flores, T. P. (1996). The
European Bioinformatics Institute (EBI). Nucleic Acids Res., 24(1), 6-12.
Schneider, R. and Sander, C. (1996). The
HSSP database of protein structure-sequence alignments. Nucleic Acids Res., 24(1),
201-205.
Smith, R. H., Gottesman, S., Hobbs, B., Lear, E., Kristofferson, D., Benton, D. and Smith,
P. R. (1991). A
mechanism for maintaining an up-to-date GenBank database via Usenet. Comput Appl
Biosci, 7 (1), 111-2.
Smith, T. F. (1990). The
history of the genetic sequence databases. Genomics, 6 (4), 701-7.
Stoehr, P. J. and Omond, R. A. (1989). The
EMBL Network File Server. Nucleic Acids Res, 17 (16), 6763.
Williams, G. W. and Gibbs, G. P. (1990). Automatic
updating of the EMBL database via EMBNet. Comput Appl Biosci, 6 (2), 122-3.
Williams, R. W. (1994). The
Portable Dictionary of the Mouse Genome: a personal database for gene mapping and
molecular biology. Mamm Genome, 5(6), 372-5.
Woodsmall, R. M. and Benson, D. A. (1993). Information
resources at the National Center for Biotechnology Information. Bull Med Libr Assoc,
81(3), 282-4.
Zehetner, G. and Lehrach, H. (1994). The
Reference Library System--sharing biological material and experimental data. Nature,
367(6462), 489-91.
Abarbanel, R. M., Wieneke, P. R., Mansfield, E., Jaffe, D. A. and Brutlag, D. L. (1984). Rapid searches for complex patterns in biological molecules. Nucleic Acids Res. 12, 263-280.
Allison,
L., Wallace, C. S. and Yee, C. N. (1992). Finite-state models in the alignment of
macromolecules. J Mol Evol, 35 (1), 77-89.
Cantalloube,
H., Labesse, G., Chomilier, J., Nahum, C., Cho, Y. Y., Chams, V., Achour, A., Lachgar, A.,
Mbika, J. P., Issing, W. and et al. (1995). Automat and BLAST: comparison of two protein
sequence similarity search programs. Comput Appl Biosci 11 (3), 261-72.
Chao,
K. M., Zhang, J., Ostell, J. and Miller, W. (1995). A local alignment tool for very long
DNA sequences. Comput Appl Biosci 11 (2), 147-53.
Dayhoff, M. Schwartz, R. M. and Orcutt, B. C. (1978). A model of evolutionary change in
Proteins. Atlas of Protein Structure 1978, 345-352
Dayhoff,
M. O., Barker, W. C. and Hunt, L. T. (1983). Establishing Homologies in Protein Sequences,
in Methods in Enzymology, 91, 524-545.
DeLisi, C. and Kanehisa, M. (1984). Assessing the Significance of Local Sequence
Homologies. Mathematical Biosciences 69, 77-85.
Doolittle,
R. and Fairchild. (1981). Similar amino acid sequences: chance or common ancestry? Science
214, 149-158.
Doolittle, R. F. (1986). Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid
Sequences. Mill Valley, California: University Science Books.
Feng,
D.F., Johnson, M.S. and Doolittle, R.F. (1985). Aligning amino acid sequences: comparison
of commonly used methods. J. Mol. Evol. 21, 112-125.
Gribskov,
M. (1994). Profile analysis. Methods Mol Biol 25 , 247-66.
Grice,
J. A., Hughey, R. and Speck, D. (1995). Parallel sequence alignment in limited space. Ismb
3 , 145-53.
Krogh,
A., Brown, M., Mian, I. S., Sjolander, K. and Haussler, D. (1994). Hidden Markov models in
computational biology. Applications to protein modeling. J Mol Biol 235 (5), 1501-31.
Huang,
X. Q., Hardison, R. C. and Miller, W. (1990). A space-efficient algorithm for local
similarities. Comput Appl Biosci 1990 6(4), 373-81.
Landes,
C. and Risler, J. L. (1994). Fast databank searching with a reduced amino-acid alphabet.
Comput Appl Biosci 10 (4), 453-4.
Lawrence,
C. E. and Reilly, A. A. (1990). An expectation maximization (EM) algorithm for the
identification and characterization of common sites in unaligned biopolymer sequences.
Proteins, 7 (1), 41-51.
Needleman,
S. B. and Wunsch, C. D. (1970). A general method applicable to the search for similarities
in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443-453.
Pearson,
W. R. and Miller, W. (1992). Dynamic programming algorithms for biological sequence
comparison. Methods Enzymol, 210, 575-601.
Pearson,
W. R. (1991). Searching protein sequence libraries: comparison of the sensitivity and
selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11 (3), 635-50.
Pearson,
W. R. (1995). Comparison of methods for searching protein sequence databases. Protein Sci
4 (6), 1145-60.
Rechid,
R., Vingron, M. and Argos, P. (1989). A new interactive protein sequence alignment program
and comparison of its results with widely used algorithms. Comput Appl Biosci, 5 (2),
107-13.
Resenchuk,
S. M. and Blinov, V. M. (1995). ALIGNMENT SERVICE: creation and processing of alignments
of sequences of unlimited length. Comput Appl Biosci 11 (1), 7-11.
Reeck,
G. R., de Haen, C., Teller, D. C., Doolittle, R. F., Fitch, W. M., Dickerson, R. E (1987).
"Homology" in Proteins andNucleic Acids: A Terminology Muddle and a Way out of
It. Cell 50, 667.
Searls,
D. B. and Murphy, K. P. (1995). Automata-theoretic models of mutation and alignment. Ismb
3 , 341-9.
Smith,
T. F. and Waterman, M. (1981). Identification of common molecular subsequences. J. Mol.
Biol. 147, 195-197.
Smith,
T., Waterman, M. and Fitch, W. (1981). Comparative biosequence metrics. J. Mol. Evol. 18,
38-46.
Streletc,
V. B., Shindyalov, I. N., Kolchanov, N. A. and Milanesi, L. (1992). Fast, statistically
based alignment of amino acid sequences on the base of diagonal fragments of DOT-matrices.
Comput Appl Biosci, 8 (6), 529-34.
Waterman,
M. S., Eggert, M. and Lander, E. (1992). Parametric sequence comparisons. Proc Natl Acad
Sci U S A, 89 (13), 6090-3.
Scoring Systems
Allison,
L. (1993). Normalization of affine gap costs used in optimal sequence alignment. J Theor
Biol 161 (2), 263-9.
Altschul,
S. F. (1993). A protein alignment scoring system sensitive at all evolutionary distances.
J Mol Evol 36 (3), 290-300.
Benner,
S. A., Cohen, M. A. and Gonnet, G. H. (1993). Empirical and structural models for
insertions and deletions in the divergent evolution of proteins. J Mol Biol 229 (4),
1065-82.
Brutlag,
D. L., Dautricourt, J. P., Maulik, S. and Relph, J. (1990). Improved sensitivity of
biological sequence database searches. Comput Appl Biosci 6 (3), 237-45.
Gonnet,
G. H., Cohen, M. A. and Benner, S. A. (1992). Exhaustive Matching of the Entire Protein
Sequence Database. Science 256 (5062), 1443-5.
Henikoff,
S. (1996). Scores for Sequence Searches. Current Opinion in Structural Biology 6 (3),
353-360.
Johnson,
M. S., Overington, J. P. and Blundell, T. L. (1993). Alignment and searching for common
protein folds using a data bank of structural templates. J Mol Biol 231 (3), 735-52.
Jones,
D. T., Taylor, W. R. and Thornton, J. M. (1992). The rapid generation of mutation data
matrices from protein sequences. Comput Appl Biosci, 8 (3), 275-82.
Luthy,
R., McLachlan, A. D. and Eisenberg, D. (1991). Secondary structure-based profiles: use of
structure-conserving scoring tables in searching protein sequence databases for structural
similarities. Proteins 10 (3), 229-239.
Overington,
J., Donnelly, D., Johnson, M. S., Sali, A. and Blundell, T. L. (1992).
Environment-specific amino acid substitution tables: tertiary templates and prediction of
protein folds. Protein Sci 1 (2), 216-26.
Schwartz, R. M. and Dayhoff, M. O. (1979). Matrices for Detecting Distant Relationships.
Atlas of Protein Structure 5 (Suppl. 3), 353-358.
Wilbur,
W. J. (1985). On the PAM matrix model of protein evolution. . Mol Biol Evol 2 (5), 434-47.
Zhu,
Z. Y., Sali, A. and Blundell, T. L. (1992). A variable gap penalty function and feature
weights for protein 3-D structure comparisons. Protein Eng 5 (1), 43-51.
Aligning Sequences to Structures
Bryant,
S. H. and Altschul, S. F. (1995). Statistics of sequence-structure threading. Curr Opin
Struct Biol 5 (2), 236-44.
Casari,
G., Sander, C. and Valencia, A. (1995). A method to predict functional residues in
proteins. Nat Struct Biol 2 (2), 171-8.
Diederichs,
K. (1995). Structural superposition of proteins with unknown alignment and detection of
topological similarity using a six-dimensional search algorithm. Proteins 23 (2), 187-95.
Fischer,
D., Rice, D., Bowie, J. U. and Eisenberg, D. (1996). Assigning amino acid sequences to
3-dimensional protein folds. Faseb J 10 (1), 126-36.
Godzik,
A. and Skolnick, J. (1994). Flexible algorithm for direct multiple alignment of protein
structures and sequences. Comput Appl Biosci 10 (6), 587-96
Holm,
L. and Sander, C. (1993). Protein structure comparison by alignment of distance matrices.
J Mol Biol 233 (1), 123-38.
Holm,
L. and Sander, C. (1996). The FSSP database: fold Classification based on
structure-structure alignment of proteins. Nucleic Acids Res. 24 (1), 206-209.
Lathrop,
R. H. and Smith, T. F. (1996). Global optimum protein threading with gapped alignment and
empirical pair score functions. J Mol Biol 255 (4), 641-65.
Miller,
R. T., Jones, D. T. and Thornton, J. M. (1996). Protein fold recognition by sequence
threading: tools and assessment techniques. Faseb J 10 (1), 171-8.
Rost,
B. and Sander, C. (1994). Structure prediction of proteins--where are we now? Curr Opin
Biotechnol 5 (4), 372-80.
Rost,
B. (1995). TOPITS: threading one-dimensional predictions into three-dimensional
structures. Ismb 3 , 314-21.
Sayle,
R., Saqi, M., Weir, M. and Lyall, A. (1995). PdbAlign, PdbDist and DistAlign: tools to aid
in relating sequence variability to structure. Comput Appl Biosci 11 (5), 571-3.
Schneider,
R. and Sander, C. (1996). The HSSP database of protein structure-sequence alignments.
Nucleic Acids Res. 24 (1), 201-205.
Wilmanns,
M. and Eisenberg, D. (1995). Inverse protein folding by the residue pair preference
profile method: estimating the correctness of alignments of structurally compatible
sequences. Protein Eng 8 (7), 627-39.
Altschul,
S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. J. (1990). A Basic Local
Alignment Search Tool. J. Mol. Biol., 215, 403-410.
Altschul,
S. F., Boguski, M. S., Gish, W. and Wootton, J. C. (1994). Issues in searching molecular
sequence databases. Nat Genet 6 (2), 119-29.
Barsalou,
T. and Brutlag, D. L. (1991). Searching Gene and Protein Sequence Databases. MD Computing,
8(3), 144-149.
Brutlag,
D. L., Dautricourt, J. P., Maulik, S. and Relph, J. (1990). Improved sensitivity of
biological sequence database searches. Comput Appl Biosci, 6(3), 237-45.
Brutlag, D. L.,
Dautricourt, J. P., Diaz, R., Fier, J., Moxon, B. and Stamm, R. (1993). BLAZE: An
implementation of the Smith-Waterman Comparison Algorithm on a Massively Parallel
Computer. Computers and Chemistry 17 , 203-207.
Collins,
J. F., & Coulson, A. F. (1984). Applications of parallel processing algorithms for DNA
sequence analysis. Nucleic Acids Res, 12, 181-192.
Collins,
J. F., Coulson, A.F. W. and Lyall, A. (1988). The significance of protein sequence
similarities. CABIOS 4, 67-71.
Galper, A. R. and
Brutlag, D. L. (1990). Parallel Similarity Search and Alignment with the Dynamic
Programming Method (KSL Report 90-74). Stanford University.
Gish,
W. and States, D. J. (1993). Identification of protein coding regions by database
similarity search. Nat Genet 3 (3), 266-72.
Gonnet,
G. H., Cohen, M. A. and Benner, S. A. (1992). Exhaustive matching of the entire protein
sequence database. Science, 256, 1443-5.
Gribskov,
M., McLachlan, A. D. and Eisenberg, D. (1987). Profile analysis: Dectection of distantly
related proteins. Proc. Natl. Acad. Sci. USA 84, 4355-4358.
Lipman,
D.J. and Pearson, W.R. (1985). Rapid and Sensitive Protein Simlarity Searches. Science
227, 1435-1441.
Liuni,
S., Prunella, N., Pesole, G., D'Orazio, T., Stella, E. and Distante, A. (1993). SIMD
parallelization of the WORDUP algorithm for detecting statistically significant patterns
in DNA sequences. Comput Appl Biosci 9 (6), 701-7.
Myers
E. W. and Miller, W. (1988). Optimal alignments in linear space. CABIOS 4, 11-17.
Pearson,
W. R. and Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proc.
Natl. Acad. Sci USA 85, 2444-2448.
Pearson, W. J. (1986). Sensitivity and Selectivity in Protein Sequence Comparison. In
Methods in Protein Sequence Analysis, Clifton, New Jersey: Humana Press.
Pearson,
W. R. (1994). Using the FASTA program to search protein and DNA sequence databases.
Methods Mol Biol 25 , 365-89.
Pesole,
G., Prunella, N., Liuni, S., Attimonelli, M. and Saccone, C. (1992). WORDUP: an efficient
algorithm for discovering statistically significant patterns in DNA sequences. Nucleic
Acids Res 20 (11), 2871-5.
Staden,
R. (1994). Staden: comparing sequences. Methods Mol Biol 25 , 155-70.
Strelets,
V. B., Ptitsyn, A. A., Milanesi, L. and Lim, H. A. (1994). Data bank homology search
algorithm with linear computation complexity. Comput Appl Biosci 10 (3), 319-22.
Wilbur,
W.J. and Lipman, D.J. (1983). Rapid similarity searches of nucleic acid and protein data
banks. Proc. Natl. Acad. Sci. USA 80, 726-30.
Carillo, H. and Lipman, D. (1988). SIAM J. Appl. Math., 48, 1073-1082.
Corpet,
F. (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res.,
16, 10881-10890.
Dolz,
R. (1994). GCG: production of multiple sequence alignment. Methods Mol Biol 24 , 83-99.
Eddy,
S. R. (1995). Multiple alignment using hidden Markov models. Ismb, 3, 114-20.
Eisen,
J. A. (1997). The Genetic Data Environment. A user modifiable and expandable multiple
sequence analysis package. Methods Mol Biol, 70, 13-38.
Feng,
D. F., and Doolittle, R. F. (1987). Progressive sequence alignment as a prerequisite to
correct phylogenetic trees. J. Mol. Evol. 25, 351-360.
Feng,
D. F. and Doolittle, R. F. (1996). Progressive Alignment of Amino Acid Sequences and
Construction of Phylogenetic Trees from Them. Methods in Enzymology, 266, 368-382.
Galas,
D.J., Eggert, M. and Waterman, M.S. (1985). Rigorous pattern-recognition methods for DNA
sequences. Analysis of promoter sequences from Escherichia coli. J. Mol. Biol. 186,
117-128.
Gotoh,
O. (1993). Optimal alignment between groups of sequences and its application to multiple
sequence alignment. Comput Appl Biosci 9 (3), 361-70.
Gotoh,
O. (1996). Significant improvement in accuracy of multiple protein sequence alignments by
iterative refinement as assessed by reference to structural alignments. J Mol Biol,
264(4), 823-38.
Henikoff,
S., & Henikoff, J. G. (1997). Embedding strategies for effective use of information
from multiple sequence alignments. Protein Sci, 6(3), 698-705.
Higgins,
D. G. and Sharp, P. M. (1988). CLUSTAL: a package for performing multiple sequence
alignment on a microcomputer. Gene, 73(1), 237-44.
Higgins,
D. G. and Sharp, P. M. (1989). Fast and sensitive multiple sequence alignments on a
microcomputer. Comput Appl Biosci, 5(2), 151-3.
Higgins,
D. G., Bleasby, A. J. and Fuchs, R. (1992). Clustal V: improved software for multiple
sequence aligment. CABIOS, 8(2), 189-191.
Higgins,
D. G. (1994). CLUSTAL V: multiple alignment of DNA and protein sequences. Methods Mol Biol
25 , 307-18.
Higgins,
D. G., Thompson, J. D. and Gibson, T. J. (1996). Using CLUSTAL for Multiple Sequence
Alignments. Methods in Enzymology, 266, 383-401.
Hughey,
R., & Krogh, A. (1996). Hidden Markov models for sequence analysis: extension and
analysis of the basic method. Comput Appl Biosci, 12(2), 95-107.
Johnson,
M. S., and Doolittle, R. F. (1986). A method for the simultaneous alignment of three or
more amino acid sequences. J. Mol. Evol. 23, 267-278.
Karlin,
S. and Ghandour,G.(1985). Comparative statistics for DNA and protein sequences: Multiple
sequence analysis. Proc. Natl. Acad. Sci. USA 82, 6186-6190.
Karlin,
S., Morris, D., Ghandour, G., and Leung, M. Y. (1988). Efficient algorithms for molecular
sequence analysis. Proc. Natl. Acad. Sci. U. S. A. 85, 841-845.
Lawrence,
C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F. and Wootton, J. C.
(1993). Detecting subtle sequence signals: a Gibbs sampling strategy for multiple
alignment. Science 262 (5131), 208-14.
Lipman,
D. J., Altschul, S. F. and Kececioglu, J. D. (1989). A tool for multiple sequence
alignment. Proc Natl Acad Sci U S A, 86(12), 4412-5.
Martinez
H.M. (1983) An efficient method for finding repeats in molecular sequences. Nucleic Acids
Res. 11, 4629-4634.
Martinez,
H. M. (1988). A flexible multiple sequence alignment program. Nucleic. Acids. Res. 16,
1683-1691.
Murata,
M., Richardson, J. S., and Sussman, J. L. (1985). Simultaneous comparison of three protein
sequences. Proc. Natl. Acad. Sci. U. S. A. 82, 3073-3077.
Myers,
G., Selznick, S., Zhang, Z., & Miller, W. (1996). Progressive multiple alignment with
constraints. J Comput Biol, 3(4), 563-72.
Russell,
R. B. and Barton, G. J. (1992). Multiple protein sequence alignment from tertiary
structure comparison: assignment of global and residue confidence levels. Proteins, 14(2),
309-23.
Sobel,
E., and Martinez, H. M. (1986). A multiple sequence alignment program. Nucleic. Acids.
Res. 14, 363-374.
Subbiah,
S. and Harrison, S. C. (1989). A method for multiple sequence alignment with gaps. J Mol
Biol, 209(4), 539-48.
Taylor,
W. R. (1986). Identification of protein sequence homology by consensus template alignment.
J. Mol. Biol. 188, 233-258.
Taylor,
W. R. (1987). Multiple sequence alignment by a pairwise algorithm. Comput. Appl. Biosci.
3, 81-87.
Taylor,
W. R. (1996). Multiple Protein Sequence Alignment: Algorithms and Gap Insertion. Methods
in Enzymology, 266, 343-367.
Thompson,
J. D., Higgins, D. G. and Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Res 22 (22), 4673-80.
Vingron,
M. and Argos, P. (1989). A fast and sensitive multiple sequence alignment algorithm.
Comput Appl Biosci, 5(2), 115-21.
Vingron,
M., & Sibbald, P. R. (1993). Weighting in sequence space: a comparison of methods in
terms of generalized sequences. Proc Natl Acad Sci U S A, 90(19), 8777-81.
Waterman,
M., Arratia, R. and Galas, D.J. (1984). Pattern Recognition in Several Sequences:
Consensus and Alignment. Bull. Math. Biol. 46, 515-527.
Waterman,
M. S. (1986). Multiple sequence alignment by consensus. Nucleic. Acids. Res. 14,
9095-9102.
Hegyi H, Pongor S: Predicting potential domain homologies from FASTA search results. Comput Appl Biosci 1993 Jun;9(3):371-372