Global Information Lookup Global Information

List of software to detect low complexity regions in proteins information



Computational methods can study protein sequences to identify regions with low complexity, which can have particular properties regarding their function and structure.

Name Last update Usage Description Open source? Reference
SAPS 1992 downloadable / web It describes several protein sequence statistics for the evaluation of distinctive characteristics of residue content and arrangement in primary structures. yes [1]
SEG 1993 downloadable It is a two pass algorithm: first, identifies the LCR, and then performs local optimization by masking with Xs the LCRs yes [2]
fLPS 2017 downloadable / web It can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CBRs and for making functional inferences about CBRs for a protein of interest yes [3]
CAST 2000 web It identifies LCRs using dynamic programming. no [4]
SIMPLE 2002 downloadable web It facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold. yes [5]
Oj.py 2001 on request A tool for demarcating low complexity protein domains. no [6]
DSR 2003 on request It calculates complexity using reciprocal complexity. no [7]
ScanCom 2003 on request Calculates the compositional complexity using the linguistic complexity measure. no [8]
CARD 2005 on request Based on the complexity analysis of subsequences delimited by pairs of identical, repeating subsequences. no [9]
BIAS 2006 downloadable / web It uses discrete scan statistics that provide a highly accurate multiple test correction to compute analytical estimates of the significance of each compositionally biased segment. yes [10]
GBA 2006 on request A graph-based algorithm that constructs a graph of the sequence. no [11]
SubSeqer 2008 web A graph-based approach for the detection and identification of repetitive elements in low–complexity sequences. no [12]
ANNIE 2009 web This method creates an automation of the sequence analytic process. no [13]
LPS-annotate 2011 on request This algorithm defines compositional bias through a thorough search for lowest-probability subsequences (LPSs; Low Probability Sequences) and serves as workbench of tools now available to molecular biologists to generate hypotheses and inferences about the proteins that they are investigating. no [14]
LCReXXXplorer 2015 web A web platform to search, visualize and share data for low complexity regions in protein sequences. LCR-eXXXplorer offers tools for displaying LCRs from the UniProt/SwissProt knowledgebase, in combination with other relevant protein features, predicted or experimentally verified. Also, users may perform queries against a custom designed sequence/LCR-centric database. no [15]
XNU 1993 downloadable It uses the PAM120 scoring matrix for the calculation of complexity. yes [16]
AlcoR 2022 downloadable A compression-based and alignment-free tool for detecting low-complexity regions in biological data yes [17]

For a comprehensive review on the various methods and tools, see.[18]

In addition, a web meta-server named PLAtform of TOols for LOw COmplexity (PlaToLoCo) has been developed, for visualization and annotation of low complexity regions in proteins.[19] PlaToLoCo integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. Furthermore, the union or intersection of the results of the search on a query sequence can be obtained.

A Neural Network webserver, named LCR-hound has been developed to predict the function of prokaryotic and eukaryotic LCRs, based on their amino acid or di-amino acid content.[20]

  1. ^ Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (15 Mar 1992). "Methods and algorithms for statistical analysis of protein sequences". Proc Natl Acad Sci U S A. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi:10.1073/pnas.89.6.2002. PMC 48584. PMID 1549558.
  2. ^ Wootton JC, Federhen S (June 2003). "Statistics of local complexity in amino acid sequences and sequence databases". Computers and Chemistry. 17 (2): 149–163. doi:10.1016/0097-8485(93)85006-X.
  3. ^ Harrison PM (13 Nov 2017). "fLPS: Fast discovery of compositional biases for the protein universe". BMC Bioinformatics. 18 (1): 476. doi:10.1186/s12859-017-1906-3. PMC 5684748. PMID 29132292.
  4. ^ Promponas VJ, Enright AJ, Tsoka S, Kreil DP, Leroy C, Hamodrakas S, Sander C, Ouzounis CA (Oct 2000). "CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts". Bioinformatics. 16 (10): 915–922. doi:10.1093/bioinformatics/16.10.915. PMID 11120681.
  5. ^ Albà MM, Laskowski RA, Hancock JM (May 2002). "Detecting cryptically simple protein sequences using the SIMPLE algorithm". Bioinformatics. 18 (5): 672–678. doi:10.1093/bioinformatics/18.5.672. PMID 12050063.
  6. ^ Wise MJ (2001). "0j.py: a software tool for low complexity proteins and protein domains". Bioinformatics. 17 (Suppl 1): S288–S295. doi:10.1093/bioinformatics/17.suppl_1.s288. PMID 11473020.
  7. ^ Wan H, Li L, Federhen S, Wootton JC (2003). "Discovering simple regions in biological sequences associated with scoring schemes". J Comput Biol. 10 (2): 171–185. doi:10.1089/106652703321825955. PMID 12804090.
  8. ^ Nandi T, Dash D, Ghai R, B-Rao C, Kannan K, Brahmachari SK, Ramakrishnan C, Ramachandran S (2003). "A new algorithm for detecting low-complexity regions in protein sequences". J Biomol Struct Dyn. 20 (5): 657–668. doi:10.1080/07391102.2003.10506882. PMID 12643768. S2CID 45635217.
  9. ^ Shin SW, Kim SM (15 Jan 2005). "A novel complexity measure for comparative analysis of protein sequences from complete genomes". Bioinformatics. 21 (2): 160–170. doi:10.1093/bioinformatics/bth497. PMID 15333459.
  10. ^ Kuznetsov IB, Hwang S (1 May 2006). "A novel sensitive method for the detection of user-defined compositional bias in biological sequences". Bioinformatics. 22 (9): 1055–1063. doi:10.1093/bioinformatics/btl049. PMID 16500936.
  11. ^ Li X, Kahveci T (15 Dec 2006). "A Novel algorithm for identifying low-complexity regions in a protein sequence". Bioinformatics. 22 (24): 2980–2987. doi:10.1093/bioinformatics/btl495. PMID 17018537.
  12. ^ He D, Parkinson J (1 Apr 2008). "SubSeqer: a graph-based approach for the detection and identification of repetitive elements in low-complexity sequences". Bioinformatics. 24 (7): 1016–1017. doi:10.1093/bioinformatics/btn073. PMID 18304932.
  13. ^ Ooi HS, Kwo CY, Wildpaner M, Sirota FL, Eisenhaber B, Maurer-Stroh S, Wong WC, Schleiffer A, Eisenhaber F, Schneider G (Jul 2009). "ANNIE: integrated de novo protein sequence annotation". Nucleic Acids Res. 37 (Web server issue): W435–W440. doi:10.1093/nar/gkp254. PMC 2703921. PMID 19389726.
  14. ^ Harbi D, Kumar M, Harrison PM (6 Jan 2011). "LPS-annotate: complete annotation of compositionally biased regions in the protein knowledgebase". Database (Oxford). 2011: baq031. doi:10.1093/database/baq031. PMC 3017391. PMID 21216786.
  15. ^ Kirmitzoglou I, Promponas VJ (1 Jul 2015). "LCR-eXXXplorer: a web platform to search, visualize and share data for low complexity regions in protein sequences". Bioinformatics. 31 (13): 2208–2210. doi:10.1093/bioinformatics/btv115. PMC 4481844. PMID 25712690.
  16. ^ Claverie JM, States D (June 1993). "Information enhancement methods for large scale sequence analysis". Computers Chem. 17 (2): 191–201. doi:10.1016/0097-8485(93)85010-a.
  17. ^ Silva JM, Qi W, Pinho AJ, Pratas D (2022-12-28). "AlcoR: alignment-free simulation, mapping, and visualization of low-complexity regions in biological data". GigaScience. 12. doi:10.1093/gigascience/giad101. ISSN 2047-217X. PMC 10716826. PMID 38091509.
  18. ^ Mier P, Paladin L, Tamana S, Petrosian S, Hajdu-Soltész B, Urbanek A, Gruca A, Plewczynski D, Grynberg M, Bernadó P, Gáspári Z (2020-03-23). "Disentangling the complexity of low complexity proteins". Briefings in Bioinformatics. 21 (2): 458–472. doi:10.1093/bib/bbz007. ISSN 1467-5463. PMC 7299295. PMID 30698641.
  19. ^ Jarnot P, Ziemska-Legiecka J, Dobson L, Merski M, Mier P, Andrade-Navarro MA, Hancock JM, Dosztányi Z, Paladin L, Necci M, Piovesan D (2020-07-02). "PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins". Nucleic Acids Research. 48 (W1): W77–W84. doi:10.1093/nar/gkaa339. ISSN 0305-1048. PMC 7319588. PMID 32421769.
  20. ^ Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD (2019-11-04). "Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved". Nucleic Acids Research. 47 (19): 9998–10009. doi:10.1093/nar/gkz730. ISSN 0305-1048. PMC 6821194. PMID 31504783.

and 24 Related for: List of software to detect low complexity regions in proteins information

Request time (Page generated in 1.2486 seconds.)

List of software to detect low complexity regions in proteins

Last Update:

Computational methods can study protein sequences to identify regions with low complexity, which can have particular properties regarding their function...

Word Count : 930

Intrinsically disordered proteins

Last Update:

structure) regions, and low-complexity regions can easily be detected. However, not all disordered proteins contain such low complexity sequences. Determining...

Word Count : 6004

Sequence alignment

Last Update:

In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence...

Word Count : 6863

Protein structure prediction

Last Update:

relationships among proteins are listed below. Many additional terms are used for various kinds of structural features found in proteins. Descriptions of such terms...

Word Count : 8971

DNA annotation

Last Update:

artificially high scores due to the presence of low complexity regions, and significant variation within a protein family. Functional annotation can be performed...

Word Count : 7339

Bioinformatics

Last Update:

terms such as gene names Proteinprotein interaction – identify which proteins interact with which proteins from text The area of research draws from statistics...

Word Count : 8408

Machine learning

Last Update:

to overfitting and generalization will be poorer. In addition to performance bounds, learning theorists study the time complexity and feasibility of learning...

Word Count : 14683

List of RNA structure prediction software

Last Update:

This list of RNA structure prediction software is a compilation of software tools and web portals used for RNA structure prediction. The single sequence...

Word Count : 8423

Flow cytometry

Last Update:

(FC) is a technique used to detect and measure the physical and chemical characteristics of a population of cells or particles. In this process, a sample...

Word Count : 6939

Cluster analysis

Last Update:

is fast and has low computational complexity. There are two types of grid-based clustering methods: STING and CLIQUE. Steps involved in grid-based clustering...

Word Count : 8803

Structural alignment

Last Update:

the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence...

Word Count : 5364

Hyperspectral imaging

Last Update:

purpose of finding objects, identifying materials, or detecting processes. There are three general types of spectral imagers. There are push broom scanners...

Word Count : 4889

Chromatography

Last Update:

the preparative scale, are operational complexity, due to gradient solvent pumping, and low throughput, due to low column loadings. Displacement chromatography...

Word Count : 7373

Genome skimming

Last Update:

non-coding regions within the 18-5.8-28S rDNA in eukaryotes and are one feature of rDNA that has been used in genome skimming studies. ITS are used to detect different...

Word Count : 4505

Biological data visualization

Last Update:

It is possible to select proteins and/or residue regions from the MSA to view their 3D structures aligned. RCSB.org clusters protein entities (PDB experimental...

Word Count : 6663

Transcriptomics technologies

Last Update:

transcriptomes of the various tissue types without use of a genome sequence. RNA-Seq can also be used to identify previously unknown protein coding regions in existing...

Word Count : 12516

Gene

Last Update:

includes genes that do not encode proteins (not all transcripts are messenger RNA). The definition normally excludes regions of the genome that control transcription...

Word Count : 12585

SLC46A3

Last Update:

therapeutic target for cancer. While protein abundance is relatively low in humans, high expression has been detected particularly in the liver, small intestine...

Word Count : 6550

Flow cytometry bioinformatics

Last Update:

fluorescent protein, or they may be artificial fluorophores covalently bonded to detection molecules such as antibodies for detecting proteins, or hybridization...

Word Count : 8051

DNA sequencing

Last Update:

and an in vitro virus mRNA display method. Specifically, this method covalently links proteins of interest to the mRNAs encoding them, then detects the mRNA...

Word Count : 14413

Ensemble learning

Last Update:

efficient technique to detect such fraudulent cases and activities in banking and credit card systems. The accuracy of prediction of business failure is...

Word Count : 6612

List of algorithms

Last Update:

generate a partial ordering of events in a distributed system and detect causality violations Buddy memory allocation: an algorithm to allocate memory such with...

Word Count : 7809

2023 in science

Last Update:

reliably detect speech deepfakes with detection for years-old AI software being at 73% (2 Aug), researchers report an unprecedented accuracy of reading...

Word Count : 44482

Aging brain

Last Update:

Due to the complexity of the brain, with all of its structures and functions, it is logical to assume that some areas would be more vulnerable to aging...

Word Count : 10021

PDF Search Engine © AllGlobal.net