Bio2Byte SARS-CoV-2

Sequence-based predictions for the characteristics of the proteins that compose the SARS-CoV-2 virus, which causes COVID-19
Last Update: March 23, 2021, 5:12 p.m.   Entries: 27

What do we provide?

This website contains sequence-based predictions to help understand the behavior of proteins that compose the SARS-CoV-2 virus. Not all of these proteins (or regions thereof) have a well-defined three-dimensional structure (as available from the PDB), and might exhibit ambiguous, dynamic behavior. With our predictions, we provide clues on how such regions in these proteins might behave, with the aim to support the work of researchers trying to understand how this virus works.

How do I proceed?

Click here on 'Entries' (or do so in the top bar) to see the list of proteins for which we provide predictions. This list is a compilation of entries provided by Uniprot and by the NCBI.

In this list, which you can sort by column, you will find:

  • The Open Reading Frame (ORF) name from the NCBI. CLick on this name to see the sequence-based predictions for this protein
  • The NCBI protein RefSeq ID (if available)
  • The Uniprot ID (if available)
  • The amino acid sequence length of the protein
  • The protein category

The pages with the per-protein sequence-based predictions provide:

  • A short description of the (putative) function of this protein, if available.
  • A plot of all incorporated sequence-based predictions (additional ones are welcome, please get in touch if you want to provide them).
  • The multiple sequence alignment (MSA) based variation of the predicted feature (like backbone dynamics) for predictions working on single sequences.
  • On the left-hand side, you will also find links to:
    • The structure in the PDB (if available)
    • The UNIPROT information (if available)
    • The NCBI information (if available)
    • The PSPer predictions about the possible phase-separation behavior of this protein (if available)
    • A JSON file containing the predictions.
    • The MSA file used to calculate the biophysical variation.