Start2Fold

The database of hydrogen/deuterium exchange data on protein folding and stability

Documentation

Implementation

Start2Fold is a relational database implemented in MySQL using Django. Data in Start2Fold is structured hierarchically in the following manner:

  1. Level 1:

    At the root level is the entry, which might be associated with one or multiple molecular systems (i.e. a protein of multiple chains, or even complexes of multiple proteins). Each entry has a distinct identifier, which consist of a tag (STF) followed by a 4 letter code (e.g. STF0001).

  2. Level 2:

    An entry might have several associated protein chains. This level (or class) stores information on the UniProt and Protein Data Bank (PDB) IDs of the protein chain, and serves to provide direct cross-links to these online repositories.

  3. Level 3:

    Each protein chain might have several corresponding experimental sets. These are either folding or stability experiments and are associated with a protection level:

    • early
    • intermediate
    • and late for folding
    • strong
    • medium
    • and weak for stability measurements

    Additional information recorded on the experimental sets include the resolution (residue-level or segment-level), the number of probes, the protection threshold, the experimental conditions (pH, temperature), the actual protein sequence used in the experiment (with any modifications/mutations) the PubMed ID of the original publication and a textual description of the experimental procedure.

  4. Level 4:

    Finally, each experimental set has an array of amino acid residues that correspond to it. These residues provide the actual information on which positions/segments of the protein chain have which type of folding/stability information (i.e. which residues are early folding, etc.).


User interface

The user interface of Start2Fold is divided into the following main sections:

  • Home:

    The 'home' section briefly introduces the database and the types of data contained within.

  • Help:

    Next, the 'help' section contains the detailed documentation of the database, with an in-depth user guide describing all the functionalities of Start2Fold.

  • Contact:

    You are invited to send your questions and feedback using the contact form.

  • Browse:

    Browsing is enabled by different criteria, such as browsing by proteins, residues sets or entries. Each option provides a list of entries and the most relevant information depending on the browsing option. When browsing by entries, the entry ID, molecular system name and whether the entry was reviewed or not are displayed. In case of browsing by proteins, the name of the molecular system, the recommended name of the protein, the entry ID, the UniProt and PDB IDs, the length of the protein chain and the secondary structure type of the chain constitute the browsing list. Lastly, when browsing by residue sets, the protection level, experiment type and method, entry ID, molecular system name and PubMed reference are displayed in the list.

  • Search:

    The database can be searched using the 'search' field located on the top left corner of the menu section. Searching Start2Fold can be performed by typing in protein names, UniProt/PDB IDs, experiment types, experiment methods and protection levels in the search field, and pressing 'search'.

  • Entry:

    Clicking on the entry ID links on either browsing list or the search results list forwards you to the entry page. This page provides all the relevant information associated with the entry, along with a link to an integrated JSmol applet (see next section). The 'home', 'help', 'contact' and browsing sections can be accessed from all the pages using the menu on the top section of the screen.


Entry pages

The actual information stored within Start2Fold is displayed on the accession screens.

  • Entry header:

    The entry ID and the title of the entry are on the top of the page.

  • Download:

    Below is the 'download in xml' link, which provides the complete entry in XML format for downloading. This XML follows the structure of the XML template found on the welcome page. Alternatively, this XML can be directly accessed by adding '.xml' to the entry URL (e.g. start2fold.eu/STF0004.xml).

  • Protein:

    The protein information tab provides the name of the protein, the species of origin, the number of residues in the protein chain, and cross-links to UniProt and PDB.

  • JSmol app:

    The link to an integrated JSmol applet can be used to visualize the different residue sets or segments by clicking on one of the buttons. The 'reset view' button can be clicked to reset the JSmol applet.

  • Exp.set:

    The experimental sets can be closed and opened by clicking on the 'show' and 'hide' buttons. These sections display the experimental type and method, the experimental conditions (pH, temperature, number of probes), a brief description of the experiment and the actual sequence that was used for the measurement. This sequence can be downloaded in FASTA format by clicking on the 'click to download' link under the sequence. Alternatively, the sequences can be directly accessed by adding '.fasta' to the URL (e.g. start2fold.eu/STF0008.fasta).

  • Residues:

    Finally, the residues are listed by their indices and their one-letter amino acid codes. The residue lists can be downloaded either by clicking the 'click to download' link below the residues, or by directly accessing them via adding '.residues' to the URL (e.g. start2fold.eu/STF0008.residues).