The SWISS-2DPAGE database assembles data on proteins identified on various 2-D PAGE and SDS-PAGE maps. Each SWISS-2DPAGE
entry contains textual data on one protein, including mapping procedures, physiological and pathological
information, experimental data (isoelectric point, molecular weight, amino acid composition, peptide masses)
and bibliographical references. In addition to this textual data, SWISS-2DPAGE provides several 2-D PAGE and SDS-PAGE
images showing the experimentally determined location of the protein, as well as a theoretical region computed from
the sequence protein, indicating where the protein might be found in the gel.
Cross-references are provided to Medline and other federated 2-DE databases (COMPLUYEAST-2DPAGE, Cornea-2DPAGE, DOSAC-COBS 2D Page, ECO2DBASE, HSC-2DPAGE, LENS-2DPAGE, OGP-WWW, PHCI-2DPAGE, PMMA-2DPAGE, Siena-2DPAGE, YEPD) and to UniProtKB/Swiss-Prot, which provides many links to other molecular databases (EMBL, Genbank, PROSITE, OMIM, etc).
For detailed information specific to the current SWISS-2DPAGE release, see the release notes.
If you want to cite SWISS-2DPAGE in a publication please use the following reference:
SWISS-2DPAGE is copyright the Swiss Institute of Bioinformatics. There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement. For information about the licensing scheme see: http://www.expasy.org/ch2d/license.html or send an email to firstname.lastname@example.org.
The above copyright notice also applies to this user manual as well as to any other SWISS-2DPAGE documents.
The entries in the SWISS-2DPAGE are text files structured to be readable by human as well as by computer programs. The explanations, descriptions, classifications and comments are in everyday English. However, symbols familiar to biochemists, chemists and molecular biologists are also used. Each entry corresponds to one protein and is composed of lines. Different types of lines, each with their own format, are used to record the various data which make up the entry. A sample protein entry is shown below:
ID ALF_ECOLI; STANDARD; 2DG. AC P0AB71; P11604; DT 01-SEP-1997, integrated into SWISS-2DPAGE (release 6). DT 15-MAY-2003, 2D annotation version 4. DT 26-SEP-2006, general annotation version 10. DE Fructose-bisphosphate aldolase class 2 (EC 22.214.171.124) (Fructose- DE bisphosphate aldolase class II) (FBP aldolase). GN Name=fbaA; Synonyms=fba, fda; OrderedLocusNames=b2925; OS Escherichia coli. OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; OC Enterobacteriaceae; Escherichia. OX NCBI_TaxID=562; MT ECOLI, ECOLI-DIGE4.5-6.5, ECOLI5-6. IM ECOLI, ECOLI-DIGE4.5-6.5, ECOLI5-6. RN  RP MAPPING ON GEL. RX MEDLINE=98410772; PubMed=9740056; RA Tonella L., Walsh B.J., Sanchez J.-C., Ou K., Wilkins M.R., Tyler M., RA Frutiger S., Gooley A.A., Pescaru I., Appel R.D., Yan J.X., Bairoch A., RA Hoogland C., Morch F.S., Hughes G.J., Williams K.L., Hochstrasser D.F.; RT '''98 Escherichia coli SWISS-2DPAGE database update''; RL Electrophoresis 19:1960-1971(1998). RN  RP MAPPING ON GEL. RX PubMed=11680886; RA Tonella L., Hoogland C., Binz P.-A., Appel R.D., Hochstrasser D.F., RA Sanchez J.-C.; RT ''New perspectives in the Escherichia coli proteome investigation''; RL Proteomics 1:409-423(2001). RN  RP MAPPING ON GEL. RX PubMed=12469338; RA Yan J.X., Devenish A.T., Wait R., Stone T., Lewis S., Fowler S.; RT ''Fluorescence 2-D difference gel electrophoresis and mass spectrometry RT based proteomic analysis of E. coli''; RL Proteomics 2:1682-1698(2002). 2D -!- MASTER: ECOLI; 2D -!- PI/MW: SPOT 2D-000L0H=5.55/40732; 2D -!- PI/MW: SPOT 2D-000L1R=5.43/39855; 2D -!- AMINO ACID COMPOSITION: SPOT 2D-000L1R: B=10.9, Z=10.5, S=7.2, H=3, 2D G=10.5, T=5.3, A=9.4, P=4.3, Y=3.6, R=3.2, V=7.4, M=1.7, I=5.3, L=8.6, 2D F=4.3, K=4.8; 2D -!- MAPPING: AMINO ACID COMPOSITION AND SEQUENCE TAG (SKIF) . 2D -!- MASTER: ECOLI-DIGE4.5-6.5; 2D -!- PI/MW: SPOT 2D-001WMY=5.49/39104; 2D -!- PEPTIDE MASSES: SPOT 2D-001WMY: 955.51; 1502.78; 1762.98; 1878.01; TRYPSIN. 2D -!- MAPPING: Peptide mass fingerprinting . 2D -!- MASTER: ECOLI5-6; 2D -!- PI/MW: SPOT 2D-001L5L=5.56/50220; 2D -!- PI/MW: SPOT 2D-001L6U=5.56/49421; 2D -!- PEPTIDE MASSES: SPOT 2D-001L5L: 1320.801; 1502.988; 1878.257; 2D 2591.649; 2719.736; 2871.916; TRYPSIN. 2D -!- PEPTIDE MASSES: SPOT 2D-001L6U: 934.591; 950.583; 953.573; 2D 1320.797; 1502.947; 1878.221; 2591.534; 2719.66; TRYPSIN. 2D -!- MAPPING: Peptide mass fingerprinting . CC --------------------------------------------------------------------------- CC This SWISS-2DPAGE entry is copyright the Swiss Institute of Bioinformatics. CC There are no restrictions on its use by non-profit institutions as long as CC its content is in no way modified and this statement is not removed. CC Usage by and for commercial entities requires a license agreement (See CC http://www.expasy.org/ch2d/license.html or send email to email@example.com). CC --------------------------------------------------------------------------- DR UniProtKB/Swiss-Prot; P0AB71; ALF_ECOLI.
Each line begins with a two-character line code, which indicates the type of data contained in the line. The current line types and line codes, and the order in which they appear in a SWISS-2DPAGE entry are shown below and are described extensively in the following sections.
--------- ---------------------------- ---------------------- Line code Content Occurrence in an entry --------- ---------------------------- ---------------------- ID Identification Once; starts the entry AC Accession number(s) One or more DT Date Three times DE Description One or more GN Gene name(s) Optional OS Organism species One or more OC Organism classification One or more OX Taxonomy cross-reference(s) Once MT Masters One or more IM Images One or more RN Reference number One or more RP Reference position One or more RX Reference cross-reference(s) Optional RA Reference authors One or more RT Reference title Optional RL Reference location One or more CC Comments or notes Optional 2D 2-D PAGE specific data Several 1D SDS-PAGE specific data Several DR Database cross-references Optional // Termination line Once; ends the entry --------- ---------------------------- ----------------------
As shown in the above table, some entries do not contain all of the line types, and some line types occur many times in a single entry. Each entry must begin with an identification line (ID) and end with a terminator line (//).
It must be noted that for standardization purpose most of the SWISS-2DPAGE line types and formats are kept from the Swiss-Prot knowledgebase. One thus can refer to the UniProtKB/Swiss-Prot user manual for extended description of these lines. Only the MT, IM, 2D and 1D lines (2-D PAGE and SDS-PAGE data) are specific to the SWISS-2DPAGE database.
The two-character line type code which begins each line is always followed by three blanks, so that the actual information begins with the sixth character. Information is not extended beyond character position 75.
The ID (IDentification) line is always the first line of an entry. The general form of the ID line is:
ID ENTRY_NAME; ENTRY_CLASS; 2DG.
The first item on the ID line is the entry name. This name is a useful means of identifying a protein. The entry name consists of up to 12 uppercase alphanumeric characters.
SWISS-2DPAGE uses the general purpose naming convention used by UniProtKB/Swiss-Prot which can be symbolized as X_Y, where:
An example of a complete protein entry name is: A1AT_HUMAN for the human alpha-1-antitrypsin.
The entry class defines the type of data, which may be STANDARD (for data which are complete) or PRELIMINARY (for entries in which certain information is missing or has not yet been verified).
The AC (ACcession number) line lists the accession number(s) associated with an entry. The format of the AC line is:
AC AC_number_1;[ AC_number_2;]...[ AC_number_N;]An example of an accession number line is shown below:
The accession numbers are separated by semicolons and the list is terminated by a semicolon. If necessary, more than one AC line will be used. Most SWISS-2DPAGE entries currently have only one accession number.
The purpose of accession numbers is to provide a stable way of identifying entries from release to release. It is sometimes necessary for reasons of consistency to change the names of the entries, for example, to ensure that related entries have similar names. However, an accession number is always conserved, and therefore allows unambiguous citation of SWISS-2DPAGE entries.
Researchers who wish to cite entries in their publications should always cite the first accession number.This is commonly referred to as the 'primary accession number'.
Entries will have more than one accession number if they have been merged or split. For example, when two entries are merged into one, a new accession number goes at the start of the AC line, and those from the merged entries are listed after this one. Similarly, if an existing entry is split into two or more entries (a rare occurrence), the original accession number list is retained in all the derived entries.
An accession number is dropped only when the data to which it was assigned have been completely removed from the database.Accession numbers consist of 6 alphanumerical characters in the following format:
1 2 3 4 5 6 [O,P,Q] [0-9] [A-Z,0-9] [A-Z,0-9] [A-Z,0-9] [0-9]Here are some examples of valid accession numbers: O08709, Q9ZPF5, Q9Z1T6 and P54638.
The DT (DaTe) lines show the date of creation and last modification of the database entry. The format of the DT lines is:
DT DD-MMM-YYYY, integrated into SWISS-2DPAGE (release n). DT DD-MMM-YYYY, 2D annotation version x. DT DD-MMM-YYYY, general annotation version x.
where 'DD' is the day, 'MMM' the month, 'YEAR' the year, respectively. The dates shown in DT lines correspond to the date of the biweekly release at which an entry was integrated or updated. There are always three DT lines in each entry, each of them is associated with a specific comment:
Example of a block of DT lines:
DT 01-AUG-1995, integrated into SWISS-2DPAGE (release 2). DT 01-OCT-2001, 2D annotation version 3. DT 07-AUG-2006, general annotation version 7.
The DE (DEscription) lines contain general descriptive information about the protein stored. This information is generally sufficient to identify the protein precisely. The format of the DE lines is:
The description is given in ordinary English and is free-format. In some cases, more than one DE line is required; in this case, the text is divided only between words and only the last DE line is terminated by a period.
When the complete sequence was not determined, the last information given on the DE lines will be '(Fragment)' or '(Fragments)'.
Two examples of description lines are given here:
DE Apolipoprotein E (Apo-E). DE Aldehyde dehydrogenase A (EC 126.96.36.199) (Lactaldehyde dehydrogenase).
For a detailed description of the current rule applyed to the DE line, one should refer to the UniProtKB/Swiss-Prot user manual.
The GN (Gene Name) line contains the name(s) of the gene(s) that code for the stored protein sequence. The format of the GN line is:
GN NAME1[ AND|OR NAME2...].
GN APOE. GN ATPA.It often occurs that more than one gene name has been assigned to an individual locus. In that case all the synonyms will be listed. The word 'OR' separates the different designations. The first name in the list is assumed to be the most correct (or most current) designation. Example:
GN ATPA OR UNCA OR PAPA OR B3734 OR Z5232 OR ECS4676.In a few cases, multiple genes code for an identical protein sequence. In that case all the different gene names will be listed. The word 'AND' separates the designations. Example:
GN HBA1 AND HBA2.In very rare cases 'AND' and 'OR' can both be present. In that case parentheses are used as shown in the following example:
GN (TUFA OR B3339) AND (TUFB OR B3980).
The OS (Organism Species) line specifies the organism(s) which was the source of the stored protein. In the rare case where all the species information will not fit on a single line more than one OS line is used. The last OS line is terminated by a period.
The species designation consists, in most cases, of the Latin genus and species designation followed by the English name (in parentheses).
Examples of OS lines are shown here:
OS Escherichia coli. OS Homo sapiens (Human). OS Saccharomyces cerevisiae (Baker's yeast).
The OC (Organism Classification) lines contain the taxonomic classification of the source organism. The taxonomic classification used is that maintained at the NCBI (see http://www.ncbi.nlm.nih.gov/Taxonomy/) and used by the nucleotide sequence databases (EMBL/GenBank/DDBJ).
The classification is listed top-down as nodes in a taxonomic tree in which the most general grouping is given first. The classification may be distributed over several OC lines, but nodes are not split or hyphenated between lines. The individual items are separated by semicolons and the list is terminated by a period.
The format of the OC lines is:
OC Node[; Node...].
For example the classification lines for a human sequence would be:
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; OC Eutheria; Primates; Catarrhini; Hominidae; Homo.
The OX (Organism taXonomy cross-reference) line is used to indicate the identifier assigned to a specific source organism in a taxonomic database. The format of the OX line is:
OX Taxonomy-database_Qualifier=Taxonomic code;
Currently the cross-references are made to the NCBI's taxonomic classification (see http://www.ncbi.nlm.nih.gov/Taxonomy/), which is associated with the qualifier 'TaxID' and a one- to six-digit taxonomic code.
The MT (MasTer) lines are specific to SWISS-2DPAGE. These lines indicate on what types of maps the protein has been identified (such as PLASMA, LIVER, etc.).
A master line example is shown here.
MT NUCLEI_LIVER_HUMAN, NUCLEOLI_HELA_2D_HUMAN, NUCLEOLI_HELA_1D_HUMAN.
The IM (IMages) lines list the 2-D PAGE and SDS-PAGE images which are associated to the entry. These may be, for example, TUMORAL LIVER, NORMAL LIVER or just LIVER.
These lines comprise the literature citations within SWISS-2DPAGE. The citations indicate the sources from which the data has been abstracted. The reference lines for a given citation occur in a block, and are always in the order RN, RP, RX, RA, RT, RL. Within each such reference block the RN and RP lines occur once, the RX and RT lines occur zero or more times, and the RA and RL lines each occur one or more times. If several references are given, there will be a reference block for each.
An example of a complete reference is:
RN  RP MAPPING ON GEL. RX MEDLINE=98410772; PubMed=9740056; RA Tonella L., Walsh B.J., Sanchez J.-C., Ou K., Wilkins M.R., Tyler M., RA Frutiger S., Gooley A.A., Pescaru I., Appel R.D., Yan J.X., Bairoch A., RA Hoogland C., Morch F.S., Hughes G.J., Williams K.L., Hochstrasser D.F.; RT "'98 Escherichia coli SWISS-2DPAGE database update."; RL Electrophoresis 19:1960-1971(1998).
The formats of the individual lines are explained below.
The RN (Reference Number) line gives a sequential number to each reference citation in an entry. This number is used to indicate the reference in comments and 2D lines. The format of the RN line is:
where N denotes the nth reference for this entry. The reference number is always enclosed in square brackets.
The RP (Reference Position) line describes the extent of the work carried out by the authors of the reference cited. The format of the RP line is:
Typical examples of RP lines are shown below:
RP PROTEIN SEQUENCE. RP AMINO ACID COMPOSITION. RP MAPPING ON GEL. RP CHARACTERIZATION. RP REVIEW.
The RX (Reference cross-reference) line is an optional line which is used to indicate the identifier assigned to a specific reference in a bibliographic database. The format of the RX line is:
RX BIBLIOGRAPHIC_DATABASE=IDENTIFIER[; BIBLIOGRAPHIC_DATABASE=IDENTIFIER...];Where the valid bibliographic database names and their associated identifier are:
------- ------------------------------------------ Name Identifier ------- ------------------------------------------ MEDLINE Eight-digit MEDLINE Unique Identifier (UI) PubMed PubMed Unique Identifier (PMID) ------- ------------------------------------------
Example of RX lines:
RX PubMed=11503206; RX MEDLINE=98410772; PubMed=9740056;
The RA (Reference Author) lines list the authors of the paper (or other work) cited. All of the authors are included, and are listed in the order given in the paper. The names are listed surname first followed by a blank followed by initial(s) with periods. The authors' names are separated by commas and terminated by a semicolon. Author names are not split between lines. An example of the use of RA lines is shown below:
RA Anderson N.L., Anderson N.G.;
As many RA lines as necessary are included for each reference.
The RT (Reference Title) lines give the title of the paper (or other work) as exactly as possible given the limitations of the computer character set. The title is enclosed in double quotes, and may be continued over several lines as necessary. The title lines are terminated by a semicolon. An example of the use of RT lines is shown below:
RT "High resolution two-dimensional electrophoresis of human plasma RT proteins.";
The RT line is optional or as many RT lines as necessary are included for each reference.
The RL (Reference Location) lines contain the conventional citation information for the reference. In general, the RL lines alone are sufficient to find the paper in question.
The RL line for a journal citation includes the journal abbreviation, the volume number, the page range, and the year. The format for such a RL line is:
RL Journal_abbrev Volume:First_page-Last_page(YEAR).
Journal names are abbreviated according to the conventions used by the National Library of Medicine (NLM) and are based on the existing ISO and ANSI standards.
An example of an RL line is:
RL Proc. Natl. Acad. Sci. U.S.A. 74:5421-5425(1977).
When a reference is made to a paper which is 'in press' at the time when the data bank is released, the page range and eventually the volume number are indicated as '0' (zero), if unknown. An example of a RL line of such type is shown here:
RL Electrophoresis 0:0-0(1997). RL Electrophoresis 14:in press(1993).
A variation of the RL line format is used for papers found in books or other similar publications, which are cited as shown below:
RL (In) Editor1 I.[, Editor2 I., EditorX I.] (eds.); RL Book Title, pp.[Vol:]First-Last, Publisher, City (Year).
The first RL line contains the designation '(In)', which indicates that this is a book reference. These citations generally include the following information: the title of the book, the name of the editor(s), the page range, the publisher name, the city where it is published, and the year of publication (which is always shown between parentheses).Example of book citation is given here:
RL (In) Neidhardt et al. (eds.); RL Escherichia coli and Salmonella: Cellular and Molecular Biology (2nd RL ed.), pp.2067-2117, ASM Press, Washington DC (1996).
RL lines for unpublished results follows the format shown in the following example:
RL Unpublished results, cited by: RL ULRICH E.L., KROGMANN D.W., MARKLEY J.L.; RL J. Biol. Chem. 257:9356-9364(1982).
For unpublished observations the format of the RL line is:
RL Unpublished observations (MMM-YEAR).
Where 'MMM' is the month and 'YEAR' is the year.
We use the 'unpublished observations' RL line to cite communications by scientists to SWISS-2DPAGE of unpublished information concerning various aspects of an entry.
For Ph.D. thesis the format of the RL line is:
RL Thesis (YEAR), Institution Name, Country.
An example of such a line is given here:
RL Thesis (1994), Geneva University, Switzerland.
For patent applications the format of the RL line is:
RL Patent number PAT_NUMB, DD-MMM-YYYY.
Where 'PAT_NUMB' is the international publication number of the patent, 'DD' is the day, 'MMM' is the month and 'YEAR' is the year.
The final form that an RL line can take is that used for submissions. The format of such a RL line is:
RL Submitted (MMM-YEAR) to the DATABASE_NAME database.
Where 'MMM' is the month, 'YEAR' is the year and 'DATABASE_NAME' is the database name (for example SWISS-2DPAGE or Swiss-Prot).
An example of submission RL line is given here:
RL Submitted (JUN-1994) to the SWISS-2DPAGE database.
The CC lines are free text comments on the entry, and may be used to convey any useful information. The comments always appear below the last reference line and are grouped together in comment blocks, a block being made of 1 or more comment lines. The first line of a block is marked with the characters '-!-'.
The format of a comment block is:
CC -!- FIRST LINE OF A COMMENT BLOCK CC SECOND AND SUBSEQUENT LINES.
A major proportion of the comment blocks are arranged according to what we designate as 'topics'. The format of a comment block which belongs to a 'topic' is:
CC -!- TOPIC: FREE TEXT DESCRIPTION.
The current topics used in SWISS-2PDAGE are:SUBUNIT: Description of the quaternary structure of a protein.
For a detailed description of current topics used in Swiss-Prot, one should refer to the UniProtKB/Swiss-Prot user manual.
We show here an example of its usage:
CC -!- SUBUNIT: HOMOTETRAMER. CC -!- MISCELLANEOUS: POSSIBLE SIGNAL/TRANSIT 1-10.
The 2D lines contain data specific to 2-D PAGE reference maps. The 2D lines may start with free text comments concerning all the reference maps available for the entry. Then appear 2D lines grouped by master.
The 2D comment lines is a block made of 1 or more lines, the first one is marked with the characters '-!-' and the last one is terminated by a period. The format for these 2D comment lines is similar to the CC lines:
2D -!- FIRST LINE OF 2D COMMENT BLOCK 2D SECOND AND SUBSEQUENT LINES.
As for CC lines, the 2D comment lines are arranged in 'topics'. The current topic is:
MAPPING COMMENT: General comments about the mapping procedure concerning all the reference maps available for the entry. The format is as follow:
2D -!- MAPPING COMMENT: FREE TEXT DESCRIPTION.
Here is an example of the 2D mapping comment usage:
2D -!- MAPPING COMMENT: CROSS-SPECIES IDENTIFICATION 2D (UniProtKB/Swiss-Prot; PGMU_RAT; P38652).
Then appear the 2D lines block for each master. Each block is made of two or more 2D lines. The first line of a block specifies the master and has the following format:
2D -!- MASTER: 'MASTER';
where 'MASTER' is one of the SWISS-2DPAGE masters available in the current release.
Examples of the first line of a 2D master block are:
2D -!- MASTER: LIVER_HUMAN; 2D -!- MASTER: HEPG2SP_HUMAN;
A major proportion of the 2D blocks are arranged according to what we designated as 'topics'. There are fixed format and free text topics. The first line of a topic is marked with the character '-!-'.
1. PI/MW: Description of the isoelectric point and molecular weight of the entry on the SWISS-2DPAGE master gel
The format of the PI/MW topic is:
2D -!- PI/MW: SPOT 'SERIAL NUMBER'='PI'/'MW';
Where 'SERIAL NUMBER' is the spot serial number (a unique spot identifier across all maps in SWISS-2DPAGE), 'PI' is the experimental isoelectric point and 'MW' the experimental molecular weight of the spot as determined on the master map.
Here is an example for the PI/MW topic:
2D -!- PI/MW: SPOT 2D-0000GG=5.80/65958;
2. AMINO ACID COMPOSITION: Description of the amino acid composition of the entry (in %) determined after 2-D PAGE transfer on a PVDF membrane, hydrolysis, Fmoc derivatisation and HPLC analysis (see protocols)
The format of the AMINO ACID COMPOSITION topic is:
2D -!- AMINO ACID COMPOSITION: SPOT 'SERIAL NUMBER': AAC_LIST 2D SUBSEQUENT AAC_LIST;
where AAC_LIST contains the beginning of the amino acid composition and SUBSEQUENT AAC_LIST contains the remaining parts of the amino acid composition. This topic may take one or more lines. The amino acid composition is a list of comma separated items of the form 'X=AAC', where X is the one-letter for an amino acid, and AAC is its value in percent of the total amount of amino acids.
We give here an example of an AMINO ACID COMPOSITION topic:
2D -!- AMINO ACID COMPOSITION: SPOT 2D-000SEW: B=11.30, S=10.80, 2D Z=12.10, G=11.20, T=3.90, H=0.90, Y=4.60, A=9.40, P=2.30, R=3.40, 2D M=1.40, V=9.20, I=3.60, L=6.30, F=2.30, K=7.30 ;
The one-letter and three-letter codes for amino acids used in SWISS-2DPAGE are those adopted by the commission on Biochemical Nomenclature of the IUPAC-IUB.
3. PEPTIDE MASSES: Description of the experimental peptide masses of the entry (in Dalton) obtained by mass spectrometry. Only the peptide masses allowing the identification using the ExPASy PeptIdent tool are given.The parameters used for identification are:
2D -!- PEPTIDE MASSES: SPOT 'SERIAL NUMBER': MASSES_LIST; 2D SUBSEQUENT MASSES_LIST; 'ENZYME'.
where MASSES_LIST contains the beginning of the peptide masses list and SUBSEQUENT MASSES_LIST contains the remaining parts (if needed), and ENZYME is the enzyme used for the digestion. This topic may take one or more lines. The peptide masses are separated by semicolons.An example is shown below:
2D -!- PEPTIDE MASSES: SPOT 2D-0015H6: 1001.631; 1267.653; 1731.898; 2D 1821.909; TRYPSIN.
4. PEPTIDE SEQUENCES: Description of the experimental peptide sequences of the entry obtained by tandem mass spectrometry. Only the peptide sequences identified are given.The format of the PEPTIDE SEQUENCES topic is:
2D -!- PEPTIDE SEQUENCES: SPOT 'SERIAL NUMBER': SEQUENCES_LIST; 2D SUBSEQUENT SEQUENCES_LIST.
where 'SERIAL NUMBER' is the spot serial number, SEQUENCES_LIST contains the beginning of the peptide sequences list and SUBSEQUENT SEQUENCES_LIST contains the remaining parts (if needed). This topic may take one or more lines. The peptide sequences is a list of semicolons separated items of the form '(S)EQUENC(E),X-Y', where (S)EQUENC(E) is the peptide sequence found, X and Y are respectively the start and end position of the peptide in the protein sequence.An example is shown below:
2D -!- PEPTIDE SEQUENCES: SPOT 2D-001WFZ: (R)VASWSTAR(H),318-325; 2D (R)QPVSASDFALQFTPGKR(Y),391-407.
2D -!- TOPIC: FREE TEXT DESCRIPTION.
Current free text topics are:
1. MAPPING: Description of the biochemical technique which has allowed the identification of the entry on the SWISS-2DPAGE master gel.
2. NORMAL LEVEL: Description of the physiological protein expression.
3. PATHOLOGICAL LEVEL: Description of pathological protein expressions (an increase or decrease).
4. NORMAL POSITIONAL VARIANTS: Description of physiological polymorphisms.
5. PATHOLOGICAL POSITIONAL VARIANTS: Description of pathological polymorphisms.
6. EXPRESSION: Description of the protein expression modifications including level and/or post-translational modifications.
Examples of free text topics are:
2D -!- MAPPING: MATCHING WITH A PLASMA GEL. 2D -!- NORMAL LEVEL: 30-60 MG/L. 2D -!- PATHOLOGICAL LEVEL: INCREASED DURING THE ACUTE- 2D PHASE REACTION; DECREASED DURING EMPHYSEMA. 2D -!- NORMAL POSITIONAL VARIANTS: 30 GENETICS 2D VARIANTS KNOWN AS PI ALLELES. 2D -!- PATHOLOGICAL POSITIONAL VARIANTS: ALPHA-1- 2D ANTITRYPSIN PITTSBURGH. 2D -!- EXPRESSION: decrease after benzoic acid treatment .
The 1D lines contain data specific to SDS-PAGE reference gels. These lines are arranged like the 2D ones. That is, the 1D lines for a given master occur in a block. A block is made of two or more 1D lines. The first line of a block specifies the master and has the following format:
1D -!- MASTER: 'MASTER';
where 'MASTER' is one of the SWISS-2DPAGE masters available in the current release.
Example of the first line of a 1D master block is:
1D -!- MASTER: NUCLEOLI_HELA_1D_HUMAN;
1. MW: Description of the experimental molecular weight of the entry on the SWISS-2DPAGE master gel
The format of the MW topic is:
1D -!- MW: BAND 'SERIAL NUMBER'='MW';
Where 'SERIAL NUMBER' is the SWISS-2PDPAGE serial number (a unique spot identifier across all gels in SWISS-2DPAGE), and 'MW' the experimental molecular weight of the band as determined on the master gel.
Here is an example for the MW topic:
1D -!- MW: BAND 1D-001V8C=63488;
See 2D lines for other fixed format topics available.
1D -!- TOPIC: FREE TEXT DESCRIPTION.
Current free text topic is:
1. MAPPING: Description of the biochemical technique which has allowed the identification of the entry on the SWISS-2DPAGE master gel.
See 2D lines for other free text topics available.
The DR (Database cross-Reference) lines are used as pointers to information related to SWISS-2DPAGE entries and found in other databases.
The format of the DR line is:
DR DATABASE; PRIMARY_IDENTIFIER; SECONDARY_IDENTIFIER.
The first item on the DR line, the database identifier, is the abbreviated name of the data collection to which reference is made. The currently defined database identifiers are:
------------------ --------------------------------------------------------------------- Identifier Database description ------------------ --------------------------------------------------------------------- Cornea-2DPAGE 2-DE database at Aarhus University, Denmark COMPLUYEAST-2DPAGE 2-DE database at Madrid University, Spain DOSAC-COBS 2D-PAGE 2-DE database at Palermo University, Italy ECO2DBASE Escherichia coli gene-protein database (2D gel spots) HSC-2DPAGE Harefield hospital 2D gel protein databases LENS-2DPAGE 2-DE database of mammalian lens samples of the Oregon Health & Science University, US OGP-WWW Oxford GlycoProteomics database (Human platelet) at Oxford University, UK PHCI-2DPAGE Parasite host cell interaction 2D-PAGE, Aarhus University, Denmark PMMA-2DPAGE 2D-PAGE database at Purkyne Military Medical Academy, Czech Siena-2DPAGE 2-DE protein database, Siena University, Italy UniProtKB/Swiss-Prot Protein Knowledgebase from the Swiss Institute of Bioinformatics and the EMBL Outstation - the European Bioinformatics Institute. UniProtKB/TrEMBL Computer-annotated supplement to Swiss-Prot from the Swiss Institute of Bioinformatics and the EMBL Outstation - the European Bioinformatics Institute. YEPD Yeast electrophoresis protein database ------------- ---------------------------------------------------------------------
The second item on the DR line, the primary identifier, is an unambiguous pointer to the information entry in the database to which reference is being made. For a UniProtKB/Swiss-Prot reference, the primary identifier is the first accession number (also called the Unique Identifier in some databases) of the entry to which reference is being made.
The third and last item on the DR line, the secondary identifier, is used to complement the information given by the first identifier. For a UniProtKB/Swiss-Prot reference the secondary identifier is the entry name.
Examples of complete DR lines are shown here:
DR UniProtKB/Swiss-Prot; P00352; DHAC_HUMAN. DR ECO2DBASE; G052.0; 6TH EDITION. DR HSC-2DPAGE; P47985; HUMAN. DR Siena-2DPAGE; P38646; GR75_HUMAN. DR PHCI-2DPAGE; P09211; GTP_HUMAN. DR YEPD; 4270; -.
The // (terminator) line contains no data or comments. It designates the end of an entry.