To understand the basic principles of protein three-dimensional structure and the potential of their use in various areas of research, academic or industrial - like pharmaceutical or biotech industries - we first need to look at the four levels of protein structure. The different structural levels depend on each other, together creating an extremely complex network of interactions between hundreds and thousands of atoms. The first level is the amino acid sequence - there are 20 different amino acids most commonly found in proteins. The sequence of these amino acids in a polypeptide chain essentially determines the types of secondary structure elements present (the second level of organization) and the way by which they are arranged in space, creating structural motifs and folds (the third level of organization).
An independent folding unit of the three-dimensional protein structure is called a domain. It is independent because domains may often be cloned, expressed and purified independently of the rest of the protein, and they may even show activity, if there is any known activity associated with them. Some proteins contain one single domain while others may contain several domains. A protein domain is assigned a certain type of fold. Domains with the same fold may or may not be related to each other functionally. This is simply because Nature has re-used the same fold many times in different contexts. The currently available protein three-dimensional structures in the Protein Data Bank have been classified into more than 1000 different unique folds. Here I am going to discuss just some examples of these folds, to illustrate the basic principles, according to which they are defined.
An example of a quaternary protein structure is shown in the image below. The image shows the complex of two of the subunits of the enzyme magnesium chelatase. The structure was obtained using single-particle reconstruction from cryo-electron microscopic (cryo EM) images of the complex. Where appropriate, the available X-ray 3D structure of subunit BchI of the enzyme was docked into the EM density (shown in ribbon representation). Other domains where homology-modeled based on known 3D structures from other proteins. Published in Lunqvist et al, Structure 2010.
The fourth level is the quaternary structure. The quaternary structure is consisted of several polypeptide chains (subunits), similar (homo-oligomer) or different (hetero-oligomer). The subunits within such structures interact with each other, may contribute to an active site (or sites), contribute to the dynamics of the complex and may interact with some target proteins.
Since large variations in the sequence may result in the same structure, we say that the structure has a higher degree of conservation than the sequence. This is reflected in the fact that the determination of the protein 3D structure may often help revealing its function. An interesting example was provided by the anaerobic cobaltochelatase, an enzyme active in vitamin B12 synthesis. Although the function of the protein was known before structure determination (Schubert et al., 1999), the similarity of the structure to that of ferrochelatase (Al-Karadaghi et al., 1997), an enzyme active in heme biosynthesis, could only be revealed after the structure determination of cobaltochelatase. The reason is that there is only 11% sequence identity between the two proteins, a number much smaller than the so-called "homology-threshold", normally considered in sequence alignment as an indication of evolutionary relationship (around 20-25%).
Jöns Jacob Berzelius (b. 1779), probably the most famous Swedish scientist, coined the word ”protein”