Protein Structure
Contributors
Created by the Center for BioMolecular Modeling.
Last revision 2/2021

This Jmol Exploration was created using the Jmol Exploration Webpage Creator from the MSOE Center for BioMolecular Modeling.

version 2.0
Exploration Content

Proteins are Important to All Life

Take a deep breath. When you take air into your lungs the oxygen binds to a protein called hemoglobin, which carries the normally reactive oxygen safely through the blood and on to all 100 trillion cells in your body.

Error:image not available. Check image URL to correct the problem.
Hemoglobin is an example of a protein.

Take a bite of an apple. When you eat sugars, your body senses the hightened glucose levels in your blood and releases a protein called Insulin, which helps cells absorb the sugar from the blood.

Visit Friday Harbour in the Pacific Northwest on a warm summer evening, and experience the green shimmer in the water as jellyfish rise to the surface. The green color is generated by the Green Fluorescent Protein(GFP) produced by the jellyfish.

Proteins Have Complex 3-dimensional Shapes

Click the buttons to display the 3-dimensional shape of the three common proteins that were introduced above. Note that the display is fully interactive and can be rotated by clicking and dragging with your mouse!

Hemoglobin Proteins safely carry oxygen in the blood.

Hemoglobin PDB ID: 1a3n

Insulin Proteins help regulate sugar in the bloodstream.

Insulin PDB ID: 2hiu

Green Fluorescent Proteins create bioluminescence in animals like jellyfish.

Green Fluorescent Protein (GFP) PDB ID: 1emb

Proteins are incredibly complex molecules. They come in an almost endless array of shapes and types, acting like molecular machines that perform countless microscopic tasks. Because of their complex and variable structures, scientists break down their overall shapes using four layers of 'protein structure':

Error:image not available. Check image URL to correct the problem.

Primary Structure: Amino Acids are the Building Blocks of Proteins

All proteins are composed of small subunits called amino acids that are joined together like links in a chain to make large complex protein structures. There are twenty different types of amino acids that can be linked together in various orders and frequencies. Every type of protein is made of a different and unique sequence of amino acids.

Error:image not available. Check image URL to correct the problem.

The Molecular Shape of Amino Acids

There are twenty different types of amino acids used to make proteins. The chemical structure of each of the twenty types of amino acids are identical in some ways, but unique in others.


  • Identical Backbone - the backbone region of each of the twenty different amino acids is identical. The backbone is composed of an amino group, a central alpha carbon and a carboxylic acid group.

  • Unique R-Group - attached to this backbone is a group called the R-group, or the sidechain. The chemical composition of this R-group is different for each of the twenty amino acids and gives each different type of amino acid its unique attributes. The twenty different R-groups are discussed in more detail in the next section.

Shown below is the chemical structure of a single amino acid (left), a physical model of a single amino acid (right) and an interactive 3-dimensional structure of a single amino acid (far right -click button to view). Use the buttons below to highlight the different parts of a single amino acid.

Error:image not available. Check image URL to correct the problem.
entire amino acid
Error:image not available. Check image URL to correct the problem.
Backbone Atoms (identical in all amino acids)
Error:image not available. Check image URL to correct the problem.
Amino Group
Error:image not available. Check image URL to correct the problem.
Alpha Carbon
Error:image not available. Check image URL to correct the problem.
Carboxylic Acid Group
Error:image not available. Check image URL to correct the problem.
R-Group or Sidechain (unique to each of the twenty types of amino acids)

A Closer Look at the R-Group or Sidechain

Let's take a closer look at the part of the 20 different amino acids that makes them each unique - the R-group, also called the Sidechain. The R-group is considered the functional group of each amino acid, meaning that it gives each amino acid its unique attributes.

The 20 amino acids are often sorted into five categories based on the unique attributes the R-groups create:





Hydrophobic - water-hating amino acids with a lot of hydrophobic carbon atoms in their R-group.
Hydrophillic - water-loving amino acids with a lot of hydrophillic oxygen and nitrogen atoms in their R-group.
Positively Charged - water-loving amino acids that have a net positive charge (+) due to an abundance of nitrogen atoms in their R-group.
Negatively Charged - water-loving amino acids that have a net negetive charge (-) due to an abundance of oxygen atoms in their R-group.
Cysteine - a unique amino acids that can form strong bonds called disulfide bonds with other cysteine amino acids.

Below are the structures of each of the twenty amino acids that routinely are found in proteins. Each button includes the name of the amino acid, along with the three letter and one letter codes used by scientists. Click on each button to view an interactive structure of that amino acid. The background color in the Jmol display represents the chemical property of that amino acid. You may also access a printable version of the Amino Acid Chart for future reference. See if you can find the amino acid backbone in each of these structures.

Alanine Ala A
Arginine Arg R
Asparagine Asp N
Aspartic Acid Asp D
Cysteine Cys C
Glutamic Acid Glu E
Glutamine Gln Q
Glycine Gly G
Histidine His H
Isoleucine Ile I
Leucine Leu L
Lysine Lys K
Methionine Met M
Phenylalanine Phe F
Proline Pro P
Serine Ser S
Threonine Thr T
Tryptophan Trp W
Tyrosine Tyr Y
Valine Val V

Creating a Protein Chain

Amino acids are chemically linked together to make long amino acid chains through a condensation reaction (sometimes called dehydration synthesis). This chemical reaction will form a bond linking two amino acids together. This type of bond between two amino acids is called a peptide bond.

Condensation reaction in the formation of peptide bond

This reaction also creates a molecule of water as a byproduct. One oxygen atom and one hydrogen atom come from the carboxylic acid group of the first amino acid and join with another hydrogen atom from the amino group of the second amino acid.

Click on the button below to display a dipeptide - two amino acids joined by a peptide bond (shown in yellow).

Dipeptide (peptide bond in yellow)

Polypeptides

A chain of many amino acids linked together by peptide bonds is called a polypeptide. The specific sequence of amino acids in a polypeptide is known as the protein's primary structure. The twenty types of amino acids can be joined together in any order or frequency, allowing for an astronomical variety of potential primary structures.

Error:image not available. Check image URL to correct the problem.
Polypeptide

Each type of protein in our body has a unique primary structure. This specific order of amino acids determines the protein's final 3-dimensional shape and function.

Secondary Structure: Alpha Helices and Beta Pleated Sheets

A protein's primary structure is the specific order of amino acids that have been linked together to form a polypeptide chain. But polypeptides do not simply stay straight as liniar sequences of amino acids. The fold back on themselves to create complex 3-dimensional shapes.

When examing different proteins, you will notice that there are two specifically recognizable shapes that are often repeated throughout a protein's 3-dimensional structure. One of these shapes looks like a curl and the other looks like rows or zig-zags.

Error:image not available. Check image URL to correct the problem.
Alpha Helix

The curls are called alpha helices and almost look like a spiral staircase or a spring. They exist when a protein's backbone curls up into a helical shape.

Alpha Helix
Error:image not available. Check image URL to correct the problem.
Beta Pleated Sheet

The rows are called beta pleated sheets and almost look like a long line for a ride at an amusement park. They exist when a protein's backbone forms an extended zig-zag structure that passes back and forth.

Beta Pleated Sheet

The organization and frequency of these two structures in a protein's overall 3-dimensional shape is called the protein's secondary structure.

Secondary Structure within a Protein

One type of protein that clearly shows both an alpha helix and a beta pleated sheet is a zinc finger protein, which helps regulate DNA expression in a cell's nucleus. This relatively small protein is only 28 amino acids long but includes a four-turn alpha helix and a two strand beta pleated sheet.

Zinc Finger Protein PDB ID: 1zaa

Visualizing Complex Protein Structures

Protein structures can become very visually overwhelming. This is especially true for large proteins, which can be thousands of amino acids long and include tens of thousands of atoms!

Because of this, proteins are often visually represented using different display formats and color schemes. Click on the buttons below to see the zinc finger protein in the interactive display to the far right shown in some of the most common display formats and color schemes.

Ball and Stick Format displays every single atom in a protein with a small sphere and represents the bonds between these atoms with thin sticks. This format has a LOT of detail - and often too much detail to understand easily!

Ball and Stick

Spacefill Format displays every single atom in a protein with a sphere the size of each atom's electron cloud. You can see the overall shape of the protein and that there isn't really much space between atoms, but it is difficult to trace the shape of the protein or identify secondary structures.

Spacefill

Backbone Format displays only the alpha carbon atoms, hiding all other atoms. Each alpha carbon is connected to the next with a cylinder. It is easy to see the overall shape and fold of the protein in a backbone model, but it doesn't convey much information about the molecular interactions.

Backbone

Cartoon Format displays the overall path of the backbone with a smooth ribbon and highlights the secondary structures with arrows that point towards the C terminal end of the protein chain.

Cartoon

Frequently proteins are displayed in a mixture of formats to convey information that tells a story about the protein's structure and function. Shown below are the three zinc ions that stabilize the zinc finger structure, along with the cysteine and histidine residues that coordinate the zinc (hold it in place). The DNA to which the zinc finger protein binds is in cartoon.

Combination Format

CPK Color Scheme colors every atom by its element type. Carbon atoms are gray, nitrogen atoms are blue, oxygen atoms are red and hydrogen atoms are white.

This next image shows how arginine residues 'reach in' and interact with the DNA.

CPK Color Scheme

Secondary Structure Color Scheme colors all atoms that are part of alpha helices magenta, all atoms that are part of beta pleated sheets yellow, and all remaining atoms white.

Secondary Structure Color Scheme

Amino Acid Attribute Color Scheme colors all hydrophobic amino acids yellow, all hydrophilic amino acids white, all positive amino acids blue, all negative amino acids red, and all cysteine amino acids green.

Amino Acid Properties Color Scheme

Each of the different display formats and color schemes used in protein visualization has advantages and disadvantages.

For example, spacefill format is an excellent way to represent the overall globular shape of a protein, but may make it difficult to see details at the very center of the protein. Cartoon format is an excellent way to see the overall path of a protein's backbone, but may not shown the details of each individual amino acid's R-group.

Hydrogen Bonds Help Support Secondary Structures

Alpha helices and beta sheets are supported and reinforced by hydrogen bonds. A hydrogen bond is a weak bond formed when a hydrogen atom is covalently bonded to an atom and interacts with another atom.

Hydrogen bonds often form between the backbone atoms of different amino acids in the two secondary structures of proteins. A hydrogen atom covalently bound to the nitrogen atom of one amino acid interactes with the oxygen atom of another amino acid.

Click on the links below to see hydrogen bonds represented as yellow cylinders in the two types of secondary structures.

Alpha Helix (ball & stick) with yellow hydrogen bonds
Alpha Helix (backbone) with yellow hydrogen bonds
Beta Sheet (ball & stick) with yellow hydrogen bonds
Beta Sheet (backbone) with yellow hydrogen bonds

Secondary Structures in Other Common Proteins

Click on the images below to view each of the three proteins discussed earlier in the interactive display to the right. Each protein will be colored with the Structure Color Scheme (alpha helices colored magenta and beta sheets colored yellow) to emphesize each protein's unique secondary structures.

Error:image not available. Check image URL to correct the problem.
Hemoglobin Proteins safely carry oxygen in the blood.
Hemoglobin PDB ID: 1a3n
Error:image not available. Check image URL to correct the problem.
Insulin Proteins help regulate sugar in the bloodstream.
Insulin PDB ID: 2hiu
Error:image not available. Check image URL to correct the problem.
Green Fluorescent Proteins create bioluminescence in animals like jellyfish.
Green Fluorescent Protein (GFP) PDB ID: 12mb

Tertiary Structure

A protein needs to adopt a final and stable 3-dimensional shape in order to function properly. The Tertiary Structure of a protein is the arrangement of the secondary structures into this final 3-dimensional shape.

The sequence of amino acids in a protein (the primary structure) will determine where alpha helices and beta sheets (the secondary structures) will occure. These secondary structure motifs then fold into an overall arrangement that is the final 3-dimensional fold of the protein (the tertiary structure).Each unique sequence of amino acids gives rise to a unique protein type, with a unique shape and function.

A summary of primary, secondary and tertiary structure is shown below.

Error:image not available. Check image URL to correct the problem.

Forces That Drive Tertiary Structure

Most proteins fold into their tertiary structure in an aqueous environment - a cell is, after all, 60% water. The chemical properties of the various R-groups (sidechains) of the amino acids within the protein chain will influence the way that the protein folds in its environment.

When a protein is surrounded by water:


  • Hydrophobic amino acids will move away from the water and bury themselves in the center of the protein.

  • Hydrophilic amino acids will interact with the water molecules, and thus tend to be located on the outer surface of the protein.

  • Basic (positvely charged) amino acids and
    Acidic (negatively charged) amino acids create salt bridges, or electrostatic interactions, to further stabilize the tertiary structure.

  • Cysteines may form a disulfide bridge, further stabilizing the protein.

Click the buttons below to see each of these four groupings of amino acid types shown in the insulin protein in the display to the right.

Hydrophobic Interactions Animation
Insulin Hydrophobic Residues Colored Yellow PDB ID: 2hiu
Hydrophilic Interactions Animation
Insulin Hydrophilic Residues Colored White PDB ID: 2hiu
Interaction of Positive and Negative Charged Residues to Form Salt Bridge Animation
Insulin Positive (blue) and Negative (Red) Residues PDB ID: 2hiu
Formation of Disulfide Bond Between Cysteine Residues Animation
Insulin Disulfide Bonds PDB ID: 2hiu

Note that cysteine residues can form covalent disulfide bonds with other cysteine residues, not all cysteine residues in a protein have to form disulfide bonds.

The Structure-Function Relationship

Proteins are amazing molecules because they come in a huge variety of sizes and shapes; each shape suited to perform a specific task. The primary sequence of amino acids in a protein determines its 3-dimensional shape which, in turn, determines how the protein will function. This structure-function relationship is key to appreciating proteins and protein structure.

The same sequence of amino acids in an amino acid chain will fold into the same 3-dimensional shape each time it is made, allowing the body to produce millions of identical copies of any particular type of protein. This pattern is due to the properties of its unique sequence of amino acids (primary structure). As long as the sequence of amino acids is the same, the protein will fold into the same 3-dimensional shape.

Exploring Protein Structures

If a protein does not fold correctly it will not function properly. Therefore, exploring a protein's structure is very important when trying to understand what it does and how it works.

When scientists study a protein they must first determine the sequence of amino acids in the protein chain (primary structure). They use this sequence to predict the presence of any alpha helices or beta sheets (secondary structure). They can then use X-ray crystallography and NMR to determine a protein's full 3-dimensional shape (tertiary structure). Knowing the tertiary structure of a protein is often crucial to understanding how it functions and how to target it for drug therapy or other medical uses.

Quaternary Structure

Secondary and tertiary structures are determined by a protein's sequence of amino acids, or primary structure. All proteins have primary, secondary and tertiary structure.

Some proteins are made up of more than one amino acid chain, giving them a quaternary structure. These multi-chain proteins are held together with the same forces as the tertiary structure of individual protein chains (hydrophobic, hydrophillic, positive/negative and cysteine interactions). Sometimes the various protein chains in a protein complex are identical and other times they are each unique.

Click on the proteins below to see their overall quaternary structure shown in the 3-dimensional display to the right. For each protein complex, the various chains have been colored differently.

Error:image not available. Check image URL to correct the problem.
Potassium Channel PDB ID: 1bl8
Error:image not available. Check image URL to correct the problem.
Antibody PDB ID: 1igt
Error:image not available. Check image URL to correct the problem.
G Protein PDB ID: 1gg2

A Review of Protein Structure


  • Proteins are long chains of amino acids that fold into complex 3-dimensional shapes.

  • Proteins come in an almost endless array of shapes and sizes, each type acting like a specialized molecular machine that performs a specific microscopic task.

  • Primary Structure is the specific order of amino acids in a protein polypeptide chain. There are 20 different types of amino acids that can be incorporated into a protein chain, each with unique attributes (hydrophobic, hydrophillic, positive, negative, and cysteine).

  • Secondary Structures are the alpha helices and beta pleated sheets present in a folded protein's structure.

  • Tertiary Structure is the final shape of an entire amino acid chain. This shape is directly related to the function of the protein.
  • Quaternary Structure exists when more than one amino acid chain comes together to form a protein complex.
Jmol