1 00:00:10,538 --> 00:00:12,960 HAZEL SIVE: All right. 2 00:00:12,960 --> 00:00:20,830 Let's move on to the second topic of our discussion today, 3 00:00:20,830 --> 00:00:25,500 which we will start today and then continue on Friday. 4 00:00:25,500 --> 00:00:29,580 And this is a discussion of the proteins. 5 00:00:29,580 --> 00:00:32,970 The proteins, probably the most fascinating class of 6 00:00:32,970 --> 00:00:46,180 macromolecules, 55% of the dry mass of a cell, and proteins 7 00:00:46,180 --> 00:00:48,320 function everywhere. 8 00:00:48,320 --> 00:00:49,835 They do almost everything. 9 00:00:54,070 --> 00:00:57,470 Strictly speaking, proteins are not hereditary 10 00:00:57,470 --> 00:01:00,910 information, although Professor Jacks will talk with 11 00:01:00,910 --> 00:01:05,670 you about a class of proteins called prions, which kind of 12 00:01:05,670 --> 00:01:11,180 are hereditary information. 13 00:01:11,180 --> 00:01:17,810 So not hereditary info, but they do 14 00:01:17,810 --> 00:01:20,700 almost everything else. 15 00:01:20,700 --> 00:01:23,155 They form the structure of the cell. 16 00:01:27,590 --> 00:01:32,520 They form a major class of catalysts called enzymes that 17 00:01:32,520 --> 00:01:36,060 we will talk about on Friday. 18 00:01:36,060 --> 00:01:40,050 They function in defense as in the immune system. 19 00:01:40,050 --> 00:01:47,040 They allow cells to move, dot, dot, dot. 20 00:01:47,040 --> 00:01:47,580 Okay? 21 00:01:47,580 --> 00:01:51,650 We'll talk about proteins on and on and on as major players 22 00:01:51,650 --> 00:01:54,110 in the fabric of life. 23 00:01:54,110 --> 00:02:08,990 Their monomer is an amino acid that is abbreviated 24 00:02:08,990 --> 00:02:12,950 AA or little aa. 25 00:02:12,950 --> 00:02:19,360 And it has a particular structure that forms around a 26 00:02:19,360 --> 00:02:24,060 central carbon, which is called the alpha carbon. 27 00:02:24,060 --> 00:02:28,190 And from this alpha carbon, there are four groups that 28 00:02:28,190 --> 00:02:33,220 emanate, something called R, which is the functional 29 00:02:33,220 --> 00:02:38,480 interesting group of each amino acid. 30 00:02:38,480 --> 00:02:49,400 So R is characteristic of each class of amino acid. 31 00:02:49,400 --> 00:02:52,150 We'll talk more about this in a moment. 32 00:02:52,150 --> 00:02:56,920 And then we'll put in a hydrogen and then there is a 33 00:02:56,920 --> 00:03:00,530 nitrogen or an amine group, which I'm going to draw here 34 00:03:00,530 --> 00:03:02,640 as NH3 positive. 35 00:03:02,640 --> 00:03:07,180 And the other group is a carboxyl group, which I'm 36 00:03:07,180 --> 00:03:11,050 going to draw here again as ionized. 37 00:03:11,050 --> 00:03:15,270 So R, actually, I'm going to give a special name to. 38 00:03:15,270 --> 00:03:18,970 We'll call it a side chain. 39 00:03:18,970 --> 00:03:24,620 And this side chain can be a charged group. 40 00:03:24,620 --> 00:03:27,915 It can be polar or non-polar. 41 00:03:35,260 --> 00:03:37,810 And let me see what I have as the first thing. 42 00:03:37,810 --> 00:03:38,200 Yes. 43 00:03:38,200 --> 00:03:42,130 Here are some of the side groups in amino acids, for 44 00:03:42,130 --> 00:03:43,750 example, lysine. 45 00:03:43,750 --> 00:03:47,190 Here is the alpha carbon, and here's its side group. 46 00:03:47,190 --> 00:03:49,180 You can see it ends in an amino group. 47 00:03:49,180 --> 00:03:50,870 It's positively charged. 48 00:03:50,870 --> 00:03:53,580 Glutamic acid, I keep forgetting glutamic acid. 49 00:03:53,580 --> 00:03:54,590 This is a better screen. 50 00:03:54,590 --> 00:03:56,440 Glutamic acid ends. 51 00:03:56,440 --> 00:03:58,290 It's got a carboxyl group at the end. 52 00:03:58,290 --> 00:03:59,155 It's negatively charged. 53 00:03:59,155 --> 00:04:00,680 It's an acid. 54 00:04:00,680 --> 00:04:03,940 Here's one, tyrosine with a polar uncharged group. 55 00:04:03,940 --> 00:04:08,450 This benzene ring and the hydroxyl gives some polarity 56 00:04:08,450 --> 00:04:10,660 to the molecule and here's leucine with a 57 00:04:10,660 --> 00:04:12,460 hydroxyl gives polarity. 58 00:04:12,460 --> 00:04:14,170 Leucine is non-polar. 59 00:04:14,170 --> 00:04:18,160 It's really a hydrocarbon and cystine has this interesting 60 00:04:18,160 --> 00:04:21,760 very reactive sulfhydryl group at the end of its R chain, 61 00:04:21,760 --> 00:04:25,860 which is capable of reacting with another sulfhydryl group 62 00:04:25,860 --> 00:04:27,680 and forming a disulfide bond. 63 00:04:34,060 --> 00:04:38,000 The polymer, actually I think I could write it here, the 64 00:04:38,000 --> 00:04:53,190 polymer of amino acids looks like this, and it's called a 65 00:04:53,190 --> 00:05:03,230 peptide if it's less than 20 amino acids and a protein if 66 00:05:03,230 --> 00:05:05,640 it's about more than 20 amino acids, but 67 00:05:05,640 --> 00:05:07,700 that's kind of loose. 68 00:05:07,700 --> 00:05:11,400 You should know the term peptide, polypeptide mean 69 00:05:11,400 --> 00:05:14,440 something a bit bigger than a peptide, protein, they're all 70 00:05:14,440 --> 00:05:16,090 the same chemical structure -- 71 00:05:16,090 --> 00:05:19,880 just refers to differences in the amount of 72 00:05:19,880 --> 00:05:21,380 sequence in the polymer. 73 00:05:24,430 --> 00:05:28,490 How does the protein polymer form? 74 00:05:28,490 --> 00:05:30,340 I'll draw that for you now. 75 00:05:30,340 --> 00:05:34,050 And it forms by making a particular bond called a 76 00:05:34,050 --> 00:05:38,480 peptide bond that you should know and be able to recognize. 77 00:05:38,480 --> 00:05:43,690 So let's draw amino acid one with carbon. 78 00:05:43,690 --> 00:05:50,100 And we're going to put its amino group, its hydrogren, 79 00:05:50,100 --> 00:05:54,040 and then its carboxyl group. 80 00:05:54,040 --> 00:05:55,430 And we're going to add to it. 81 00:05:55,430 --> 00:05:59,210 So this is going to be amino acid one, and we're going to 82 00:05:59,210 --> 00:06:03,915 add to it another one with a second side chain. 83 00:06:03,915 --> 00:06:08,760 And here, we'll draw out the hydrogens on the amino group. 84 00:06:17,850 --> 00:06:21,780 This oxygen and the hydrogens on the amino group are going 85 00:06:21,780 --> 00:06:29,840 to interact, and they are going to undergo a 86 00:06:29,840 --> 00:06:31,570 condensation reaction. 87 00:06:31,570 --> 00:06:35,820 And actually I realize when I drew for you the nucleotide 88 00:06:35,820 --> 00:06:39,050 condensation reaction, that was when we drew the 89 00:06:39,050 --> 00:06:42,460 dinucleotide that was forming, I realize I didn't put that 90 00:06:42,460 --> 00:06:45,970 there was elimination of a molecule as the dinucleotide 91 00:06:45,970 --> 00:06:46,440 was forming. 92 00:06:46,440 --> 00:06:48,980 It's a bit complicated, and that was why I didn't do it. 93 00:06:48,980 --> 00:06:52,100 But in this case, this is a kind of classic condensation 94 00:06:52,100 --> 00:06:55,950 reaction, which will eliminate water, and you'll end up with 95 00:06:55,950 --> 00:06:57,550 something that looks like this. 96 00:07:01,030 --> 00:07:02,120 So here's your alpha. 97 00:07:02,120 --> 00:07:04,600 I'm going to circle the alpha carbon so we don't 98 00:07:04,600 --> 00:07:05,850 get where they are. 99 00:07:10,050 --> 00:07:15,680 And then we've got a carbonyl group that's joined to an 100 00:07:15,680 --> 00:07:19,910 amine and then the other alpha carbon, R2. 101 00:07:25,650 --> 00:07:30,600 So here is amino acid one, amino acid two. 102 00:07:30,600 --> 00:07:36,320 And here is a dipeptide, or diamino acid. 103 00:07:36,320 --> 00:07:39,270 It doesn't really matter what you call it. 104 00:07:39,270 --> 00:07:46,510 The peptide bond is this guy, and it's a very, very 105 00:07:46,510 --> 00:07:49,725 important bond. 106 00:07:49,725 --> 00:07:53,410 It holds the protein chain together, and you need to be 107 00:07:53,410 --> 00:07:55,770 able to recognize it. 108 00:07:55,770 --> 00:07:58,690 Two other features of this as well, which will be 109 00:07:58,690 --> 00:08:02,890 reminiscent of the nucleic acid case, and on one end -- 110 00:08:02,890 --> 00:08:03,890 the ends are different. 111 00:08:03,890 --> 00:08:05,380 Okay? 112 00:08:05,380 --> 00:08:07,030 The ends are different. 113 00:08:07,030 --> 00:08:16,470 On one end, there is a free amino group, and this is 114 00:08:16,470 --> 00:08:24,930 called the amino end or the N-terminal of the dipeptide or 115 00:08:24,930 --> 00:08:26,580 of the protein. 116 00:08:26,580 --> 00:08:37,320 And on the other end, there is a free carboxyl group, and 117 00:08:37,320 --> 00:08:47,440 this is termed the carboxyl or the carboxy, or the C-terminal 118 00:08:47,440 --> 00:08:52,660 of this dipeptide, or indeed of any protein. 119 00:08:52,660 --> 00:08:57,160 So this is going to sound reminiscent of the case as in 120 00:08:57,160 --> 00:09:02,460 nucleic acids because like nucleic acids, proteins, 121 00:09:02,460 --> 00:09:07,990 because they have different ends, have a linear order. 122 00:09:07,990 --> 00:09:15,980 So we'll write it again, different ends 123 00:09:15,980 --> 00:09:18,350 and a linear order. 124 00:09:18,350 --> 00:09:23,350 And again, it's this linear order that can only be read in 125 00:09:23,350 --> 00:09:27,040 one direction or that leads to information that is not 126 00:09:27,040 --> 00:09:32,250 symmetric or not randomly oriented that gives the 127 00:09:32,250 --> 00:09:34,530 protein its particular properties. 128 00:09:34,530 --> 00:09:37,370 So let's write this out again, formally. 129 00:09:37,370 --> 00:09:41,770 On one end, there's the amino end. 130 00:09:41,770 --> 00:09:44,900 You can also write this as NH2. 131 00:09:44,900 --> 00:09:46,590 It's done kind of casually. 132 00:09:46,590 --> 00:09:51,270 You can write it as N. Doesn't really matter to us. 133 00:09:51,270 --> 00:09:55,710 And then here's your polymer of amino acids. 134 00:09:55,710 --> 00:09:58,060 And here is your carboxy end. 135 00:09:58,060 --> 00:09:59,470 You can write COO. 136 00:09:59,470 --> 00:10:01,570 I always write COOH. 137 00:10:01,570 --> 00:10:05,460 You can also write C. All of those are kind of given as 138 00:10:05,460 --> 00:10:11,250 being equivalent and this is your amino end and your 139 00:10:11,250 --> 00:10:14,580 carboxy end. 140 00:10:14,580 --> 00:10:18,680 That's terminology, but you can tell from where these free 141 00:10:18,680 --> 00:10:22,720 chemical groups are which amino acid was added first and 142 00:10:22,720 --> 00:10:24,130 which was added last. 143 00:10:24,130 --> 00:10:29,060 And the amino acid nearest the end terminal is added first. 144 00:10:32,380 --> 00:10:37,020 The one nearest the free carboxy is the most recently 145 00:10:37,020 --> 00:10:38,470 added, always added last. 146 00:10:41,810 --> 00:10:46,200 Now, we went through a bit of a calculation for nucleic 147 00:10:46,200 --> 00:10:50,180 acids where there are four bases, and I pointed out to 148 00:10:50,180 --> 00:10:53,200 you that you could get a lot of combinatorics out of four 149 00:10:53,200 --> 00:10:56,620 bases if you had a polymer that was long enough. 150 00:10:56,620 --> 00:11:00,080 For proteins, the situation is actually 151 00:11:00,080 --> 00:11:02,810 dauntingly more complex. 152 00:11:02,810 --> 00:11:06,565 There are 20 different amino acids. 153 00:11:13,750 --> 00:11:24,470 So for a peptide, that's three amino acids, you would get 20 154 00:11:24,470 --> 00:11:34,760 to the third combinations or 8,000. 155 00:11:34,760 --> 00:11:45,060 Proteins can be greater than 1,000 amino acids in length. 156 00:11:45,060 --> 00:11:51,810 Proteins or peptides for little ones can range from two 157 00:11:51,810 --> 00:11:56,960 to greater than 1,000 amino acids. 158 00:11:56,960 --> 00:11:59,770 And so the number of combinations of information 159 00:11:59,770 --> 00:12:03,300 that you can get in proteins is enormous, and that means 160 00:12:03,300 --> 00:12:08,050 that most of protein space has not been explored by life and 161 00:12:08,050 --> 00:12:10,470 can be explored in the laboratory. 162 00:12:10,470 --> 00:12:13,700 And that's very exciting in thinking about the various 163 00:12:13,700 --> 00:12:18,840 functions that one can find that have not yet been found. 164 00:12:18,840 --> 00:12:22,870 Now, the thing that is different about nucleic acids 165 00:12:22,870 --> 00:12:25,670 and proteins apart from their fundamental chemical 166 00:12:25,670 --> 00:12:30,590 structure, is that nucleic acids, as we will discuss, are 167 00:12:30,590 --> 00:12:34,000 used as a linear more or less, they are 168 00:12:34,000 --> 00:12:35,930 read as a linear string. 169 00:12:35,930 --> 00:12:39,740 You start somewhere, and you read along the nucleic acid, 170 00:12:39,740 --> 00:12:42,700 and you get your information out, and that gives you some 171 00:12:42,700 --> 00:12:44,830 kind of a next step. 172 00:12:44,830 --> 00:12:49,820 Proteins, although they have a linear order, are not used in 173 00:12:49,820 --> 00:12:52,990 this kind of straight linear way. 174 00:12:52,990 --> 00:12:56,490 They are read once they have folded up into 175 00:12:56,490 --> 00:13:00,650 three-dimensional structures, and the folding of proteins 176 00:13:00,650 --> 00:13:05,860 into these 3D structures is intrinsic and essential for 177 00:13:05,860 --> 00:13:06,171 their function. 178 00:13:06,171 --> 00:13:07,421 So protein folding -- 179 00:13:14,150 --> 00:13:16,910 Actually, let me before we do protein folding, I realize 180 00:13:16,910 --> 00:13:19,580 I've been remiss in giving you these slides. 181 00:13:19,580 --> 00:13:22,470 Here is something I drew for you on amino acid 182 00:13:22,470 --> 00:13:25,643 polymerization from your book, and here we are 183 00:13:25,643 --> 00:13:27,300 where we should be. 184 00:13:27,300 --> 00:13:32,320 Protein folding is required for function. 185 00:13:41,660 --> 00:13:46,580 The linear order of amino acids in the first peptide 186 00:13:46,580 --> 00:13:50,170 chain that is synthesized is called the primary structure. 187 00:13:57,060 --> 00:14:01,380 So primary structure refers to the linear 188 00:14:01,380 --> 00:14:05,650 order of amino acids. 189 00:14:11,250 --> 00:14:14,160 And these amino acids, of course, are held together by 190 00:14:14,160 --> 00:14:20,690 covalent bonds as in peptide bonds. 191 00:14:20,690 --> 00:14:24,220 But this primary structure, while absolutely essential for 192 00:14:24,220 --> 00:14:29,000 protein function, is not the functional protein. 193 00:14:29,000 --> 00:14:32,880 It folds up, this primary structure, this linear chain, 194 00:14:32,880 --> 00:14:36,780 folds up and folds again and folds again. 195 00:14:36,780 --> 00:14:40,850 The way I think of it is, if you took a piece of string or 196 00:14:40,850 --> 00:14:44,580 an old-fashioned telephone cord, and you folded it up, 197 00:14:44,580 --> 00:14:50,110 you start rolling it up into a roll, it will roll up, and 198 00:14:50,110 --> 00:14:53,810 it'll fold back on itself, and fold back on itself, and fold 199 00:14:53,810 --> 00:14:55,170 back on itself. 200 00:14:55,170 --> 00:14:57,770 And that's kind of the deal with proteins. 201 00:14:57,770 --> 00:15:00,540 So we have to consider something called secondary 202 00:15:00,540 --> 00:15:10,750 structure where the linear chain of amino acids folds 203 00:15:10,750 --> 00:15:12,310 back on itself. 204 00:15:12,310 --> 00:15:23,150 So the linear amino acid chain folds up, and it 205 00:15:23,150 --> 00:15:26,060 does so in two ways. 206 00:15:26,060 --> 00:15:32,810 It forms something called an alpha helix or it forms 207 00:15:32,810 --> 00:15:37,420 something called a beta sheet or a beta pleated sheet, and 208 00:15:37,420 --> 00:15:40,480 these are characteristics folding structure or folding 209 00:15:40,480 --> 00:15:44,360 patterns that are governed by hydrogen bonds. 210 00:15:46,930 --> 00:15:50,860 But that secondary structure of the protein is not the 211 00:15:50,860 --> 00:15:54,050 functional protein. 212 00:15:54,050 --> 00:15:59,180 Those folded hydrogen-bonded alpha helices or beta sheets 213 00:15:59,180 --> 00:16:06,910 fold on themselves again to form the tertiary structure of 214 00:16:06,910 --> 00:16:08,270 the protein. 215 00:16:08,270 --> 00:16:16,900 So this is more folding of the alpha 216 00:16:16,900 --> 00:16:24,270 helices or the beta sheets. 217 00:16:24,270 --> 00:16:28,060 And this tertiary structure can involve any number of 218 00:16:28,060 --> 00:16:29,840 different kinds of bonds. 219 00:16:29,840 --> 00:16:34,800 It can involve covalent bonds like sulfhydryl disulfide 220 00:16:34,800 --> 00:16:42,330 bonds, hydrogen bonds, hydrophobic bonds, and all of 221 00:16:42,330 --> 00:16:44,960 these things give a tertiary structure. 222 00:16:44,960 --> 00:16:49,720 All of them use one linear polypeptide or protein chain. 223 00:16:49,720 --> 00:16:52,610 But then there's a fourth folding that is required for 224 00:16:52,610 --> 00:16:54,720 the function of many proteins. 225 00:16:54,720 --> 00:17:02,670 This is the quaternary structure, and this refers to, 226 00:17:02,670 --> 00:17:11,504 so again, this is of the primary and secondary chain. 227 00:17:18,150 --> 00:17:22,270 The quaternary structure is an association between two 228 00:17:22,270 --> 00:17:27,020 different proteins in a non-covalent way, usually. 229 00:17:27,020 --> 00:17:46,370 So this is association between two different protein chains 230 00:17:46,370 --> 00:17:51,300 and two different protein chains that are in some kind 231 00:17:51,300 --> 00:17:54,230 of tertiary structure. 232 00:17:54,230 --> 00:17:57,520 And usually this association is non-covalent. 233 00:18:02,300 --> 00:18:08,450 The way protein structure is governed and proteins fold is 234 00:18:08,450 --> 00:18:11,480 mysterious, and very complicated, and not well 235 00:18:11,480 --> 00:18:12,600 understood. 236 00:18:12,600 --> 00:18:13,350 And there we go. 237 00:18:13,350 --> 00:18:16,000 Look at this interesting stuff you can see on my screen. 238 00:18:16,000 --> 00:18:17,140 Here we go. 239 00:18:17,140 --> 00:18:19,370 Here, not from your book because I think it's better 240 00:18:19,370 --> 00:18:23,490 from a different book, is a picture of an alpha helix, a 241 00:18:23,490 --> 00:18:31,330 beta sheet, tertiary structures of folded 242 00:18:31,330 --> 00:18:33,290 polypeptide chains. 243 00:18:33,290 --> 00:18:35,650 And here is a quaternary structure. 244 00:18:35,650 --> 00:18:38,930 And as we'll talk about on Friday, it's the quaternary 245 00:18:38,930 --> 00:18:41,890 structures that are these functional units called 246 00:18:41,890 --> 00:18:43,370 enzymes, and we'll stop there.