1 00:00:04,290 --> 00:00:05,942 PROFESSOR: Today, we continue-- we're 2 00:00:05,942 --> 00:00:07,150 going to do two things today. 3 00:00:07,150 --> 00:00:11,910 One is protein folding and the other is interlocked chains. 4 00:00:11,910 --> 00:00:13,610 So protein folding, we're continuing 5 00:00:13,610 --> 00:00:15,930 on where we left off last time, which was essentially 6 00:00:15,930 --> 00:00:18,700 looking at the mechanics, the mechanical models of protein 7 00:00:18,700 --> 00:00:20,880 folding, which was fixed angle chains. 8 00:00:23,700 --> 00:00:27,190 But basically ignoring all the forces that 9 00:00:27,190 --> 00:00:28,870 go into actual protein folding. 10 00:00:28,870 --> 00:00:30,830 Today, we're going to look at some 11 00:00:30,830 --> 00:00:35,680 of the very theoretical work on what 12 00:00:35,680 --> 00:00:38,450 forces cause proteins to fold and what 13 00:00:38,450 --> 00:00:43,250 you can say about algorithms and complexities of those problems. 14 00:00:43,250 --> 00:00:47,500 So obviously, a lot of people work on protein folding. 15 00:00:50,460 --> 00:00:53,480 On the practical side, people look a lot 16 00:00:53,480 --> 00:00:58,290 at defining energy functions that you minimize in order 17 00:00:58,290 --> 00:01:01,600 to find what we believe would be the right protein folding. 18 00:01:01,600 --> 00:01:03,320 There's a lot of energy functions around, 19 00:01:03,320 --> 00:01:06,790 and the key energies that-- forces people look at 20 00:01:06,790 --> 00:01:09,520 are torsion angles potentials, van der Waal interactions, 21 00:01:09,520 --> 00:01:13,670 hydrogen bonds, hydrophobicity, secondary structure propensity, 22 00:01:13,670 --> 00:01:15,544 and paralyze specific interactions. 23 00:01:15,544 --> 00:01:18,210 And then they take some weighted combination of all those things 24 00:01:18,210 --> 00:01:19,780 and try to find one that matches up 25 00:01:19,780 --> 00:01:21,274 with what we see in real life. 26 00:01:21,274 --> 00:01:22,690 The things we see in real life, we 27 00:01:22,690 --> 00:01:24,190 get by crystallizing proteins, which 28 00:01:24,190 --> 00:01:27,420 can take a really long time, like years of research. 29 00:01:27,420 --> 00:01:28,704 Then you get one 3D image. 30 00:01:28,704 --> 00:01:31,120 And that's what protein data bank collects is all those 3D 31 00:01:31,120 --> 00:01:32,250 images. 32 00:01:32,250 --> 00:01:34,386 People try to train against that data, but I say, 33 00:01:34,386 --> 00:01:36,260 we really don't know what the right objective 34 00:01:36,260 --> 00:01:38,400 is for real protein folding. 35 00:01:38,400 --> 00:01:40,980 Given that, because I'm a theoretician, 36 00:01:40,980 --> 00:01:44,140 we're going to look at a super simple model 37 00:01:44,140 --> 00:01:46,830 that just models one of those features that I mentioned, 38 00:01:46,830 --> 00:01:48,740 which is hydrophobicity. 39 00:01:48,740 --> 00:01:54,160 Hydrophobia is fear of water. 40 00:01:54,160 --> 00:01:57,200 And remember we have this back on the protein, 41 00:01:57,200 --> 00:02:00,190 the various amino acids hanging off, some of those amino acids 42 00:02:00,190 --> 00:02:04,930 are hydrophobic, meaning they like to hide from water. 43 00:02:04,930 --> 00:02:09,229 Now usually, your proteins live in a big bath of fluid, mostly 44 00:02:09,229 --> 00:02:12,660 water, and so they're trying to hide from the surroundings, 45 00:02:12,660 --> 00:02:15,950 try to cluster on the inside of the folded shape, 46 00:02:15,950 --> 00:02:20,030 whereas other hydrophilic amino acids that like water, they 47 00:02:20,030 --> 00:02:21,806 tend to go around the outside. 48 00:02:21,806 --> 00:02:23,180 You can't always do this, but you 49 00:02:23,180 --> 00:02:26,220 try to minimize the amount to which the hydrophobic guys are 50 00:02:26,220 --> 00:02:28,850 on the outside and maximize the hydrophilic guys are 51 00:02:28,850 --> 00:02:29,970 on the outside. 52 00:02:29,970 --> 00:02:33,640 In the HP model, there are hydrophobic and hydrophilic 53 00:02:33,640 --> 00:02:34,140 guys. 54 00:02:34,140 --> 00:02:40,740 And actually, we'll call h the hydrophobic 55 00:02:40,740 --> 00:02:41,983 and P the hydrophilic. 56 00:02:45,224 --> 00:02:46,890 It's very confusing because both of them 57 00:02:46,890 --> 00:02:50,340 have h and p in the acronym, but p also apparently stands 58 00:02:50,340 --> 00:02:55,170 for polar, polar opposite of the hydrophobic guys. 59 00:02:55,170 --> 00:02:57,420 Anyway, we just have these two kinds of nodes, 60 00:02:57,420 --> 00:02:59,350 and we just try to model that one 61 00:02:59,350 --> 00:03:01,690 feature, plus the bonds that link together, 62 00:03:01,690 --> 00:03:04,020 the backbone of the chain. 63 00:03:04,020 --> 00:03:05,670 And so we're going to model a chain 64 00:03:05,670 --> 00:03:11,510 as just a sequence of nodes. 65 00:03:11,510 --> 00:03:13,440 We're going to fold that chain on a lattice. 66 00:03:16,090 --> 00:03:17,110 Something like this. 67 00:03:17,110 --> 00:03:17,710 Is it five? 68 00:03:17,710 --> 00:03:18,610 Yeah. 69 00:03:18,610 --> 00:03:23,670 And different of these nodes could be marked h or p. 70 00:03:23,670 --> 00:03:27,280 And we usually could use two colors to denote that. 71 00:03:27,280 --> 00:03:33,080 In this case, I made this guy h, this guy h, this guy h. 72 00:03:33,080 --> 00:03:37,500 The score of a fold-- so this is a folding. 73 00:03:37,500 --> 00:03:40,040 We're always going to fold on the square lattice 74 00:03:40,040 --> 00:03:42,480 today although people have looked at other lattices, 75 00:03:42,480 --> 00:03:45,430 in particular 3D cubic lattice and a little bit 76 00:03:45,430 --> 00:03:47,490 on the triangular lattice. 77 00:03:47,490 --> 00:03:50,310 I'll mention some reasons why the square lattice is 78 00:03:50,310 --> 00:03:53,760 a little bit artificial, but in general, conjectures, 79 00:03:53,760 --> 00:03:55,887 it doesn't really matter which lattice you fold on. 80 00:03:55,887 --> 00:03:57,470 But you really want to follow lattices 81 00:03:57,470 --> 00:04:00,120 because we're going to define the score in terms 82 00:04:00,120 --> 00:04:05,360 of how many adjacent H nodes there are on the lattice. 83 00:04:05,360 --> 00:04:12,710 Score is the number of call them H-H bonds. 84 00:04:12,710 --> 00:04:16,376 The intuition is if you have two hydrophobic nodes that 85 00:04:16,376 --> 00:04:18,250 are next to each other that means they're not 86 00:04:18,250 --> 00:04:22,566 next to the boundary, and so they're happier. 87 00:04:22,566 --> 00:04:24,190 There's a more formal argument of that, 88 00:04:24,190 --> 00:04:27,160 but this is an easy definition of score. 89 00:04:27,160 --> 00:04:30,510 You want to maximize the number of H-H nodes 90 00:04:30,510 --> 00:04:32,790 that are adjacent on the lattice. 91 00:04:32,790 --> 00:04:35,747 Usually, there are some that are adjacent along the chain. 92 00:04:35,747 --> 00:04:37,830 Those we don't care about because they will always 93 00:04:37,830 --> 00:04:38,600 be adjacent. 94 00:04:38,600 --> 00:04:41,800 You have to fold this thing so that consecutive nodes 95 00:04:41,800 --> 00:04:46,180 in the chain appear adjacent in the lattice. 96 00:04:46,180 --> 00:04:56,226 So we usually ignore what I would call H-H edges which 97 00:04:56,226 --> 00:04:57,850 are four, so you can count them or not. 98 00:04:57,850 --> 00:05:00,666 It doesn't really matter. 99 00:05:00,666 --> 00:05:02,540 We want to maximize the number of these guys. 100 00:05:02,540 --> 00:05:05,860 For this chain, that's probably the maximum you can get. 101 00:05:05,860 --> 00:05:10,870 In general, a typical node has two edge neighbors, 102 00:05:10,870 --> 00:05:13,630 and it has two other neighbors. 103 00:05:13,630 --> 00:05:17,740 And so at most, every guy has-- every H node 104 00:05:17,740 --> 00:05:20,230 has two bonds next to it, which means 105 00:05:20,230 --> 00:05:24,160 on average at most one bond per H node 106 00:05:24,160 --> 00:05:26,877 because we're double counting. 107 00:05:26,877 --> 00:05:27,710 So that's the model. 108 00:05:30,530 --> 00:05:36,590 The optimal folding has the maximum score. 109 00:05:41,640 --> 00:05:44,150 And I'm going to talk about the theory 110 00:05:44,150 --> 00:05:46,580 of this model in a moment. 111 00:05:46,580 --> 00:05:51,020 Some practice you might say is this picture. 112 00:05:51,020 --> 00:05:56,070 This is heuristic or local optimization of maximizing. 113 00:05:56,070 --> 00:06:00,300 Here, red nodes are H nodes that will be throughout the slides. 114 00:06:00,300 --> 00:06:01,250 Red nodes are H nodes. 115 00:06:01,250 --> 00:06:02,560 Blue nodes are P nodes. 116 00:06:02,560 --> 00:06:04,340 That's a universal standard. 117 00:06:04,340 --> 00:06:06,170 You can see, in this case conveniently-- 118 00:06:06,170 --> 00:06:08,841 it may look like some random pattern of reds and blues, 119 00:06:08,841 --> 00:06:11,090 but you end up with all the blues pretty much covering 120 00:06:11,090 --> 00:06:13,340 the outside, and so the reds cluster together, 121 00:06:13,340 --> 00:06:15,300 and there's lots of H-H bonds. 122 00:06:15,300 --> 00:06:19,340 They are not drawn here, but-- So that's nice. 123 00:06:19,340 --> 00:06:22,510 And this looks like a typical protein 124 00:06:22,510 --> 00:06:25,192 in that it globs together. 125 00:06:25,192 --> 00:06:27,150 But of course, it's not capturing every feature 126 00:06:27,150 --> 00:06:31,040 of protein folding, just trying to capture one of them. 127 00:06:31,040 --> 00:06:35,690 But pretending this is the whole story, 128 00:06:35,690 --> 00:06:40,880 what can you say about optimal folding in the HP model? 129 00:06:40,880 --> 00:06:42,980 Well sadly, it's NP-hard. 130 00:06:47,060 --> 00:06:50,000 So to find the actual optimal folding of a given 131 00:06:50,000 --> 00:06:52,990 string of red and blue nodes is NP-hard. 132 00:06:52,990 --> 00:06:55,010 It's NP-hard in the 3D cubic lattice. 133 00:06:55,010 --> 00:06:59,220 That was by two MIT professors, Bonnie Berger and Tom Leighton 134 00:06:59,220 --> 00:07:02,180 And then it was proved NP-hard hard in two dimensions 135 00:07:02,180 --> 00:07:05,550 by whole bunch of people, I guess, from Berkeley. 136 00:07:08,530 --> 00:07:10,370 So that's bad news. 137 00:07:10,370 --> 00:07:14,310 And to me, that's a sign that nature probably 138 00:07:14,310 --> 00:07:16,297 does not find the optimal folding. 139 00:07:16,297 --> 00:07:17,880 There are a couple possibilities here. 140 00:07:17,880 --> 00:07:21,240 Maybe they're extra forces beyond hydrophobicity 141 00:07:21,240 --> 00:07:24,160 that make it easier to find the optimal solution 142 00:07:24,160 --> 00:07:26,950 or maybe nature's just finding a local optimum, not 143 00:07:26,950 --> 00:07:29,590 a global optimum. 144 00:07:29,590 --> 00:07:32,360 Or maybe the proteins that exist in real life that 145 00:07:32,360 --> 00:07:35,610 have been evolved are designed so that local optima equal 146 00:07:35,610 --> 00:07:36,665 global optima. 147 00:07:36,665 --> 00:07:38,250 That's somewhat related. 148 00:07:38,250 --> 00:07:41,480 Somehow nature gets around this NP-hardness, I would think, 149 00:07:41,480 --> 00:07:43,110 because I believe nature's a computer. 150 00:07:43,110 --> 00:07:45,110 Maybe not everyone does. 151 00:07:45,110 --> 00:07:49,580 But it should be bounded by polynomial time computation. 152 00:07:49,580 --> 00:07:53,320 So unless P equals NP-- that's another possibility I guess. 153 00:07:53,320 --> 00:07:56,899 It's unlikely that, to me, nature is doing this. 154 00:07:56,899 --> 00:07:58,440 But maybe it's finding local optimum. 155 00:07:58,440 --> 00:08:00,773 That seems to do pretty well in this particular example. 156 00:08:03,150 --> 00:08:04,090 AUDIENCE: Hey Erik? 157 00:08:04,090 --> 00:08:04,320 PROFESSOR: Yeah.? 158 00:08:04,320 --> 00:08:05,580 AUDIENCE: Are there approximation algorithms? 159 00:08:05,580 --> 00:08:07,130 PROFESSOR: Good question. 160 00:08:07,130 --> 00:08:08,588 Are there approximation algorithms, 161 00:08:08,588 --> 00:08:10,100 that's exactly where I wanted to go. 162 00:08:10,100 --> 00:08:14,070 There are lots of constant factor approximations, 163 00:08:14,070 --> 00:08:16,850 and I will show you one of them. 164 00:08:16,850 --> 00:08:19,160 So I'm not going to show you these hardness proofs 165 00:08:19,160 --> 00:08:20,910 because they're a bit complicated. 166 00:08:20,910 --> 00:08:24,220 I don't have free diagrams of them. 167 00:08:24,220 --> 00:08:28,839 But the constant factor approximations, I do. 168 00:08:28,839 --> 00:08:29,630 Where are we going? 169 00:08:29,630 --> 00:08:32,280 Here. 170 00:08:32,280 --> 00:08:36,220 And so, in particular, the best approximation that's known 171 00:08:36,220 --> 00:08:38,019 is a 1/3 approximation. 172 00:08:42,760 --> 00:08:47,940 This is by MIT Ph.D. student [INAUDIBLE], 173 00:08:47,940 --> 00:08:52,490 and from several years ago-- eight years ago. 174 00:08:52,490 --> 00:08:54,410 I believe it's still the best. 175 00:08:54,410 --> 00:08:56,770 It's pretty simple, but quite elegant, 176 00:08:56,770 --> 00:08:58,822 looks like protein folding. 177 00:08:58,822 --> 00:09:00,780 So this is going to be within a factor of three 178 00:09:00,780 --> 00:09:02,410 of the best you could hope for. 179 00:09:02,410 --> 00:09:05,330 We write 1/3 because we're always getting fewer bonds 180 00:09:05,330 --> 00:09:08,610 than you might want in a maximization problem. 181 00:09:08,610 --> 00:09:10,690 Some people call this three approximation. 182 00:09:10,690 --> 00:09:14,720 I call it one of the other depending on the day the week. 183 00:09:14,720 --> 00:09:19,935 So how could we achieve 1/3 approximation? 184 00:09:25,240 --> 00:09:31,775 Well, before I get to this image, let me tell you. 185 00:09:31,775 --> 00:09:33,650 In general, when-- we haven't talked too much 186 00:09:33,650 --> 00:09:35,400 about approximation algorithms in this class, 187 00:09:35,400 --> 00:09:37,860 but a typical technique in approximation algorithms is you 188 00:09:37,860 --> 00:09:41,010 get some bound on what the optimal might be. 189 00:09:41,010 --> 00:09:45,210 I already told you there's at most one bond for each H node. 190 00:09:45,210 --> 00:09:49,340 And that is indeed-- that's roughly what we'll say. 191 00:09:49,340 --> 00:09:51,950 Just because of the degree argument, 192 00:09:51,950 --> 00:09:56,580 every guy has bond degree at most two. 193 00:09:56,580 --> 00:10:00,420 So that means, at most, one edge per H node. 194 00:10:00,420 --> 00:10:03,380 But we can be a little more precise on the square grid 195 00:10:03,380 --> 00:10:05,950 because there are two types of nodes. 196 00:10:05,950 --> 00:10:08,420 If you checkerboard color the square grid, 197 00:10:08,420 --> 00:10:11,440 there's even nodes and odd nodes. 198 00:10:11,440 --> 00:10:14,930 So in fact, we claim that the optimal solution, 199 00:10:14,930 --> 00:10:21,550 which we call OPT, is always at most twice the min 200 00:10:21,550 --> 00:10:27,096 of the number of even H's and the number of odd H's. 201 00:10:31,660 --> 00:10:35,100 This is slightly smaller than the number of H's. 202 00:10:35,100 --> 00:10:38,060 If a number of H's and not H's are equal, 203 00:10:38,060 --> 00:10:39,900 then this will just be the number of H's. 204 00:10:39,900 --> 00:10:41,441 But if one is smaller than the other, 205 00:10:41,441 --> 00:10:43,322 this will be a little bit smaller 206 00:10:43,322 --> 00:10:44,530 or it could be a lot smaller. 207 00:10:44,530 --> 00:10:46,640 They're saying, no even H's. 208 00:10:46,640 --> 00:10:49,910 The point is if you have an even parity H, 209 00:10:49,910 --> 00:10:53,300 and it somehow has a bond with some other H, 210 00:10:53,300 --> 00:10:57,410 well obviously, that h has different parity from this one. 211 00:10:57,410 --> 00:11:03,360 So only even and odd H's could possibly bond together. 212 00:11:03,360 --> 00:11:06,860 And then this is pretty obvious. 213 00:11:06,860 --> 00:11:08,760 Let's see if I can formalize it. 214 00:11:08,760 --> 00:11:13,940 So you have every bond defines and even and an odd. 215 00:11:13,940 --> 00:11:17,720 Every even and odd can only get hit up to twice. 216 00:11:17,720 --> 00:11:20,310 If there's more evens and odds, it's not going to help you. 217 00:11:20,310 --> 00:11:23,120 In the best case, you pair them up or alternate even odd, 218 00:11:23,120 --> 00:11:24,450 even odd, even odd. 219 00:11:24,450 --> 00:11:29,160 So you'll never be able to use more evens than odds, 220 00:11:29,160 --> 00:11:33,150 maybe one more, not really. 221 00:11:33,150 --> 00:11:35,719 And I think the best case is you do a cycle of even odd, 222 00:11:35,719 --> 00:11:36,510 even odd, even odd. 223 00:11:36,510 --> 00:11:39,380 Then, there's an even number of them-- each, 224 00:11:39,380 --> 00:11:40,920 an equal number of each. 225 00:11:40,920 --> 00:11:43,170 And so you throw away the excess. 226 00:11:43,170 --> 00:11:45,450 You end up with a min of the two. 227 00:11:45,450 --> 00:11:51,450 But you get a factor of two because every even guy has 228 00:11:51,450 --> 00:11:56,052 two incident bonds in the best case, 229 00:11:56,052 --> 00:11:57,510 and then you're not double counting 230 00:11:57,510 --> 00:12:01,140 because you count this even guy, maybe you count this even guy, 231 00:12:01,140 --> 00:12:03,359 and they have disjoint bonds. 232 00:12:03,359 --> 00:12:05,150 So it's twice the number of evens or number 233 00:12:05,150 --> 00:12:07,800 of odds, whichever is smaller. 234 00:12:07,800 --> 00:12:10,050 So this is useful. 235 00:12:10,050 --> 00:12:12,900 It's just what is the best case you could hope for OPT? 236 00:12:12,900 --> 00:12:15,750 Now, what this 1/3 approximation will get 237 00:12:15,750 --> 00:12:18,120 is at least 1/3 this amount. 238 00:12:18,120 --> 00:12:20,040 If you know you get at least 1/3 this amount, 239 00:12:20,040 --> 00:12:22,170 you know you get at least 1/3 the optimal 240 00:12:22,170 --> 00:12:25,500 because this is an upper bound on the optimal. 241 00:12:25,500 --> 00:12:27,780 So that's with the algorithm will do. 242 00:12:31,920 --> 00:12:34,140 And so the general idea of the algorithm 243 00:12:34,140 --> 00:12:37,020 is to proceed in stages. 244 00:12:37,020 --> 00:12:42,945 At a typical stage-- So you pick some edge. 245 00:12:42,945 --> 00:12:45,690 I don't think it'll matter too much. 246 00:12:45,690 --> 00:12:47,830 Pick some edge of the chain, and then you 247 00:12:47,830 --> 00:12:50,220 start working your way down. 248 00:12:50,220 --> 00:12:54,920 And general idea is you can make big loops off to the sides 249 00:12:54,920 --> 00:13:00,780 so that when you get back to the center, these are H nodes. 250 00:13:00,780 --> 00:13:02,030 So you get a bond there. 251 00:13:02,030 --> 00:13:03,380 That's the general idea. 252 00:13:03,380 --> 00:13:07,320 With that basic idea, you probably get about 1/4 253 00:13:07,320 --> 00:13:08,330 of this amount. 254 00:13:08,330 --> 00:13:12,310 But we want to get 1/3, which is a little better. 255 00:13:12,310 --> 00:13:15,250 I won't try to analyze that formally. 256 00:13:15,250 --> 00:13:17,000 But there's a lot of flexibility here 257 00:13:17,000 --> 00:13:19,640 when we-- how long we make this branch. 258 00:13:19,640 --> 00:13:21,450 In order to make this work out, what 259 00:13:21,450 --> 00:13:29,110 I like is that the left branch has a lot of-- pick one-- a lot 260 00:13:29,110 --> 00:13:33,900 of even H's, and the right branch has a lot of odd H's, 261 00:13:33,900 --> 00:13:36,240 or vice versa because then we'll be 262 00:13:36,240 --> 00:13:37,790 able to make lots of the even guys 263 00:13:37,790 --> 00:13:39,270 match up with the odd guys. 264 00:13:39,270 --> 00:13:41,060 Here, we're fighting the parity. 265 00:13:41,060 --> 00:13:42,700 And parity is one of the reasons why 266 00:13:42,700 --> 00:13:44,992 I think the square grid is a little bit artificial. 267 00:13:44,992 --> 00:13:47,450 Let's say with the triangular grid, you wouldn't have that. 268 00:13:47,450 --> 00:13:50,490 There aren't just two classes. 269 00:13:50,490 --> 00:13:51,970 But here, we can get around parity 270 00:13:51,970 --> 00:13:55,310 by finding a break point, this edge, 271 00:13:55,310 --> 00:13:58,500 so that the number of even guys in the left 272 00:13:58,500 --> 00:14:01,350 is about the number of odd guys on the right. 273 00:14:01,350 --> 00:14:03,990 I won't prove that, but it's just an intermediate value 274 00:14:03,990 --> 00:14:05,230 argument. 275 00:14:05,230 --> 00:14:08,020 See, you get that as you walk counterclockwise 276 00:14:08,020 --> 00:14:11,040 along the chain, you have more even guys, 277 00:14:11,040 --> 00:14:15,360 and walk clockwise around the chain, you get more odd guys. 278 00:14:15,360 --> 00:14:18,080 Once you have that set up, you fall into four cases. 279 00:14:22,450 --> 00:14:25,802 So we're supposing at the top-- let me get to a case, 280 00:14:25,802 --> 00:14:26,510 that I can reach. 281 00:14:26,510 --> 00:14:29,170 Here's a typical case. 282 00:14:29,170 --> 00:14:31,630 Suppose you just succeeded in making 283 00:14:31,630 --> 00:14:32,860 two H nodes join together. 284 00:14:32,860 --> 00:14:34,850 Maybe that was your first edge, whatever. 285 00:14:34,850 --> 00:14:38,030 So it might not be H-H. But now, we're proceeding down, 286 00:14:38,030 --> 00:14:40,795 and we want to align a bunch of H nodes. 287 00:14:40,795 --> 00:14:43,170 And we're going to do better than the picture I drew over 288 00:14:43,170 --> 00:14:45,590 here, which was just getting two H nodes to come together. 289 00:14:45,590 --> 00:14:48,020 I actually want to get four H nodes 290 00:14:48,020 --> 00:14:49,850 to come together with three bonds. 291 00:14:49,850 --> 00:14:52,900 That's what I will always achieve, at least that. 292 00:14:52,900 --> 00:14:53,590 Is that right? 293 00:14:53,590 --> 00:14:55,090 Actually, no, this is the good case. 294 00:14:55,090 --> 00:14:57,970 I won't always achieve quite that well, quite that good. 295 00:14:57,970 --> 00:14:59,060 But here's the idea. 296 00:14:59,060 --> 00:15:04,720 Suppose there's some reasonably large distance between this H 297 00:15:04,720 --> 00:15:06,560 node and the next one. 298 00:15:06,560 --> 00:15:10,960 I'm going to go down two steps and then go around however 299 00:15:10,960 --> 00:15:15,120 long it takes so that when I come back, I get to an H node, 300 00:15:15,120 --> 00:15:16,220 and same on the left side. 301 00:15:16,220 --> 00:15:18,136 So these two chains may have different lengths 302 00:15:18,136 --> 00:15:20,109 depending on when the next H node is. 303 00:15:20,109 --> 00:15:21,900 Now, when I say the next H node, I actually 304 00:15:21,900 --> 00:15:23,610 mean the next odd H node on this side 305 00:15:23,610 --> 00:15:27,210 and the next even H node on this side so that they will line up. 306 00:15:27,210 --> 00:15:29,870 This is assuming that the distance from this H 307 00:15:29,870 --> 00:15:32,190 node to the next one is greater than 1, 308 00:15:32,190 --> 00:15:34,650 by parity that means it's at least three. 309 00:15:34,650 --> 00:15:37,220 So if it's three, it's going to be one, two, three. 310 00:15:37,220 --> 00:15:40,000 And that's it. 311 00:15:40,000 --> 00:15:41,276 And so go straight down. 312 00:15:41,276 --> 00:15:42,650 If it's more than three, you just 313 00:15:42,650 --> 00:15:44,983 divide that distance in half, and you go back and forth. 314 00:15:44,983 --> 00:15:47,190 And this sets the parity right. 315 00:15:47,190 --> 00:15:49,270 So that's the set up. 316 00:15:49,270 --> 00:15:51,250 Now, we have these two H nodes. 317 00:15:51,250 --> 00:15:52,980 Now, when are the next H nodes? 318 00:15:52,980 --> 00:15:55,440 Well, there's at least one blue node 319 00:15:55,440 --> 00:15:58,960 in between because of parity because we're looking again 320 00:15:58,960 --> 00:16:00,360 at odd H nodes on the right side, 321 00:16:00,360 --> 00:16:02,950 even H nodes on the left side. 322 00:16:02,950 --> 00:16:04,680 So maybe it's just one blue node, 323 00:16:04,680 --> 00:16:08,092 and we would go like this, and here we would go like that. 324 00:16:08,092 --> 00:16:09,550 Or if it's more than one blue node, 325 00:16:09,550 --> 00:16:11,215 we just lay them out like this. 326 00:16:11,215 --> 00:16:12,590 Goes to the appropriate distance, 327 00:16:12,590 --> 00:16:14,940 so that when we come back, we get a red. 328 00:16:14,940 --> 00:16:16,670 So we have two red H nodes here. 329 00:16:16,670 --> 00:16:19,500 I want to make two red H nodes here no matter what. 330 00:16:19,500 --> 00:16:21,940 If it's just gap of one, we would go straight, make 331 00:16:21,940 --> 00:16:25,650 a corner, otherwise, we do these big loops. 332 00:16:25,650 --> 00:16:27,570 And the result is I get these four H nodes 333 00:16:27,570 --> 00:16:32,810 in this pattern, just three bonds for four H nodes. 334 00:16:32,810 --> 00:16:35,320 And before we analyze what exactly that means, 335 00:16:35,320 --> 00:16:37,610 let me just tell you about the other cases. 336 00:16:37,610 --> 00:16:41,880 So this is the case when there is more than one blue node 337 00:16:41,880 --> 00:16:44,770 in between two H nodes. 338 00:16:44,770 --> 00:16:47,040 The other rows and columns are when 339 00:16:47,040 --> 00:16:48,600 the left side or the right side has 340 00:16:48,600 --> 00:16:51,780 just one blue node from that red guy. 341 00:16:51,780 --> 00:16:53,540 So let's go to this one for example. 342 00:16:53,540 --> 00:16:56,570 Here the left side has only one blue node to the next red guy. 343 00:16:56,570 --> 00:16:58,600 The right side has at least three. 344 00:16:58,600 --> 00:17:02,460 So then we'd go one, two, three, and we get there. 345 00:17:02,460 --> 00:17:04,196 If it's a more than three, then we go 346 00:17:04,196 --> 00:17:05,846 and we make that big loop. 347 00:17:05,846 --> 00:17:07,220 Now we have these two read nodes. 348 00:17:07,220 --> 00:17:09,069 My goal now is to put a red node here. 349 00:17:09,069 --> 00:17:13,849 If this chain only has one blue node, I would make a corner. 350 00:17:13,849 --> 00:17:19,260 Otherwise, I'd loop around, and then I go onto the next step. 351 00:17:19,260 --> 00:17:21,890 Here, I only got two bonds for three nodes, 352 00:17:21,890 --> 00:17:24,860 so it's actually a little worse in ratio. 353 00:17:24,860 --> 00:17:26,609 In the case where they're both length one, 354 00:17:26,609 --> 00:17:29,170 I just go straight down, make those two H nodes, 355 00:17:29,170 --> 00:17:31,310 and then I use the same trick as over here 356 00:17:31,310 --> 00:17:34,320 to get that nice zig-zag pattern. 357 00:17:34,320 --> 00:17:37,620 And this case is symmetric to this one, just flipped. 358 00:17:40,510 --> 00:17:50,460 So in all cases, I didn't skip anything. 359 00:17:50,460 --> 00:17:53,750 I did all of the even H nodes on the right side. 360 00:17:53,750 --> 00:17:56,080 I did all of the odd nodes, H nodes, on the left side. 361 00:17:59,570 --> 00:18:00,574 So I did skip something. 362 00:18:00,574 --> 00:18:02,490 I skipped all of the odd guys on the left side 363 00:18:02,490 --> 00:18:04,470 and all the even guys on the right side. 364 00:18:04,470 --> 00:18:07,400 So I skipped half of the H nodes. 365 00:18:07,400 --> 00:18:12,520 So I get at least two bonds-- I'm calling them 366 00:18:12,520 --> 00:18:25,830 H-H bonds-- for every three interesting H nodes. 367 00:18:28,600 --> 00:18:30,660 By interesting, I just mean even on the left, 368 00:18:30,660 --> 00:18:36,500 odd on the right, which is really for every six H nodes. 369 00:18:36,500 --> 00:18:41,440 And so we get a factor of 2/6, also known as 1/3. 370 00:18:46,200 --> 00:18:50,340 Really here I mean 6 even H nodes. 371 00:18:50,340 --> 00:18:51,610 Let's say. 372 00:18:51,610 --> 00:18:54,340 We used the same number of even and odds. 373 00:18:54,340 --> 00:18:57,340 And we had-- Is that right? 374 00:18:57,340 --> 00:18:57,840 Sorry. 375 00:18:57,840 --> 00:19:00,655 No, I guess I actually made H nodes here. 376 00:19:00,655 --> 00:19:02,780 I always get a little confused here because there's 377 00:19:02,780 --> 00:19:04,363 a factor of two, a factor of two here. 378 00:19:04,363 --> 00:19:07,600 But in the end we get 1/3 bound. 379 00:19:11,030 --> 00:19:13,539 So that's the approximation algorithm, 380 00:19:13,539 --> 00:19:14,580 and that's the best node. 381 00:19:14,580 --> 00:19:15,455 It looks pretty nice. 382 00:19:15,455 --> 00:19:19,860 It's like beta sheets, if you've seen proteins folding. 383 00:19:19,860 --> 00:19:22,434 it depends, of course, on the spacing of the H nodes, 384 00:19:22,434 --> 00:19:24,600 but you get-- in particular, you get all the H nodes 385 00:19:24,600 --> 00:19:26,090 on the inside pretty much. 386 00:19:26,090 --> 00:19:29,560 I guess here is one exposed end. 387 00:19:29,560 --> 00:19:31,450 And here there's one exposed end. 388 00:19:31,450 --> 00:19:33,790 These two cases, we're doing especially well. 389 00:19:33,790 --> 00:19:37,424 Maybe in real proteins, the gaps are always more than one. 390 00:19:37,424 --> 00:19:38,340 I don't actually know. 391 00:19:40,659 --> 00:19:42,450 I think in reality there's different levels 392 00:19:42,450 --> 00:19:46,080 of hydrophobicity, so that's maybe not so well defined. 393 00:19:46,080 --> 00:19:48,154 But any questions? 394 00:19:48,154 --> 00:19:52,627 AUDIENCE: Is the exposed end considered the [INAUDIBLE] 395 00:19:52,627 --> 00:19:54,194 above the dashed line or is it-- 396 00:19:54,194 --> 00:19:54,860 PROFESSOR: Yeah. 397 00:19:54,860 --> 00:19:56,420 When I say exposed end, I mean here. 398 00:19:56,420 --> 00:20:00,267 There's an adjacency to the outside water. 399 00:20:00,267 --> 00:20:02,350 Now of course, there might actually be nodes here. 400 00:20:02,350 --> 00:20:03,974 Then, that would not be an exposed end. 401 00:20:03,974 --> 00:20:05,720 But if this just went straight up, 402 00:20:05,720 --> 00:20:08,610 which it might in this picture, then that 403 00:20:08,610 --> 00:20:09,660 would be an exposed end. 404 00:20:09,660 --> 00:20:11,970 I guess that might also be. 405 00:20:11,970 --> 00:20:13,300 No, probably not. 406 00:20:13,300 --> 00:20:15,300 Actually, interesting. 407 00:20:15,300 --> 00:20:16,731 If these two guys are red, there's 408 00:20:16,731 --> 00:20:18,730 always a blue node to the left and right of them 409 00:20:18,730 --> 00:20:20,710 in this construction, so actually, I 410 00:20:20,710 --> 00:20:23,580 think that would not be exposed. 411 00:20:23,580 --> 00:20:25,520 That's cool. 412 00:20:25,520 --> 00:20:27,660 So actually, I think there are no exposed 413 00:20:27,660 --> 00:20:29,760 ends in this folding, except maybe at the very top 414 00:20:29,760 --> 00:20:31,710 and bottom. 415 00:20:31,710 --> 00:20:34,160 Except, we're only considering half of the H nodes, 416 00:20:34,160 --> 00:20:37,620 so there's the other half we have no idea about. 417 00:20:37,620 --> 00:20:41,080 So hopefully, nature chose the parity right. 418 00:20:41,080 --> 00:20:44,409 Or, if you think about this is in the triangular case, which 419 00:20:44,409 --> 00:20:45,950 I don't think anyone's tried to adapt 420 00:20:45,950 --> 00:20:47,790 this algorithm to the triangular grid, 421 00:20:47,790 --> 00:20:49,220 maybe you can get no exposed node. 422 00:20:49,220 --> 00:20:51,530 So that would be pretty neat. 423 00:20:51,530 --> 00:20:54,127 Of course reality, it probably does not fold on a lattice, 424 00:20:54,127 --> 00:20:55,210 but it's an approximation. 425 00:21:03,930 --> 00:21:10,470 Now, one of the observed features in protein folding 426 00:21:10,470 --> 00:21:13,440 is they tend to fold to the same shape. 427 00:21:13,440 --> 00:21:15,040 It's actually a hard thing to measure, 428 00:21:15,040 --> 00:21:17,910 but that's the general consensus, 429 00:21:17,910 --> 00:21:21,840 that at least most proteins in a non diseased organism 430 00:21:21,840 --> 00:21:25,000 fold always to the same shape, at least 431 00:21:25,000 --> 00:21:26,650 in the same environment. 432 00:21:26,650 --> 00:21:29,570 So if you want to model that in the HP model, 433 00:21:29,570 --> 00:21:33,421 and you pretend that, really, we are finding optimal solutions-- 434 00:21:33,421 --> 00:21:34,920 because it's a little hard to define 435 00:21:34,920 --> 00:21:36,859 what locally optimal means. 436 00:21:36,859 --> 00:21:38,650 But that would be an interesting direction. 437 00:21:38,650 --> 00:21:40,390 It would be nice to find proteins 438 00:21:40,390 --> 00:21:43,380 where locally optimal equals globally optimal, 439 00:21:43,380 --> 00:21:46,060 but lacking definitions of those terms, 440 00:21:46,060 --> 00:21:47,550 no one has tried to prove that. 441 00:21:47,550 --> 00:21:49,250 It could pick a good direction. 442 00:21:49,250 --> 00:21:52,530 But if you believe in optimality, then at least 443 00:21:52,530 --> 00:21:56,135 you hope that the optimal are unique. 444 00:21:56,135 --> 00:21:58,010 So you take a typical example of the optimal, 445 00:21:58,010 --> 00:22:01,830 like if they're all blue, then every folding is optimal. 446 00:22:01,830 --> 00:22:04,210 So there's exponentially many of them. 447 00:22:04,210 --> 00:22:06,970 But if you have any hope of getting unique folding, 448 00:22:06,970 --> 00:22:11,400 you'd really like unique optimal foldings. 449 00:22:11,400 --> 00:22:12,605 The theorem is these exist. 450 00:22:15,330 --> 00:22:19,190 And the 2D squared grid, they exist 451 00:22:19,190 --> 00:22:29,409 for all even n, for closed chains-- mostly, 452 00:22:29,409 --> 00:22:31,450 we care about open chains, but for closed chains, 453 00:22:31,450 --> 00:22:34,210 you could prove these exist for all even n-- 454 00:22:34,210 --> 00:22:42,617 and for open chains, they exist for all doubly even n. 455 00:22:50,250 --> 00:22:53,680 This is a theorem by a bunch of people. 456 00:22:53,680 --> 00:22:56,490 Oswin Aicholzer, David Bremner, me, Hank Meijer, 457 00:22:56,490 --> 00:22:59,190 Vera Sacristan, and Michael Soss, 458 00:22:59,190 --> 00:23:02,640 we talked about last time. 459 00:23:02,640 --> 00:23:04,580 So here they are for closed chains. 460 00:23:04,580 --> 00:23:08,950 It's a pretty simple example, basically alternating red blue, 461 00:23:08,950 --> 00:23:15,900 but at the end, you have two blue nodes to make the corner. 462 00:23:15,900 --> 00:23:17,410 That is not the optimal folding. 463 00:23:17,410 --> 00:23:18,750 The optimal folding is this. 464 00:23:22,610 --> 00:23:25,710 So there's only two exposed ends. 465 00:23:25,710 --> 00:23:28,370 There's one down here, and one on the top. 466 00:23:28,370 --> 00:23:30,760 So there's no surprises that it's optimal, a little less 467 00:23:30,760 --> 00:23:34,270 obvious that it's uniquely optimal. 468 00:23:34,270 --> 00:23:35,310 But there it is. 469 00:23:35,310 --> 00:23:37,605 And this works for any even n. 470 00:23:37,605 --> 00:23:43,270 In general, it depends a little bit on your the n mod four, so 471 00:23:43,270 --> 00:23:45,125 that's why we are drawing two pictures here, 472 00:23:45,125 --> 00:23:49,330 to check that it works for both congruences mod four. 473 00:23:49,330 --> 00:23:50,990 It has to be even for closed chain 474 00:23:50,990 --> 00:23:53,510 and must be even, so this is really for all m 475 00:23:53,510 --> 00:23:55,240 that there's closed chains, there 476 00:23:55,240 --> 00:23:59,650 are unique-- there are examples of HP chains 477 00:23:59,650 --> 00:24:02,320 that have unique optimal foldings. 478 00:24:02,320 --> 00:24:05,280 Here, it's just to do you end going horizontal or n going 479 00:24:05,280 --> 00:24:07,360 vertical? 480 00:24:07,360 --> 00:24:09,640 Not a big difference. 481 00:24:09,640 --> 00:24:16,970 I have a proof by picture that this is the optimal folding. 482 00:24:16,970 --> 00:24:19,980 So remember there are two-- we talk 483 00:24:19,980 --> 00:24:23,190 about H-H edges and H-H bonds. 484 00:24:23,190 --> 00:24:27,470 These are what we call P-P edges. 485 00:24:27,470 --> 00:24:30,640 And I remember when writing this paper, one of the co-authors, 486 00:24:30,640 --> 00:24:33,030 I won't say which one, saying call me juvenile, 487 00:24:33,030 --> 00:24:35,840 but I don't want to have a paper that talks about P-P edges. 488 00:24:35,840 --> 00:24:37,820 And I convinced him to keep it. 489 00:24:37,820 --> 00:24:40,820 So we have-- it's really only obvious 490 00:24:40,820 --> 00:24:44,090 when you say it out loud, which sadly I have to do here. 491 00:24:44,090 --> 00:24:46,720 But you have two of those edges, all blue edges, 492 00:24:46,720 --> 00:24:48,930 we might also call them. 493 00:24:48,930 --> 00:24:53,120 And they really should be on the boundary 494 00:24:53,120 --> 00:24:56,400 because you have to have exposed ends on the boundary, 495 00:24:56,400 --> 00:24:59,040 on the bounding box is what I mean. 496 00:24:59,040 --> 00:25:01,872 So if you put too many red guys on the bounding box, 497 00:25:01,872 --> 00:25:03,330 you're going to lose lots of bonds. 498 00:25:03,330 --> 00:25:06,150 So those two blue guys should be on the boundary. 499 00:25:06,150 --> 00:25:07,900 Some two edges have to be on the boundary, 500 00:25:07,900 --> 00:25:10,330 so it's best to have the two blue ones. 501 00:25:10,330 --> 00:25:13,710 So we could show you in optimal solutions, that is the case. 502 00:25:13,710 --> 00:25:15,890 Now, you decompose into two chains 503 00:25:15,890 --> 00:25:17,060 that connect the two ends. 504 00:25:17,060 --> 00:25:19,480 There's connecting this end to the left 505 00:25:19,480 --> 00:25:23,450 and connecting this end up and around to the other blue node 506 00:25:23,450 --> 00:25:24,350 at the end. 507 00:25:24,350 --> 00:25:28,490 And if you check all of the red notes on this side 508 00:25:28,490 --> 00:25:31,226 have even parity, let's say, and all the red nodes on this 509 00:25:31,226 --> 00:25:33,660 die have odd parity because it alternates. 510 00:25:33,660 --> 00:25:37,320 So there's no way that you could have 511 00:25:37,320 --> 00:25:38,990 bonds within one of those chains. 512 00:25:38,990 --> 00:25:40,795 That's not drawn here. 513 00:25:40,795 --> 00:25:42,920 The other thing is because these two blue edges are 514 00:25:42,920 --> 00:25:47,986 on the boundary, you cannot have a bond on the outside 515 00:25:47,986 --> 00:25:49,360 because there's no hope for that. 516 00:25:49,360 --> 00:25:53,560 You can only have bonds on the inside of that closed loop. 517 00:25:53,560 --> 00:25:57,480 Here, we're using that it's a closed loop. 518 00:25:57,480 --> 00:26:00,000 So you have bonds on the inside, and then you argue, well, 519 00:26:00,000 --> 00:26:02,980 because it's bipartite graph and because it's planar, 520 00:26:02,980 --> 00:26:05,160 you can't have any crossings in these bonds 521 00:26:05,160 --> 00:26:08,170 because you're embedding this thing in the plane, 522 00:26:08,170 --> 00:26:11,570 really you basically have to alternate. 523 00:26:11,570 --> 00:26:14,220 There's actually two ways you could try to alternate. 524 00:26:14,220 --> 00:26:16,770 You start on this side or if you start on this side. 525 00:26:16,770 --> 00:26:18,730 You try to realize them geometrically. 526 00:26:18,730 --> 00:26:20,980 When you do this alternation, you decompose your thing 527 00:26:20,980 --> 00:26:22,870 into squares. 528 00:26:22,870 --> 00:26:25,840 Only in one case does it work, and you get this picture. 529 00:26:25,840 --> 00:26:27,480 And then once you have the squares, 530 00:26:27,480 --> 00:26:29,692 there's a unique way to glue them together. 531 00:26:29,692 --> 00:26:30,900 So that's how the proof goes. 532 00:26:30,900 --> 00:26:35,390 It's pretty easy once you get the right limits in place 533 00:26:35,390 --> 00:26:36,730 for closed chains. 534 00:26:36,730 --> 00:26:40,110 Let me show you what happens for open chains. 535 00:26:40,110 --> 00:26:44,930 Take the same example for open chains, and it's almost unique. 536 00:26:44,930 --> 00:26:49,730 In the double even case, like 16, it is unique. 537 00:26:49,730 --> 00:26:54,310 In the only singlely even, so two mod four case, 538 00:26:54,310 --> 00:26:59,000 like 18, you have this issue. 539 00:26:59,000 --> 00:27:00,740 So the way we set up is we do not 540 00:27:00,740 --> 00:27:05,560 have the blue blue at the end to turn around. 541 00:27:05,560 --> 00:27:08,830 That is the best we know how to do. 542 00:27:08,830 --> 00:27:13,160 And there are these two-foldings, 543 00:27:13,160 --> 00:27:15,190 how you might achieve it. 544 00:27:15,190 --> 00:27:20,920 So here, we wrap around and get a little triad of blues here. 545 00:27:20,920 --> 00:27:22,910 Because we have a red n point, we 546 00:27:22,910 --> 00:27:25,260 can actually get three bonds into it. 547 00:27:25,260 --> 00:27:27,595 Or, we can do the intended folding. 548 00:27:27,595 --> 00:27:31,920 Now, it's a pretty small change, so it's approximately unique, 549 00:27:31,920 --> 00:27:36,600 but that's the best we know for open chains, only singly even 550 00:27:36,600 --> 00:27:37,170 length. 551 00:27:37,170 --> 00:27:39,650 Or for odd length, we also don't know anything that's optimal. 552 00:27:39,650 --> 00:27:41,200 Of course there are things that are almost optimal. 553 00:27:41,200 --> 00:27:42,490 You just add a little end. 554 00:27:42,490 --> 00:27:48,790 Won't make a big difference, but interesting open problem. 555 00:27:48,790 --> 00:27:51,360 Also open, whether any of these unique results-- 556 00:27:51,360 --> 00:27:53,420 uniqueness results extend to other grids, 557 00:27:53,420 --> 00:27:54,410 like triangular grid. 558 00:27:54,410 --> 00:27:57,034 That would be nice because here we're really exploiting parity, 559 00:27:57,034 --> 00:27:58,330 and that's kind of cheating. 560 00:27:58,330 --> 00:28:00,580 Certainly, parity is not such a big deal in real life, 561 00:28:00,580 --> 00:28:01,800 but I don't know. 562 00:28:01,800 --> 00:28:04,030 Could be. 563 00:28:04,030 --> 00:28:07,070 The approximation algorithm's generalized to other grids, 564 00:28:07,070 --> 00:28:10,390 not necessarily this one, but some constant factor 565 00:28:10,390 --> 00:28:13,180 approximation have been obtained. 566 00:28:13,180 --> 00:28:14,470 Uniqueness is open. 567 00:28:14,470 --> 00:28:16,570 The really cool open question here, I think, 568 00:28:16,570 --> 00:28:19,020 is protein design. 569 00:28:19,020 --> 00:28:21,390 HP model is simple enough we can think about it. 570 00:28:21,390 --> 00:28:25,430 And while folding-- finding the optimal folding of a given 571 00:28:25,430 --> 00:28:28,760 protein is NP-hard, if I get to design the protein, 572 00:28:28,760 --> 00:28:32,170 and say, well, I'd really like to fold into-- I don't know. 573 00:28:32,170 --> 00:28:36,040 The letter m-- how do I make a protein fold into the letter m, 574 00:28:36,040 --> 00:28:37,800 at least approximately? 575 00:28:37,800 --> 00:28:41,080 So ideally, it should still have a unique optimal folding, 576 00:28:41,080 --> 00:28:43,090 so we generalize this result, but it 577 00:28:43,090 --> 00:28:45,260 can match any target shape. 578 00:28:45,260 --> 00:28:47,490 Here, we're matching a diagonal line 579 00:28:47,490 --> 00:28:49,390 up to some constant factor resolution. 580 00:28:49,390 --> 00:28:51,570 Can we match any shape up to some constant factor 581 00:28:51,570 --> 00:28:52,120 resolution? 582 00:28:52,120 --> 00:28:53,734 I think the answer should be yes. 583 00:28:53,734 --> 00:28:56,150 And then you should be able to solve that polynomial time. 584 00:28:56,150 --> 00:28:58,350 In the same way that origami design is a lot easier 585 00:28:58,350 --> 00:29:00,269 than origami foldability, here, I 586 00:29:00,269 --> 00:29:02,060 think protein design should be a lot easier 587 00:29:02,060 --> 00:29:04,960 than protein folding. 588 00:29:04,960 --> 00:29:06,460 But no one has worked on that. 589 00:29:06,460 --> 00:29:08,619 I think it's a really cool problem. 590 00:29:08,619 --> 00:29:10,660 To bad I'm mentioning it so late in the semester, 591 00:29:10,660 --> 00:29:12,535 but we can work on it in the problem session. 592 00:29:15,800 --> 00:29:18,750 So ends my coverage of protein folding. 593 00:29:18,750 --> 00:29:20,862 Obviously, there's tons of work I'm not covering, 594 00:29:20,862 --> 00:29:22,820 but I'm focusing on the very algorithmic stuff. 595 00:29:22,820 --> 00:29:24,990 Any questions before we go? 596 00:29:24,990 --> 00:29:25,778 Yeah? 597 00:29:25,778 --> 00:29:27,770 AUDIENCE: In these models you have, are these 598 00:29:27,770 --> 00:29:31,320 supposed to equal number of P and H, 599 00:29:31,320 --> 00:29:34,160 can you get in situations where you're way 600 00:29:34,160 --> 00:29:37,342 overbeared with P or H, and then it's 601 00:29:37,342 --> 00:29:39,700 going to be harder to find an optimal folding. 602 00:29:39,700 --> 00:29:42,945 Or, is it not realistic because these 603 00:29:42,945 --> 00:29:47,094 are such-- these are small sets of the entire protein? 604 00:29:47,094 --> 00:29:47,760 PROFESSOR: Yeah. 605 00:29:47,760 --> 00:29:50,190 So the question is whether-- Does 606 00:29:50,190 --> 00:29:53,180 it matter the ratio between the number of H nodes and P nodes. 607 00:29:53,180 --> 00:29:57,724 I think the easy-- A sure thing is I don't know. 608 00:29:57,724 --> 00:29:59,390 My guess is, in reality, the number of H 609 00:29:59,390 --> 00:30:02,310 and P's are within a constant factor of each other, 610 00:30:02,310 --> 00:30:03,170 for some constant. 611 00:30:03,170 --> 00:30:05,020 I don't think the constant matters too much. 612 00:30:05,020 --> 00:30:06,853 Although if there's an overwhelming majority 613 00:30:06,853 --> 00:30:09,682 of H's or P's, the problem might become easier. 614 00:30:09,682 --> 00:30:11,390 My guess also is in the NP-hardness proof 615 00:30:11,390 --> 00:30:13,800 although it's been a while since I read it. 616 00:30:13,800 --> 00:30:16,967 There's probably a constant factor again. 617 00:30:16,967 --> 00:30:19,050 But maybe you could show if there's-- for example, 618 00:30:19,050 --> 00:30:21,356 if there's a constant number of h nodes, 619 00:30:21,356 --> 00:30:23,230 I'll bet you can solve it in polynomial time. 620 00:30:23,230 --> 00:30:25,430 That's maybe-- That's a new open problem. 621 00:30:25,430 --> 00:30:26,570 I should write it down. 622 00:30:26,570 --> 00:30:27,710 Good question. 623 00:30:27,710 --> 00:30:31,686 AUDIENCE: If there is a wrong factor between them, 624 00:30:31,686 --> 00:30:34,171 wouldn't we see that at the model 625 00:30:34,171 --> 00:30:37,020 because it just wouldn't work? 626 00:30:37,020 --> 00:30:40,420 PROFESSOR: No, this model makes sense for any number of reds 627 00:30:40,420 --> 00:30:44,520 and H's and P's, but uniqueness is probably hard to obtain 628 00:30:44,520 --> 00:30:47,650 if you have a small number of H's or if you have a huge 629 00:30:47,650 --> 00:30:48,590 number of H's. 630 00:30:48,590 --> 00:30:50,596 If you don't have enough-- 631 00:30:50,596 --> 00:30:54,670 AUDIENCE: What if you have H's on the order of the square 632 00:30:54,670 --> 00:30:57,892 of the P's, then there are at least two H's? 633 00:30:57,892 --> 00:30:59,760 PROFESSOR: All right. 634 00:30:59,760 --> 00:31:01,940 AUDIENCE: Kind of makes sense, right? 635 00:31:01,940 --> 00:31:04,690 PROFESSOR: Yeah, actually we have 636 00:31:04,690 --> 00:31:07,960 a theorem along those lines, which is not published 637 00:31:07,960 --> 00:31:10,700 so that's why I'm not talking about it. 638 00:31:10,700 --> 00:31:13,700 You can achieve the number of H's as square the number 639 00:31:13,700 --> 00:31:15,010 of P's. 640 00:31:15,010 --> 00:31:17,260 I imagine that's the limit-- and still get uniqueness. 641 00:31:17,260 --> 00:31:18,450 I imagine that's the limit. 642 00:31:18,450 --> 00:31:19,949 We really ought to write that result 643 00:31:19,949 --> 00:31:24,399 that it's from seven years ago, so unlikely at this point. 644 00:31:24,399 --> 00:31:25,065 Other questions? 645 00:31:31,120 --> 00:31:43,780 So we go onto interlocked 3D chains, 646 00:31:43,780 --> 00:31:47,060 which is related in that it's in 3D and its chains. 647 00:31:47,060 --> 00:31:50,280 So related to the mechanical protein folding 648 00:31:50,280 --> 00:31:52,810 we did last class, except here, mostly, 649 00:31:52,810 --> 00:31:54,810 I'm going to think about universal joints again, 650 00:31:54,810 --> 00:31:56,167 so I can do whatever I want. 651 00:31:56,167 --> 00:31:57,750 I don't have to hold the angles fixed. 652 00:31:57,750 --> 00:31:59,208 At the end, I'll mention how things 653 00:31:59,208 --> 00:32:02,515 change a little bit for when you have fixed angle chains. 654 00:32:05,340 --> 00:32:07,180 But universals more fun. 655 00:32:07,180 --> 00:32:09,540 So the motivation here is we have 656 00:32:09,540 --> 00:32:12,870 our good friend the knitting needles example, 657 00:32:12,870 --> 00:32:14,560 or maybe our mortal enemy, depending 658 00:32:14,560 --> 00:32:16,130 on how you like to think of that. 659 00:32:16,130 --> 00:32:20,040 One, two, three, four, five. 660 00:32:20,040 --> 00:32:22,310 And remember-- never mind the edge lengths. 661 00:32:22,310 --> 00:32:24,330 I'm not going to worry about edge links here. 662 00:32:24,330 --> 00:32:26,865 With five bars, you can lock the chain. 663 00:32:30,760 --> 00:32:33,630 So an interesting question is what 664 00:32:33,630 --> 00:32:39,330 if I had less than five chains, say length 4 chains, 665 00:32:39,330 --> 00:32:42,310 but I had more than one of them. 666 00:32:42,310 --> 00:32:45,050 Now certainly, each chain will be unlocked. 667 00:32:45,050 --> 00:32:49,530 That's a known result by Cantrell and Johnson '96. 668 00:32:49,530 --> 00:32:50,610 '96, '98? 669 00:32:50,610 --> 00:32:52,350 '96. 670 00:32:52,350 --> 00:32:57,230 But what if I have two four chains, can they interlock? 671 00:32:57,230 --> 00:32:59,790 The answer turns out to be yes. 672 00:32:59,790 --> 00:33:02,280 The original motivation for this proble,-- 673 00:33:02,280 --> 00:33:07,560 it's a neat question by itself, but the original motivation is 674 00:33:07,560 --> 00:33:10,750 something called Lubiw's problem. 675 00:33:10,750 --> 00:33:15,800 Many of you met Anna Lubiw, she's one of my Ph.D. Advisors. 676 00:33:15,800 --> 00:33:19,360 Last year, she was here on sabbatical. 677 00:33:19,360 --> 00:33:21,420 And the question is the following-- 678 00:33:21,420 --> 00:33:25,100 so she posed this back in 2000, a long time ago. 679 00:33:25,100 --> 00:33:29,000 What is the minimum number of cuts 680 00:33:29,000 --> 00:33:34,000 you have to make in a 3D chain in order to unlock it? 681 00:33:49,690 --> 00:33:55,720 I say unlocked meaning we know they are locked chains. 682 00:33:55,720 --> 00:33:58,420 Suppose I want to make the configuration space connected, 683 00:33:58,420 --> 00:34:00,610 but I allow cheating and I allow cutting vertices. 684 00:34:00,610 --> 00:34:02,240 How many do I have to cut? 685 00:34:02,240 --> 00:34:06,070 And say the chain has n bars. 686 00:34:06,070 --> 00:34:08,909 Then, as a function of n, how many cuts do I need? 687 00:34:12,730 --> 00:34:15,630 And the answer at this point, in the worst case, 688 00:34:15,630 --> 00:34:19,870 we know is between about n over 2 and n over 4. 689 00:34:19,870 --> 00:34:30,190 So it's at least 4 over n minus 1 over 4. 690 00:34:30,190 --> 00:34:37,210 And it's at most 4 n minus 3 over 2. 691 00:34:37,210 --> 00:34:39,350 And I'll prove to you both of those results. 692 00:34:39,350 --> 00:34:41,877 So we know roughly its theta n. 693 00:34:41,877 --> 00:34:43,460 It's in a constant factor of n, but we 694 00:34:43,460 --> 00:34:45,480 don't know exactly what the constant is, 695 00:34:45,480 --> 00:34:47,300 still a neat problem. 696 00:34:47,300 --> 00:34:50,530 Originally this motivation here is, well, maybe-- there's 697 00:34:50,530 --> 00:34:52,969 some theories that in real proteins, 698 00:34:52,969 --> 00:34:56,050 maybe they disconnect their bonds for a little while 699 00:34:56,050 --> 00:34:59,085 so they can pass through and make things cross. 700 00:34:59,085 --> 00:35:00,460 How many disconnections would you 701 00:35:00,460 --> 00:35:03,370 need because that might require a lot of energy? 702 00:35:03,370 --> 00:35:07,600 The answer is a lot if you have a long protein. 703 00:35:07,600 --> 00:35:09,580 in the worst case of course. 704 00:35:09,580 --> 00:35:12,440 Real proteins have unit edge lengths almost equal angles, 705 00:35:12,440 --> 00:35:14,784 so maybe the answers change in that case, 706 00:35:14,784 --> 00:35:16,700 but given that we don't even know whether it's 707 00:35:16,700 --> 00:35:19,070 a locked chain, in those situations, 708 00:35:19,070 --> 00:35:20,940 it's pretty hard for us to answer 709 00:35:20,940 --> 00:35:24,090 something more complicated like this. 710 00:35:24,090 --> 00:35:26,240 So let me start with the lower bound. 711 00:35:26,240 --> 00:35:29,245 You need to make at least n over 4 cuts almost. 712 00:35:31,770 --> 00:35:34,310 Any ideas how to do that before I show you the answer? 713 00:35:38,680 --> 00:35:41,701 Given the one tool we have, which is the knitting needles. 714 00:35:41,701 --> 00:35:44,674 AUDIENCE: Yeah, I mean if you [INAUDIBLE]. 715 00:35:44,674 --> 00:35:46,215 PROFESSOR: Just connect a whole bunch 716 00:35:46,215 --> 00:35:47,100 of knitting needles together. 717 00:35:47,100 --> 00:35:47,770 Yeah. 718 00:35:47,770 --> 00:35:49,497 Now, knitting needles has five edges, 719 00:35:49,497 --> 00:35:51,080 so that would naturally give n over 5. 720 00:35:51,080 --> 00:35:54,224 But if you're a little bit clever, you can share edges. 721 00:35:54,224 --> 00:35:56,390 If you share the long edges from one knitting needle 722 00:35:56,390 --> 00:36:02,780 to the next, and you get about 4 bars per knitting needle, 723 00:36:02,780 --> 00:36:05,740 except the very first bar which have to count extra, and so 724 00:36:05,740 --> 00:36:08,480 that's the n minus 1 divided by 4. 725 00:36:08,480 --> 00:36:13,540 Now if you don't cut one of the vertices-- 726 00:36:13,540 --> 00:36:16,424 and here, we're considering vertex cuts-- 727 00:36:16,424 --> 00:36:17,840 if you don't cut one of those four 728 00:36:17,840 --> 00:36:19,450 vertices of that knitting needle, 729 00:36:19,450 --> 00:36:20,820 the whole thing will be locked because that's 730 00:36:20,820 --> 00:36:21,720 a knitting needle. 731 00:36:21,720 --> 00:36:24,220 So you have to cut one of those four, and one of those four, 732 00:36:24,220 --> 00:36:28,120 and one of those four just for the pieces to not be locked. 733 00:36:28,120 --> 00:36:31,070 The trouble is-- so the natural algorithm, if I give you 734 00:36:31,070 --> 00:36:35,940 some big complicated chain, is cut every fourth vertex. 735 00:36:35,940 --> 00:36:39,040 That we know does not work. 736 00:36:39,040 --> 00:36:44,350 I think I have to wait to see that one. 737 00:36:44,350 --> 00:36:46,410 Let me tell you what's known about interlocking, 738 00:36:46,410 --> 00:36:48,570 and then we'll get back to Lubiw's problem. 739 00:36:48,570 --> 00:36:51,100 But at least we've proved this part. 740 00:36:51,100 --> 00:36:53,670 And next thing I want to do is prove this part. 741 00:36:53,670 --> 00:36:57,690 And to show you how we might do that, I 742 00:36:57,690 --> 00:36:58,830 think I need a whole board. 743 00:37:14,140 --> 00:37:21,742 We're going to think about an open chain of length 2-- 744 00:37:21,742 --> 00:37:24,210 let me make a little more room-- 2, 745 00:37:24,210 --> 00:37:41,220 3, 4 interlocking with-- start with an open chain of length 2, 746 00:37:41,220 --> 00:37:42,620 3, or 4. 747 00:37:42,620 --> 00:37:45,520 There's going to be other results, but it's a good start. 748 00:37:48,310 --> 00:37:55,127 So for 2, 2 chains, I claim are always separable. 749 00:37:55,127 --> 00:37:56,710 And I'm going to put a little asterisk 750 00:37:56,710 --> 00:37:58,650 to mean another thing is true. 751 00:37:58,650 --> 00:38:01,030 A 2 chain and a 3 chain is always separable. 752 00:38:01,030 --> 00:38:03,890 A 2 chain and a 4 chain is always separable. 753 00:38:03,890 --> 00:38:05,520 This matrix is symmetric. 754 00:38:11,136 --> 00:38:15,080 A 3 chain versus a 3 chain also is separable. 755 00:38:18,180 --> 00:38:19,805 But everything else, you can interlock. 756 00:38:24,610 --> 00:38:26,920 I guess I haven't defined interlock. 757 00:38:26,920 --> 00:38:28,660 Be a good idea. 758 00:38:28,660 --> 00:38:31,220 I call a bunch of-- a collection of chains 759 00:38:31,220 --> 00:38:34,276 separable if they can all fly to infinity away from each other. 760 00:38:34,276 --> 00:38:35,650 So the distances between them can 761 00:38:35,650 --> 00:38:37,858 get arbitrarily large, otherwise they're interlocked. 762 00:38:41,000 --> 00:38:43,000 And usually, that will mean that no pair of them 763 00:38:43,000 --> 00:38:44,330 can fly away from each other. 764 00:38:44,330 --> 00:38:45,350 Here, we're just thinking about two 765 00:38:45,350 --> 00:38:47,391 at once, so it's either separable or interlocked. 766 00:38:50,120 --> 00:38:53,230 So this is a worry if you're thinking 767 00:38:53,230 --> 00:38:54,700 about cutting this apart. 768 00:38:54,700 --> 00:38:57,575 You cut every fourth bar-- fourth vertex, 769 00:38:57,575 --> 00:38:59,380 if I get the counting right, I think 770 00:38:59,380 --> 00:39:02,720 that means you have length 4 chains connecting them. 771 00:39:02,720 --> 00:39:05,170 Now, we know length 4 chains can't be locked, 772 00:39:05,170 --> 00:39:06,880 but they could be interlocked. 773 00:39:06,880 --> 00:39:09,170 And so you could actually build an example 774 00:39:09,170 --> 00:39:12,420 where if you do the simple pattern of every fourth cut, 775 00:39:12,420 --> 00:39:13,895 you will get interlocking. 776 00:39:13,895 --> 00:39:17,260 It's still open whether you could be clever and still 777 00:39:17,260 --> 00:39:19,800 only make about n over 4 cuts and get everything 778 00:39:19,800 --> 00:39:22,740 to separate, but to solve that, you'd have to get around 779 00:39:22,740 --> 00:39:26,330 this boundary, this annoying situation, which we don't fully 780 00:39:26,330 --> 00:39:27,150 have characterized. 781 00:39:27,150 --> 00:39:29,420 We just have examples. 782 00:39:29,420 --> 00:39:31,300 And we also thought about-- and we'll 783 00:39:31,300 --> 00:39:34,680 get to this later-- if you try interlock 784 00:39:34,680 --> 00:39:36,650 an open chain with a closed chain. 785 00:39:36,650 --> 00:39:38,350 With two closed chains, it's trivial. 786 00:39:38,350 --> 00:39:42,800 You can just make a knot, or make a link. 787 00:39:42,800 --> 00:39:44,990 I guess is the proper term. 788 00:39:44,990 --> 00:39:48,915 But an open versus a closed is more interesting, especially 789 00:39:48,915 --> 00:39:52,710 if that closed guy is not itself a knot. 790 00:39:52,710 --> 00:39:56,640 I think up to length 5. 791 00:39:56,640 --> 00:39:58,930 Yeah, I think the smallest, locked, 792 00:39:58,930 --> 00:40:01,350 closed chain is of length 6. 793 00:40:01,350 --> 00:40:06,050 Never showed it to you, but that is a known fact. 794 00:40:06,050 --> 00:40:09,290 So up to 5, we know this thing is not by itself locked, 795 00:40:09,290 --> 00:40:12,150 so it's not really cheating compared to an open chain. 796 00:40:12,150 --> 00:40:16,750 Then, the answer is separable, separable, separable. 797 00:40:16,750 --> 00:40:19,920 It's really hard to lock things with a 2 chain. 798 00:40:19,920 --> 00:40:23,162 2 chains, we call hairpins. 799 00:40:23,162 --> 00:40:24,170 That's a 2 chain. 800 00:40:32,649 --> 00:40:33,940 Everything else is interlocked. 801 00:40:38,870 --> 00:40:41,595 Basically, the same results hold whether one of your chains 802 00:40:41,595 --> 00:40:47,272 is opened or closed because notice there's a shift here. 803 00:40:47,272 --> 00:40:48,980 The smallest closed chains have length 3, 804 00:40:48,980 --> 00:40:50,850 and the smallest open chain is really of length 2. 805 00:40:50,850 --> 00:40:52,510 I guess you could consider single bars, 806 00:40:52,510 --> 00:40:54,170 but they don't usually do much. 807 00:40:54,170 --> 00:40:57,250 You're can just slide them out parallel to themselves. 808 00:40:57,250 --> 00:40:58,770 So there's a shift in the indices 809 00:40:58,770 --> 00:41:01,785 here, and that's why there's a shift in the matrix. 810 00:41:01,785 --> 00:41:02,410 Sounds ominous. 811 00:41:02,410 --> 00:41:06,560 A shift in the matrix, what will we do? 812 00:41:06,560 --> 00:41:08,410 But they're basically the same. 813 00:41:08,410 --> 00:41:11,430 So I'm going to show most of these results to you. 814 00:41:11,430 --> 00:41:15,850 The stars mean that even if I give you-- get this right-- 815 00:41:15,850 --> 00:41:18,930 even if I give you a whole bunch more hairpins-- so 816 00:41:18,930 --> 00:41:20,860 for this result, it's the most interesting. 817 00:41:20,860 --> 00:41:25,640 I take a 3 chain, a 3 chain, and then a millionaire hairpins, 818 00:41:25,640 --> 00:41:27,240 a million 2 chains. 819 00:41:27,240 --> 00:41:29,462 Still, they can separate. 820 00:41:29,462 --> 00:41:30,920 And all the results with stars, you 821 00:41:30,920 --> 00:41:32,336 can add arbitrarily many hairpins. 822 00:41:35,950 --> 00:41:38,000 These two are the most interesting, a 4 chain 823 00:41:38,000 --> 00:41:39,860 with any number of hairpins, a 2, 824 00:41:39,860 --> 00:41:42,030 3 chain with any number of hairpins. 825 00:41:42,030 --> 00:41:44,494 These results do not hold or we don't 826 00:41:44,494 --> 00:41:45,660 know how to prove it anyway. 827 00:41:48,814 --> 00:41:49,605 That's the summary. 828 00:41:52,560 --> 00:41:56,950 There's one question that's not answered by this table 829 00:41:56,950 --> 00:41:59,420 because it's how low can you go and get interlocking. 830 00:41:59,420 --> 00:42:02,300 This is saying, well, once you get to a 3 chain and a 4 chain, 831 00:42:02,300 --> 00:42:03,050 you can interlock. 832 00:42:03,050 --> 00:42:04,580 It's the minimum. 833 00:42:04,580 --> 00:42:09,290 But what if you have a bunch of 2 chains or even just one? 834 00:42:09,290 --> 00:42:12,610 Can I interlock it with some chain of some length? 835 00:42:12,610 --> 00:42:14,890 Let's say, I mean up to 4 is the most interesting 836 00:42:14,890 --> 00:42:17,264 because once you get to 5, you can have knitting needles. 837 00:42:17,264 --> 00:42:21,930 But suppose I take longer open chains down here, 838 00:42:21,930 --> 00:42:24,180 but I require that they're not themselves locked. 839 00:42:24,180 --> 00:42:28,070 Can I interlock it with a hairpin? 840 00:42:28,070 --> 00:42:31,555 It has practical applications to hair I guess. 841 00:42:34,090 --> 00:42:36,410 And the answer is yes. 842 00:42:36,410 --> 00:42:41,290 So this first example is a 16 chain. 843 00:42:41,290 --> 00:42:44,581 That's the gray part. 844 00:42:44,581 --> 00:42:45,080 Wow! 845 00:42:45,080 --> 00:42:46,740 It's hard to even follow the ends. 846 00:42:46,740 --> 00:42:47,860 Here's one end. 847 00:42:47,860 --> 00:42:51,690 And here's another end and 16 bars in the middle. 848 00:42:51,690 --> 00:42:54,740 And then the red thing is a 2 chain, 849 00:42:54,740 --> 00:42:59,210 and this is a model of that thing, physical model. 850 00:42:59,210 --> 00:43:00,830 And it's locked. 851 00:43:00,830 --> 00:43:02,900 Sorry, it's interlocked, I should say. 852 00:43:02,900 --> 00:43:05,640 I'm pretty sure you take either of the chains, 853 00:43:05,640 --> 00:43:07,410 obviously with the hairpin, but also 854 00:43:07,410 --> 00:43:09,980 for the really complicated 16 chain, 855 00:43:09,980 --> 00:43:11,130 it will unfold by itself. 856 00:43:11,130 --> 00:43:14,310 But with a hairpin, it is held open at that angle 857 00:43:14,310 --> 00:43:16,850 and can't unfold. 858 00:43:19,710 --> 00:43:25,820 The best bound so far is an 11 chain versus a 2 chain. 859 00:43:25,820 --> 00:43:27,775 It's conjectured to be optimal. 860 00:43:27,775 --> 00:43:29,430 There's various reasons to believe 861 00:43:29,430 --> 00:43:32,540 at least this interlocking trick where you-- basically, 862 00:43:32,540 --> 00:43:37,350 all the hairpin can do is hold an angle open or try to. 863 00:43:37,350 --> 00:43:41,510 And this seems to be the smallest way to interlock it 864 00:43:41,510 --> 00:43:44,990 up there and the smallest way to interlock it on either side. 865 00:43:44,990 --> 00:43:46,110 And these are zooms. 866 00:43:46,110 --> 00:43:47,490 This is of one of these corners. 867 00:43:47,490 --> 00:43:48,814 This is of the top part. 868 00:43:48,814 --> 00:43:50,230 So those, by themselves, if you're 869 00:43:50,230 --> 00:43:52,980 trying to do these three interlock tricks, 870 00:43:52,980 --> 00:43:55,610 these are minimal, we think. 871 00:43:55,610 --> 00:43:58,580 But as an assembly, it's conjectured to be optimal, 872 00:43:58,580 --> 00:44:00,080 but we don't know how to prove that. 873 00:44:00,080 --> 00:44:01,746 But at least these guys are interlocked, 874 00:44:01,746 --> 00:44:03,670 so 11 chain with a 2 chain. 875 00:44:08,700 --> 00:44:12,030 So those don't fit on the table because the table's not 876 00:44:12,030 --> 00:44:13,650 large enough. 877 00:44:13,650 --> 00:44:18,450 But let me start with this result. 878 00:44:18,450 --> 00:44:22,790 So the easiest thing to do is show 879 00:44:22,790 --> 00:44:33,150 that if I have any finite set of hairpins, 2 chains, 880 00:44:33,150 --> 00:44:34,195 they never interlock. 881 00:44:38,494 --> 00:44:40,660 This is actually an old result although it was never 882 00:44:40,660 --> 00:44:45,430 phrased this way until earlier this decade, 883 00:44:45,430 --> 00:44:55,030 I guess, by de Brujin, a famous Dutch mathematician from 1954. 884 00:44:55,030 --> 00:44:58,360 He proved essentially this. 885 00:44:58,360 --> 00:45:00,500 He proved a simpler version which was then 886 00:45:00,500 --> 00:45:04,580 generalized in 1984 by Robert Dawson in Canada. 887 00:45:04,580 --> 00:45:08,720 And they proved something more general. 888 00:45:08,720 --> 00:45:13,250 They said, well-- I will say a hairpin is star-shaped, 889 00:45:13,250 --> 00:45:17,550 meaning if you were right here, here's your eye at this point, 890 00:45:17,550 --> 00:45:19,690 you can see the entire interior of the shape 891 00:45:19,690 --> 00:45:21,270 without leaving the shape. 892 00:45:21,270 --> 00:45:24,120 So that's a star-shaped figure. 893 00:45:24,120 --> 00:45:26,980 And these guys proved that for star-shaped figures, 894 00:45:26,980 --> 00:45:28,980 there's a way to always separate them 895 00:45:28,980 --> 00:45:31,400 without even changing any of the angles. 896 00:45:31,400 --> 00:45:33,490 There's no folding involved. 897 00:45:33,490 --> 00:45:35,490 You could think of rigid 2 chains. 898 00:45:35,490 --> 00:45:37,760 They can all fly apart from each other. 899 00:45:37,760 --> 00:45:39,400 How do you do it? 900 00:45:39,400 --> 00:45:40,610 Some fun geometry. 901 00:45:40,610 --> 00:45:47,982 You say, I have my favorite 0, 0, 0 in three space. 902 00:45:47,982 --> 00:45:49,940 I just want to scale everything from the point, 903 00:45:49,940 --> 00:45:52,540 make space bigger. 904 00:45:52,540 --> 00:45:55,690 What will happen to these poor little chains? 905 00:45:55,690 --> 00:45:56,944 They get bigger. 906 00:45:56,944 --> 00:45:57,735 They'll get longer. 907 00:46:00,620 --> 00:46:04,510 As they get longer, just cut them off again. 908 00:46:04,510 --> 00:46:06,340 But the result is that these points 909 00:46:06,340 --> 00:46:08,690 will fly away from each other. 910 00:46:08,690 --> 00:46:10,974 And so all of these two chains are going to separate. 911 00:46:10,974 --> 00:46:13,390 The lengths get longer, but I can always cut them shorter. 912 00:46:13,390 --> 00:46:15,215 And if I scale space, obviously if I 913 00:46:15,215 --> 00:46:16,756 didn't have self-intersection before, 914 00:46:16,756 --> 00:46:18,630 I still won't have self-intersection. 915 00:46:18,630 --> 00:46:19,930 So it's a fun motion. 916 00:46:19,930 --> 00:46:22,310 We tried to draw it in the book. 917 00:46:22,310 --> 00:46:24,440 It's a little hard to draw. 918 00:46:24,440 --> 00:46:26,930 So we have the red, green, and, blue-- hopefully, 919 00:46:26,930 --> 00:46:31,330 you're not colorblind-- hairpins hanging out. 920 00:46:31,330 --> 00:46:33,230 They're not hitting each other. 921 00:46:33,230 --> 00:46:34,360 Here's the origin. 922 00:46:34,360 --> 00:46:37,370 We scale everything, so the blue guy goes to the dashed blue. 923 00:46:37,370 --> 00:46:39,390 Green goes to the dashed green. 924 00:46:39,390 --> 00:46:40,790 It looks weird. 925 00:46:40,790 --> 00:46:42,410 It's all over the place. 926 00:46:42,410 --> 00:46:44,160 And you're stretching for this point, 927 00:46:44,160 --> 00:46:47,230 so as you go farther away, it's bigger. 928 00:46:47,230 --> 00:46:49,690 Red guy expands to the dashed red guy. 929 00:46:49,690 --> 00:46:52,372 What's not shown here is then you clip the red things. 930 00:46:52,372 --> 00:46:53,830 Because everything gets longer, you 931 00:46:53,830 --> 00:46:55,580 can always clip to the original lengths. 932 00:46:55,580 --> 00:46:57,510 And that's a motion of these chains. 933 00:46:57,510 --> 00:46:58,760 You're not changing the angle. 934 00:46:58,760 --> 00:47:00,930 Scaling preserves angles. 935 00:47:00,930 --> 00:47:03,830 And everything flies way for each other. 936 00:47:03,830 --> 00:47:07,060 So a very nice trick from a long time ago. 937 00:47:10,760 --> 00:47:14,640 Once you get to 3 chains, this doesn't work. 938 00:47:14,640 --> 00:47:19,760 And so if we want to prove this result, which 939 00:47:19,760 --> 00:47:23,850 would subsume these guys, we need to do a little more work. 940 00:47:23,850 --> 00:47:26,520 Although a very similar trick works. 941 00:47:26,520 --> 00:47:34,625 So let me show you why two open 3 chains separate. 942 00:47:39,330 --> 00:47:44,160 Even if you add arbitrarily many extra hairpins-- that's 943 00:47:44,160 --> 00:47:47,640 the star-- Two 3 chains always separate. 944 00:47:57,310 --> 00:47:59,270 I guess suppose we're in a generic situation. 945 00:47:59,270 --> 00:48:01,210 Nothing aligns that shouldn't. 946 00:48:01,210 --> 00:48:05,230 I have one 3 chain, or something like this. 947 00:48:05,230 --> 00:48:06,592 And I have another one. 948 00:48:06,592 --> 00:48:07,675 Let's make it interesting. 949 00:48:12,670 --> 00:48:14,050 Is that possible? 950 00:48:14,050 --> 00:48:15,020 Maybe. 951 00:48:15,020 --> 00:48:16,130 Something like that. 952 00:48:16,130 --> 00:48:18,886 Probably you want these guys to be pretty long. 953 00:48:18,886 --> 00:48:21,200 Do some crazy whatever. 954 00:48:21,200 --> 00:48:25,680 That is two 3 chains if I drew it right. 955 00:48:25,680 --> 00:48:27,780 And the claim is you can always separate them. 956 00:48:37,900 --> 00:48:40,070 So here's the deal. 957 00:48:40,070 --> 00:48:42,350 I have this middle bar. 958 00:48:42,350 --> 00:48:46,290 There's a whole family of planes that are parallel to that bar, 959 00:48:46,290 --> 00:48:47,690 in fact, a lot of planes. 960 00:48:47,690 --> 00:48:48,280 There's what? 961 00:48:48,280 --> 00:48:49,550 Three dimensions of planes. 962 00:48:49,550 --> 00:48:51,300 There's a two dimensional family of planes 963 00:48:51,300 --> 00:48:52,780 that are parallel to that bar. 964 00:48:52,780 --> 00:48:55,280 There's a two dimensional family of planes that are parallel 965 00:48:55,280 --> 00:48:57,080 to that bar, meaning if you extend it, 966 00:48:57,080 --> 00:49:01,830 it doesn't hit the plane or lies in the plane. 967 00:49:01,830 --> 00:49:05,422 So there's actually a plane that is parallel to both bars. 968 00:49:05,422 --> 00:49:07,380 In fact, there's a whole one dimensional family 969 00:49:07,380 --> 00:49:09,110 of that, if I figured it out right. 970 00:49:09,110 --> 00:49:12,870 So you can actually find a plane that bisects the two bars. 971 00:49:12,870 --> 00:49:16,140 So there's one on one side, one on the other side. 972 00:49:16,140 --> 00:49:20,000 And even if the bars extended, it would not hit the plane. 973 00:49:20,000 --> 00:49:23,700 So there's some plane in the middle here. 974 00:49:23,700 --> 00:49:27,272 Really hard to draw in 3D. 975 00:49:27,272 --> 00:49:28,730 Actually, I think I have a picture. 976 00:49:28,730 --> 00:49:31,990 It might be a little easier to see. 977 00:49:31,990 --> 00:49:33,370 So it looks something like this. 978 00:49:33,370 --> 00:49:33,930 You have to believe me. 979 00:49:33,930 --> 00:49:34,880 This is general case. 980 00:49:34,880 --> 00:49:37,210 You have the red chain and the blue chain. 981 00:49:37,210 --> 00:49:39,590 The middle bars are on opposite sides of the plane, 982 00:49:39,590 --> 00:49:41,660 and they're parallel to the plane. 983 00:49:41,660 --> 00:49:44,890 Now what we do is scale, but we only 984 00:49:44,890 --> 00:49:47,991 scale in z, so non-uniform scaling. 985 00:49:47,991 --> 00:49:49,240 This will not preserve angles. 986 00:49:49,240 --> 00:49:51,060 Here, we're going to fold. 987 00:49:51,060 --> 00:49:53,440 So the result, we scale from z equals zero, 988 00:49:53,440 --> 00:49:55,110 so these guys get pushed down. 989 00:49:55,110 --> 00:49:56,730 Those guys get pushed up. 990 00:49:56,730 --> 00:50:00,340 Scaling in z, again, does not cause any self-intersection. 991 00:50:00,340 --> 00:50:02,180 And because we made these guys parallel, 992 00:50:02,180 --> 00:50:04,390 these edge lengths will be preserved. 993 00:50:04,390 --> 00:50:05,850 That's the key. 994 00:50:05,850 --> 00:50:07,410 The end bars will get longer, but I 995 00:50:07,410 --> 00:50:08,660 don't care if they get longer. 996 00:50:08,660 --> 00:50:10,817 I can always cut them shorter. 997 00:50:10,817 --> 00:50:11,650 So that's the trick. 998 00:50:13,869 --> 00:50:15,410 Yeah, they'll always get longer, even 999 00:50:15,410 --> 00:50:17,810 if they don't go to the other side. 1000 00:50:17,810 --> 00:50:19,840 So that's how you can just pull them apart. 1001 00:50:19,840 --> 00:50:21,760 And also, if there are 2 chains in here, 1002 00:50:21,760 --> 00:50:23,820 they will also just fly away from each other. 1003 00:50:23,820 --> 00:50:25,020 So eventually, everything will get 1004 00:50:25,020 --> 00:50:27,436 far away unless they happen to have the same z-coordinate, 1005 00:50:27,436 --> 00:50:29,330 but then it's degenerate. 1006 00:50:29,330 --> 00:50:30,590 Just perturbed a little bit. 1007 00:50:30,590 --> 00:50:31,090 Jason? 1008 00:50:31,090 --> 00:50:34,278 AUDIENCE: Is this the case where the two 1009 00:50:34,278 --> 00:50:38,260 bars are coplanar and parallel? 1010 00:50:38,260 --> 00:50:39,710 PROFESSOR: Yeah. 1011 00:50:39,710 --> 00:50:42,150 I said at the beginning generic, so in particular, 1012 00:50:42,150 --> 00:50:44,977 I want to assume that these guys are not coplanar, 1013 00:50:44,977 --> 00:50:47,060 otherwise, you could just perturb it a little bit. 1014 00:50:47,060 --> 00:50:50,405 As long as they're not touching, you can wiggle everybody. 1015 00:50:50,405 --> 00:50:52,920 Good point. 1016 00:50:52,920 --> 00:50:54,500 That's two 3 chains. 1017 00:50:54,500 --> 00:50:56,360 There's one more result in the separability. 1018 00:50:56,360 --> 00:50:59,930 We've done all of these separability results here. 1019 00:50:59,930 --> 00:51:03,450 The next one would be this, which is the same as this. 1020 00:51:03,450 --> 00:51:07,480 So here we have a 4 chain versus a 2 chain. 1021 00:51:07,480 --> 00:51:09,470 Basically, the same trick works. 1022 00:51:09,470 --> 00:51:14,180 I just have reorient to put middle bars in the xy plane. 1023 00:51:14,180 --> 00:51:17,280 So in other words, there's two middle bars. 1024 00:51:17,280 --> 00:51:20,140 They live in a plane because they share a vertex. 1025 00:51:20,140 --> 00:51:22,570 So think of that as the plane and then scale perpendicular 1026 00:51:22,570 --> 00:51:23,530 to the that. 1027 00:51:23,530 --> 00:51:26,150 Same trick works. 1028 00:51:26,150 --> 00:51:29,162 So 4 chains versus various 2 chains. 1029 00:51:29,162 --> 00:51:31,370 Again, we preserve the lengths of the two middle bars 1030 00:51:31,370 --> 00:51:32,740 because they lie on the plane. 1031 00:51:32,740 --> 00:51:34,480 So there's no scaling that happens in the plane, 1032 00:51:34,480 --> 00:51:35,910 and in the generic case, everybody 1033 00:51:35,910 --> 00:51:37,409 will be off the plane, and they fly. 1034 00:51:39,680 --> 00:51:41,009 Cool. 1035 00:51:41,009 --> 00:51:43,050 So that proves all of these separability results. 1036 00:51:43,050 --> 00:51:46,167 I'm not going to do these closed chain ones here. 1037 00:51:46,167 --> 00:51:48,000 You can read the paper if you're interested. 1038 00:51:48,000 --> 00:51:50,280 At this point, we can go back to Lubiw's problem 1039 00:51:50,280 --> 00:51:54,520 and prove this result. 1040 00:51:54,520 --> 00:51:58,569 So basic algorithm is cut every other vertex. 1041 00:51:58,569 --> 00:52:00,110 You'll get a whole bunch of hairpins. 1042 00:52:00,110 --> 00:52:02,130 Hairpins, we know, separate. 1043 00:52:02,130 --> 00:52:04,380 But we can be slightly more efficient 1044 00:52:04,380 --> 00:52:09,405 by leaving an initial segment of 4 chain or two 3 chains. 1045 00:52:09,405 --> 00:52:12,900 I think it comes out to the same number. 1046 00:52:12,900 --> 00:52:18,855 And you end up saving three if we computed this right. 1047 00:52:18,855 --> 00:52:21,830 So you get n minus 3 over 2. 1048 00:52:21,830 --> 00:52:24,490 I think our first bound on this problem was n minus 1 over 2, 1049 00:52:24,490 --> 00:52:27,250 and then we improved by 1 although I have to n minus 3 1050 00:52:27,250 --> 00:52:28,200 over 2. 1051 00:52:28,200 --> 00:52:32,090 That has stood the best bound since. 1052 00:52:32,090 --> 00:52:35,110 Basically, because with interlocked stuff-- after that, 1053 00:52:35,110 --> 00:52:36,310 you get interlocked. 1054 00:52:36,310 --> 00:52:38,534 And if you're going to avoid interlocking, 1055 00:52:38,534 --> 00:52:39,450 you have to be clever. 1056 00:52:39,450 --> 00:52:42,500 You can't just use the numbers. 1057 00:52:42,500 --> 00:52:44,720 So that is all we know about Lubiw's problem. 1058 00:52:44,720 --> 00:52:49,470 Now, I get to show you all the fun interlocked examples. 1059 00:52:49,470 --> 00:52:54,380 We're going to do pretty much all of these. 1060 00:52:54,380 --> 00:52:59,580 So we start with one of the prettiest examples, 1061 00:52:59,580 --> 00:53:00,610 the threefold symmetric. 1062 00:53:00,610 --> 00:53:04,380 This is three 3 chains. 1063 00:53:04,380 --> 00:53:08,190 So that's not even drawn here. 1064 00:53:08,190 --> 00:53:11,380 Because this table is about pairwise interactions. 1065 00:53:11,380 --> 00:53:14,480 We know that two 3 chains-- we just proved two 3 chains cannot 1066 00:53:14,480 --> 00:53:18,400 interlock, but three of them can. 1067 00:53:18,400 --> 00:53:20,765 That's the only case where that question is interesting. 1068 00:53:20,765 --> 00:53:23,490 So everybody else, you can interlock. 1069 00:53:23,490 --> 00:53:26,120 Unless you have 2 chains, and we know those don't help. 1070 00:53:26,120 --> 00:53:28,534 So three 3 chains interlock. 1071 00:53:28,534 --> 00:53:30,200 I'm not going to try to prove that here. 1072 00:53:30,200 --> 00:53:32,170 It's a pretty complicated argument 1073 00:53:32,170 --> 00:53:35,149 where you look at the center edges. 1074 00:53:35,149 --> 00:53:36,940 You look at the convex whole of those edges 1075 00:53:36,940 --> 00:53:38,981 and argue these guys can't stick into the center. 1076 00:53:38,981 --> 00:53:40,715 It's a geometric argument. 1077 00:53:40,715 --> 00:53:43,385 If you're interested, check out this paper. 1078 00:53:43,385 --> 00:53:44,820 It's a pretty cool example. 1079 00:53:44,820 --> 00:53:47,160 This was a great-- finally, I had a good exercise 1080 00:53:47,160 --> 00:53:49,717 for using Rhino yesterday and drawing this figure. 1081 00:53:49,717 --> 00:53:50,550 It was a lot of fun. 1082 00:53:54,370 --> 00:53:57,510 This is a 3 chain and 4 chain. 1083 00:53:57,510 --> 00:53:59,840 One, two, three, four. 1084 00:53:59,840 --> 00:54:01,920 Just two different views of the same thing. 1085 00:54:01,920 --> 00:54:07,000 And that proves this result, which is this result. 1086 00:54:11,780 --> 00:54:15,350 It's pretty close, pretty tight. 1087 00:54:15,350 --> 00:54:17,140 Again, it's a geometric argument. 1088 00:54:17,140 --> 00:54:18,050 It's tricky. 1089 00:54:18,050 --> 00:54:19,110 I'll go into it here. 1090 00:54:22,770 --> 00:54:24,830 We found these by building lots of straw models 1091 00:54:24,830 --> 00:54:26,840 and jiggling them around with string 1092 00:54:26,840 --> 00:54:28,860 connecting the straws together. 1093 00:54:28,860 --> 00:54:30,436 But then we had to prove it. 1094 00:54:30,436 --> 00:54:32,435 For a while, we weren't sure whether a 3 and a 4 1095 00:54:32,435 --> 00:54:33,340 could interlock. 1096 00:54:33,340 --> 00:54:35,910 I think first we had a 4 and a 4 or maybe 4 and a 5, 1097 00:54:35,910 --> 00:54:37,490 and we kept working our way down. 1098 00:54:37,490 --> 00:54:39,531 And now we know this is the smallest you can get. 1099 00:54:42,460 --> 00:54:47,650 And now we go into the closed results. 1100 00:54:47,650 --> 00:54:50,690 Because 4 and 4, that's of course easier than 3 and 4. 1101 00:54:50,690 --> 00:54:53,040 So that's done. 1102 00:54:53,040 --> 00:54:56,940 This is contained in those results. 1103 00:54:56,940 --> 00:54:59,020 Now, we look at the closed examples. 1104 00:54:59,020 --> 00:55:02,980 So this is going to be a closed triangle interlocking 1105 00:55:02,980 --> 00:55:04,627 with an open 4 chain. 1106 00:55:04,627 --> 00:55:06,710 And here I'm going to give you some of the details 1107 00:55:06,710 --> 00:55:09,822 because this proof-- it's relatively easy to prove 1108 00:55:09,822 --> 00:55:10,780 that this is interlock. 1109 00:55:10,780 --> 00:55:12,490 You can use a more topological argument, 1110 00:55:12,490 --> 00:55:14,740 which is what we're used to from the knitting needles. 1111 00:55:14,740 --> 00:55:18,200 Remember we took a ball that separated the interior vertices 1112 00:55:18,200 --> 00:55:20,230 from the endpoints. 1113 00:55:20,230 --> 00:55:23,080 We said the endpoints really can't get in 1114 00:55:23,080 --> 00:55:24,420 if we set the ball right. 1115 00:55:24,420 --> 00:55:29,550 And therefore, you could connect the endpoints by a big rope, 1116 00:55:29,550 --> 00:55:31,816 and then you have an Trefoil knot, 1117 00:55:31,816 --> 00:55:33,940 and there's no way you can untie the knot with that 1118 00:55:33,940 --> 00:55:35,240 having intersection somewhere. 1119 00:55:35,240 --> 00:55:37,150 You can argue the rope never self intersects, 1120 00:55:37,150 --> 00:55:40,390 so it must be the chain. 1121 00:55:40,390 --> 00:55:42,510 We can do the same thing with this example, 1122 00:55:42,510 --> 00:55:46,870 with a 4 chain interlocked with a triangle. 1123 00:55:46,870 --> 00:55:53,160 And we'll sketch that proof. 1124 00:56:17,300 --> 00:56:19,830 So I want to prove that those two guys are interlocked, 1125 00:56:19,830 --> 00:56:23,210 so suppose that somehow they separate from each other. 1126 00:56:23,210 --> 00:56:26,175 If there's a separating motion, there's 1127 00:56:26,175 --> 00:56:28,970 a motion where the triangle actually doesn't move at all. 1128 00:56:28,970 --> 00:56:30,980 So triangle's rigid, except that it 1129 00:56:30,980 --> 00:56:34,670 can move by rigid motion, translation and rotation. 1130 00:56:34,670 --> 00:56:39,420 But any motion I have, I can recenter things by relativity 1131 00:56:39,420 --> 00:56:41,570 so that the triangles stays fixed 1132 00:56:41,570 --> 00:56:45,114 and just the quadrilateral moves. 1133 00:56:45,114 --> 00:56:47,280 So that's nice because a triangle is a natural thing 1134 00:56:47,280 --> 00:56:49,230 to draw a sphere around, and I want 1135 00:56:49,230 --> 00:56:52,910 to argue if these bars are really, really long, 1136 00:56:52,910 --> 00:56:56,920 the endpoints will be outside this big sphere. 1137 00:56:56,920 --> 00:56:58,200 So let me tell you how long. 1138 00:57:02,320 --> 00:57:04,775 I have my triangle I'm going to draw 1139 00:57:04,775 --> 00:57:07,350 a sphere that contains the triangle. 1140 00:57:07,350 --> 00:57:09,830 Say this sphere has radius little r. 1141 00:57:09,830 --> 00:57:12,680 Big R is going to be little r plus the sum 1142 00:57:12,680 --> 00:57:16,210 of the middle bars, which should sound familiar from what 1143 00:57:16,210 --> 00:57:18,180 we did with knitting needles. 1144 00:57:18,180 --> 00:57:20,614 This is of the 4 chain. 1145 00:57:20,614 --> 00:57:22,340 There's two middle bars. 1146 00:57:22,340 --> 00:57:24,090 Add up their lengths. 1147 00:57:24,090 --> 00:57:24,830 Add on r. 1148 00:57:24,830 --> 00:57:27,620 This is just a big number, capital R. 1149 00:57:27,620 --> 00:57:29,770 I'm going to make it bigger. 1150 00:57:29,770 --> 00:57:32,030 So I have my little triangle. 1151 00:57:32,030 --> 00:57:34,290 It has a tiny ball around it. 1152 00:57:34,290 --> 00:57:38,130 I'm going to draw a pretty big sphere here 1153 00:57:38,130 --> 00:57:46,789 of radius 15 r, an even bigger sphere of radius 20 r. 1154 00:57:46,789 --> 00:57:49,330 And I don't need to be precise about exactly where the center 1155 00:57:49,330 --> 00:57:51,496 is, but I guess it would be the center of this ball. 1156 00:57:55,630 --> 00:57:57,930 Here's what I want to say. 1157 00:57:57,930 --> 00:58:04,320 I'm supposing the lengths of these edges is more than 20 r. 1158 00:58:04,320 --> 00:58:08,430 So initially, those bars go outside the box. 1159 00:58:08,430 --> 00:58:10,500 The endpoints are outside of the big ball. 1160 00:58:13,060 --> 00:58:19,030 Now, here's the claim. 1161 00:58:19,030 --> 00:58:22,570 As long as the endpoints of the really long-- the endpoints 1162 00:58:22,570 --> 00:58:35,280 of the 4 chain stay outside this ball, then the interior bars-- 1163 00:58:35,280 --> 00:58:42,760 these two middle bars-- have to stay inside this ball. 1164 00:58:42,760 --> 00:58:44,760 This is just like the knitting needles. 1165 00:58:44,760 --> 00:58:47,170 As long as the super long bars, the endpoints, 1166 00:58:47,170 --> 00:58:49,730 stay outside this ball, the middle bars 1167 00:58:49,730 --> 00:58:50,900 have to stay inside. 1168 00:58:50,900 --> 00:58:54,480 That's just because those bars are so darn long. 1169 00:58:54,480 --> 00:58:57,000 And so for as long as that holds, 1170 00:58:57,000 --> 00:58:59,720 you can compute the topology-- I'm not 1171 00:58:59,720 --> 00:59:05,920 going to try to draw it-- the topology of-- in this picture, 1172 00:59:05,920 --> 00:59:08,200 if you connect to the ends of the purple guys, 1173 00:59:08,200 --> 00:59:12,419 because they're outside the big ball, the 15 r ball-- 1174 00:59:12,419 --> 00:59:13,960 not sure I really need the 20 r ball, 1175 00:59:13,960 --> 00:59:19,310 but the 15 r ball is the key-- topology you get is this. 1176 00:59:19,310 --> 00:59:22,190 That is the link between the circle, 1177 00:59:22,190 --> 00:59:23,800 which is this nice loop. 1178 00:59:23,800 --> 00:59:27,220 And the purple guy is that red loop 1179 00:59:27,220 --> 00:59:29,490 if you just figure out what it looks like currently. 1180 00:59:29,490 --> 00:59:33,710 And as long as the endpoints of these guys don't go inside 1181 00:59:33,710 --> 00:59:35,420 and those guys remain on the inside, 1182 00:59:35,420 --> 00:59:37,570 basically until something interesting happens, 1183 00:59:37,570 --> 00:59:38,840 the topology has to be that. 1184 00:59:41,810 --> 00:59:49,110 So at some point-- sorry, I said that slightly wrong. 1185 00:59:49,110 --> 00:59:52,250 As long as the middle bars stay inside the ball 1186 00:59:52,250 --> 00:59:55,920 and the endpoint stay outside the ball, a 15 r, 1187 00:59:55,920 --> 00:59:56,949 this will be the case. 1188 00:59:56,949 --> 00:59:58,990 They are not saying nothing interesting happened. 1189 00:59:58,990 --> 01:00:01,490 So violation, the only way to open 1190 01:00:01,490 --> 01:00:06,340 this thing could be an endpoint goes inside or a middle bar 1191 01:00:06,340 --> 01:00:09,340 goes outside the ball. 1192 01:00:09,340 --> 01:00:12,630 Consider the moment when that happens. 1193 01:00:12,630 --> 01:00:16,810 If a middle bar, one in the middle vertices 1194 01:00:16,810 --> 01:00:21,330 is going to go outside this 15 r radius ball, 1195 01:00:21,330 --> 01:00:25,520 then you know that all of the middle bars 1196 01:00:25,520 --> 01:00:29,390 are far away from this triangle because those edges 1197 01:00:29,390 --> 01:00:32,670 very small compared to capital R. 1198 01:00:32,670 --> 01:00:34,860 So we're 15 away. 1199 01:00:34,860 --> 01:00:37,760 If you have one middle vertex that's outside, all of them 1200 01:00:37,760 --> 01:00:39,000 are pretty close to outside. 1201 01:00:39,000 --> 01:00:41,070 They're all going to be in some smaller ball 1202 01:00:41,070 --> 01:00:45,720 here-- outside that smaller ball. 1203 01:00:45,720 --> 01:00:48,530 The alternative is that one of the end-- that's 1204 01:00:48,530 --> 01:00:50,727 not a contradiction yet, but we'll come back to it-- 1205 01:00:50,727 --> 01:00:53,060 the alternative is that one of the endpoints of the bars 1206 01:00:53,060 --> 01:00:55,440 goes inside this radius 15 r ball. 1207 01:00:58,280 --> 01:01:02,180 When that happens, because this bar is so long, 1208 01:01:02,180 --> 01:01:06,250 the other end of the bar will actually be outside the sphere. 1209 01:01:06,250 --> 01:01:08,575 And so then, again, all the middle guys 1210 01:01:08,575 --> 01:01:10,390 are outside the sphere. 1211 01:01:10,390 --> 01:01:12,100 So either way, you violate. 1212 01:01:12,100 --> 01:01:13,730 I assumed this before, but now, I'm 1213 01:01:13,730 --> 01:01:16,020 saying if you try to take these guys out 1214 01:01:16,020 --> 01:01:17,820 or if you try to push this guy in, 1215 01:01:17,820 --> 01:01:20,480 the middle guys are outside the ball already. 1216 01:01:20,480 --> 01:01:23,870 So consider a situation when the middle edges are all outside 1217 01:01:23,870 --> 01:01:26,490 the ball of radius 15 r. 1218 01:01:26,490 --> 01:01:29,720 The only thing that could penetrate some tiny sphere, 1219 01:01:29,720 --> 01:01:33,530 like this one of little r, only parts of the 4 chain 1220 01:01:33,530 --> 01:01:35,800 that could penetrate that are the long bars 1221 01:01:35,800 --> 01:01:38,450 because the middle stuff is too far away. 1222 01:01:38,450 --> 01:01:41,800 The long bars are super long, so they could conceivably 1223 01:01:41,800 --> 01:01:43,430 penetrate the triangle. 1224 01:01:43,430 --> 01:01:45,510 All we really care about is piercing the center 1225 01:01:45,510 --> 01:01:46,968 of the triangle because that's when 1226 01:01:46,968 --> 01:01:48,460 we get interesting topology. 1227 01:01:48,460 --> 01:01:52,370 We want to get this topology, but sadly, you 1228 01:01:52,370 --> 01:01:54,050 could only get these topologies. 1229 01:01:54,050 --> 01:01:54,804 I guess luckily. 1230 01:01:54,804 --> 01:01:56,470 You can only get those three topologies, 1231 01:01:56,470 --> 01:01:58,060 which don't match this one. 1232 01:01:58,060 --> 01:01:59,780 And so there's no way to get continuously 1233 01:01:59,780 --> 01:02:01,890 from one of these to the other. 1234 01:02:01,890 --> 01:02:02,900 That's the argument. 1235 01:02:02,900 --> 01:02:04,930 Let me say a little more which is 1236 01:02:04,930 --> 01:02:08,360 if neither of the two long bars hit the triangle, 1237 01:02:08,360 --> 01:02:10,770 then you trivially get this. 1238 01:02:10,770 --> 01:02:12,860 You don't get any linking. 1239 01:02:12,860 --> 01:02:16,830 If one of them strikes a triangle, you always get this. 1240 01:02:19,430 --> 01:02:21,640 If two of them strike the triangle, 1241 01:02:21,640 --> 01:02:23,510 it's possible to get this or this 1242 01:02:23,510 --> 01:02:24,902 depending on how you link it up. 1243 01:02:24,902 --> 01:02:26,360 And I'm not going to prove it here, 1244 01:02:26,360 --> 01:02:27,910 but that's all you can get. 1245 01:02:27,910 --> 01:02:31,875 And so if you can't get this one, you're in trouble. 1246 01:02:31,875 --> 01:02:34,850 There's no continuous motion from this to those 1247 01:02:34,850 --> 01:02:36,604 without crossing. 1248 01:02:36,604 --> 01:02:37,520 So that's how it goes. 1249 01:02:37,520 --> 01:02:40,850 And this is an example of a topological argument. 1250 01:02:40,850 --> 01:02:42,710 You can use the same kind of argument 1251 01:02:42,710 --> 01:02:48,310 to prove this last result, which is this one. 1252 01:02:48,310 --> 01:02:51,220 But now this is not necessarily symmetric because one's closed, 1253 01:02:51,220 --> 01:02:52,370 one's open. 1254 01:02:52,370 --> 01:02:57,340 So this is a closed 4 chain quadrilateral, in other words, 1255 01:02:57,340 --> 01:03:01,150 with an open 3 chain, which can also lock. 1256 01:03:01,150 --> 01:03:03,340 And again, you can use the topological arguments. 1257 01:03:03,340 --> 01:03:05,881 A little messy because now this is like a tetrahedron instead 1258 01:03:05,881 --> 01:03:10,510 of a triangle, but the same tricks work and topology, 1259 01:03:10,510 --> 01:03:13,590 luckily, doesn't match between the inside and outside cases. 1260 01:03:16,350 --> 01:03:17,650 Cool. 1261 01:03:17,650 --> 01:03:22,740 So that is interlocking and pretty much 1262 01:03:22,740 --> 01:03:24,390 the entirety of this table. 1263 01:03:24,390 --> 01:03:27,070 Now, there's a way to generalize this table, 1264 01:03:27,070 --> 01:03:29,800 which is instead of saying, oh, I have universal joints. 1265 01:03:29,800 --> 01:03:31,510 Everything's nice. 1266 01:03:31,510 --> 01:03:32,930 I start thinking back to proteins 1267 01:03:32,930 --> 01:03:35,441 and say, well, what if I had some fixed angles. 1268 01:03:35,441 --> 01:03:37,690 What if some of the chains are fixed angles and others 1269 01:03:37,690 --> 01:03:38,540 are universal? 1270 01:03:38,540 --> 01:03:40,170 What if some of the chains were ridge? 1271 01:03:40,170 --> 01:03:42,890 Because we know that rigid hairpins can separate 1272 01:03:42,890 --> 01:03:45,610 without any folding whatsoever, what if some of them are rigid 1273 01:03:45,610 --> 01:03:48,590 and some of them are fixed angle and some of them are universal? 1274 01:03:48,590 --> 01:03:53,320 Well, we haven't done all of those results, but most of them 1275 01:03:53,320 --> 01:03:58,070 for pairs, we have got universal and rigid. 1276 01:03:58,070 --> 01:04:01,130 I mean a fixed angle 2 chain is a rigid 2 chain, 1277 01:04:01,130 --> 01:04:02,970 so there's only two columns there. 1278 01:04:02,970 --> 01:04:04,160 Universal fixed angle rigid. 1279 01:04:04,160 --> 01:04:05,720 Universal fixed angle rigid. 1280 01:04:05,720 --> 01:04:07,650 And here we only get up to rigid. 1281 01:04:11,000 --> 01:04:14,400 Yeah, I'm not sure if it makes a huge difference out there. 1282 01:04:14,400 --> 01:04:17,120 Again, this is not a complete story. 1283 01:04:17,120 --> 01:04:20,860 There other results that might fit in here. 1284 01:04:20,860 --> 01:04:25,120 How far down do those minuses go if you don't have rigid parts? 1285 01:04:25,120 --> 01:04:27,300 But let's see. 1286 01:04:27,300 --> 01:04:28,890 The plus means there's locked things. 1287 01:04:28,890 --> 01:04:32,070 The minus means they always separate. 1288 01:04:32,070 --> 01:04:33,720 It's a little more concise. 1289 01:04:33,720 --> 01:04:35,870 And we've sorted the rows and columns 1290 01:04:35,870 --> 01:04:38,820 so that you get nice-- conveniently you 1291 01:04:38,820 --> 01:04:42,620 get a nice diagonal pattern, not perfect diagonal, 1292 01:04:42,620 --> 01:04:45,430 but it's at least monotone. 1293 01:04:45,430 --> 01:04:48,240 It makes sense that a rigid-- obviously, rigid 1294 01:04:48,240 --> 01:04:52,690 is more fixed than a fixed angle. 1295 01:04:52,690 --> 01:04:54,940 But it's not obvious that a universal 4 chain 1296 01:04:54,940 --> 01:04:57,380 is going to lock more than a fixed angle 3 chain, 1297 01:04:57,380 --> 01:05:01,630 but turns out that is the case for these examples. 1298 01:05:01,630 --> 01:05:03,749 Now, you could consider triples of them, 1299 01:05:03,749 --> 01:05:05,040 each with different properties. 1300 01:05:05,040 --> 01:05:06,300 Then, you get a three dimensional table. 1301 01:05:06,300 --> 01:05:07,560 That gets a little messier. 1302 01:05:07,560 --> 01:05:08,930 Probably not a huge number of results 1303 01:05:08,930 --> 01:05:10,290 are necessary along those lines. 1304 01:05:10,290 --> 01:05:16,630 Here, we just needed the 2 chains plus the three 3 chains 1305 01:05:16,630 --> 01:05:18,350 were interlocked. 1306 01:05:18,350 --> 01:05:20,330 So there's more work to be done here, 1307 01:05:20,330 --> 01:05:22,790 but this was a lot of fun. 1308 01:05:22,790 --> 01:05:24,290 And if you want to see the examples, 1309 01:05:24,290 --> 01:05:26,020 you should check out this paper. 1310 01:05:30,530 --> 01:05:32,960 This was basically for fun and to see how 1311 01:05:32,960 --> 01:05:34,940 low could you go in terms of number of edges 1312 01:05:34,940 --> 01:05:37,190 and various combinations of different parts. 1313 01:05:37,190 --> 01:05:40,831 But it turns out this interlock stuff does have applications. 1314 01:05:40,831 --> 01:05:42,330 I'm not going to call them practical 1315 01:05:42,330 --> 01:05:44,680 because it's about another theory result. 1316 01:05:44,680 --> 01:05:47,060 And its result, I mentioned. 1317 01:05:47,060 --> 01:05:50,100 If you have some 3 chain-- oh sorry. 1318 01:05:50,100 --> 01:05:53,990 Some three dimensional chain-- let's say universal joints-- 1319 01:05:53,990 --> 01:05:57,380 and you have two configurations, configuration a 1320 01:05:57,380 --> 01:06:01,270 and configuration b, and you want 1321 01:06:01,270 --> 01:06:03,415 to know is there a motion from configuration 1322 01:06:03,415 --> 01:06:06,620 a to configuration b. 1323 01:06:06,620 --> 01:06:16,800 This is PSPACE-complete, result by a bunch of Berliners 1324 01:06:16,800 --> 01:06:18,120 from a few years ago. 1325 01:06:18,120 --> 01:06:21,000 And basically, what they do is build a computer. 1326 01:06:21,000 --> 01:06:24,680 I don't have the gadgets here to show you. 1327 01:06:24,680 --> 01:06:27,500 But they build a computer with various moving parts 1328 01:06:27,500 --> 01:06:28,780 and what not. 1329 01:06:28,780 --> 01:06:31,230 To build the infrastructure of that computer, 1330 01:06:31,230 --> 01:06:33,490 the rigid part doesn't move. 1331 01:06:33,490 --> 01:06:35,610 They use interlocked chains. 1332 01:06:35,610 --> 01:06:39,540 Basically, if you want to build a nice, rigid framework, 1333 01:06:39,540 --> 01:06:44,497 like a truss, you can do it because you just take chains. 1334 01:06:44,497 --> 01:06:45,830 I'm only allowed to make chains. 1335 01:06:45,830 --> 01:06:47,121 I'm not allowed to make graphs. 1336 01:06:47,121 --> 01:06:50,110 But if I bring two parts of the chain 1337 01:06:50,110 --> 01:06:52,320 really close to each other, I can basically 1338 01:06:52,320 --> 01:06:55,300 tie a knot by making lots of little links 1339 01:06:55,300 --> 01:06:57,980 and using say an interlocked 3, 4 chain, 1340 01:06:57,980 --> 01:06:59,922 interlock 3 chain with an interlocked 4 chain. 1341 01:06:59,922 --> 01:07:01,380 In the same way when we're building 1342 01:07:01,380 --> 01:07:06,640 the locked 11 chain with the lock 2 chain, 1343 01:07:06,640 --> 01:07:09,362 you're tying little knots around various points. 1344 01:07:09,362 --> 01:07:11,070 Here, we don't even have to be efficient, 1345 01:07:11,070 --> 01:07:13,190 just constant numbers are fine. 1346 01:07:13,190 --> 01:07:19,020 So you can build some rigid structure like a tetrahedron 1347 01:07:19,020 --> 01:07:27,200 by taking one path here, making an interlock structure, 1348 01:07:27,200 --> 01:07:31,290 making interlock structure, interlock, interlock. 1349 01:07:31,290 --> 01:07:34,930 Could double traverse, I guess. 1350 01:07:34,930 --> 01:07:38,100 And you can interlock everything using a chain. 1351 01:07:38,100 --> 01:07:39,632 Build a rigid structure. 1352 01:07:39,632 --> 01:07:41,340 Then, you go and build the flexible part. 1353 01:07:41,340 --> 01:07:43,250 And that's the fun computation. 1354 01:07:43,250 --> 01:07:45,590 But just to get off the ground and to build 1355 01:07:45,590 --> 01:07:49,160 some rigid infrastructure, to build gadgets basically 1356 01:07:49,160 --> 01:07:51,575 in your hardness proof, they use interlocked chains, 1357 01:07:51,575 --> 01:07:52,450 which is pretty cool. 1358 01:07:52,450 --> 01:07:57,310 This is a really interesting result, a very negative one. 1359 01:07:57,310 --> 01:07:59,480 It's really hard to fold from one shape to another, 1360 01:07:59,480 --> 01:08:01,290 but it uses this. 1361 01:08:01,290 --> 01:08:03,770 They also prove the same thing for trees in two dimensions. 1362 01:08:03,770 --> 01:08:05,270 For that, they used the lock trees-- 1363 01:08:05,270 --> 01:08:07,270 which we covered way back, lecture three 1364 01:08:07,270 --> 01:08:11,720 or whatever-- these lock trees to build-- to basically fill 1365 01:08:11,720 --> 01:08:15,670 rigid angles, to force things to be at particular angles, 1366 01:08:15,670 --> 01:08:19,520 and then they have flexible parts to do the computer stuff. 1367 01:08:19,520 --> 01:08:20,140 So it's fun. 1368 01:08:20,140 --> 01:08:24,370 They use our little examples to build big computers. 1369 01:08:24,370 --> 01:08:27,660 Still open, of course, is can I start 1370 01:08:27,660 --> 01:08:32,120 from a straight configuration of the chain 1371 01:08:32,120 --> 01:08:34,790 and can I get it to this. 1372 01:08:34,790 --> 01:08:38,500 So in other words, given a chain, can I unfold it. 1373 01:08:38,500 --> 01:08:40,910 We'd like to know whether that's polynomially 1374 01:08:40,910 --> 01:08:42,200 solvable or empty complete. 1375 01:08:42,200 --> 01:08:45,670 It is about characterizing lock chains in some sense. 1376 01:08:45,670 --> 01:08:47,122 This is about going from arbitrary 1377 01:08:47,122 --> 01:08:48,830 configuration to arbitrary configuration. 1378 01:08:48,830 --> 01:08:50,649 And here, you're allowed to build 1379 01:08:50,649 --> 01:08:53,109 rigid infrastructure which never moves. 1380 01:08:53,109 --> 01:08:55,620 Here, everything would have to fall apart somehow, 1381 01:08:55,620 --> 01:08:57,779 so you can't use interlock chains. 1382 01:08:57,779 --> 01:09:00,790 You can't use lock trees in 2D. 1383 01:09:00,790 --> 01:09:06,285 So this problem is still open. 1384 01:09:06,285 --> 01:09:07,660 Seems quite difficult because you 1385 01:09:07,660 --> 01:09:11,149 can't build any infrastructure the stays around forever. 1386 01:09:11,149 --> 01:09:13,020 Everything would have to fall apart 1387 01:09:13,020 --> 01:09:14,195 if your machine terminates. 1388 01:09:16,779 --> 01:09:19,181 Any questions? 1389 01:09:19,181 --> 01:09:20,930 Wanted to talk about that for a long time. 1390 01:09:20,930 --> 01:09:22,346 Now that we have interlock chains, 1391 01:09:22,346 --> 01:09:24,979 you get to see a little bit behind the scenes of what's 1392 01:09:24,979 --> 01:09:25,689 happening. 1393 01:09:25,689 --> 01:09:27,200 If you want to see the full proof, 1394 01:09:27,200 --> 01:09:32,370 check out that paper by [INAUDIBLE], [INAUDIBLE], 1395 01:09:32,370 --> 01:09:33,859 and [INAUDIBLE]. 1396 01:09:38,290 --> 01:09:38,810 All right. 1397 01:09:38,810 --> 01:09:40,570 That's it.