1 00:00:01,280 --> 00:00:04,110 PROFESSOR: So now we start on another topic in graph theory, 2 00:00:04,110 --> 00:00:06,897 namely the topic of simple graphs. 3 00:00:06,897 --> 00:00:08,980 So last week we were talking about directed graphs 4 00:00:08,980 --> 00:00:10,880 where the arrows have a beginning and an end, 5 00:00:10,880 --> 00:00:12,530 as shown here. 6 00:00:12,530 --> 00:00:15,570 But simple graphs are simpler. 7 00:00:15,570 --> 00:00:17,890 The edges don't have direction. 8 00:00:17,890 --> 00:00:23,460 They just correspond to a mutual connection, which is symmetric. 9 00:00:23,460 --> 00:00:25,130 So there's a picture of a simple graph, 10 00:00:25,130 --> 00:00:29,240 and the edges are shown without an arrowhead. 11 00:00:29,240 --> 00:00:30,820 A special thing about directed graphs 12 00:00:30,820 --> 00:00:33,280 is that it's possible to have an arrow going 13 00:00:33,280 --> 00:00:35,560 in each direction between two vertices. 14 00:00:35,560 --> 00:00:37,930 But when we have undirected edges like this, 15 00:00:37,930 --> 00:00:38,820 that doesn't happen. 16 00:00:38,820 --> 00:00:41,920 So there's only one edge between a pair 17 00:00:41,920 --> 00:00:43,740 of vertices in a simple graph. 18 00:00:43,740 --> 00:00:45,450 In addition, a directed graph might 19 00:00:45,450 --> 00:00:49,290 have a self loop, an edge that starts and begins 20 00:00:49,290 --> 00:00:51,790 at the-- starts and ends at the same vertex. 21 00:00:51,790 --> 00:00:54,470 And those are also disallowed in simple graphs. 22 00:00:54,470 --> 00:00:56,112 Now, you could allow those things. 23 00:00:56,112 --> 00:00:57,570 There's a thing called multi-graphs 24 00:00:57,570 --> 00:01:00,870 where there are multiple edges between vertices. 25 00:01:00,870 --> 00:01:03,420 And there could also be self loops, but we don't need those. 26 00:01:03,420 --> 00:01:05,069 Let's not complicate matters. 27 00:01:05,069 --> 00:01:07,380 We're talking about simple graphs. 28 00:01:07,380 --> 00:01:10,060 OK, so the formal definition of a simple graph 29 00:01:10,060 --> 00:01:13,540 is that it's an object G that has a bunch of parts. 30 00:01:13,540 --> 00:01:19,750 Namely it has a nonempty set, v of vertices, 31 00:01:19,750 --> 00:01:21,410 just like directed graphs. 32 00:01:21,410 --> 00:01:25,410 And it has a set E of edges, but the edges now 33 00:01:25,410 --> 00:01:27,840 are somewhat different since they don't have beginnings 34 00:01:27,840 --> 00:01:28,360 and ends. 35 00:01:28,360 --> 00:01:32,080 An edge just has two endpoints that are in V, 36 00:01:32,080 --> 00:01:33,820 and we don't distinguish the endpoints. 37 00:01:33,820 --> 00:01:35,700 So let's just draw a picture. 38 00:01:35,700 --> 00:01:39,060 Here's a case where there are six vertices V shown in blue, 39 00:01:39,060 --> 00:01:41,790 and there are these undirected edges shown in green. 40 00:01:41,790 --> 00:01:47,540 In this case I see seven edges in E. 41 00:01:47,540 --> 00:01:50,500 There is an example of an edge that goes between two vertices 42 00:01:50,500 --> 00:01:52,430 that I've highlighted in yellow and red. 43 00:01:52,430 --> 00:01:55,520 And we've made that particular edge dark green. 44 00:01:55,520 --> 00:01:58,180 An edge like that can formally be represented 45 00:01:58,180 --> 00:02:01,570 as just the set of its end points, a set of two things, 46 00:02:01,570 --> 00:02:03,490 red and yellow. 47 00:02:03,490 --> 00:02:08,169 In text, we'll often indicate it as the two vertices connected 48 00:02:08,169 --> 00:02:09,669 by a horizontal bar. 49 00:02:09,669 --> 00:02:12,220 But you have to remember that the order in which 50 00:02:12,220 --> 00:02:13,940 the red and the yellow occur don't 51 00:02:13,940 --> 00:02:16,010 matter, because it's really a set consisting 52 00:02:16,010 --> 00:02:17,580 of red and yellow. 53 00:02:17,580 --> 00:02:20,170 When two vertices are connected by an edge, 54 00:02:20,170 --> 00:02:22,250 they are said to be adjacent. 55 00:02:22,250 --> 00:02:25,290 And the edge is said to be incident 56 00:02:25,290 --> 00:02:30,080 to its end points, just a little vocabulary that we use here. 57 00:02:30,080 --> 00:02:32,754 A basic concept in graph theory, which 58 00:02:32,754 --> 00:02:34,420 is what we're going to make a little bit 59 00:02:34,420 --> 00:02:38,680 of in this video segment, is the idea of the degree of a vertex. 60 00:02:38,680 --> 00:02:41,750 The degree of a vertex is simply the number of incident edges, 61 00:02:41,750 --> 00:02:43,630 the number of edges, that touch it, 62 00:02:43,630 --> 00:02:45,770 the number of edges for which its an end point. 63 00:02:45,770 --> 00:02:49,130 So let's look at the red vertex. 64 00:02:49,130 --> 00:02:51,970 There are two edges incident to the red vertex, 65 00:02:51,970 --> 00:02:53,480 so its degree is 2. 66 00:02:53,480 --> 00:02:56,230 OK, let's look at the yellow vertex. 67 00:02:56,230 --> 00:02:59,710 Here there are four edges incident to the yellow vertex, 68 00:02:59,710 --> 00:03:01,880 so its degree is 4. 69 00:03:01,880 --> 00:03:04,100 No surprises yet. 70 00:03:04,100 --> 00:03:09,350 So let's examine some properties of vertex degrees 71 00:03:09,350 --> 00:03:11,240 that are motivated by a simple example. 72 00:03:11,240 --> 00:03:13,030 Suppose I asked the question, is it 73 00:03:13,030 --> 00:03:17,920 possible to have a graph with vertex degrees of 2, 2, and 1. 74 00:03:17,920 --> 00:03:21,040 So implicitly it's a 3 vertex graph. 75 00:03:21,040 --> 00:03:24,110 And one vertex has degree 2, another has degree 2, 76 00:03:24,110 --> 00:03:25,090 and one has degree 1. 77 00:03:25,090 --> 00:03:27,250 Well let's see what it looks like. 78 00:03:27,250 --> 00:03:29,329 If I'm going to have a vertex of degree 1, 79 00:03:29,329 --> 00:03:30,620 then I know what it looks like. 80 00:03:30,620 --> 00:03:31,430 There's the vertex. 81 00:03:31,430 --> 00:03:32,690 It's got one edge out of it. 82 00:03:32,690 --> 00:03:34,570 It's going to some other vertex. 83 00:03:34,570 --> 00:03:37,340 Now this other vertex must have degree 2, 84 00:03:37,340 --> 00:03:40,770 so it's connected to something else. 85 00:03:40,770 --> 00:03:43,600 And the something else must be another vertex with degree 2, 86 00:03:43,600 --> 00:03:46,440 because those are the only possible spectrum of degrees, 87 00:03:46,440 --> 00:03:47,550 1, 2, and 2. 88 00:03:47,550 --> 00:03:51,120 And that means that this last guy 89 00:03:51,120 --> 00:03:53,910 has to have an edge out of it, because it's degree 2. 90 00:03:53,910 --> 00:03:56,910 And it can't go back to 2, because there's already 91 00:03:56,910 --> 00:03:58,060 an edge between these two. 92 00:03:58,060 --> 00:04:00,340 And it can't go back to 1, because that has degree 1. 93 00:04:00,340 --> 00:04:01,360 So we're stuck. 94 00:04:01,360 --> 00:04:04,020 And by this ad hoc reasoning we figured out 95 00:04:04,020 --> 00:04:07,780 that there can't be a degree 3 graph with this spectrum 96 00:04:07,780 --> 00:04:09,710 of degrees 2, 2, 1. 97 00:04:09,710 --> 00:04:11,270 It's impossible. 98 00:04:11,270 --> 00:04:13,770 Well, we could have reasoned more generally. 99 00:04:13,770 --> 00:04:17,040 And there's a very elementary property of degrees 100 00:04:17,040 --> 00:04:19,600 that we're going to actually make something of in a minute. 101 00:04:19,600 --> 00:04:21,440 And it's called the handshaking lemma. 102 00:04:21,440 --> 00:04:24,190 It says that the sum of the degrees 103 00:04:24,190 --> 00:04:27,250 summed over all the vertices is equal to twice 104 00:04:27,250 --> 00:04:28,750 the number of edges. 105 00:04:28,750 --> 00:04:30,660 There it is written as a formula, 106 00:04:30,660 --> 00:04:32,610 twice the number of edges. 107 00:04:32,610 --> 00:04:34,410 So that's the cardinality symbol. 108 00:04:34,410 --> 00:04:36,910 Absolute value of a set means the size of the set. 109 00:04:36,910 --> 00:04:38,350 E here is finite. 110 00:04:38,350 --> 00:04:40,250 Twice the number of edges is equal to the sum 111 00:04:40,250 --> 00:04:43,800 over all the vertices of the degree of the vertices. 112 00:04:43,800 --> 00:04:45,360 Why is that true? 113 00:04:45,360 --> 00:04:48,270 Well, if you think about it, in the sum on the right, 114 00:04:48,270 --> 00:04:52,830 every edge is counted twice, once for each vertex 115 00:04:52,830 --> 00:04:54,520 that it's the end of. 116 00:04:54,520 --> 00:04:56,590 So we're really just count-- and this 117 00:04:56,590 --> 00:05:00,830 is a way of summing up over all of the vertices 118 00:05:00,830 --> 00:05:02,950 in which each vertex gets numerated twice. 119 00:05:02,950 --> 00:05:05,260 So the sum is twice the number of vertices. 120 00:05:05,260 --> 00:05:08,220 And the proof is trivial, but let's make something of this. 121 00:05:08,220 --> 00:05:10,760 You might wonder why it's called the handshaking lemma. 122 00:05:10,760 --> 00:05:13,300 That will emerge in some problems 123 00:05:13,300 --> 00:05:15,110 that we're going to have you do. 124 00:05:15,110 --> 00:05:17,430 But let's go on and apply the handshaking lemma 125 00:05:17,430 --> 00:05:18,640 in an interesting way. 126 00:05:18,640 --> 00:05:24,020 And by the way, of course, since 2 plus 2 plus 1 is odd, 127 00:05:24,020 --> 00:05:27,180 we could have without that ad hoc analysis figured out 128 00:05:27,180 --> 00:05:30,530 that the sum of the degrees can't be odd, 129 00:05:30,530 --> 00:05:32,449 because it's twice something. 130 00:05:32,449 --> 00:05:33,990 All right, so here's the applications 131 00:05:33,990 --> 00:05:35,530 designed to get your attention. 132 00:05:35,530 --> 00:05:39,050 It is an application of graph theory to sex. 133 00:05:39,050 --> 00:05:43,140 And we ask the question, are men more promiscuous than women? 134 00:05:43,140 --> 00:05:45,100 And there have been repeated studies 135 00:05:45,100 --> 00:05:48,720 that are cited in the notes that show again and again that when 136 00:05:48,720 --> 00:05:50,700 they survey collections of men and women 137 00:05:50,700 --> 00:05:53,770 and ask them how many sexual partners they have, 138 00:05:53,770 --> 00:05:56,970 it's consistently the case that the men are assessed to have 139 00:05:56,970 --> 00:06:00,720 30% more, 75% more, sometimes 2 and 1/2 times, 140 00:06:00,720 --> 00:06:04,000 3 times as many partners as the women. 141 00:06:04,000 --> 00:06:06,600 And there's got to be something wacky about this. 142 00:06:06,600 --> 00:06:08,920 We're going to come up with a very elementary graph 143 00:06:08,920 --> 00:06:14,620 theoretic argument that says that this is complete nonsense. 144 00:06:14,620 --> 00:06:17,950 By the way, the most recent study that we could find 145 00:06:17,950 --> 00:06:21,370 was one that's mentioned in the notes in 2007 146 00:06:21,370 --> 00:06:23,970 by the US Department of Health. 147 00:06:23,970 --> 00:06:26,060 And the statistician who collected the data 148 00:06:26,060 --> 00:06:28,480 knew that the results were impossible. 149 00:06:28,480 --> 00:06:30,820 But her job was to report the data, not 150 00:06:30,820 --> 00:06:32,840 to explain it or interpret it. 151 00:06:32,840 --> 00:06:38,390 And the men reported 30% more partners than the women. 152 00:06:38,390 --> 00:06:41,217 And we're going to show that somebody is lying. 153 00:06:41,217 --> 00:06:42,550 Here's how we're going to do it. 154 00:06:42,550 --> 00:06:47,260 We're going to model the relationships between men 155 00:06:47,260 --> 00:06:49,910 and women by having a graph that comes in two parts. 156 00:06:49,910 --> 00:06:52,620 It's going to be called a so-called bipartite graph. 157 00:06:52,620 --> 00:06:55,080 So there's going to be one set of vertices called 158 00:06:55,080 --> 00:06:58,720 M and another set of vertices called F. M for men 159 00:06:58,720 --> 00:07:01,830 and f for women, or females. 160 00:07:01,830 --> 00:07:06,920 And we're going to have edges going between men and women, 161 00:07:06,920 --> 00:07:10,770 between M's and F's, precisely when they have been involved 162 00:07:10,770 --> 00:07:12,920 in a sexual liaison. 163 00:07:12,920 --> 00:07:21,430 So looking back at this graph, this edge from that blue M 164 00:07:21,430 --> 00:07:28,240 to that orange F indicates that they had a sexual liaison. 165 00:07:28,240 --> 00:07:30,360 They were partners. 166 00:07:30,360 --> 00:07:33,250 OK, so this is a simple graph structure 167 00:07:33,250 --> 00:07:36,370 that we can use to represent who got together 168 00:07:36,370 --> 00:07:41,700 with whom in any given population of men and women. 169 00:07:41,700 --> 00:07:44,060 Now, if you think about the same argument 170 00:07:44,060 --> 00:07:49,660 that we use for handshaking, if you sum the degrees of the men, 171 00:07:49,660 --> 00:07:52,890 you're counting each edge exactly once. 172 00:07:52,890 --> 00:07:55,440 And so the sum of the degrees of the men 173 00:07:55,440 --> 00:07:57,730 is equal to the number of edges in this graph. 174 00:07:57,730 --> 00:08:01,760 And likewise, if you sum over the females, 175 00:08:01,760 --> 00:08:04,300 you're counting each edge once. 176 00:08:04,300 --> 00:08:06,640 And so the sum of the female degrees 177 00:08:06,640 --> 00:08:08,410 is also equal to the number of edges. 178 00:08:08,410 --> 00:08:12,310 In particular, the sum over the degrees of the males 179 00:08:12,310 --> 00:08:16,540 is equal to the sum over the degrees of the females. 180 00:08:16,540 --> 00:08:18,440 Because every time there's a liaison, 181 00:08:18,440 --> 00:08:20,980 it involves one male, one female. 182 00:08:20,980 --> 00:08:23,160 All right, now let's just do a little bit 183 00:08:23,160 --> 00:08:24,780 of elementary arithmetic. 184 00:08:24,780 --> 00:08:28,140 I'm going to divide both sides of this equality 185 00:08:28,140 --> 00:08:30,740 by the size of the male population, 186 00:08:30,740 --> 00:08:32,370 by the number of men. 187 00:08:32,370 --> 00:08:35,110 And if I do that, I get this formula. 188 00:08:35,110 --> 00:08:37,549 The left hand side is the number of-- is 189 00:08:37,549 --> 00:08:41,010 the sum of the degrees of men divided by the size of the M 190 00:08:41,010 --> 00:08:41,834 population. 191 00:08:41,834 --> 00:08:43,250 And here I'm doing a little trick. 192 00:08:43,250 --> 00:08:45,570 Notice that the F's cancel out. 193 00:08:45,570 --> 00:08:49,730 But I've expressed some of the female degrees divided 194 00:08:49,730 --> 00:08:53,600 by M as the sum of the female degrees divided by F times 195 00:08:53,600 --> 00:08:58,010 this factor F over M, which is the ratio of the populations 196 00:08:58,010 --> 00:09:00,130 of women to men. 197 00:09:00,130 --> 00:09:02,420 Now the reason I'm doing this is that if you 198 00:09:02,420 --> 00:09:04,410 look at this thing on the left, this is 199 00:09:04,410 --> 00:09:07,050 the average degree of the men. 200 00:09:07,050 --> 00:09:10,290 This is the sum of all the degrees of men 201 00:09:10,290 --> 00:09:11,640 divided by the number of men. 202 00:09:11,640 --> 00:09:15,790 So it's the average number of partners that men have. 203 00:09:15,790 --> 00:09:18,700 And likewise, now you can recognize over here 204 00:09:18,700 --> 00:09:22,120 that I've got the average number of partners 205 00:09:22,120 --> 00:09:24,470 that each woman has. 206 00:09:24,470 --> 00:09:26,550 And what we've just figured out then 207 00:09:26,550 --> 00:09:29,140 is that there's a fixed relationship 208 00:09:29,140 --> 00:09:32,600 between the average number of partners of men, 209 00:09:32,600 --> 00:09:34,810 the average degree of the M vertices 210 00:09:34,810 --> 00:09:37,270 and the average degree of the F vertices. 211 00:09:37,270 --> 00:09:41,090 And these two average degree, these average numbers 212 00:09:41,090 --> 00:09:44,170 of partners, is simply related by the ratio 213 00:09:44,170 --> 00:09:45,830 of the populations. 214 00:09:45,830 --> 00:09:48,510 The men degree is the female population 215 00:09:48,510 --> 00:09:52,400 divided by the M population times the average degree 216 00:09:52,400 --> 00:09:53,930 of the females. 217 00:09:53,930 --> 00:10:00,680 Now, what this tells us is that these wild figures 218 00:10:00,680 --> 00:10:04,240 of twice as many, and 30% more, and so on 219 00:10:04,240 --> 00:10:05,160 are completely absurd. 220 00:10:05,160 --> 00:10:08,110 Because we know a lot about the ratio of females 221 00:10:08,110 --> 00:10:09,270 to males in the population. 222 00:10:09,270 --> 00:10:11,560 As a matter of fact, in the US overall 223 00:10:11,560 --> 00:10:14,920 there are slightly more women than men. 224 00:10:14,920 --> 00:10:20,740 There is 1.035 women for each man in the US population. 225 00:10:20,740 --> 00:10:23,950 And that tells us then that if you surveyed 226 00:10:23,950 --> 00:10:26,660 the population of all the men and women in the country, 227 00:10:26,660 --> 00:10:31,730 you would discover that the men looked 3 and 1/2 percent more-- 228 00:10:31,730 --> 00:10:36,990 had 3 and 1/2 percent more partners than women per man. 229 00:10:36,990 --> 00:10:40,500 But this has nothing to do with their behavior, or promiscuity, 230 00:10:40,500 --> 00:10:41,550 or lack of it. 231 00:10:41,550 --> 00:10:45,844 It's simply a reflection of the ratio of the populations. 232 00:10:48,460 --> 00:10:50,770 Which gets us to the question of, 233 00:10:50,770 --> 00:10:53,210 where do these crazy numbers come from? 234 00:10:53,210 --> 00:10:58,710 And the answer seems to be that people are lying. 235 00:10:58,710 --> 00:11:00,810 One explanation would be that men exaggerate 236 00:11:00,810 --> 00:11:02,779 their number of partners, and women understate 237 00:11:02,779 --> 00:11:03,820 their number of partners. 238 00:11:03,820 --> 00:11:05,930 But the truth is that nobody knows 239 00:11:05,930 --> 00:11:10,750 exactly why we get these consistently false numbers. 240 00:11:10,750 --> 00:11:14,710 But we do get them consistently in one survey after another. 241 00:11:14,710 --> 00:11:18,400 You will no longer be fooled by such nonsense.