1 00:00:00,800 --> 00:00:03,530 PROFESSOR: Hi, have you found it particularly difficult to find 2 00:00:03,530 --> 00:00:05,484 a specific item in your house? 3 00:00:05,484 --> 00:00:07,400 Let's say you're looking for a pair of gloves, 4 00:00:07,400 --> 00:00:11,160 but you just can't find it, and you spend your entire afternoon 5 00:00:11,160 --> 00:00:12,660 looking for it. 6 00:00:12,660 --> 00:00:15,060 Well, you've just encountered the same problem 7 00:00:15,060 --> 00:00:17,840 that companies like Google or Microsoft 8 00:00:17,840 --> 00:00:20,650 encounter every single day. 9 00:00:20,650 --> 00:00:22,910 And that's the problem of search. 10 00:00:22,910 --> 00:00:25,750 Just like a house which stores thousands of different items, 11 00:00:25,750 --> 00:00:30,390 Google stores 45 billion index pages of information. 12 00:00:30,390 --> 00:00:32,640 If every page was a sheet of paper 13 00:00:32,640 --> 00:00:37,760 and we stack them up real high, we'll create a tower 600 times 14 00:00:37,760 --> 00:00:39,560 taller than Mount Everest. 15 00:00:39,560 --> 00:00:41,820 Well, how can Google find my results 16 00:00:41,820 --> 00:00:44,090 so quickly when I find it so difficult 17 00:00:44,090 --> 00:00:46,880 to find a pair of gloves? 18 00:00:46,880 --> 00:00:48,460 Well, searching on Google is kind 19 00:00:48,460 --> 00:00:52,870 of like looking for person in a big school. 20 00:00:52,870 --> 00:00:55,770 Let's say you're looking for James in a row of classrooms. 21 00:00:55,770 --> 00:00:57,380 One of the easiest method would be 22 00:00:57,380 --> 00:01:02,760 to go to every classroom nearest to you until you find James. 23 00:01:02,760 --> 00:01:06,940 There's a better method called binary search. 24 00:01:06,940 --> 00:01:10,030 Now, let's say the students were arranged from a to z, 25 00:01:10,030 --> 00:01:12,200 one in each classroom. 26 00:01:12,200 --> 00:01:13,720 We would then go to the middle room 27 00:01:13,720 --> 00:01:16,680 first and see if James is there. 28 00:01:16,680 --> 00:01:18,340 And if James isn't there we will look 29 00:01:18,340 --> 00:01:21,340 at the first letter in the name, and if the letter is 30 00:01:21,340 --> 00:01:26,670 before j, we head to the right, if not, we head to the left. 31 00:01:26,670 --> 00:01:28,470 We would then approach the middle room 32 00:01:28,470 --> 00:01:30,790 in the newly sectioned area. 33 00:01:30,790 --> 00:01:34,460 Eventually we will repeat the process over and over again 34 00:01:34,460 --> 00:01:36,980 until we find James. 35 00:01:36,980 --> 00:01:40,940 Now, this is just like the first method, but it's at a much, 36 00:01:40,940 --> 00:01:43,350 much faster rate. 37 00:01:43,350 --> 00:01:45,320 How much faster would that be? 38 00:01:45,320 --> 00:01:48,540 Well, that depends on the number of students in the school. 39 00:01:48,540 --> 00:01:51,670 Let's say there are 500 students and we're looking for one. 40 00:01:51,670 --> 00:01:54,400 It will take about 80 minutes in the first method, 41 00:01:54,400 --> 00:01:57,650 But one and a half minutes with binary search. 42 00:01:57,650 --> 00:02:00,350 But let's say there are 1,000 students in the school. 43 00:02:00,350 --> 00:02:04,080 It would take 160 minutes with the first method, 44 00:02:04,080 --> 00:02:07,965 but 1.6 minutes with the second method. 45 00:02:07,965 --> 00:02:10,310 Now that's a whole lot of difference. 46 00:02:10,310 --> 00:02:12,080 So a name is just a word. 47 00:02:12,080 --> 00:02:15,640 But Google searches a combination of words, 48 00:02:15,640 --> 00:02:18,690 making it a little bit more complicated. 49 00:02:18,690 --> 00:02:21,510 So just like how we identified the first letter 50 00:02:21,510 --> 00:02:23,660 of each alphabet of the name, Google 51 00:02:23,660 --> 00:02:27,900 identifies 200 unique factors making your search terms 52 00:02:27,900 --> 00:02:29,610 faster. 53 00:02:29,610 --> 00:02:32,390 Well, if you recall, the effectiveness of binary search 54 00:02:32,390 --> 00:02:35,810 depends on the prearrangement of data. 55 00:02:35,810 --> 00:02:38,110 And that's why computer scientists are actively 56 00:02:38,110 --> 00:02:42,210 looking for ways to sort, manage, and eventually retrieve 57 00:02:42,210 --> 00:02:45,220 data faster and better. 58 00:02:45,220 --> 00:02:48,990 In the same way, the TV remote goes near the TV, 59 00:02:48,990 --> 00:02:53,200 the shoes go to the shoe rack, the coats go into the cupboard, 60 00:02:53,200 --> 00:02:56,230 and the winter gloves go into the winter jacket. 61 00:02:56,230 --> 00:03:00,000 Aha, so that's where my gloves are.