1 00:00:04,660 --> 00:00:06,370 The problem that we have studied so far 2 00:00:06,370 --> 00:00:09,380 captures the essential features of the AdWords problem, 3 00:00:09,380 --> 00:00:11,710 but it can be extended in several ways. 4 00:00:11,710 --> 00:00:13,500 We will shortly talk in some more detail 5 00:00:13,500 --> 00:00:16,050 about two of these, which relate to the idea of slates 6 00:00:16,050 --> 00:00:21,090 or positions, and which relate to the idea of personalization. 7 00:00:21,090 --> 00:00:22,680 Aside from these two extensions, there 8 00:00:22,680 --> 00:00:24,340 are also many other issues that Google 9 00:00:24,340 --> 00:00:26,440 deals with on a daily basis. 10 00:00:26,440 --> 00:00:29,620 One of these is related to click-through-rates. 11 00:00:29,620 --> 00:00:32,150 In particular, how does Google know the chance 12 00:00:32,150 --> 00:00:34,550 that a user clicks on a given ad? 13 00:00:34,550 --> 00:00:37,800 Google does this by analyzing large amounts of user data 14 00:00:37,800 --> 00:00:40,190 and building predictive models, similar to those 15 00:00:40,190 --> 00:00:43,630 we studied in this class, to predict how often users click 16 00:00:43,630 --> 00:00:47,390 on different ads when they're shown with different queries. 17 00:00:47,390 --> 00:00:50,060 Another issue is related to the advertisers. 18 00:00:50,060 --> 00:00:52,670 We saw earlier that the price-per-click depends 19 00:00:52,670 --> 00:00:56,810 on how the advertisers place their bids. 20 00:00:56,810 --> 00:00:58,930 So understanding the behavior of advertisers 21 00:00:58,930 --> 00:01:01,650 and incorporating this behavior in the optimization model 22 00:01:01,650 --> 00:01:04,790 is also another important consideration. 23 00:01:04,790 --> 00:01:08,650 Let's move on to discuss the idea of slates. 24 00:01:08,650 --> 00:01:11,950 In our example with AT&T, T-Mobile, and Verizon, 25 00:01:11,950 --> 00:01:14,380 we assume that the search page for each query 26 00:01:14,380 --> 00:01:16,420 has space for only one ad. 27 00:01:16,420 --> 00:01:18,170 Now typically, as we saw in Video 1 28 00:01:18,170 --> 00:01:20,410 when we searched for Nine Inch Nails tickets, 29 00:01:20,410 --> 00:01:24,660 there's usually space for many ads. 30 00:01:24,660 --> 00:01:26,520 In this case, Google has to decide 31 00:01:26,520 --> 00:01:29,310 which combination of ads, or slate, 32 00:01:29,310 --> 00:01:32,110 to display with each query. 33 00:01:32,110 --> 00:01:34,570 Although this would seem to be a more complicated problem, 34 00:01:34,570 --> 00:01:37,539 it can still be solved using linear optimization. 35 00:01:37,539 --> 00:01:40,140 Before, our variables were defined 36 00:01:40,140 --> 00:01:47,500 as x of a given advertiser and a given query. 37 00:01:47,500 --> 00:01:49,380 But now, we would instead define them 38 00:01:49,380 --> 00:01:55,150 as x for a given slate and a given query. 39 00:01:55,150 --> 00:01:58,550 So for example, for our wireless service provider example, 40 00:01:58,550 --> 00:02:02,500 if we had two spaces on our results page, then for query 1, 41 00:02:02,500 --> 00:02:11,130 we'd still have x_A1, x_T1, and x_V1, where here, for example, 42 00:02:11,130 --> 00:02:13,910 x_V1 is the number of times that we 43 00:02:13,910 --> 00:02:17,150 display Verizon with query 1. 44 00:02:17,150 --> 00:02:26,370 But we would also have x_AT1, x_AV1, and x_TV1. 45 00:02:26,370 --> 00:02:30,040 Here, for example, x_AV1, represents the number of times 46 00:02:30,040 --> 00:02:33,790 that we display the slate containing AT&T and Verizon 47 00:02:33,790 --> 00:02:35,750 with query 1. 48 00:02:35,750 --> 00:02:37,760 Now, this can become even more complicated 49 00:02:37,760 --> 00:02:40,950 as the position of the ad within the slate is important. 50 00:02:40,950 --> 00:02:43,690 For example, ads to the right of the search results 51 00:02:43,690 --> 00:02:46,390 might not attract as many clicks as those above the search 52 00:02:46,390 --> 00:02:47,340 results. 53 00:02:47,340 --> 00:02:54,620 In this case, we would also consider x_TA1, x_VA1, 54 00:02:54,620 --> 00:02:55,329 and x_VT1. 55 00:02:57,910 --> 00:03:01,580 And here, the first ad in the combination 56 00:03:01,580 --> 00:03:03,540 is the ad that is placed in the first position. 57 00:03:03,540 --> 00:03:05,560 So, for example, here T-Mobile is 58 00:03:05,560 --> 00:03:08,080 placed in the first position for x_TA1 59 00:03:08,080 --> 00:03:11,040 and AT&T is placed in the second position. 60 00:03:13,580 --> 00:03:15,650 We would formulate our objective and our budget 61 00:03:15,650 --> 00:03:18,110 and query constraints in the same way as before, 62 00:03:18,110 --> 00:03:21,140 but making sure that slates that contain a certain advertiser 63 00:03:21,140 --> 00:03:23,570 use up that advertiser's budget. 64 00:03:23,570 --> 00:03:25,370 And slates in a given query counts 65 00:03:25,370 --> 00:03:29,170 towards that query's estimated number of requests. 66 00:03:29,170 --> 00:03:33,140 Let's now discuss the idea of personalization. 67 00:03:33,140 --> 00:03:36,329 In addition to the query, Google can use other information 68 00:03:36,329 --> 00:03:38,300 to decide which ad to display. 69 00:03:38,300 --> 00:03:41,220 For example, Google might know the geographic location 70 00:03:41,220 --> 00:03:45,200 of the user as determined from their IP address. 71 00:03:45,200 --> 00:03:46,910 Google might also know other information, 72 00:03:46,910 --> 00:03:50,060 such as different Google searches that the user has 73 00:03:50,060 --> 00:03:54,530 conducted, and browser activity on Google's website. 74 00:03:54,530 --> 00:03:57,600 The question then is, how can Google take this into account 75 00:03:57,600 --> 00:04:00,840 when deciding which ads display for which queries? 76 00:04:00,840 --> 00:04:03,820 Well, just like slates, we could also incorporate this 77 00:04:03,820 --> 00:04:06,220 into our linear optimization model. 78 00:04:06,220 --> 00:04:07,940 Rather than working with queries, 79 00:04:07,940 --> 00:04:11,390 we would work with combinations of queries and user profiles. 80 00:04:11,390 --> 00:04:14,160 So rather than having x defined for a given 81 00:04:14,160 --> 00:04:18,980 advertiser and a given query, we would 82 00:04:18,980 --> 00:04:25,920 define x for a given advertiser, a given query, and a given user 83 00:04:25,920 --> 00:04:26,420 profile. 84 00:04:29,530 --> 00:04:32,670 So here, a user profile just describes the type of user. 85 00:04:32,670 --> 00:04:36,390 For instance, a profile might be men aged 20 to 25 86 00:04:36,390 --> 00:04:38,350 who live in Boston, Massachusetts in the United 87 00:04:38,350 --> 00:04:39,540 States. 88 00:04:39,540 --> 00:04:47,100 If we had three user profiles, we could name them P1, P2, P3. 89 00:04:47,100 --> 00:04:50,550 And then for AT&T for query 1, we 90 00:04:50,550 --> 00:04:56,310 would use x_A1P1 to denote the number of times 91 00:04:56,310 --> 00:04:59,830 that we display AT&T's ad with query 1 92 00:04:59,830 --> 00:05:03,950 for a user of profile P1. 93 00:05:03,950 --> 00:05:07,640 The rest of the model could then be easily accommodated 94 00:05:07,640 --> 00:05:09,470 for this new type of modeling consideration. 95 00:05:14,100 --> 00:05:15,960 We'll just now summarize the salient points 96 00:05:15,960 --> 00:05:17,650 of this recitation. 97 00:05:17,650 --> 00:05:19,760 So, so far, we've studied a small instance 98 00:05:19,760 --> 00:05:21,740 of Google's ad allocation problem, where 99 00:05:21,740 --> 00:05:26,050 we had just three advertisers or bidders and three queries. 100 00:05:26,050 --> 00:05:29,450 We saw how an optimization solution increases revenue 101 00:05:29,450 --> 00:05:34,690 by 16% over a simple common sense solution. 102 00:05:34,690 --> 00:05:37,850 What I'd like you to remember is that in reality, this problem 103 00:05:37,850 --> 00:05:40,930 is much larger. 104 00:05:40,930 --> 00:05:43,970 For each query that Google receives on its search engine, 105 00:05:43,970 --> 00:05:45,600 there may be hundreds to thousands 106 00:05:45,600 --> 00:05:47,580 of advertisers bidding on it. 107 00:05:47,580 --> 00:05:50,270 In terms of dollar amounts, in 2012, 108 00:05:50,270 --> 00:05:52,409 Google's total revenue from advertising 109 00:05:52,409 --> 00:05:55,190 was over $40 billion. 110 00:05:55,190 --> 00:05:58,270 At this scale, the gains that are possible from optimization 111 00:05:58,270 --> 00:06:01,040 become enormous, and I hope this convinces you 112 00:06:01,040 --> 00:06:03,480 of the value of linear optimization 113 00:06:03,480 --> 00:06:05,330 in online advertising.