1 00:00:00,070 --> 00:00:01,780 The following content is provided 2 00:00:01,780 --> 00:00:04,019 under a Creative Commons License. 3 00:00:04,019 --> 00:00:06,870 Your support will help MIT OpenCourseWare continue 4 00:00:06,870 --> 00:00:10,730 to offer high quality educational resources for free. 5 00:00:10,730 --> 00:00:13,330 To make a donation or view additional materials 6 00:00:13,330 --> 00:00:17,240 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,240 --> 00:00:17,865 at ocw.mit.edu. 8 00:00:28,370 --> 00:00:30,420 SPEAKER 1: Welcome to this Vensim tutorial. 9 00:00:30,420 --> 00:00:33,770 As a group of students engaged in the system dynamics seminar 10 00:00:33,770 --> 00:00:36,350 at MIT Sloan, we will present how 11 00:00:36,350 --> 00:00:38,900 to estimate model parameters and the confidence interval 12 00:00:38,900 --> 00:00:45,220 in a dynamic model using Maximum Likelihood Estimation, MLE, 13 00:00:45,220 --> 00:00:48,900 Likelihood Ratio, LR method. 14 00:00:48,900 --> 00:00:52,455 First, the basics of MLE are described, as well as 15 00:00:52,455 --> 00:00:55,570 the advantages and underlying assumptions. 16 00:00:55,570 --> 00:00:58,420 Next, the different methods available for finding 17 00:00:58,420 --> 00:01:00,940 confidence intervals of the estimated parameters 18 00:01:00,940 --> 00:01:02,620 are discussed. 19 00:01:02,620 --> 00:01:06,820 Then, a step-by-step guide to a parameter estimation 20 00:01:06,820 --> 00:01:09,900 using MLE-- an assessment of the uncertainty 21 00:01:09,900 --> 00:01:14,080 around parameter estimates using Univariant Likelihood Ratio 22 00:01:14,080 --> 00:01:17,960 Method in Vensim-- is provided. 23 00:01:17,960 --> 00:01:21,690 This video is presented to you by William, George, Sergey, 24 00:01:21,690 --> 00:01:25,560 and Jim under the guidance of Professor Hazhir Rahmandad. 25 00:01:30,340 --> 00:01:33,370 The literature, Struben, Sterman, and Keith, 26 00:01:33,370 --> 00:01:36,335 highlight that estimating model parameters and the uncertainty 27 00:01:36,335 --> 00:01:39,230 of these parameters are central to good dynamic modeling 28 00:01:39,230 --> 00:01:40,530 practice. 29 00:01:40,530 --> 00:01:42,710 Models must be grounded in data if modelers 30 00:01:42,710 --> 00:01:45,900 are to provide reliable advice to policymakers. 31 00:01:45,900 --> 00:01:48,305 Ideally, one should estimate model parameters 32 00:01:48,305 --> 00:01:52,300 using data that are independent of the model behavior. 33 00:01:52,300 --> 00:01:54,940 Often, however, such direct estimation 34 00:01:54,940 --> 00:01:57,690 from independent data is not possible. 35 00:01:57,690 --> 00:02:00,030 In practice, modelers must frequently 36 00:02:00,030 --> 00:02:01,860 estimate at least some model parameters 37 00:02:01,860 --> 00:02:05,219 using historical data itself, finding the set of parameters 38 00:02:05,219 --> 00:02:07,510 that minimize the difference between the historical and 39 00:02:07,510 --> 00:02:09,889 simulated time series. 40 00:02:09,889 --> 00:02:12,360 Only then can estimations of model parameters 41 00:02:12,360 --> 00:02:14,150 and uncertainties of these estimations 42 00:02:14,150 --> 00:02:17,230 serve to test model hypotheses and quantify 43 00:02:17,230 --> 00:02:21,120 important uncertainties, which are crucial for decision-making 44 00:02:21,120 --> 00:02:23,150 based on modeling outcome. 45 00:02:23,150 --> 00:02:26,035 Therefore, when modeling a system in Vensim, 46 00:02:26,035 --> 00:02:28,740 if the purpose of this model involves numerical testing 47 00:02:28,740 --> 00:02:31,640 or projection, robust statistical parameter 48 00:02:31,640 --> 00:02:33,930 estimation is necessary. 49 00:02:33,930 --> 00:02:36,550 Confidence intervals also serve as an important tool 50 00:02:36,550 --> 00:02:38,575 for decision-making based on modeling outcomes. 51 00:02:43,130 --> 00:02:46,070 Various approaches are available to estimate parameters 52 00:02:46,070 --> 00:02:49,760 in confidence intervals in dynamic model, including, 53 00:02:49,760 --> 00:02:52,680 for estimation, the generalized methods of moments 54 00:02:52,680 --> 00:02:56,740 in maximum likelihood, and for confidence intervals, 55 00:02:56,740 --> 00:03:00,340 likelihood-based methods and bootstrapping. 56 00:03:00,340 --> 00:03:03,240 Maximum Likelihood Estimation is becoming increasingly 57 00:03:03,240 --> 00:03:05,310 important for nonlinear models when 58 00:03:05,310 --> 00:03:07,310 estimating nonlinear parameters that 59 00:03:07,310 --> 00:03:10,690 consist of non-normal, autocorrelated errors, 60 00:03:10,690 --> 00:03:12,466 and heteroscedasticity. 61 00:03:12,466 --> 00:03:15,550 It is simpler to understand the construct, yet 62 00:03:15,550 --> 00:03:17,830 at the same time, requires relatively little 63 00:03:17,830 --> 00:03:20,460 computational power. 64 00:03:20,460 --> 00:03:23,360 MLE is best suitable for using historical data 65 00:03:23,360 --> 00:03:25,830 to generate parameter estimation and confidence 66 00:03:25,830 --> 00:03:28,710 intervals as long as errors of estimation 67 00:03:28,710 --> 00:03:31,720 are independent and identically distributed. 68 00:03:31,720 --> 00:03:33,860 When using MLE in complex systems 69 00:03:33,860 --> 00:03:36,150 where these assumptions are violated, 70 00:03:36,150 --> 00:03:38,370 and/or the analytical likelihood function 71 00:03:38,370 --> 00:03:40,890 might be difficult to find, one should 72 00:03:40,890 --> 00:03:43,740 use more advanced methods. 73 00:03:43,740 --> 00:03:47,490 This tutorial will not address these cases. 74 00:03:47,490 --> 00:03:50,380 As the demonstration will show, the average laptop 75 00:03:50,380 --> 00:03:52,970 in use in 2014 is capable of running the analysis 76 00:03:52,970 --> 00:03:54,310 in a few minutes or less. 77 00:03:59,580 --> 00:04:03,024 This tutorial is based on the following four references-- 78 00:04:03,024 --> 00:04:05,065 "Bootstrapping for confidence interval estimation 79 00:04:05,065 --> 00:04:08,430 and hypothesis testing for parameters of system dynamics 80 00:04:08,430 --> 00:04:12,830 models" by Dogan, "A behavioral approach to feedback loop 81 00:04:12,830 --> 00:04:17,720 dominance analysis" by Ford, "Modeling Managerial 82 00:04:17,720 --> 00:04:22,440 Behavior," also known as "The Beer Game" by Sterman, 83 00:04:22,440 --> 00:04:26,820 and a soon-to-be published text by Struben, Sterman, and Keith, 84 00:04:26,820 --> 00:04:29,580 "Parameter and Confidence Interval Estimation in Dynamic 85 00:04:29,580 --> 00:04:34,470 Models-- Maximum Likelihood and Bootstrapping Methods." 86 00:04:34,470 --> 00:04:37,880 More background and explanation on the theory of MLE 87 00:04:37,880 --> 00:04:39,830 can be found in these works. 88 00:04:39,830 --> 00:04:43,325 This tutorial will focus on the application of MLE in Vensim. 89 00:04:47,420 --> 00:04:50,750 The literature, Struben, Sterman, and Keith, 90 00:04:50,750 --> 00:04:52,560 stresses that modelers must not only 91 00:04:52,560 --> 00:04:55,770 estimate parameter values, but also the uncertainty 92 00:04:55,770 --> 00:04:58,100 in the estimates so they and others can determine 93 00:04:58,100 --> 00:05:00,690 how much confidence to place in the estimates 94 00:05:00,690 --> 00:05:04,240 and select appropriate ranges for sensitivity analysis 95 00:05:04,240 --> 00:05:08,420 in order to assess the robustness of the conclusions. 96 00:05:08,420 --> 00:05:09,960 Estimating confidence intervals can 97 00:05:09,960 --> 00:05:11,376 be thought of as finding the shape 98 00:05:11,376 --> 00:05:14,470 of the inverted bowl, Figure 2. 99 00:05:14,470 --> 00:05:17,450 If for a given data set, the likelihood function 100 00:05:17,450 --> 00:05:19,645 for a set of parameters falls off very steeply 101 00:05:19,645 --> 00:05:23,382 for even small departures from the best estimate, 102 00:05:23,382 --> 00:05:25,590 then one can have confidence that the true parameters 103 00:05:25,590 --> 00:05:27,685 are close to the estimated value. 104 00:05:27,685 --> 00:05:31,690 As always, assuming the model is correctly specified, 105 00:05:31,690 --> 00:05:35,510 another maintained hypothesis are satisfied. 106 00:05:35,510 --> 00:05:38,130 If the likelihood falls off only slowly, 107 00:05:38,130 --> 00:05:40,540 other values of the parameters are nearly as likely 108 00:05:40,540 --> 00:05:43,530 as the best estimates and one cannot have much confidence 109 00:05:43,530 --> 00:05:46,080 in the estimated values. 110 00:05:46,080 --> 00:05:48,185 MLE methods provides two major approaches 111 00:05:48,185 --> 00:05:50,495 to constructing confidence intervals or confidence 112 00:05:50,495 --> 00:05:52,210 regions. 113 00:05:52,210 --> 00:05:55,634 The first is the asymptotic method, AM, 114 00:05:55,634 --> 00:05:57,550 which assumes that the likelihood function can 115 00:05:57,550 --> 00:06:01,400 be approximated by a parabola around the estimated parameter. 116 00:06:01,400 --> 00:06:05,140 An assumption that is valid for a very large sample. 117 00:06:05,140 --> 00:06:09,690 The second is the likelihood ratio, LR method. 118 00:06:09,690 --> 00:06:12,390 The LR is the ratio of the likelihood for a given 119 00:06:12,390 --> 00:06:16,556 set of parameter values to the likelihood for the MLE values. 120 00:06:16,556 --> 00:06:19,760 The LR method involves searching the actual likelihood surface 121 00:06:19,760 --> 00:06:21,880 to find values of the likelihood function 122 00:06:21,880 --> 00:06:25,080 that yield a particular value for the LR. 123 00:06:25,080 --> 00:06:27,960 That value is derived for the probability distribution 124 00:06:27,960 --> 00:06:29,980 of the LR and the confidence level 125 00:06:29,980 --> 00:06:34,220 desired, such as 95% chance that the true parameter value lies 126 00:06:34,220 --> 00:06:35,550 within the confidence interval. 127 00:06:40,580 --> 00:06:43,720 This tutorial will use the Univariate Likelihood Ratio 128 00:06:43,720 --> 00:06:47,262 for determining the MLE competence interval in Vensim. 129 00:06:47,262 --> 00:06:49,220 The estimated parameter and competence interval 130 00:06:49,220 --> 00:06:52,850 mean that for a specific percentage of probability-- 131 00:06:52,850 --> 00:06:56,410 usually 95 or 99% that the real parameter falls 132 00:06:56,410 --> 00:06:59,540 within the confidence interval with the designated percent 133 00:06:59,540 --> 00:07:01,050 possibility. 134 00:07:01,050 --> 00:07:03,140 This is consistent with general applications 135 00:07:03,140 --> 00:07:06,680 of statistics and probability. 136 00:07:06,680 --> 00:07:08,660 The LR our method of confidence interval 137 00:07:08,660 --> 00:07:11,100 estimation compared to the likelihood for the estimated 138 00:07:11,100 --> 00:07:15,010 parameter, theta hat, with that of an alternative set, 139 00:07:15,010 --> 00:07:16,320 theta star. 140 00:07:16,320 --> 00:07:19,130 The likelihood ratio is determined in equation one 141 00:07:19,130 --> 00:07:26,490 as R equals L theta hat divided by L theta star. 142 00:07:26,490 --> 00:07:28,630 Asymptotically, the likelihood ratio 143 00:07:28,630 --> 00:07:32,770 falls at chi square distribution, equation two. 144 00:07:32,770 --> 00:07:35,530 This is valuable, because the univariate method requires 145 00:07:35,530 --> 00:07:39,320 no new optimizations once the MLE has been found. 146 00:07:39,320 --> 00:07:41,570 The critical parameter value for all parameters 147 00:07:41,570 --> 00:07:45,470 is then simply using equation three. 148 00:07:45,470 --> 00:07:48,930 A disadvantage of univariate confidence interval estimates, 149 00:07:48,930 --> 00:07:52,720 however, is that the parameter space is not fully explored. 150 00:07:52,720 --> 00:07:56,225 Hence, the effect of interaction between the parameters on LL 151 00:07:56,225 --> 00:08:01,640 is ignored, 152 00:08:01,640 --> 00:08:04,150 The tutorial now switches to a real simulation 153 00:08:04,150 --> 00:08:06,640 using Vensim 6.1. 154 00:08:06,640 --> 00:08:08,750 It will show how the theory just described 155 00:08:08,750 --> 00:08:12,310 can be applied to estimating parameters of decision-making 156 00:08:12,310 --> 00:08:15,262 in the Beer Distribution Game. 157 00:08:15,262 --> 00:08:15,970 SPEAKER 2: Hello. 158 00:08:15,970 --> 00:08:18,600 This tutorial is now going to explore analytics, 159 00:08:18,600 --> 00:08:20,850 by estimating the parameters for a well-analyzed model 160 00:08:20,850 --> 00:08:23,600 of decision-making used in the Beer Distribution Game. 161 00:08:23,600 --> 00:08:26,100 The model is described in the paper, "Modeling Managerial 162 00:08:26,100 --> 00:08:28,530 Behavior-- Misperception of Feedback in a Dynamic 163 00:08:28,530 --> 00:08:32,390 Decision-Making Experiment" by Professor Sterman from 1989. 164 00:08:32,390 --> 00:08:34,610 Participants in the beer game choose how much beer 165 00:08:34,610 --> 00:08:37,385 to order each period in a simulated supply chain. 166 00:08:37,385 --> 00:08:39,760 The challenge is to estimate the parameters of a proposed 167 00:08:39,760 --> 00:08:41,415 decision rule for participant orders. 168 00:08:44,096 --> 00:08:45,470 The screen shows the simple model 169 00:08:45,470 --> 00:08:47,407 of beer game decision-making. 170 00:08:47,407 --> 00:08:49,740 The ordering decision rule proposed by Professor Sterman 171 00:08:49,740 --> 00:08:51,370 is the following. 172 00:08:51,370 --> 00:08:54,650 The orders placed every week are given by the maximum of zero 173 00:08:54,650 --> 00:08:57,180 and the sum of expected customer orders-- the orders 174 00:08:57,180 --> 00:08:58,710 that participants expect to receive 175 00:08:58,710 --> 00:09:01,590 next period from their immediate customer and inventory 176 00:09:01,590 --> 00:09:03,142 discrepancy. 177 00:09:03,142 --> 00:09:06,140 The inventory discrepancy is the difference 178 00:09:06,140 --> 00:09:08,660 between total desired inventory and some 179 00:09:08,660 --> 00:09:11,040 of actual on-hand inventory and supply 180 00:09:11,040 --> 00:09:13,420 line of on-order inventory. 181 00:09:13,420 --> 00:09:15,400 There are four parameters that are 182 00:09:15,400 --> 00:09:18,690 used in the model-- [INAUDIBLE], the weight on incoming orders 183 00:09:18,690 --> 00:09:22,000 in demand forecasting, S prime, the desired 184 00:09:22,000 --> 00:09:25,550 on-hand and on-order inventory, the fraction 185 00:09:25,550 --> 00:09:28,750 of the gap between desired and actual on-hand and on-order 186 00:09:28,750 --> 00:09:31,195 inventory ordered each week, and the fraction 187 00:09:31,195 --> 00:09:34,600 of the supply line the subject accounts for. 188 00:09:34,600 --> 00:09:36,680 As modelers, we don't know what the parameters 189 00:09:36,680 --> 00:09:37,960 of the actual game are. 190 00:09:37,960 --> 00:09:41,200 But we have the data for actual orders placed by participants. 191 00:09:41,200 --> 00:09:43,450 And this data is read from the Excel spreadsheet 192 00:09:43,450 --> 00:09:47,194 into the variable, actual orders. 193 00:09:47,194 --> 00:09:48,610 Let's start from some guesstimates 194 00:09:48,610 --> 00:09:50,790 and assign the following values. 195 00:09:50,790 --> 00:09:55,440 [INAUDIBLE], fraction of discrepancy 196 00:09:55,440 --> 00:09:58,840 between desired and actual inventory, 197 00:09:58,840 --> 00:10:04,190 and supply line fraction are all equal to 0.5. 198 00:10:04,190 --> 00:10:09,390 S prime, the total desired inventory, is 20 cases of beer. 199 00:10:09,390 --> 00:10:12,870 With these parameters, the model runs and generates 200 00:10:12,870 --> 00:10:14,580 some alias of the order placed, which 201 00:10:14,580 --> 00:10:16,287 can be seem on this graph. 202 00:10:16,287 --> 00:10:18,120 Let's compare them against the actual orders 203 00:10:18,120 --> 00:10:19,860 observed in the beer game. 204 00:10:19,860 --> 00:10:21,830 We can see that although the trend is generally 205 00:10:21,830 --> 00:10:24,330 correct at the high level, the shape is totally different. 206 00:10:27,067 --> 00:10:29,650 This necessitates the question, how can we effectively measure 207 00:10:29,650 --> 00:10:30,520 the fit of the data? 208 00:10:30,520 --> 00:10:32,880 The typical way is to calculate the sum of squared errors, 209 00:10:32,880 --> 00:10:35,040 which are differences between simulated and actual data 210 00:10:35,040 --> 00:10:35,550 points. 211 00:10:35,550 --> 00:10:37,050 Square, to make sure negative values 212 00:10:37,050 --> 00:10:38,990 don't reduce the total sum. 213 00:10:38,990 --> 00:10:40,730 The basic statistics of any variable 214 00:10:40,730 --> 00:10:42,740 can be found by using statistics to 215 00:10:42,740 --> 00:10:45,194 from either object output or bench tool. 216 00:10:54,086 --> 00:10:56,220 For this tutorial, I have already defined it. 217 00:10:56,220 --> 00:10:59,010 And in this case, we can see [INAUDIBLE] of sum of squares 218 00:10:59,010 --> 00:11:02,130 of residuals is 1700, with a mean of about 35. 219 00:11:02,130 --> 00:11:04,070 This is pretty far from a good fit, 220 00:11:04,070 --> 00:11:05,932 and we can confirm it officially. 221 00:11:05,932 --> 00:11:07,640 Now let's see how to run the optimization 222 00:11:07,640 --> 00:11:10,098 to find the parameters that will bring the values of orders 223 00:11:10,098 --> 00:11:12,360 placed as close to the actual orders as possible. 224 00:11:12,360 --> 00:11:13,830 The optimization control panel is 225 00:11:13,830 --> 00:11:17,200 invoked by using optimized tool on the toolbar. 226 00:11:17,200 --> 00:11:21,600 When you first open, you have to specify the file name here. 227 00:11:21,600 --> 00:11:24,140 Also, it is necessary to specify what type of optimization 228 00:11:24,140 --> 00:11:25,549 you're going to do. 229 00:11:25,549 --> 00:11:27,090 There are two types that can be used. 230 00:11:27,090 --> 00:11:29,070 They differ in the way they interpret and calculate 231 00:11:29,070 --> 00:11:29,820 the payoff value. 232 00:11:29,820 --> 00:11:33,096 A payoff is a single number that summarize the simulation. 233 00:11:33,096 --> 00:11:34,470 In the case of the simulation, it 234 00:11:34,470 --> 00:11:35,900 will be a measure of fit, which can 235 00:11:35,900 --> 00:11:37,775 be a sum of squared errors between the actual 236 00:11:37,775 --> 00:11:39,390 and the simulated values, or it can 237 00:11:39,390 --> 00:11:41,470 be a true electrical function. 238 00:11:41,470 --> 00:11:43,500 If you are only interested in finding the best 239 00:11:43,500 --> 00:11:45,790 fit without worrying about confidence intervals, 240 00:11:45,790 --> 00:11:47,580 you can use the calibration mode. 241 00:11:47,580 --> 00:11:49,690 For calibration, it is not necessary to define 242 00:11:49,690 --> 00:11:50,960 the payoff functional mode. 243 00:11:50,960 --> 00:11:53,562 Instead, choose the model variable and the data variable 244 00:11:53,562 --> 00:11:55,520 with which the model variable will be compared. 245 00:11:58,490 --> 00:12:00,130 In Vensim, then at each time step, 246 00:12:00,130 --> 00:12:02,380 the difference between the data and the model variable 247 00:12:02,380 --> 00:12:04,490 is multiplied by the weight specified. 248 00:12:04,490 --> 00:12:06,482 And this product is then squared. 249 00:12:06,482 --> 00:12:08,065 This number, which is always positive, 250 00:12:08,065 --> 00:12:09,960 is then subtracted from the payoff, 251 00:12:09,960 --> 00:12:11,920 so that the payoff is always negative. 252 00:12:11,920 --> 00:12:13,420 Maximizing the payoff means getting 253 00:12:13,420 --> 00:12:15,394 it to be as close to zero as possible. 254 00:12:15,394 --> 00:12:17,560 However, this is not a true log-likelihood function, 255 00:12:17,560 --> 00:12:20,184 so the results cannot be used to find the confidence intervals, 256 00:12:20,184 --> 00:12:20,940 which we need. 257 00:12:20,940 --> 00:12:23,120 Therefore, we're going to use the policy mode. 258 00:12:23,120 --> 00:12:25,340 For policy mode, the payoff function 259 00:12:25,340 --> 00:12:26,980 must be specified explicitly. 260 00:12:26,980 --> 00:12:28,890 At each time step, the value of the variable 261 00:12:28,890 --> 00:12:33,410 or presenting a payoff function is multiplied by the weight. 262 00:12:33,410 --> 00:12:36,615 Then it is multiplied by time step and added to the payoff. 263 00:12:36,615 --> 00:12:39,460 The optimizer is designed to maximize the payoff. 264 00:12:39,460 --> 00:12:41,430 So variables, for which more is better, 265 00:12:41,430 --> 00:12:42,860 should be given positive weights, 266 00:12:42,860 --> 00:12:44,440 and those, for which less is better, 267 00:12:44,440 --> 00:12:45,815 should be given negative weights. 268 00:12:48,990 --> 00:12:52,816 Here, [INAUDIBLE] are specified in the sum of squared errors. 269 00:12:52,816 --> 00:12:55,190 Residuals are calculated as the difference between orders 270 00:12:55,190 --> 00:12:57,750 placed and actual orders at each time step. 271 00:12:57,750 --> 00:12:59,390 This squared value is then accumulated 272 00:12:59,390 --> 00:13:01,720 in the stock sum of residual square. 273 00:13:01,720 --> 00:13:04,924 The tricky part is the variable, total sum of residual square. 274 00:13:04,924 --> 00:13:07,090 Please note that because Vensim, in the policy mode, 275 00:13:07,090 --> 00:13:08,923 multiplies the values of the payoff function 276 00:13:08,923 --> 00:13:10,826 by the time step, the payoff value 277 00:13:10,826 --> 00:13:12,950 is the weighted combination of the different payoff 278 00:13:12,950 --> 00:13:15,290 elements integrated over the simulation. 279 00:13:15,290 --> 00:13:16,770 This is not what we're looking for. 280 00:13:16,770 --> 00:13:19,350 We are interested in the final value of a variable instead 281 00:13:19,350 --> 00:13:22,180 of the integrated value, because the final value is exactly 282 00:13:22,180 --> 00:13:25,170 the total sum of squared errors the way we defined it. 283 00:13:25,170 --> 00:13:28,175 Here's an example to illustrate the concept. 284 00:13:28,175 --> 00:13:31,600 In this line is a stock variable level. 285 00:13:31,600 --> 00:13:33,783 It can be seen that, if we just use the stock of sum 286 00:13:33,783 --> 00:13:35,574 of residual squared in the payoff function, 287 00:13:35,574 --> 00:13:41,458 we get the value, which is equal to the area under the curve, 288 00:13:41,458 --> 00:13:44,090 instead of the final value that we need. 289 00:13:51,260 --> 00:13:53,600 [INAUDIBLE] look at only the final value. 290 00:13:53,600 --> 00:13:55,870 In a model, it is necessary to introduce new variable 291 00:13:55,870 --> 00:13:58,328 for the payoff function and use the following equation that 292 00:13:58,328 --> 00:14:00,220 makes its value to be zero at each time step, 293 00:14:00,220 --> 00:14:01,440 except for the final step. 294 00:14:07,094 --> 00:14:09,510 Add the current value of the residual squared to the stock 295 00:14:09,510 --> 00:14:12,440 level to account for the value of the current time step. 296 00:14:12,440 --> 00:14:14,810 This effectually ignores all the intermediate values 297 00:14:14,810 --> 00:14:17,460 and looks only at the final value. 298 00:14:17,460 --> 00:14:19,020 So far, this is nothing different 299 00:14:19,020 --> 00:14:21,180 from what Vensim was doing in the calibration mode. 300 00:14:21,180 --> 00:14:23,597 We simply replicated what it would be doing automatically. 301 00:14:23,597 --> 00:14:25,888 In order to get the true likelihood function-- assuming 302 00:14:25,888 --> 00:14:27,500 normal distribution of errors-- we 303 00:14:27,500 --> 00:14:29,166 need to divide the sum of squared errors 304 00:14:29,166 --> 00:14:30,250 by two variances. 305 00:14:30,250 --> 00:14:32,750 This is what you can see in the true low likelihood function 306 00:14:32,750 --> 00:14:34,940 variable. 307 00:14:34,940 --> 00:14:36,360 So far, in this demonstration, we 308 00:14:36,360 --> 00:14:38,360 don't know what the standard deviation of errors 309 00:14:38,360 --> 00:14:40,830 is going to be, because we haven't optimized anything yet, 310 00:14:40,830 --> 00:14:43,925 so we're going to leave this variable as 1. 311 00:14:43,925 --> 00:14:45,800 We are going to change it to the actual value 312 00:14:45,800 --> 00:14:48,340 of the standard deviation of errors later by the confidence 313 00:14:48,340 --> 00:14:49,640 interval. 314 00:14:49,640 --> 00:14:54,674 Now let's change the run name and go back 315 00:14:54,674 --> 00:14:55,590 to Optimization Setup. 316 00:14:58,360 --> 00:15:03,569 And let's use our log-likelihood function as the payoff element. 317 00:15:03,569 --> 00:15:05,110 Because this is policy mode, we don't 318 00:15:05,110 --> 00:15:07,856 need to specify anything in Compare To field. 319 00:15:07,856 --> 00:15:10,210 For the weight, because this is policy mode, 320 00:15:10,210 --> 00:15:12,670 it is necessary to assign a negative value. 321 00:15:12,670 --> 00:15:14,170 This tells Vensim that we're looking 322 00:15:14,170 --> 00:15:15,510 to minimize the payoff function. 323 00:15:15,510 --> 00:15:16,880 If there are multiple payoff functions, 324 00:15:16,880 --> 00:15:18,750 the weight value would have been important. 325 00:15:18,750 --> 00:15:20,170 For one function, it doesn't matter. 326 00:15:20,170 --> 00:15:21,240 But we don't want to change the values 327 00:15:21,240 --> 00:15:23,698 of the log-likelihood function to calculate true confidence 328 00:15:23,698 --> 00:15:24,660 internal values. 329 00:15:24,660 --> 00:15:27,360 Therefore, the log-likelihood function will not be scaled, 330 00:15:27,360 --> 00:15:29,162 and the weight is going to be minus 1. 331 00:15:32,778 --> 00:15:35,420 Next screen, the Optimization Control Panel, 332 00:15:35,420 --> 00:15:37,760 we have to specify the file name again. 333 00:15:37,760 --> 00:15:41,447 Then, make sure to use multiple starts. 334 00:15:41,447 --> 00:15:43,030 This means the analysis is less likely 335 00:15:43,030 --> 00:15:44,680 going to be stuck at a local optimum 336 00:15:44,680 --> 00:15:46,480 if the surface is not complete. 337 00:15:46,480 --> 00:15:48,899 If multiple starts is random, starting points 338 00:15:48,899 --> 00:15:51,190 for any optimizations are picked randomly and uniformly 339 00:15:51,190 --> 00:15:53,040 over the range of each parameter. 340 00:15:53,040 --> 00:15:56,085 Next field, optimizer, tells Vensim to use probable method 341 00:15:56,085 --> 00:15:59,277 to find local minimum function at each start. 342 00:15:59,277 --> 00:16:01,110 Technically, you could ignore this parameter 343 00:16:01,110 --> 00:16:02,693 and just rely on random starts to find 344 00:16:02,693 --> 00:16:04,250 the values of a payoff function. 345 00:16:04,250 --> 00:16:06,080 However, using multiple starts together 346 00:16:06,080 --> 00:16:07,750 with optimization of each time step 347 00:16:07,750 --> 00:16:09,875 produces better and faster convergence. 348 00:16:09,875 --> 00:16:12,599 The other values are left at the default levels. 349 00:16:12,599 --> 00:16:14,890 You can read more about these parameter in Vensim help, 350 00:16:14,890 --> 00:16:16,880 but generally, they control the optimization 351 00:16:16,880 --> 00:16:18,760 and are important for complicated models 352 00:16:18,760 --> 00:16:20,830 where simulation time is significant. 353 00:16:20,830 --> 00:16:23,607 Let's add now the parameters that Vensim will vary in order 354 00:16:23,607 --> 00:16:24,940 to minimize the payoff function. 355 00:16:50,350 --> 00:16:56,180 Next, make sure Payoff Report is selected and hit Finish. 356 00:16:56,180 --> 00:16:59,160 For random or other random values of multiple start, 357 00:16:59,160 --> 00:17:01,030 the optimizer will never stop unless you 358 00:17:01,030 --> 00:17:02,739 click on the Stop button to interrupt it. 359 00:17:02,739 --> 00:17:04,238 If you don't choose multiple starts, 360 00:17:04,238 --> 00:17:05,780 the next few steps are not necessary 361 00:17:05,780 --> 00:17:07,500 as Vensim would be able to complete optimization 362 00:17:07,500 --> 00:17:09,333 and sensitivity analysis, which is performed 363 00:17:09,333 --> 00:17:10,930 in the final step of optimization. 364 00:17:10,930 --> 00:17:12,849 However, as I mentioned, the optimization 365 00:17:12,849 --> 00:17:14,520 can be stuck at a local optimum, thus 366 00:17:14,520 --> 00:17:16,349 producing some optimal results. 367 00:17:16,349 --> 00:17:18,260 So it is recommended that you use multiple starts unless you 368 00:17:18,260 --> 00:17:19,801 know the shape of the payoff function 369 00:17:19,801 --> 00:17:22,920 and are sure that local optimum is equal to the global optimum. 370 00:17:22,920 --> 00:17:24,829 So having chosen random multiple starts, 371 00:17:24,829 --> 00:17:27,060 the question is, when to stop the optimization. 372 00:17:27,060 --> 00:17:28,810 This requires some experiments and depends 373 00:17:28,810 --> 00:17:30,550 on the shape of your payoff function. 374 00:17:30,550 --> 00:17:32,410 In this case, the values of the payoff 375 00:17:32,410 --> 00:17:34,360 have been changing quite fast in the beginning 376 00:17:34,360 --> 00:17:36,600 and are now at the same level for quite some time. 377 00:17:36,600 --> 00:17:39,100 So it's a good time to attempt to interrupt the optimization 378 00:17:39,100 --> 00:17:40,443 and see if you like the results. 379 00:17:43,890 --> 00:17:46,350 As you can see, the shape of the orders placed 380 00:17:46,350 --> 00:17:48,820 is much closer to the actual order values. 381 00:17:48,820 --> 00:17:51,440 The statistics shows a much better fit, 382 00:17:51,440 --> 00:17:54,050 with the total sum of squared error significantly lower 383 00:17:54,050 --> 00:18:00,719 at the level of just 279, with a mean of 5.813. 384 00:18:00,719 --> 00:18:02,760 The values of the rest of the optimized parameter 385 00:18:02,760 --> 00:18:04,718 can be looked up in the file with the same name 386 00:18:04,718 --> 00:18:06,742 as the run name and the extension out. 387 00:18:06,742 --> 00:18:08,950 We can open this file from Vensim using the Edit File 388 00:18:08,950 --> 00:18:10,050 command from File menu. 389 00:18:15,810 --> 00:18:17,950 So far, we only know the optimal values, but not 390 00:18:17,950 --> 00:18:20,264 the confidence intervals. 391 00:18:20,264 --> 00:18:22,680 We need now to tell Vensim to use the optimal values found 392 00:18:22,680 --> 00:18:25,709 and calculate the confidence intervals around them. 393 00:18:25,709 --> 00:18:28,250 We are going to do it by using the output of the optimization 394 00:18:28,250 --> 00:18:30,365 and changing optimization parameters. 395 00:18:30,365 --> 00:18:33,040 To do this, first, we have to modify the optimization control 396 00:18:33,040 --> 00:18:39,770 parameters and turn of multiple starts, 397 00:18:39,770 --> 00:18:47,560 and optimize them, and save it as the optimization control 398 00:18:47,560 --> 00:18:48,840 type file, BOC. 399 00:18:59,969 --> 00:19:02,260 Also, we need to change the value of standard deviation 400 00:19:02,260 --> 00:19:05,409 to the actual value to make sure we're 401 00:19:05,409 --> 00:19:07,825 looking at the true values of the log-likelihood function. 402 00:19:07,825 --> 00:19:10,324 This value can be found in the statistics [INAUDIBLE] again. 403 00:19:10,324 --> 00:19:12,470 We use it to change the variable in the model. 404 00:19:27,820 --> 00:19:29,570 Open the Optimization Control Panel again, 405 00:19:29,570 --> 00:19:33,750 and leave the payoff function as it is. 406 00:19:33,750 --> 00:19:35,510 But on the Optimization Control Screen, 407 00:19:35,510 --> 00:19:37,140 open the file that was just created 408 00:19:37,140 --> 00:19:40,015 and check that multiple starts and optimizer are turned off. 409 00:19:44,620 --> 00:19:49,510 In addition, specify [INAUDIBLE] by choosing payoff value 410 00:19:49,510 --> 00:19:51,840 and entering the value by which Vensim will change 411 00:19:51,840 --> 00:19:54,480 the optimal likelihood function in order to find the constants 412 00:19:54,480 --> 00:19:55,630 controls. 413 00:19:55,630 --> 00:19:57,340 Because for likelihood ratio method, 414 00:19:57,340 --> 00:19:58,840 the likelihood ratio is approximated 415 00:19:58,840 --> 00:20:00,330 by the chi-square distribution, it 416 00:20:00,330 --> 00:20:02,705 is necessary to find the value of chi-square distribution 417 00:20:02,705 --> 00:20:05,520 [INAUDIBLE] for 95th percentile for one degree of freedom, 418 00:20:05,520 --> 00:20:07,480 because we are doing the univariate analysis. 419 00:20:07,480 --> 00:20:10,470 A disadvantage of univariate confidence interval estimation 420 00:20:10,470 --> 00:20:13,085 is that the parameter space is not fully explored. 421 00:20:13,085 --> 00:20:15,210 Hence, the effect of interaction between parameters 422 00:20:15,210 --> 00:20:17,600 and log-likelihood function is ignored. 423 00:20:17,600 --> 00:20:19,650 The value of chi-squared, in this case, 424 00:20:19,650 --> 00:20:25,930 is approximately 3.84146, which we're 425 00:20:25,930 --> 00:20:28,565 going to use in the sensitivity field. 426 00:20:28,565 --> 00:20:30,320 Now, the [INAUDIBLE] button. 427 00:20:30,320 --> 00:20:33,245 And as you see, this time, we don't 428 00:20:33,245 --> 00:20:35,370 have to wait as Vensim doesn't do any optimization. 429 00:20:35,370 --> 00:20:36,865 The results, allocated in the file 430 00:20:36,865 --> 00:20:46,350 that has the name of the run-- and the word 431 00:20:46,350 --> 00:20:49,010 sensitive with the extension tab. 432 00:20:49,010 --> 00:20:51,602 Let's open this file and see our estimated parameters 433 00:20:51,602 --> 00:20:52,810 and the confidence intervals. 434 00:20:56,262 --> 00:20:58,470 Since we are using too much more likelihood function, 435 00:20:58,470 --> 00:21:00,636 the confidence bounds are not necessarily symmetric. 436 00:21:04,959 --> 00:21:06,250 This is a very simple tutorial. 437 00:21:06,250 --> 00:21:09,280 I hope you are now more familiar with [INAUDIBLE] for estimating 438 00:21:09,280 --> 00:21:11,760 parameters and confidence intervals for your models. 439 00:21:11,760 --> 00:21:12,420 Thank you. 440 00:21:15,650 --> 00:21:18,170 SPEAKER 3: A summary of the seven key steps involved 441 00:21:18,170 --> 00:21:22,630 in determining the MLE in Vensim is contained on the slide. 442 00:21:22,630 --> 00:21:25,200 It can be printed and serve as a useful reference 443 00:21:25,200 --> 00:21:27,330 and check for those wishing to use the method. 444 00:21:31,650 --> 00:21:34,342 MLE has its advantages and disadvantages 445 00:21:34,342 --> 00:21:37,560 in estimating parameters and their confidence intervals 446 00:21:37,560 --> 00:21:40,020 when performing dynamic modeling. 447 00:21:40,020 --> 00:21:41,870 In conclusion, when the condition 448 00:21:41,870 --> 00:21:45,030 of errors being independent and identically distributed 449 00:21:45,030 --> 00:21:48,580 are met, then MLE is a simple and straightforward method 450 00:21:48,580 --> 00:21:50,770 for estimating parameters in determining 451 00:21:50,770 --> 00:21:53,700 their statistical fit in system dynamics models developed 452 00:21:53,700 --> 00:21:55,550 in Vensim.