1 00:00:09,500 --> 00:00:12,920 In this class, we'll be using R to work with data 2 00:00:12,920 --> 00:00:14,630 and build models. 3 00:00:14,630 --> 00:00:16,640 But what is R? 4 00:00:16,640 --> 00:00:20,420 R is a software environment for data analysis, 5 00:00:20,420 --> 00:00:23,690 statistical computing, and graphics. 6 00:00:23,690 --> 00:00:25,850 It is also a programming language 7 00:00:25,850 --> 00:00:28,000 which is natural to use and allows 8 00:00:28,000 --> 00:00:32,119 you to complete data analyses in just a few lines. 9 00:00:32,119 --> 00:00:35,210 We won't be doing much programming in class, 10 00:00:35,210 --> 00:00:37,640 and almost everything we ask you to do 11 00:00:37,640 --> 00:00:40,370 can be completed in a few lines. 12 00:00:40,370 --> 00:00:42,780 There's a lot more that you can do with R 13 00:00:42,780 --> 00:00:44,910 than just what we teach in this class, 14 00:00:44,910 --> 00:00:47,260 but you will have a good understanding of R 15 00:00:47,260 --> 00:00:51,440 and how to use it in just a few weeks. 16 00:00:51,440 --> 00:00:54,470 R originated from the statistical programming 17 00:00:54,470 --> 00:00:58,470 language S, which was developed by John Chambers 18 00:00:58,470 --> 00:01:01,980 while at Bell Labs in the 1970s. 19 00:01:01,980 --> 00:01:04,390 The first version of R was developed 20 00:01:04,390 --> 00:01:07,530 by Robert Gentlemen and Ross Ihaka 21 00:01:07,530 --> 00:01:11,480 at the University of Auckland in the mid-1990s. 22 00:01:11,480 --> 00:01:13,810 They wanted a better statistical software 23 00:01:13,810 --> 00:01:17,400 to use in their Macintosh teaching laboratory 24 00:01:17,400 --> 00:01:19,880 and decided to create their own. 25 00:01:19,880 --> 00:01:23,860 They also released it as an open-sourced alternative to S 26 00:01:23,860 --> 00:01:26,390 and encouraged others to download and help 27 00:01:26,390 --> 00:01:29,080 develop the software. 28 00:01:29,080 --> 00:01:31,289 There are many choices for data analysis 29 00:01:31,289 --> 00:01:33,640 software available today. 30 00:01:33,640 --> 00:01:36,450 In addition to R, some popular examples 31 00:01:36,450 --> 00:01:45,479 are SAS, Stata, SPSS, Excel and Excel add-ons, MATLAB, Minitab, 32 00:01:45,479 --> 00:01:47,860 and pandas in Python. 33 00:01:47,860 --> 00:01:50,460 So, why are we using R? 34 00:01:50,460 --> 00:01:53,190 R is free and open-sourced and is 35 00:01:53,190 --> 00:01:57,670 available on all platforms, Mac, Windows, and Linux. 36 00:01:57,670 --> 00:01:59,780 R is also widely used. 37 00:01:59,780 --> 00:02:03,510 There are more than two million users around the world. 38 00:02:03,510 --> 00:02:07,590 This means that new features are being developed all the time, 39 00:02:07,590 --> 00:02:11,430 and there are a lot of community resources for R. 40 00:02:11,430 --> 00:02:15,140 Additionally, R makes it easy to re-run previous work 41 00:02:15,140 --> 00:02:17,000 and to make adjustments. 42 00:02:17,000 --> 00:02:22,050 R also has nice graphics and visualizations. 43 00:02:22,050 --> 00:02:25,800 We will just be using the R command line interface. 44 00:02:25,800 --> 00:02:28,770 However, if you would like to try a graphical user 45 00:02:28,770 --> 00:02:32,110 interface, or GUI, there are many choices. 46 00:02:32,110 --> 00:02:35,620 Two popular choices are RStudio and Rattle. 47 00:02:38,630 --> 00:02:41,760 There are many R resources available online. 48 00:02:41,760 --> 00:02:45,810 In addition to the official R page and the download page, 49 00:02:45,810 --> 00:02:47,850 there are many helpful websites. 50 00:02:47,850 --> 00:02:50,750 Here are a few popular ones. 51 00:02:50,750 --> 00:02:53,650 In general, though, if you're looking for a command 52 00:02:53,650 --> 00:02:57,280 or how to do something in R, try googling it. 53 00:02:57,280 --> 00:03:00,790 Often, if you just try typing "R" and then what you're 54 00:03:00,790 --> 00:03:02,840 looking for in a search engine, you 55 00:03:02,840 --> 00:03:06,730 can find some very helpful posts and websites. 56 00:03:06,730 --> 00:03:10,390 We want to emphasize, though, that the best way to learn R 57 00:03:10,390 --> 00:03:12,260 is through trial and error. 58 00:03:12,260 --> 00:03:15,070 So, let's get started working in R.