1 00:00:04,500 --> 00:00:06,670 In this recitation, we will see how 2 00:00:06,670 --> 00:00:10,150 to apply clustering techniques to segment images, 3 00:00:10,150 --> 00:00:13,190 with the main application being geared towards medical image 4 00:00:13,190 --> 00:00:14,770 segmentation. 5 00:00:14,770 --> 00:00:16,900 At the end of this recitation, you 6 00:00:16,900 --> 00:00:20,370 will get a head start on how to cluster an MRI brain 7 00:00:20,370 --> 00:00:25,440 image by tissue substances and locate pathological anatomies. 8 00:00:25,440 --> 00:00:27,500 Image segmentation is the process 9 00:00:27,500 --> 00:00:31,200 of partitioning digital images into regions, or segments, 10 00:00:31,200 --> 00:00:34,750 that share the same visual characteristics, such as color, 11 00:00:34,750 --> 00:00:36,890 intensity, or texture. 12 00:00:36,890 --> 00:00:39,250 The segments should also be meaningful, 13 00:00:39,250 --> 00:00:43,040 as in they should correspond to particular surfaces, objects, 14 00:00:43,040 --> 00:00:45,430 or even parts of an object. 15 00:00:45,430 --> 00:00:47,490 Think of having an image of a water pond, 16 00:00:47,490 --> 00:00:50,290 a mountain chain in the backdrop, and the sky. 17 00:00:50,290 --> 00:00:52,430 Segmenting this image should ideally 18 00:00:52,430 --> 00:00:54,690 detect the three different objects 19 00:00:54,690 --> 00:00:56,600 and assign their corresponding pixels 20 00:00:56,600 --> 00:00:59,000 to three different regions. 21 00:00:59,000 --> 00:01:01,830 In few words, the goal of image segmentation 22 00:01:01,830 --> 00:01:05,810 is to modify the representation of an image from pixel data 23 00:01:05,810 --> 00:01:09,980 into something meaningful to us and easier to analyze. 24 00:01:09,980 --> 00:01:13,110 Image segmentation has a wide applicability. 25 00:01:13,110 --> 00:01:15,590 A major practical application is in the field 26 00:01:15,590 --> 00:01:18,670 of medical imaging, where image segments often 27 00:01:18,670 --> 00:01:23,140 correspond to different tissues, organs, pathologies, or tumors. 28 00:01:23,140 --> 00:01:26,530 Image segmentation helps locate these geometrically complex 29 00:01:26,530 --> 00:01:29,530 objects and measure their volume. 30 00:01:29,530 --> 00:01:31,950 Another application is detecting instances 31 00:01:31,950 --> 00:01:36,110 of semantic objects such as humans, buildings, and others. 32 00:01:36,110 --> 00:01:38,850 The two major domains that have seen much attention 33 00:01:38,850 --> 00:01:42,560 recently include face and pedestrian detection. 34 00:01:42,560 --> 00:01:44,870 The main uses of facial detection, for instance, 35 00:01:44,870 --> 00:01:48,640 include the development of the auto-focus in digital cameras 36 00:01:48,640 --> 00:01:53,150 and face recognition commonly used in video surveillance. 37 00:01:53,150 --> 00:01:56,289 Other important applications are fingerprint and iris 38 00:01:56,289 --> 00:01:57,470 recognition. 39 00:01:57,470 --> 00:01:59,800 For instance, fingerprint recognition 40 00:01:59,800 --> 00:02:02,450 tries to identify print patterns, including 41 00:02:02,450 --> 00:02:06,520 aggregate characteristics of ridges and minutiae points. 42 00:02:06,520 --> 00:02:09,270 In this recitation, we will look in particular 43 00:02:09,270 --> 00:02:12,960 at the medical imaging application. 44 00:02:12,960 --> 00:02:15,860 Various methods have been proposed to segment images. 45 00:02:15,860 --> 00:02:18,160 Clustering methods are used to group the points 46 00:02:18,160 --> 00:02:21,520 into clusters according to their characteristic features, 47 00:02:21,520 --> 00:02:23,910 for instance, intensity values. 48 00:02:23,910 --> 00:02:25,930 These clusters are then mapped back 49 00:02:25,930 --> 00:02:28,740 to the original spatial domain to produce 50 00:02:28,740 --> 00:02:31,380 a segmentation of the image. 51 00:02:31,380 --> 00:02:33,960 Another technique is edge detection, 52 00:02:33,960 --> 00:02:38,020 which is based on detecting discontinuities or boundaries. 53 00:02:38,020 --> 00:02:40,230 For instance, in a gray-scale image, 54 00:02:40,230 --> 00:02:42,600 a boundary would correspond to an abrupt change 55 00:02:42,600 --> 00:02:45,180 in the gray level. 56 00:02:45,180 --> 00:02:47,570 Instead of finding boundaries of regions in the image, 57 00:02:47,570 --> 00:02:49,560 there are other techniques called region 58 00:02:49,560 --> 00:02:51,760 growing methods, which start dividing 59 00:02:51,760 --> 00:02:54,180 the image into small regions. 60 00:02:54,180 --> 00:02:57,540 Then, they sequentially merge these regions together 61 00:02:57,540 --> 00:03:00,340 if they are sufficiently similar. 62 00:03:00,340 --> 00:03:04,350 In this recitation, our focus is on clustering methods. 63 00:03:04,350 --> 00:03:07,560 In particular, we will review hierarchical and k-means 64 00:03:07,560 --> 00:03:11,160 clustering techniques and how to use them in R. 65 00:03:11,160 --> 00:03:14,530 We will restrict ourselves to gray-scale images. 66 00:03:14,530 --> 00:03:17,890 Our first example is a low-resolution flower image 67 00:03:17,890 --> 00:03:19,920 whose pixel intensity information 68 00:03:19,920 --> 00:03:23,360 is given the data set flower.csv. 69 00:03:23,360 --> 00:03:27,240 Our second and major example involves two weighted MRI 70 00:03:27,240 --> 00:03:28,810 images of the brain. 71 00:03:28,810 --> 00:03:31,590 One image corresponds to a healthy patient, 72 00:03:31,590 --> 00:03:34,610 and the other one corresponds to a patient with a tumor 73 00:03:34,610 --> 00:03:36,950 called oligodendroglioma. 74 00:03:36,950 --> 00:03:39,990 The pixel intensity information of these two images 75 00:03:39,990 --> 00:03:44,410 are given in the data sets healthy and tumor.csv. 76 00:03:44,410 --> 00:03:47,370 The last video will compare the use, pros, 77 00:03:47,370 --> 00:03:51,430 and cons of all the analytics tools that we have seen so far. 78 00:03:51,430 --> 00:03:54,160 I hope that this will help you synthesize all 79 00:03:54,160 --> 00:03:56,460 that you've learned to give you an edge in the class 80 00:03:56,460 --> 00:03:59,590 competition that is coming up soon.