1211 DIV 「妄想日記」MV Full ver
#############################
Video Source: www.youtube.com/watch?v=W6-8L3eRzeM
Want to learn more? Take the full course at https://learn.datacamp.com/courses/su... at your own pace. More than a video, you'll learn hands-on coding quickly apply skills to your daily work. • --- • In the previous lesson, we used a simple one dimensional example to illustrate the notion of an optimal decision boundary, that is, one that maximizes the margin. • In this lesson we'll create a two-predictor dataset that we will subsequently use to illustrate some of the key principles of support vector machines, including margin maximization. The dataset we will generate is essentially a generalization of the previous example in that it has two variables instead of one and the decision boundary is a line rather than a point. • We generate a dataset with 200 points consisting of two predictor variables, x1 and x2, that are uniformly distributed between 0 and 1. To do this we first set the number of data points n to 200 and the seed integer for random number generation. We then create two sets of random numbers lying between 0 and 1 using the runif() function, which generates uniform random numbers. The resulting values for x1 and x2 are stored in the dataframe df. • Next we create two classes separated by a straight line x1=x2. This line passes through the origin and makes an angle of 45 degrees with the horizontal axis. We label points below the line as having class equals -1 and those above as having class equals +1. Here's the code. Now let's see what our two class dataset looks like. • Let's visualize the dataset and the decision boundary using ggplot(). We'll create a two dimensional scatter plot with x1 on the x-axis and x2 on the y-axis, distinguishing the two classes by color. Points below the decision boundary will be colored red and those above blue. The decision boundary itself is a straight line x1 equals x2, which passes through the origin and has a slope of 1, that is, it makes an angle 45 degrees with the x1 axis. Here's the code. • And here is the resulting plot. Notice that although the decision boundary separates the two classes cleanly, it has no margin. So let's introduce a small margin in the dataset. • To create a margin we need to remove points that lie close to the decision boundary. One way to do this is to filter out points that have x1 and x2 values that differ by less than a specified value. Let’s set this value to 0.05 and do the filtering. The dataset should now have a margin. Let's replot it using exactly the same ggplot code as before. • Here is the resulting plot. Notice the empty space on either side of the decision boundary. This is the margin. We can make the margin clearer by delineating its boundaries. • The margin boundaries are parallel to the decision boundary and lie 0.05 units on either side of it. We'll draw the margin boundaries as dashed lines to distinguish them from the decision boundary. • Here is the plot. Notice that our decision boundary is the maximal margin separator because it lies halfway between the margin boundaries. • That's it for this chapter. In the exercises we'll create a dataset similar to the one discussed in this lesson. We will use that dataset extensively in the exercises in the next chapter. • #R #RTutorial #DataCamp #Vector #Machines #linearly #separable #dataset
#############################