R Tutorial The gapminder dataset

>> YOUR LINK HERE: ___ http://youtube.com/watch?v=GmEKpKEakmQ

Want to learn more? Take the full course at https://learn.datacamp.com/courses/in... at your own pace. More than a video, you'll learn hands-on coding quickly apply skills to your daily work. • --- • Hi, I'm Dave Robinson and I'll be your instructor. I'm a data scientist and I love using R to dive into a dataset and discover interesting things. This course will get you started on the path to exploring and visualizing your own data with the R programming language. This course introduces you • To the tidyverse, a collection of data science tools within R for transforming and visualizing data. This is not the only set of tools in R, but it's a powerful and popular approach for exploring data. At every step, you'll be analyzing a real dataset called • Gapminder. Gapminder tracks economic and social indicators like life expectancy and the GDP per capita of countries over time. The experience you gain in this example will help you in analyzing your own data. You'll learn to draw specific insights and • Communicate them through informative visualizations with the ggplot2 package. This course is interactive: between the short videos, you'll complete exercises by typing in code, with help from us along the way. The • The first code you'll write is to load two R packages, which is done by writing library parenthesis, the name of the package, then end parenthesis . R packages are tools that aren't built into the language but were created later by other programmers. Each of them provides tools that you don't have to write yourself. • The first package is gapminder, created by Jenny Bryan, which contains the dataset that you'll be analyzing. The second package is dplyr, created by Hadley Wickham, which provides step-by-step tools for transforming this data, such as filtering, sorting, and summarizing it. You type • Gapminder to display the contents of the gapminder object, which is structured as a data frame. A data frame keeps rectangular data in rows and columns, similar to a spreadsheet, or a table in a SQL database. Most data analyses in R, and everything you'll do in this course, are centered around data frames. As described in the first line of the output, this is a special type of data frame called a tibble, though for now, you don't have to worry about the difference. • R displays the first ten rows so that you can get a glimpse of it, and you can see a short description in the first line. This tells you the tibble has one thousand seven hundred and four rows, each of which we call an observation. It has six columns, each of which we call a variable. • It's important in analysis to understand what each observation, or row, represents. Here, each represents a unique pair of a country and a year. For example, the first observation represents country statistics for Afghanistan in 1952, the second for Afghanistan in 1957, and so on. • For each combination of a country and year, the dataset contains several variables, or columns, describing the country's demographics. We see the continent - in this case, Asia - the life expectancy in years, the population, and the GDP per capita. The GDP per capita is the country's total economic output (Gross Domestic Product) divided by its population, and it's a common measure of how wealthy a country is. Each variable is of one consistent data type: some are numbers, like life expectancy and population, and some are categorical, like country and continent. • Even with this small glimpse of the data, you can extract a few insights. For example, you can see that Afghanistan's life expectancy and population have both gone up from 1952 to 1997, but that its GDP per capita has wavered. In the rest of this course, you'll learn to use R to draw many conclusions about the social and economic history of countries around the world.

#############################

New on site