- R Data Visualization Recipes
- Vitor Bianchi Lanzetta
- 379字
- 2021-07-02 23:33:30
How it works...
Step 1 draws regression lines using ggplot2. After loading ggplot2 package, first step is drawing a scatterplot. Plot is stored into an object called scatter.
Regression lines would be colored based on Species variable if aesthetic colour, shape or fill were declared into it ggplot() instead of geom_point(). Calling for them with exception of shape aesthetic into geom_smooth() would also color the lines.
After the scatterplot is drawn, it's summed with the geom_smooth() layer to draw the regression lines. The group aesthetic declared into this layer asks for a unique regression for each species. There are a bunch of useful arguments available at this particular layer; consult them by typing ?geom_smooth().
For this recipe, three arguments are directly called for: method, se, and show.legend. The first one is inputted with 'lm', so that the lines are based on the linear model (there are others available). Next argument is set to F (FALSE), so that the confidence interval isn't drawn. The last one makes sure that the regression lines wouldn't be present in the legends.
Step 2 draws a similar plot using ggvis. After calling layer_points() , pipes (%>%) were used to group data by variable Species and then draw regression lines with layer_model_predictions(). Arguments model and stroke are respectively picking the prediction model and line colors.
Step 3 draws regression lines using the plotly package. The function plot_ly() is used to initialize the plotly object, it does not draw anything yet. Next function, add_markers(), is drawing the points. Argument show.legends asked for legends to explain the points.
Following add_markers() function there are a series of add_lines() functions. There is one for each Species showed by data. All of them work in a very similar way. First, data is filtered with respect to certain species using the filter() function. By doing this, the recipe groups the data manually. Next, the fitted values coming out from the linear model are inputted as y. The linear models were adjusted using the lm() function. The fitted values were rescued using the fitted() function. To draw a single ungrouped regression line would be much simpler. Don't you filter data and call a single add_line() function.