书名：R Data Visualization Recipes
作者名：Vitor Bianchi Lanzetta
本章字数：473字
更新时间：2021-07-02 23:33:37

How it works...

Recipe begins with data manipulation. The first step starts by setting seed (set.seed()) once pseudo random processes are about to be called; this makes the example reproducible. After loading the car package, the Salaries data set is separated into an object called dt.

The three following lines are translating the categorical variable rank into numerical values. A specif range of values were "randomly" generated for each category and stored into a new column called rk. Relation between numerical ranges and categories are explained below:

Assistant Professors: from 0.8 to 1.2
Associated Professors from 1.8 to 2.2
Professors: from 2.8 to 3.2.

Step 2 finally draws our "dot plot" using ggvis. Trusting the adjusted data (dt), layer_points() is deployed to reach for points geometry; the rk variable is inputted instead of rank. Notice the opacity argument setting up alpha blending. Afterwards, add_axis() is called in order to properly name the labels.

There is one add_axis() function for each label (3 labels, 3 functions). We're actually plotting several x-axes with custom text displayed by the tick discriminated at values argument. By ggvis version 0.4.3, inputting a single value into this argument would sometimes prevent the package from properly rendering the figure; duplicating the tick reference into a vector solved this problem.

Each add_axis() function carries a single tick reference while text is set by labels' properties. Also, notice how the axis title was renamed. To do this the title argument was inputted with empty strings ('') at each add_axis() function with exception of the last one, which was inputted with the proper proper name string ('Rank').

Discriminating several ticks by the values argument while setting several texts by properties won't work. All the texts are going to be displayed at once at every single tick set on values. This is the reason we need one add_axis() function for each label we're willing to customize.

Step 3 draws a similar plot using plotly. Using the plot_ly() function along with the manipulated data, type was set to 'scatter' along with mode = 'markers', creating the "dot" visual. Pipe operator led to layout(), which took care of relabeling the x-axis. This last function was inputted with a list under xaxis argument in order to rework the x-axis. Parameter tickvals picked the ticks values while ticktext set up the names to be displayed.

Last step uses ggplot2. Plot is initialized with ggplot() using the original data set. Function geom_jitter() is then stacked to display the jittered points. This last function sets height = 0 so that the points won't move vertically while width = .2 controls the amount of movement (noise) added horizontally. It also sets alpha blending. Colors here could be simply added by declaring aes(colour = sex) inside ggplot() or geom_jitter().