--- title: "Tutorial: Toy Data Set" output: rmarkdown::html_vignette bibliography: rankrate.bib vignette: > %\VignetteIndexEntry{tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(rmarkdown.html_vignette.check_title = FALSE) ``` In this tutorial, we demonstrate the key functionality of the `rankrate` package on a toy data set first analyzed in @gallo2022new. The tutorial includes code for visualization of rankings and ratings, as well as demonstration of functions for model estimation and inference. We begin by loading necessary packages. ```{r warning=FALSE,message=FALSE} library(rankrate) library(reshape2) # data reformatting library(pander) # creating nice tables library(ggplot2) # creating nice figures ``` ## Exploratory Data Analysis We now load the `ToyData3` data set, which is included in the `rankrate` package. The data set includes `r nrow(ToyData3$rankings)` judges who assess `r ncol(ToyData3$rankings)` objects with rankings and ratings. Specifically, each judge provides a complete ranking of all objects, and rates each object using the integers between 0 (best) and `r ToyData3$M` (worst). The ratings provided by each judge need not align with his/her ranking. ```{r} data("ToyData3") ``` ### Visualizing rankings Let's start by displaying rankings from the toy data set in tabular and graphical form. ```{r warning=FALSE, fig.asp = 0.7, out.width="90%",dpi=300} rankings_table <- as.data.frame(ToyData3$rankings) rownames(rankings_table) <- paste0("Judge ",1:16) names(rankings_table) <- paste0("Rank ",1:3) pander(rankings_table) rankings_long <- melt(ToyData3$rankings) names(rankings_long) <- c("Judge","Rank","Proposal") ggplot(rankings_long,aes(x=Proposal,fill=factor(Rank)))+theme_bw(base_size=10)+ geom_bar()+scale_fill_manual(values=c("#31A354","#A1D99B","#E5F5E0"))+ labs(fill="Rank",y="Count")+ggtitle("Ranks by Proposal")+ theme(panel.grid = element_blank()) ``` ### Visualizing ratings Let's start by displaying rankings from the toy data set in tabular and graphical form. ```{r warning=FALSE, fig.asp = 0.7, out.width="90%",dpi=300} ratings_table <- as.data.frame(ToyData3$ratings) rownames(ratings_table) <- paste0("Judge ",1:16) names(ratings_table) <- c("Proposal: 1",2:3) set.alignment("right") pander(ratings_table) ratings_long <- melt(ToyData3$ratings) names(ratings_long) <- c("Judge","Proposal","Rating") ggplot(ratings_long,aes(x=factor(Proposal),y=Rating))+theme_bw(base_size=10)+ geom_boxplot(color="gray")+geom_jitter(height=0,width=0.4,alpha=0.75)+ labs(x="Proposal",y="Rating")+ggtitle("Ratings by Proposal")+ theme(panel.grid = element_blank()) ``` ## Estimation Now, let's fit a Mallows-Binomial model to our toy dataset. Given the relatively small size of the data, we will use the exact MLE search method, "ASTAR". ```{r} MLE_mb <- fit_mb(rankings=ToyData3$rankings,ratings=ToyData3$ratings,M=ToyData3$M,method="ASTAR") pander(data.frame(Parameter=c("Consensus Ranking, pi_0", "Object Quality Parameter, p", "Consensus Scale Parameter, theta"), MLE=c(paste0(MLE_mb$pi0,collapse="<"), paste0("(",paste0(round(MLE_mb$p,2),collapse=","),")"), round(MLE_mb$theta,2)))) ``` ## Inference Furthermore, we can bootstrap confidence intervals using the `ci_mb` function. Below, we calculate and display 95% confidence intervals for the object quality parameters, `p`, as well as for the estimated rank of each object. We observe that even though objects 1 and 2 are estimated to be of similar quality, object 1 is still preferred to object 2 with high certainty. Object 3 is in third place. ```{r, fig.asp = 0.7, out.width="90%",dpi=300} CI_mb <- ci_mb(rankings=ToyData3$rankings,ratings=ToyData3$ratings,M=ToyData3$M, interval=0.95,nsamples=200,method="ASTAR") plot_p <- as.data.frame(cbind(1:3,MLE_mb$p,t(CI_mb$ci[,1:3]))) names(plot_p) <- c("Proposal","PointEstimate","Lower","Upper") ggplot(plot_p,aes(x=Proposal,y=PointEstimate,ymin=Lower,ymax=Upper))+ geom_point()+geom_errorbar()+ylim(c(0,1))+theme_bw(base_size=10)+ labs(x="Proposal",y="Estimated Quality (95% CI)")+ ggtitle("Estimated Quality by Object")+ theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank()) plot_pi0 <- as.data.frame(cbind(1:3,order(MLE_mb$pi0),t(CI_mb$ci_ranks))) names(plot_pi0) <- c("Proposal","PointEstimate","Lower","Upper") ggplot(plot_pi0,aes(x=Proposal,y=PointEstimate,ymin=Lower,ymax=Upper))+ geom_point()+geom_errorbar()+theme_bw(base_size=10)+ scale_y_continuous(breaks=1:3)+ labs(x="Proposal",y="Estimated Rank (95% CI)")+ ggtitle("Estimated Rank by Object")+ theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank()) ``` ## References