Monday, September 29, 2014

R Statistical Software Basics – Descriptive Statistics - Percentiles

The practice sheet titled StatisticMarks Data.csv downloaded from Link. – Download Sheet

About the Data Sheet - The data in this sheet is related to marks scored by 100 Students in a
Statistical Test.

Based on the data, we will use R Software Statistical functions to analyze the descriptive statistics.

In the Data Sheet, we have Data from A2:A101, A1 being the header of the Data. I have stored the StatisticMarks.csv file in Working Directory on my Desktop.

setwd("C:/Users/Rajesh Prabhakar/Desktop/R")

For inputting or reading Data from “StatisticMarks.csv” file, R Command would be

StatMarks=read.csv("StatisticMarks.csv")

Percentiles
Assume that the elements in a data set are rank ordered from the smallest to the largest. The values that divide a rank-ordered set of elements into 100 equal parts are called percentiles.
An element having a percentile rank of Pi would have a greater value than i percent of all the elements in the set. Thus, the observation at the 50th percentile would be denoted P50, and it would be greater than 50 percent of the observations in the set.
An observation at the 50th percentile would correspond to the median value in the set.
In R Statistical Software, Quartiles & percentiles are represented by function called “quantile”

quantile(filename, c(.10,.20,.30,.40,.50,.60,.70,.80,.90,.95))

In the Data Sheet, we have Data from A2:A101, A1 being the header of the Data titled StatisticsMarks.

quantile(StatMarks$StatisticsMarks, c(.10,.20,.30,.40,.50,.60,.70,.80,.90,.95)) 

StatMarks is the name of the variable in which we stored the data followed by $ sign and column header of the Data i.e. StatisticsMarks.

Remember the title of the column should be exactly same including the large caps & small caps or else it will give error.

In R the file names, column headers and row headers should exactly match the same or else the function will give errors

The result of this function in R Console is

> quantile(StatMarks$StatisticsMarks,c(.10,.20,.30,.40,.50,.60,.70,.80,.90,.95))
 10%  20%  30%  40%  50%  60%  70%  80%  90%  95%
  56.8 66.0   71.0  73.0   75.0  78.0  80.0   81.2  91.1   97.0


10% of students scored upto 56.8 marks, 20% of students scored upto 66 marks, 30% scored upto 71 marks, 50% scored upto 75 marks, etc.

No comments:

Post a Comment