<Review!>
Does anyone have age small than 0? : table(age<0)
How man y missing values does the variable age have?: table(is.na(age))
Are those people with BMI 0 the same people with insulin 0: table((BMI==0) & (insulin==0)
Change variable type: dat$lunchtime <-as.Date(dat$lunchtime)
[row, column]
to call out variables use: $___ OR attach(dat)
to remove the variables attached to a dataset: detach(dat)
dat[!dat$Age == 0, ] = > extracting data where age is not equal to 0
dat_female <- dat [dat$Gender == "F", ] => create object (new dataset) with just female data
dat_final <- data.frame(ID = dat_female$PatientId, Age=dat_female$Age, NoShow = dat_female$No-show)
<Descriptive analysis>
Goal:
Describe your study sample regarding its main characteristics
ie. regarding main personal/sociodemographic variables, covariates, and outcome/exposure variables
Goal 1: for yourself
- Compute plots (and tables) to get an understanding of the characteristics of the variables, their distribution, and association between variables
Goal 2: for presenting to others
- Describe the main characteritics of your study sample
- Present these descriptive statistics in an easily accessible minimal table (or graphic)
Table 1 ; main characteristic of the sample
- Informative? Relevant? Easy to understand?
- ex) study of change in wellbeing following work exit in 8037 persons
- absolute number, frequencies
- all important & relevant variables that later will be analysized should be included
- (!) %; are these row-wise? or column wise?
- how is each variable measured? ; nominal, ordinal, metric ==> different kind of summary statistics
For nominal vafriables? >> frequencies
- Absolute frequencies : table(var1)
- Relative frequencies: table(var1)/length(var1)
- Further functions to create frequency tables: prop.table(), janitor::tabyl(), summarytools::freq()
- Frequencies of 2 variables: table(var1, var2)
- Alternative: expss::cro(var1, var2)
Ways to save the statistics/tables:
- Draft table in e.g. Word, copy/paste values from R manually.
- Create your table directly in R, export to e.g. csv or excel file (or by knitting directly to word/pdf file)
- eg. use openxlsx::write.xlsx(), writexl::write_xlsx(), write.table(), write.csv() functions
- Create your report through R Markdown, and generate the tables / figure there directly.
- eg. use summary_table() functions in qwraps2 package to generate formatted tables.
- (!) if you knit to a word file, then you have a formatted and editable table in word!
- See https://cran.r-project.org/web/packages/qwraps2/vignettes/summary-statistics.html.
Exercise with>>
Main question: Does sending a reminder SMS have an affect on whether people come to their doctor appointment?
Learn how to save as excel!! - rewatch the video
Frequencty plots:
- Bar plot: barplot(table(var1))
- Pie chart: pie(table(var1))
- Stratified barplots with barplot(table(var1, var2)) or mosaicplot(table(var1,var2))
- Histogram: hist() for metric/continuous variables
Save by using pdf() and jpeg() to open the connection to a pdf or jpeg file.
Descriptive statistics of ordinal variables >> bar plot, mosaic plot, histogram, boxplot, scatterplots
- Frequencies (if not many categories)
- Minimum, maximum, median,
- Range, quantiles, IQR, median absolute deviation (MAD)
- Used but not fully appropriate: mean, standard deviastion (SD)
For continuous/metric variables >> Histogram, boxplot, scatterplots, Quantile-quantile plots
- Mean, median, min., max,
- Range, quantiles, IQR, MAD, SD, variance
- (Skewness, kurtosis)
Things to remember!
na.rm = TRUE option to remove missing values! ex) mean(dat$age, na.rm = TRUE)
SD and variance are based on denominator n-1
quantile() function has 9 types how to compute quantiles!
'석사과정' 카테고리의 다른 글
[Statistical Analysis with R] Data Analysis (0) | 2021.03.02 |
---|---|
[Statistical Analysis with R] Advanced Tables & Plots (0) | 2021.03.02 |
[Statistical Analysis with R] Data Manipulation (0) | 2021.02.16 |
[Statistical Analysis with R] R Markdown (0) | 2021.02.16 |
[Statistical Analysis with R] Overview (0) | 2021.02.09 |
댓글