[Statistical Analysis with R] Overview
Overview and help for R
R introduction: http://r-tutorial.nl
R download and overview: https://cran.r-project.org
R packages: https://cran.r-project.org/web/packages
R journal: https://journal.r-project.org
RStudio cheatsheets: www.rstudio.com/resources/cheatsheets
Overview: www.rstudio.com/online-learning
*For publicly available datasets for research: https://datasetsearch.research.google.com/
Download and installation of R
Download and install most recent, archived, and development versions from https://cran.r-project.org/.
RStudio: Download and install from www.rstudio.com/products/rstudio/download
Help for the installation can be found e.g. at http://r-tutorial.nl/.
Update R e.g. using the updateR() function (and other functions) in the install r package (https://cran.r-project.org/web/packages/installr).
Background
1. Empirical research process; analysis & interpretation
Theoretical Q <--------R------> Empirical observations
-Operationalization of Constructs
-Data collection
-Data organization & digitalization
-Data check & manipulation
-Data analysis --> Description of sample (Descriptive Statistics) / Use sample to make conclusion/draw inference about general population
-Interpretation of results --> Is the sample representative? / Are the results convincing?
2. R & RStudio history
-1980s: S, S-PLUS developed by Becker & Chambers.
-Ross Ihaka & Rober Gentleman (University of Auckland): development of reduced version of S:R.
-New version updated approx. every 2~3 months ; current version: 4.0.3
R: ~ Engine
-Statistical computer program
-Complete programming language
-Environment to perform statistical analyses, produce graphics
>It's free, used a lot in life sciences, public health, etc.
>Very good people working on the development and adding new things (e.g. Hadley Wickham) – e.g. work in the cloud (https://rstudio.cloud)
RStudio: ~ Program
-Nice graphical interface, with additional options and functionalities
Other programs:
SPSS: can do all analysis by "clicking" without programming/with programming, not an open source
SAS: can do some analysis by clicking/generally made for programming. High costs
Python: Similar to R: increasing user base and applications, especially in machine learning
>used more in computer science, machine learning
Introduction, overview:
R for Data Science: http://r4ds.had.co.nz/index.html
Dalgaard (2008). Introductory Statistics with R.
Crawley (2012). The R book.
Advanced R, Graphics in R, and further references:
Advanced R: https://adv-r.hadley.nz/
Matloff (2011). The art of R programming.
ggplot2: Elegant Graphics for Data Analysis: http://had.co.nz/ggplot2/
Dynamic Documents with R and knitr: https://github.com/yihui/knitr- book
Practical Regression and Anova using R: https://cran.r- project.org/doc/contrib/Faraway- PRA.pdf
Packages contains different category of functions: search after packages/what you are looking for
>> eg. https://cran.r-project.org/web/packages/ggplot2/index.html
ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics
A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
cran.r-project.org
In order to remove an assigned object or dataset, need to use rm(object or dataset) command.