Statistical Analysis with R For Dummies

by Joseph Schmuller, PhD

Contents at a Glance

Part 1: Getting Started with Statistical Analysis with R
Data, Statistics, and Decisions
R: What It Does and How It Does It
Part 2: Describing Data
Getting Graphic
Finding Your Center
Deviating from the Average
Meeting Standards and Standings
Summarizing It All
What’s Normal?
Part 3: Drawing Conclusions from Data
The Confidence Game: Estimation
One-Sample Hypothesis Testing
Two-Sample Hypothesis Testing
Testing More than Two Samples
More Complicated Testing
Regression: Linear, Multiple, and the General Linear Model
Correlation: The Rise and Fall of Relationships
Curvilinear Regression: When Relationships Get Complicated
Part 4: Working with Probability
Introducing Probability
Introducing Modeling
Part 5: The Part of Tens
Ten Tips for Excel Emigrés.
Ten Valuable Online R Resources

e-books shop
e-books shop
Purchase Now !
Just with Paypal



Book Details
 Price
 3.00
 Pages
 456 p
 File Size 
 11,098 KB
 File Type
 PDF format
 ISBN
 978-1-119-33706-5
 978-1-119-33726-3 (ebk)
 978-1-119-33709-6 (ebk)
 Copyright©   
 2017 by John Wiley & Sons, Inc 

Introduction
So you’re holding a statistics book. In my humble (and absolutely biased)
opinion, it’s not just another statistics book. It’s also not just another
R book. I say this for two reasons.

First, many statistics books teach you the concepts but don’t give you an easy way
to apply them. That often leads to a lack of understanding. Because R is readymade
for statistics, it’s a tool for applying (and learning) statistics concepts.

Second, let’s look at it from the opposite direction: Before I tell you about one of
R’s features, I give you the statistical foundation it’s based on. That way, you
understand that feature when you use it — and you use it more effectively.

I didn’t want to write a book that only covers the details of R and introduces some
clever coding techniques. Some of that is necessary, of course, in any book that
shows you how to use a software tool like R. My goal was to go way beyond that.

Neither did I want to write a statistics “cookbook”: when-faced-with-problemcategory-#
152-use-statistical-procedure-#346. My goal was to go way beyond that, too.

Bottom line: This book isn’t just about statistics or just about R — it’s firmly at
the intersection of the two. In the proper context, R can be a great tool for teaching
and learning statistics, and I’ve tried to supply the proper context.

Table of Contents
INTRODUCTION. 1
About This Book. 1
Similarity with This Other For Dummies Book . 2
What You Can Safely Skip. 2
Foolish Assumptions. 2
How This Book Is Organized . 3
Part 1: Getting Started with Statistical Analysis with R. 3
Part 2: Describing Data. 3
Part 3: Drawing Conclusions from Data. 3
Part 4: Working with Probability. 3
Part 5: The Part of Tens . 4
Online Appendix A: More on Probability. 4
Online Appendix B: Non-Parametric Statistics. 4
Online Appendix C: Ten Topics That Just Didn’t Fit
in Any Other Chapter . 4
Icons Used in This Book. 4
Where to Go from Here. 5
PART 1: GETTING STARTED WITH STATISTICAL
ANALYSIS WITH R. 7
CHAPTER 1: Data, Statistics, and Decisions. 9
The Statistical (and Related) Notions You Just Have to Know. 10
Samples and populations. 10
Variables: Dependent and independent . 11
Types of data. 12
A little probability . 13
Inferential Statistics: Testing Hypotheses . 14
Null and alternative hypotheses. 14
Two types of error. 15
CHAPTER 2: R: What It Does and How It Does It. 17
Downloading R and RStudio . 18
A Session with R. 21
The working directory. 21
So let’s get started, already . 22
Missing data. 26
R Functions. 26
User-Defined Functions. 28
Comments . 29
R Structures. 29
Vectors . 30
Numerical vectors. 30
Matrices . 31
Factors. 33
Lists. 34
Lists and statistics. 35
Data frames. 36
Packages. 39
More Packages. 42
R Formulas. 43
Reading and Writing. 44
Spreadsheets. 44
CSV files. 46
Text files. 47
PART 2: DESCRIBING DATA. 49
CHAPTER 3: Getting Graphic. 51
Finding Patterns. 51
Graphing a distribution . 52
Bar-hopping. 53
Slicing the pie. 54
The plot of scatter. 55
Of boxes and whiskers. 56
Base R Graphics. 57
Histograms. 57
Adding graph features . 59
Bar plots. 60
Pie graphs. 62
Dot charts. 62
Bar plots revisited. 64
Scatter plots. 67
Box plots. 71
Graduating to ggplot2. 71
Histograms. 72
Bar plots. 74
Dot charts. 75
Bar plots re-revisited. 78
Scatter plots. 82
Box plots. 86
Wrapping Up . 89
CHAPTER 4: Finding Your Center. 91
Means: The Lure of Averages . 91
The Average in R: mean(). 93
What’s your condition?. 93
Eliminate $-signs forth with(). 94
Exploring the data. 95
Outliers: The flaw of averages. 96
Other means to an end. 97
Medians: Caught in the Middle. 99
The Median in R: median(). 100
Statistics à la Mode. 101
The Mode in R . 101
CHAPTER 5: Deviating from the Average. 103
Measuring Variation. 104
Averaging squared deviations: Variance and
how to calculate it. 104
Sample variance .107
Variance in R. 107
Back to the Roots: Standard Deviation. 108
Population standard deviation . 108
Sample standard deviation . 109
Standard Deviation in R. 109
Conditions, Conditions, Conditions . . . . 110
CHAPTER 6: Meeting Standards and Standings. 111
Catching Some Z’s. 112
Characteristics of z-scores. 112
Bonds versus the Bambino. 113
Exam scores. 114
Standard Scores in R. 114
Where Do You Stand?. 117
Ranking in R. 117
Tied scores. 117
Nth smallest, Nth largest. 118
Percentiles . 118
Percent ranks. 120
Summarizing . 121
CHAPTER 7: Summarizing It All. 123
How Many?. 123
The High and the Low. 125
Living in the Moments . 125
A teachable moment. 126
Back to descriptives .126
Skewness . 127
Kurtosis. 130
Tuning in the Frequency. 131
Nominal variables: table() et al. 131
Numerical variables: hist(). 132
Numerical variables: stem(). 138
Summarizing a Data Frame. 139
CHAPTER 8: What’s Normal?. 143
Hitting the Curve. 143
Digging deeper. 144
Parameters of a normal distribution. 145
Working with Normal Distributions . 147
Distributions in R. 147
Normal density function. 147
Cumulative density function . 152
Quantiles of normal distributions. 155
Random sampling. 156
A Distinguished Member of the Family. 158
PART 3: DRAWING CONCLUSIONS FROM DATA. 161
CHAPTER 9: The Confidence Game: Estimation . 163
Understanding Sampling Distributions. 164
An EXTREMELY Important Idea: The Central Limit Theorem . 165
(Approximately) Simulating the central limit theorem. 167
Predictions of the central limit theorem . 171
Confidence: It Has Its Limits!. 173
Finding confidence limits for a mean. 173
Fit to a t. 175
CHAPTER 10: One-Sample Hypothesis Testing. 179
Hypotheses, Tests, and Errors. 179
Hypothesis Tests and Sampling Distributions. 181
Catching Some Z’s Again. 183
Z Testing in R. 185
t for One. 187
t Testing in R. 188
Working with t-Distributions. 189
Visualizing t-Distributions. 190
Plotting t in base R graphics. 191
Plotting t in ggplot2. 192
One more thing about ggplot2. 197
Testing a Variance. 198
Testing in R. 199
Working with Chi-Square Distributions. 201
Visualizing Chi-Square Distributions. 201
Plotting chi-square in base R graphics. 202
Plotting chi-square in ggplot2. 203
CHAPTER 11: Two-Sample Hypothesis Testing. 205
Hypotheses Built for Two. 205
Sampling Distributions Revisited . 206
Applying the central limit theorem. 207
Z’s once more. 208
Z-testing for two samples in R. 210
t for Two. 212
Like Peas in a Pod: Equal Variances. 212
t-Testing in R. 214
Working with two vectors. 214
Working with a data frame and a formula. 215
Visualizing the results. 216
Like p’s and q’s: Unequal variances. 219
A Matched Set: Hypothesis Testing for Paired Samples . 220
Paired Sample t-testing in R. 222
Testing Two Variances . 222
F-testing in R. 224
F in conjunction with t. 225
Working with F-Distributions. 226
Visualizing F-Distributions . 226
CHAPTER 12: Testing More than Two Samples. 231
Testing More Than Two . 231
A thorny problem . 232
A solution. 233
Meaningful relationships. 237
ANOVA in R. 237
Visualizing the results. 239
After the ANOVA . 239
Contrasts in R. 242
Unplanned comparisons . 243
Another Kind of Hypothesis, Another Kind of Test. 244
Working with repeated measures ANOVA. 245
Repeated measures ANOVA in R. 247
Visualizing the results. 249
Getting Trendy. 250
Trend Analysis in R . 254
CHAPTER 13: More Complicated Testing. 255
Cracking the Combinations. 255
Interactions.  257
The analysis. 257
Two-Way ANOVA in R. 259
Visualizing the two-way results. 261
Two Kinds of Variables . . . at Once. 263
Mixed ANOVA in R. 266
Visualizing the Mixed ANOVA results. 268
After the Analysis. 269
Multivariate Analysis of Variance . 270
MANOVA in R. 271
Visualizing the MANOVA results. 273
After the analysis. 275
CHAPTER 14: Regression: Linear, Multiple, and
the General Linear Model. 277
The Plot of Scatter. 277
Graphing Lines. 279
Regression: What a Line! . 281
Using regression for forecasting. 283
Variation around the regression line. 283
Testing hypotheses about regression . 285
Linear Regression in R . 290
Features of the linear model. 292
Making predictions. 292
Visualizing the scatter plot and regression line . 293
Plotting the residuals . 294
Juggling Many Relationships at Once: Multiple Regression. 295
Multiple regression in R. 297
Making predictions. 298
Visualizing the 3D scatter plot and regression plane. 298
ANOVA: Another Look. 301
Analysis of Covariance: The Final Component of the GLM. 305
But wait — there’s more. 311
CHAPTER 15: Correlation: The Rise and Fall of Relationships. 313
Scatter plots Again. 313
Understanding Correlation . 314
Correlation and Regression. 316
Testing Hypotheses About Correlation . 319
Is a correlation coefficient greater than zero? 319
Do two correlation coefficients differ?. 320
Correlation in R. 322
Calculating a correlation coefficient. 322
Testing a correlation coefficient. 322
Testing the difference between two correlation coefficients. 323
Calculating a correlation matrix . 324
Visualizing correlation matrices . 324
Multiple Correlation . 326
Multiple correlation in R. 327
Adjusting R-squared. 328
Partial Correlation. 329
Partial Correlation in R. 330
Semipartial Correlation . 331
Semipartial Correlation in R. 332
CHAPTER 16: Curvilinear Regression: When Relationships
Get Complicated. 335
What Is a Logarithm? . 336
What Is e?. 338
Power Regression. 341
Exponential Regression . 346
Logarithmic Regression . 350
Polynomial Regression: A Higher Power. 354
Which Model Should You Use?. 358
PART 4: WORKING WITH PROBABILITY. 359
CHAPTER 17: Introducing Probability. 361
What Is Probability?. 361
Experiments, trials, events, and sample spaces. 362
Sample spaces and probability. 362
Compound Events. 363
Union and intersection. 363
Intersection again. 364
Conditional Probability. 365
Working with the probabilities . 366
The foundation of hypothesis testing. 366
Large Sample Spaces . 366
Permutations. 367
Combinations. 368
R Functions for Counting Rules. 369
Random Variables: Discrete and Continuous. 371
Probability Distributions and Density Functions . 371
The Binomial Distribution . 374
The Binomial and Negative Binomial in R. 375
Binomial distribution . 375
Negative binomial distribution. 377
Hypothesis Testing with the Binomial Distribution. 378
More on Hypothesis Testing: R versus Tradition . 380
CHAPTER 18: Introducing Modeling. 383
Modeling a Distribution. 383
Plunging into the Poisson distribution. 384
Modeling with the Poisson distribution. 385
Testing the model’s fit. 388
A word about chisq.test(). 391
Playing ball with a model. 392
A Simulating Discussion. 396
Taking a chance: The Monte Carlo method. 396
Loading the dice . 396
Simulating the central limit theorem. 401
PART 5: THE PART OF TENS. 405
CHAPTER 19: Ten Tips for Excel Emigrés. 407
Defining a Vector in R Is Like Naming a Range in Excel. 407
Operating on Vectors Is Like Operating on Named Ranges. 408
Sometimes Statistical Functions Work the Same Way  412
. . . And Sometimes They Don’t. 412
Contrast: Excel and R Work with Different Data Formats. 413
Distribution Functions Are (Somewhat) Similar . 414
A Data Frame Is (Something) Like a Multicolumn Named Range. 416
The sapply() Function Is Like Dragging. 417
Using edit() Is (Almost) Like Editing a Spreadsheet. .418
Use the Clipboard to Import a Table from Excel into R. 419
CHAPTER 20: Ten Valuable Online R Resources. 421
Websites for R Users. 421
R-bloggers. 421
Microsoft R Application Network . 422
Quick-R. 422
RStudio Online Learning. 422
Stack Overflow. 422
Online Books and Documentation. 423
R manuals. 423
R documentation. 423
RDocumentation. 423
YOU CANanalytics. 423
The R Journal . 424
INDEX. 425


Bookscreen
e-books shop

About This Book
Although the field of statistics proceeds in a logical way, I’ve organized this book
so that you can open it up in any chapter and start reading. The idea is for you to
find the information you’re looking for in a hurry and use it immediately —
whether it’s a statistical concept or an R-related one.

On the other hand, reading from cover to cover is okay if you’re so inclined. If
you’re a statistics newbie and you have to use R to analyze data, I recommend that
you begin at the beginning.
Previous Post Next Post