Basic data analysis for time series with R /

"This book emphasizes the collaborative analysis of data that is used to collect increments of time or space. Written at a readily accessible level, but with the necessary theory in mind, the author uses frequency- and time-domain and trigonometric regression as themes throughout the book. The...

Full description

Saved in:
Bibliographic Details
Main Author: Derryberry, DeWayne R. (Author)
Format: Electronic eBook
Language:English
Published: Hoboken, New Jersey : John Wiley & Sons, Inc., [2014]
Subjects:
Online Access: Full text (Wentworth users only)
Table of Contents:
  • Machine generated contents note: 1. R Basics
  • 1.1. Getting Started,
  • 1.2. Special R Conventions,
  • 1.3. Common Structures,
  • 1.4. Common Functions,
  • 1.5. Time Series Functions,
  • 1.6. Importing Data,
  • Exercises,
  • 2. Review of Regression and More About R
  • 2.1. Goals of this Chapter,
  • 2.2. The Simple(ST) Regression Model,
  • 2.2.1. Ordinary Least Squares,
  • 2.2.2. Properties of OLS Estimates,
  • 2.2.3. Matrix Representation of the Problem,
  • 2.3. Simulating the Data from a Model and Estimating the Model Parameters in R,
  • 2.3.1. Simulating Data,
  • 2.3.2. Estimating the Model Parameters in R,
  • 2.4. Basic Inference for the Model,
  • 2.5. Residuals Analysis[2014]What Can Go Wrong,
  • 2.6. Matrix Manipulation in R,
  • 2.6.1. Introduction,
  • 2.6.2. OLS the Hard Way,
  • 2.6.3. Some Other Matrix Commands,
  • Exercises,
  • 3. The Modeling Approach Taken in this Book and Some Examples of Typical Serially Correlated Data
  • 3.1. Signal and Noise,
  • 3.2. Time Series Data,
  • 3.3. Simple Regression in the Framework,
  • 3.4. Real Data and Simulated Data,
  • 3.5. The Diversity of Time Series Data,
  • 3.6. Getting Data Into R,
  • 3.6.1. Overview,
  • 3.6.2. The Diskette and the scan() and ts() Functions[2014]New York City Temperatures,
  • 3.6.3. The Diskette and the read.table() Function[2014]The Semmelweis Data,
  • 3.6.4. Cut and Paste Data to a Text Editor,
  • Exercises,
  • 4. Some Comments on Assumptions
  • 4.1. Introduction,
  • 4.2. The Normality Assumption,
  • 4.2.1. Right Skew,
  • 4.2.2. Left Skew,
  • 4.2.3. Heavy Tails,
  • 4.3. Equal Variance,
  • 4.3.1. Two-Sample t-Test,
  • 4.3.2. Regression,
  • 4.4. Independence,
  • 4.5. Power of Logarithmic Transformations Illustrated,
  • 4.6. Summary,
  • Exercises,
  • 5. The Autocorrelation Function And AR(1), AR(2) Models
  • 5.1. Standard Models[2014]What are the Alternatives to White Noise?,
  • 5.2. Autocovariance and Autocorrelation,
  • 5.2.1. Stationarity,
  • 5.2.2. A Note About Conditions,
  • 5.2.3. Properties of Autocovariance,
  • 5.2.4. White Noise,
  • 5.2.5. Estimation of the Autocovariance and Autocorrelation,
  • 5.3. The acf() Function in R,
  • 5.3.1. Background,
  • 5.3.2. The Basic Code for Estimating the Autocovariance,
  • 5.4. The First Alternative to White Noise: Autoregressive Errors[2014]AR(1), AR(2),
  • 5.4.1. Definition of the AR(1) and AR(2) Models,
  • 5.4.2. Some Preliminary Facts,
  • 5.4.3. The AR(1) Model Autocorrelation and Autocovariance,
  • 5.4.4. Using Correlation and Scatterplots to Illustrate the AR(1) Model,
  • 5.4.5. The AR(2) Model Autocorrelation and Autocovariance,
  • 5.4.6. Simulating Data for AR(m) Models,
  • 5.4.7. Examples of Stable and Unstable AR(1) Models,
  • 5.4.8. Examples of Stable and Unstable AR(2) Models,
  • Exercises,
  • 6. The Moving Average Models MA(1) And MA(2)
  • 6.1. The Moving Average Model,
  • 6.2. The Autocorrelation for MA(1) Models,
  • 6.3. A Duality Between MA(l) And AR(m) Models,
  • 6.4. The Autocorrelation for MA(2) Models,
  • 6.5. Simulated Examples of the MA(1) Model,
  • 6.6. Simulated Examples of the MA(2) Model,
  • 6.7. AR(m) and MA(l) model acf() Plots,
  • Exercises,
  • 7. Review of Transcendental Functions and Complex Numbers
  • 7.1. Background,
  • 7.2. Complex Arithmetic,
  • 7.2.1. The Number i,
  • 7.2.2. Complex Conjugates,
  • 7.2.3. The Magnitude of a Complex Number,
  • 7.3. Some Important Series,
  • 7.3.1. The Geometric and Some Transcendental Series,
  • 7.3.2. A Rationale for Euler's Formula,
  • 7.4. Useful Facts About Periodic Transcendental Functions,
  • Exercises,
  • 8. The Power Spectrum and the Periodogram
  • 8.1. Introduction,
  • 8.2. A Definition and a Simplified Form for p(f),
  • 8.3. Inverting p(f) to Recover the Ck Values,
  • 8.4. The Power Spectrum for Some Familiar Models,
  • 8.4.1. White Noise,
  • 8.4.2. The Spectrum for AR(1) Models,
  • 8.4.3. The Spectrum for AR(2) Models,
  • 8.5. The Periodogram, a Closer Look,
  • 8:5.1. Why is the Periodogram Useful?,
  • 8.5.2. Some Naive Code for a Periodogram,
  • 8.5.3. An Example[2014]The Sunspot Data,
  • 8.6. The Function spec.pgram() in R,
  • Exercises,
  • 9. Smoothers, The Bias-Variance Tradeoff, and the Smoothed Periodogram
  • 9.1. Why is Smoothing Required?,
  • 9.2. Smoothing, Bias, and Variance,
  • 9.3. Smoothers Used in R,
  • 9.3.1. The R Function lowess(),
  • 9.3.2. The R Function smooth.spline(),
  • 9.3.3. Kernel Smoothers in spec.pgram(),
  • 9.4. Smoothing the Periodogram for a Series With a Known and Unknown Period,
  • 9.4.1. Period Known,
  • 9.4.2. Period Unknown,
  • 9.5. Summary,
  • Exercises,
  • 10. A Regression Model for Periodic Data
  • 10.1. The Model,
  • 10.2. An Example: The NYC Temperature Data,
  • 10.2.1. Fitting a Periodic Function,
  • 10.2.2. An Outlier,
  • 10.2.3. Refitting the Model with the Outlier Corrected,
  • 10.3. Complications 1: CO2 Data,
  • 10.4. Complications 2: Sunspot Numbers,
  • 10.5. Complications 3: Accidental Deaths,
  • 10.6. Summary,
  • Exercises,
  • 11. Model Selection and Cross-Validation
  • 11.1. Background,
  • 11.2. Hypothesis Tests in Simple Regression,
  • 11.3. A More General Setting for Likelihood Ratio Tests,
  • 11.4. A Subtlety Different Situation,
  • 11.5. Information Criteria,
  • 11.6. Cross-validation (Data Splitting): NYC Temperatures,
  • 11.6.1. Explained Variation, R2,
  • 11.6.2. Data Splitting,
  • 11.6.3. Leave-One-Out Cross-Validation,
  • 11.6.4. AIC as Leave-One-Out Cross-Validation,
  • 11.7. Summary,
  • Exercises,
  • 12. Fitting Fourier series
  • 12.1. Introduction: More Complex Periodic Models,
  • 12.2. More Complex Periodic Behavior: Accidental Deaths,
  • 12.2.1. Fourier Series Structure,
  • 12.2.2. R Code for Fitting Large Fourier Series,
  • 12.2.3. Model Selection with AIC,
  • 12.2.4. Model Selection with Likelihood Ratio Tests,
  • 12.2.5. Data Splitting,
  • 12.2.6. Accidental Deaths[2014]Some Comment on Periodic Data,
  • 12.3. The Boise River Flow data,
  • 12.3.1. The Data,
  • 12.3.2. Model Selection with AIC,
  • 12.3.3. Data Splitting,
  • 12.3.4. The Residuals,
  • 12.4. Where Do We Go from Here?,
  • Exercises,
  • 13. Adjusting for AR(1) Correlation in Complex Models
  • 13.1. Introduction,
  • 13.2. The Two-Sample t-Test[2014]UNCUT and Patch-Cut Forest,
  • 13.2.1. The Sleuth Data and the Question of Interest,
  • 13.2.2. A Simple Adjustment for t-Tests When the Residuals Are AR(1),
  • 13.2.3. A Simulation Example,
  • 13.2.4. Analysis of the Sleuth Data,
  • 13.3. The Second Sleuth Case[2014]Global Warming, A Simple Regression,
  • 13.3.1. The Data and the Question,
  • 13.3.2. Filtering to Produce (Quasi- )Independent Observations,
  • 13.3.3. Simulated Example[2014]Regression,
  • 13.3.4. Analysis of the Regression Case,
  • 13.3.5. The Filtering Approach for the Logging Case,
  • 13.3.6. A Few Comments on Filtering,
  • 13.4. The Semmelweis Intervention,
  • 13.4.1. The Data,
  • 13.4.2. Why Serial Correlation?,
  • 13.4.3. How This Data Differs from the Patch/Uncut Case,
  • 13.4.4. Filtered Analysis,
  • 13.4.5. Transformations and Inference,
  • 13.5. The NYC Temperatures (Adjusted),
  • 13.5.1. The Data and Prediction Intervals,
  • 13.5.2. The AR(1) Prediction Model,
  • 13.5.3. A Simulation to Evaluate These Formulas,
  • 13.5.4. Application to NYC Data,
  • 13.6. The Boise River Flow Data: Model Selection With Filtering,
  • 13.6.1. The Revised Model Selection Problem,
  • 13.6.2. Comments on R2 and R2pred'
  • 13.6.3. Model Selection After Filtering with a Matrix,
  • 13.7. Implications of AR(1) Adjustments and the "Skip" Method,
  • 13.7.1. Adjustments for AR(1) Autocorrelation,
  • 13.7.2. Impact of Serial Correlation on p-Values,
  • 13.7.3. The "skip" Method,
  • 13.8. Summary,
  • Exercises,
  • 14. The Backshift Operator, the Impulse Response Function, and General ARMA Models
  • 14.1. The General ARMA Model,
  • 14.1.1. The Mathematical Formulation,
  • 14.1.2. The arima.sim() Function in R Revisited,
  • 14.1.3. Examples of ARMA(m, l) Models,
  • 14.2. The Backshift (Shift, Lag) Operator,
  • 14.2.1. Definition of B,
  • 14.2.2. The Stationary Conditions for a General AR(m) Model,
  • 14.2.3. ARMA(m, l) Models and the Backshift Operator,
  • 14.2.4. More Examples of ARMA(m, l) Models,
  • 14.3. The Impulse Response Operator[2014]Intuition,
  • 14.4. Impulse Response Operator, g(B)[2014]Computation,
  • 14.4.1. Definition of g(B),
  • 14.4.2. Computing the Coefficients,
  • 14.4.3. Plotting an Impulse Response Function,
  • 14.5. Interpretation and Utility of the Impulse Response Function,
  • Exercises,
  • 15. The Yule[2014]Walker Equations and the Partial Autocorrelation Function
  • 15.1. Background,
  • 15.2. Autocovariance of an ARMA(m, /) Model,
  • 15.2.1. A Preliminary Result,
  • 15.2.2. The Autocovariance Function for ARMA(m, /) Models,
  • 15.3. AR(m) and the Yule[2014]Walker Equations,
  • 15.3.1. The Equations,
  • 15.3.2. The R Function aryw() with an AR(3) Example,
  • 15.3.3. Information Criteria-Based Model Selection Using aryw(),
  • 15.4. The Partial Autocorrelation Plot,
  • 15.4.1. A Sequence of Hypothesis Tests,
  • 15.4.2. The pacf() Function[2014]Hypothesis Tests Presented in a Plot,
  • 15.5. The Spectrum For Arma Processes,
  • 15.6. Summary,
  • Exercises,
  • 16. Modeling Philosophy and Complete Examples
  • 16.1. Modeling Overview,
  • 16.1.1. The Algorithm,
  • Note continued: 16.1.2. The Underlying Assumption,
  • 16.1.3. An Example Using an AR(m) Filter to Model MA(3),
  • 16.1.4. Generalizing the "Skip" Method,
  • 16.2. A Complex Periodic Model[2014]Monthly River Flows, Fumas 1931-1978,
  • 16.2.1. The Data,
  • 16.2.2. A Saturated Model,
  • 16.2.3. Building an AR(m) Filtering Matrix,
  • 16.2.4. Model Selection,
  • 16.2.5. Predictions and Prediction Intervals for an AR(3) Model,
  • 16.2.6. Data Splitting,
  • 16.2.7. Model Selection Based on a Validation Set,
  • 16.3. A Modeling Example[2014]Trend and Periodicity: CO2 Levels at Mauna Lau,
  • 16.3.1. The Saturated Model and Filter,
  • 16.3.2. Model Selection,
  • 16.3.3. How Well Does the Model Fit the Data?,
  • 16.4. Modeling Periodicity with a Possible Intervention[2014]Two Examples,
  • 16.4.1. The General Structure,
  • 16.4.2. Directory Assistance,
  • 16.4.3. Ozone Levels in Los Angeles,
  • 14.5. Interpretation and Utility of the Impulse Response Function,
  • Exercises,
  • 15. The Yule[2014]Walker Equations and the Partial Autocorrelation Function
  • 15.1. Background,
  • 15.2. Autocovariance of an ARMA(m, l) Model,
  • 15.2.1. A Preliminary Result,
  • 15.2.2. The Autocovariance Function for ARMA(m, /) Models,
  • 15.3. AR(m) and the Yule[2014]Walker Equations,
  • 15.3.1. The Equations,
  • 15.3.2. The R Function ar.yw() with an AR(3) Example,
  • 15.3.3. Information Criteria-Based Model Selection Using ar.yw(),
  • 15.4. The Partial Autocorrelation Plot,
  • 15.4.1. A Sequence of Hypothesis Tests,
  • 15.4.2. The pacf() Function[2014]Hypothesis Tests Presented in a Plot,
  • 15.5. The Spectrum For Arma Processes,
  • 15.6. Summary,
  • Exercises,
  • 16. Modeling Philosophy and Complete Examples
  • 16.1. Modeling Overview,
  • 16.1.1. The Algorithm,
  • 16.1.2. The Underlying Assumption,
  • 16.1.3. An Example Using an AR(m) Filter to Model MA(3),
  • 16.1.4. Generalizing the "Skip" Method,
  • 16.2. A Complex Periodic Model[2014]Monthly River Flows, Fumas 1931-1978,
  • 16.2.1. The Data,
  • 16.2.2. A Saturated Model,
  • 16.2.3. Building an AR(m) Filtering Matrix,
  • 16.2.4. Model Selection,
  • 16.2.5. Predictions and Prediction Intervals for an AR(3) Model,
  • 16.2.6. Data Splitting,
  • 16.2.7. Model Selection Based on a Validation Set,
  • 16.3. A Modeling Example[2014]Trend and Periodicity: CO2 Levels at Mauna Lau,
  • 16.3.1. The Saturated Model and Filter,
  • 16.3.2. Model Selection,
  • 16.3.3. How Well Does the Model Fit the Data?,
  • 16.4. Modeling Periodicity with a Possible Intervention[2014]Two Examples,
  • 16.4.1. The General Structure,
  • 16.4.2. Directory Assistance,
  • 16.4.3. Ozone Levels in Los Angeles,
  • 16.5. Periodic Models: Monthly, Weekly, and Daily Averages,
  • 16.6. Summary,
  • Exercises,
  • 17. Wolf's Sunspot Number Data
  • 17.1. Background,
  • 17.2. Unknown Period -> Nonlinear Model,
  • 17.3. The Function nls() in R,
  • 17.4. Determining the Period,
  • 17.5. Instability in the Mean, Amplitude, and Period,
  • 17.6. Data Splitting for Prediction,
  • 17.6.1. The Approach,
  • 17.6.2. Step 1-Fitting One Step Ahead,
  • 17.6.3. The AR Correction,
  • 17.6.4. Putting it All Together,
  • 17.6.5. Model Selection,
  • 17.6.6. Predictions Two Steps Ahead,
  • 17.7. Summary,
  • Exercises,
  • 18. An Analysis of Some Prostate and Breast Cancer Data
  • 18.1. Background,
  • 18.2. The First Data Set,
  • 18.3. The Second Data Set,
  • 18.3.1. Background and Questions,
  • 18.3.2. Outline of the Statistical Analysis,
  • 18.3.3. Looking at the Data,
  • 18.3.4. Examining the Residuals for AR(m) Structure,
  • 18.3.5. Regression Analysis with Filtered Data,
  • Exercises,
  • 19. Christopher Tennant/Ben Crosby Watershed Data
  • 19.1. Background and Question,
  • 19.2. Looking at the Data and Fitting Fourier Series,
  • 19.2.1. The Structure of the Data,
  • 19.2.2. Fourier Series Fits to the Data,
  • 19.2.3. Connecting Patterns in Data to Physical Processes,
  • 19.3. Averaging Data,
  • 19.4. Results,
  • Exercises,
  • 20. Vostok Ice Core Data
  • 20.1. Source of the Data,
  • 20.2. Background,
  • 20.3. Alignment,
  • 20.3.1. Need for Alignment, and Possible Issues Resulting from Alignment,
  • 20.3.2. Is the Pattern in the Temperature Data Maintained?,
  • 20.3.3. Are the Dates Closely Matched?,
  • 20.3.4. Are the Times Equally Spaced?,
  • 20.4. A Naïve Analysis,
  • 20.4.1. A Saturated Model,
  • 20.4.2. Model Selection,
  • 20.4.3. The Association Between CO2 and Temperature Change,
  • 20.5. A Related Simulation,
  • 20.5.1. The Model and the Question of Interest,
  • 20.5.2. Simulation Code in R,
  • 20.5.3. A Model Using all of the Simulated Data,
  • 20.5.4. A Model Using a Sample of 283 from the Simulated Data,
  • 20.6. An AR(1) Model for Irregular Spacing,
  • 20.6.1. Motivation,
  • 20.6.2. Method,
  • 20.6.3. Results,
  • 20.6.4. Sensitivity Analysis,
  • 20.6.5. A Final Analysis, Well Not Quite,
  • 20.7. Summary,
  • Exercises,
  • A.1. Overview,
  • A.2. Loading a Time Series in Datamarket,
  • A.3. Respecting Datamarket Licensing Agreements,
  • B.1. Introduction,
  • B.2. PRESS,
  • B.3. Connection to Akaike's Result,
  • B.4. Normalization and R2,
  • B.5. An example,
  • B.6. Conclusion and Further Comments,
  • C.1. Introduction,
  • C.2. Newton's Method for One-Dimensional Nonlinear Optimization,
  • C.3. A Sequence of Directions, Step Sizes, and a Stopping Rule,
  • C.4. What Could Go Wrong?,
  • C.5. Generalizing the Optimization Problem,
  • C.6. What Could Go Wrong[2014]Revisited,
  • C.7. What Can be Done?