Summary

Knowledge of statistical modeling is more important than ever today because in almost every field of science, such as medicine, epidemiology, engineering, economics or finance, decisions are made on the basis of observed data. A proper statistical model is needed to discover important relationships between variables or to obtain accurate predictions. Today, data are available in large quantities and researchers have a lot of information. However, large and complex data also pose challenges about their analysis from a statistical perspective. R is the leading tool for statistical analysis, as it is free and has thousands of packages. As a result, researchers have easy access to the most advanced statistical tools and techniques. This course aims to guide students in the use of R for linear modeling, considering both cross-sectional and time series data. The focus of this course is "learning by doing". Therefore, students are encouraged to apply the tools learned in the course to their own research projects and/or datasets that are in their research area. Guidance on the correct application of these tools will be provided during group work sessions.

Description

The course aims to provide students with knowledge on the following topics:

1. Base statistics with R: RStudio; R libraries and packages; importing datasets; identifying outliers and missing values; measuring the centrality and variability of a distribution; examples with real data.

2. Cross-sectional data analysis in R: simple and multiple linear regression, predictions; hypothesis testing; standard errors robust to heteroscedasticity; regression with binary response; introduction to spatial dependence; applications to real data.

Linear modeling with time series: autocorrelation; time series plotting; time series decomposition; regression-based trend elimination; regression-based seasonal adjustment; linear regression with time series data; standard errors of heteroscedasticity and robust autocorrelation (HAC); predictive regressions; applications to real data

4. Other topics: dimensionality reduction; finding data structures (clustering): cross-sectional vs. time-series data; introduction to longitudinal data analysis

Data of the activity

Sponsors:

International Doctoral School (EIDUAL)

Teaches:

Raffaele Mattera, PhD. (Sapienza University of Rome). His main research interests are computational statistics, time series analysis, spatial statistics and financial econometrics. He publishes in international journals related to computational statistics (with application) and participates in numerous international conferences with the same focus. He is an ordinary member of the Italian Statistical Society, the Italian Association of Econometrics and the Association of Spatial Econometrics.

Date:

February 1 & 2, 2023, from 3:00 pm to 6:00 pm.

Directed to:

Final year students of FYCO, COFIC, PhD in CCEE and Mathematics

No. of hours:

6 hours

Place:

Computer room 13

No. of places:

40 places

Certificate:

Issued by the event organizers, please contact jetrini@ual.es for any questions.