关键词 > ECON2900/4409/8901

Development, Poverty and Famine, ECON2900/4409/8901 Semester 2, 2022 Tutorial 0

发布时间：2022-08-29

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Development, Poverty and Famine, ECON2900/4409/8901

Semester 2, 2022

Tutorial 0 (Week1)

This semester we will conduct a few exercises in a computer program

called R. This course requires no previous knowledge of econometrics or pro-

gramming. We will conduct relatively straightforward exercises to help us to

learn to analyse data from developing countries.

This week I provide a short introduction to a data analysis program called

R. If you are familiar with how to use R in an econometrics setting you may

ignore this introduction.

You will find R on any ANU Information Commons computer. You may

want to access R remotely from your own device. You can either install the

program yourself or access it by logging onto the ANU Virtual IT Commons.

All this means is that you will log into an ANU run computer off your device

where ever you are. To do that you need to install a program called VMWare

from

https://services.anu.edu.au/information-technology/software- systems

anu-virtual-information-commons. Once you have installed this

program you double click on the icon and enter your ANU login details.

That should take you to a virtual desktop. Open the programs window at

the bottom left of your screen and select ”R-Studio”.

You could, of course, install it on your own device since it is free. To

do so go to https://cran.csiro.au/ and download R for the type of

operating system you have.

R is where the calculations get done but it is not where you write in the

commands, for that we need to install RStudio. Go to https://rstudio.

com/products/rstudio/ and install RStudio for desktop.

The appendix at https://rstudio-education.github.io/hopr/ starting.html#using-r gives a brief explanation of how to install R.

If you would prefer to work in the cloud you could skip all these steps and

for free. I won’t give instructions for using this, my instructions are for the virtual commons only.

1 Setting R up for this course

The basic R that ANU has installed is quite good. But it is a programming language for all sorts of different users, it is not specifically for economists. If we want to analyze some data we would have to write a whole lot of specific instructions. In fact, we would probably write those same instructions each time we wanted to analyze a dataset. So maybe we would write those instructions and then save them to reuse as needed.

The world is filled with helpful R users, some of whom have written the very instructions we would want to use. So instead of writing out basic routines like how to calculate a mean or variance, we simply download an already written package that does this. (Actually, basic R already has code for means and variances but I hope you understand what I mean.)

You don’t all have automatic individual permission to install packages on ANU computers. We can work around that and we only have to work around it once.

After you login through the remote access application or after you have started a computer in an ANU computer lab, open the windows menu and scroll down to find R and Rstudio. Follow these steps as well if you are using your own device.

1. Open R Studio. When you open it you either have three or four win- dows where things happen. In figure 1 I have four quadrants. If you only have three then just click on the two overlapping squares on the right hand side of the Console screen (top middle of your screen).

The top left quadrant is where we will write long pieces of code. The bottom left quadrant is where you could write simple steps of code. It also shows you your results once you’ve executed a command.

The top right quadrant is where you will see information about the datsets we use (there won’t be anything there yet) and the bottom right quadrant is where we can keep track of our files and folders, where we can see which packages we’ve installed, get help and see our graphs if we generate any.

We’ll keep track of files and folders here, as well as load packages and look at graphs

Figure 1: Your first look at R

2. We need to create a folder for our projects - it is through this folder that we will always start our R sessions. Click on the top right hand side above the environment quadrant where it currently says Project(none). (This is labelled in my figure 2). Select New Project. I assume you don’t already have an R directory so select New Directory. Then select New Project and give it a name. I am naming mine ECON2900 4409 8901. Accept the directory R suggests. You will now have the folder in your H drive in your My Documents folder.

3. Now within this new project we will create two folders, one for data and one for the code we write:

• Click on New Folder in the bottom right quadrant and name it data. Select OK.

• Repeat but now call the new folder syntax (or code or scripts, something that seems like it might be storing code)

4. Let’s install a few packages. The first time we do this it is a bit tricky because R is running off the ANU server and somehow there is some in- consistency between what the server allows us to do and what we need to do in R.1 So the first thing we do is to type: .libPaths("H:/My Documents/ECON2900 4409 8901") in the bottom left hand win- dow and press enter.

Now we can type in the command in the Console window: install.packages(“name of package”).

5. We will install a package called tidyverse which is a fairly commonly installed package.

• Type install.packages(“tidyverse”)

• This may take fairly long and stop for a while before continuing (actually on the ANU commons it takes ages, don’t give up).

6. You will never need to install tidyverse again. Anytime you want to use it, you type library(tidyverse) into the Console and it will be loaded.

Figure 2: Naming a project for this course

Figure 3: reloading tidyverse

7. Figure 3 shows the output in the console once I reload tidyverse. It tells me there are some conflicts, that’s ok, it just means tidyverse tried to load something I already had. (tidyverse contains a lot of other packages.)

Install : logr, janitor, here, skimr, foreign, haven, tidytex, cli, stargazer, plm

8. Now go to the bottom right window and select Packages from the tabs. Click the little square next to tidyverse and all the other packages you’ve installed, as well as ggplot2, dplyr, knitr, tibble, tidyr.

2 Using a dataset from a developing country

1. Download the data files hh 91 .dta, hh 98 .dta, and hh 9198 .dta

from Wattle and save them in h:/My Documents/ECON2900 4409 8901/data.

2. Open RStudio through the project name in your H drive (the little R cube with the name you gave the project). You have to open it this way to save yourselves a lot of drama, basically you want R to remember the packages you’ve installed.

3. Open the data file hh 98 .dta in RStudio by clicking on Import Dataset in the Environment window. Select import Stata dataset.

4. Or you could type:

hh 98 <- read dta(‘‘h:/My Documents/econ2900/data/hh 98 .dta’’)

in either the scripts window or the console. I am using the scripts win- dow as I am constructing a script that will run all the commands at once. If you do use the scripts window then you need to highlight the command and click on the Run command at the top right of the window the execute that command.

5. If you want to see the data in spreadsheet form you can type View(hh 98) (I’m sure you can see another way to see the spreadsheet.)

6. Use the following commands to obtain information about the data set:

− dim(hh 98)

− str(hh 98)

− glimpse(hh 98)

− head(hh 98)

− tail(hh 98)

− summary(hh 98)

− skim(hh 98)

− summarise(hh 98, mean(famsize), mean(educhead))

− summarise(hh 98, mean(famsize), sd(famsize), min(famsize),

mean(educhead), sd(educhead), n())

− arrange(hh 98, agehead)

− sum(hh 98$agehead > 50)

− young head <- filter(hh 98, agehead < 80)

− fam size <- filter(hh 98, famsize < 7)

− hh 98 %>%

group by(dfmfd) %>%

summarise(famsize mean = mean(famsize, na .rm = TRUE), famsize sd = sd(famsize, na .rm = TRUE),

rice pricem = mean(rice, na .rm = TRUE), rice pricesd = sd(rice, na .rm = TRUE), wheat pricem = mean(wheat, na .rm = TRUE), wheat pricesd = sd(wheat, na .rm = TRUE))

− hh 98 %>%

group by(dfmfd) %>%

summarise(educhead mean = mean(educhead, na .rm = TRUE),

educhead sd = sd(educhead, na .rm = TRUE))

7. Using the data you found in 7. above, answer the following questions:

(a) What is the average family size?

(b) What is the average educational achievement?

(c) Using the summarise command and the help function what other statistics can you calculate for these two variables. What do you learn about them?

(d) Discuss the results of the second last bullet point. (e) Discuss the results of the last bullet point.