关键词 > MY461/MY561

MY461/MY561 Social Network Analysis Summer 2023 Exam

发布时间：2023-04-19

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Summer 2023 Exam

MY461/MY561

Social Network Analysis

Suitable for all candidates

Instructions to candidates

This paper contains five questions. Answer all five questions. All questions will be given equal weight (20%). Responses for each question should be a maximum of 500 words excluding tables and figures. Please include a bibliography with any cited sources (this does not count towards the word count).

The exam questions will be released on March 29, 2023. The exam is due on May 3, 2023, at 16:00.

Submission will be done through Moodle. Please submit your answers in a PDF file. You will be evaluated based on your responses to the 5 prompts. To help us determine where any errors were made, you must additionally submit an annotated R or Rmd file of the code used to arrive at your responses.

Background information:

For this exam, you will be analysing networks representing international trade. We will be comparing trade in goods with trade in services. The former covers tangible items (foods, equipment, livestock, etc), while the later covers intangible products. In the context of international trade, services could include: a business firm consulting for a company in another country, a call centre working for a company based in another country, a bank providing services with a branch overseas, or even tourists hiring a tour guide while abroad.

We will be using data from the OECD, specifically their“balanced trade statistics.”We will be using the OECD Balanced International Merchandise Trade dataset for goods and the OECD-WTO Balanced Trade in Services dataset for services.

While these data represent trade flows for many years, we focus solely on 2009 data and consider only records of exports (as opposed to imports). We have also limited the datasets to include only those countries that appear in both.

We are providing you with three files:

The first file (country_metadata.csv) contains information on each of the countries included in the joined dataset. We note that not all countries are included in this dataset, and some“countries”that are included are not necessarily globally recognised as such – for simplicity we refer here to all included as“countries,” and suggest you simply take the dataset as given for this exercise. To complement the information provided by the OECD, we have additionally associated some variables from other sources. The variables are as follows:

• ISO.alpha3.code: a three letter code for each country (See morehere)

• Country: the name of the country

• M49.code: another standard coding for each country (Again, see morehere)

• Region.1: the primary region that the country is located in. (See morehere)

• Region.2: an alternative subdivision of regions (primarily joining some smaller regions into Latin America and Sub-Saharan Africa)

• Continent: the continent of the country. (Again, see morehere)

• GDP.per.capita: the gross domestic product per capita in US dollars, as of 2017, as reported byUN Data

• Population2010.OECD.estimate: the population of each country, as reported on Wikipediahere

• Area.sqkm: The land area of each country, in square kilometres, per Wikipedia as above

• centroid.lon: The longitude in decimal degrees of the geographical centre of the country, as given by theCoordinateClearR package (from which the subsequent variables also derive)

• centroid.lat: The latitude in decimal degrees of the geographical centre of the country

• capital: The capital of the country

• capital.lon: The longitude in decimal degrees of the capital of the country

• capital.lat: The latitude of the capital of the country

The second file (exp_goods_2009.csv) is an edge data frame with three columns:“Reporter”,“Partner”, and“USD”. The first row implies that Afghanistan (“AFG”) exported $2,096,336-worth of goods to Angola (“AGO”) in 2009. The exact value here is derived from a complex set of calculations and adjustments made by OPEC to try to reconcile varying reports – for our purposes, we can simply take the numbers as given.

The third file (exp_services_2009.csv) follows the same structure as the former, being an edge data frame recording now the value of services exported from one country to another. So, the first row of that file implies that Afghanistan (“AFG”)“exported”$11,467,197-worth of services to Angola (“AGO”) in 2009.

Use these files to generate four networks, where nodes are the countries and edges are the trade flows from Country A to Country B in 2009. Specifically, make the following networks:

• “goods” network: A directed, weighted network based on the exp_goods_2009 file, representing trade in goods between the countries; the edge weight should be the value of goods exported from Country A to Country B.

• “services” network: A directed, weighted network based on the exp_services_2009 file, representing trade in services between the countries; the edge weight should be the value of services exported from Country A to Country B.

• “top goods” network: A directed, weighted network that is a subset of the“goods”network, retaining only those trading relationships that are in the top quartile (i.e., top 25%) in value. Note that this will result in the new network having a few isolates.

• “top services” network: A directed, weighted network that is a subset of the“services”network, retaining only those trading relationships that are in the top quartile (i.e., top 25%) in value. Again, there will be a few isolates.

As a hint, if you have created the goods network as an igraph object called“n_goods”with an edge attribute called“USD”, then you should be able to create the top goods network with this line of code: n_goods_top <- delete.edges(n_goods, which(E(n_goods)$USD < quantile(E(n_goods)$USD, prob = 0.75, na.rm = TRUE)))

With these networks in hand, answer the following questions.

Consider the overall metrics (average path length, transitivity, and reciprocity) for the top goods and top services networks. Compare these two networks to networks modelled on the originals, created with the Erdős – Rényi model and with the configuration model (i.e., you should create one E-R network and one configuration model network based on the top goods network, and one E-R network and one configuration model network based on the top services network). What do these comparisons tell you about the nature of trade flows between countries? How do the two empirical networks differ? In your answer, make sure to define each of the metrics and give an intuitive interpretation for them.

Which do you see as the most influential countries in the trade networks? Identify two potential meanings of "influence," as proxied by different centrality measures. Justify your choice of each centrality measure. In that justification, present clear interpretations of what each centrality measure is capturing about the position of the countries. Calculate both of your chosen centrality measures on the top goods and top services networks and identify the top ten countries that has the highest values for each. There are likely to be many countries repeatedly identified across both, so it may be informative to consider the countries that appear uniquely as having high centrality in one network but not the other. Discuss the patterns you see and what it implies about the top countries and about trade flows, generally. Make explicit reference in your response to the concepts of social capital and brokerage/structural holes covered in the course material. [Note that if you use edge weight for a centrality measure, you need to identify how the calculation interprets those weights (i.e., do higher values mean greater closeness or greater distance?); in those cases where the measure assumes that higher values means greater distance, you should use 1/weight in the calculation.]

How does geography influence trade flows? Calculate assortativity by continent and by GDP on the top goods and top services networks. Use the spinglass community detection algorithm to divide both the goods and services networks into communities (make sure to consider edge weight and use set.seed(123456) to ensure that we get equivalent results). Plot the goods and services networks with nodes positioned by their latitude/longitude (so that your network should look roughly like a map of the world), and nodes coloured by their community membership. Make sure that your plots are legible and informative, with a legend. Discuss what each of these results implies about migration flows.

Which countries seem to have similar approaches to trading in goods and services? Calculate the structural equivalence (separately) on the top goods and top services networks (using the equiv.clust() function). Plot the dendrograms that result from each. Plot the top goods and top services networks again with nodes positioned as above by their latitude/longitude, and now with nodes coloured by the resulting structural equivalence classes (where you should divide each into four groups). Make sure that your plots are legible and informative, with a legend. Describe and compare the resulting blocks (for this, you will want to look at the dendrogram, at the memberships, and potentially at block models or mixing matrices based on the equivalence classes) and discuss what each of these results imply about global trade relations.

What helps predict whether two countries trade with one another? Consider the results of the two exponential random graph models (ERGMs) below, one run on the top goods network and one on the top services network (on the following page):

Here, log_dist is a weighted adjacency matrix, where each cell represents the log of the distance between the two countries, calculated from the latitude and longitude of the centroid of each country (using the distm function from the geosphere R package). (So, e.g., Afghanistan and Aruba are 13074 km distant from one another, so the value in this cell would be log(13074), or 9.48). log_gdp is the log of each country’s GDP (derived from the GDP.per.capita variable, where countries without a reported GDP have been assigned the median of all other countries), and log_pop is the log of each country’s population (derived from the Population2010.OECD.estimate variable).

Interpret each term in the ERGM (except edges) – how does each term influence whether one country trades with another (in goods and in services)? Use odds ratios in your substantive interpretation of each term. Compare the two model results – what do the similarities and differences between them imply about trade in goods versus trade in services? Propose an additional model term to add to the model and provide a justification for them. What do you think it would capture that is currently missing from the ERGMs? [You do not need to run these new ERGMs].