Independent research paper

Due by 11:55 PM on Friday, March 22, 2024

For the final course project, you will produce a short research paper that uses regression analysis to investigate a causal relationship. This page provides guidance for those who decide to pursue an independent study that uses their own data. It also provides information on what to submit for Milestones 1 and 2, and how to submit your final documents.

In summary, for the project you will

You will work by yourself, not in groups. At the same time, do not hesitate to reach out to other classmates via Perusall. Also, absolutely do not hesitate to ask me questions. I’m here to help!

Your first milestone is due by 11:55 PM on Sunday, February 04, 2024. You should meet with me at least once, during my student hours, at any point before 05:00 PM on Wednesday, February 14, 2024. Your second milestone is due by 11:55 PM on Sunday, March 03, 2024. No late work will be accepted.

For Milestone 2 and for your final paper, you should submit a replication package I (or anyone else) can run on my computer, along with your paper (i.e., not just a copy of your paper). A key aim of our class is for you learn how to create reproducible work—this assignment provides an opportunity to show that you have mastered this skill.


Overall instructions

As we discuss in our first class, a lot of non-experimental social science research is—at least implicitly—causal (Grosz, Rohrer, and Thoemmes 2020). For your final project, you will produce a short research paper that uses regression analysis to investigate a causal relationship.

Your paper will discuss a theoretical model of the causal relationship you investigate using a directed acyclic graph (DAG). You should then investigate this relationship with a regression analysis that includes a bounding exercise (Oster 2019). Finally, your paper should present a proposal of how to investigate the same relationship with a (hypothetical) quasi-experimental approach (using either a regression discontinuity design, instrumental variables, or a difference-in-differences strategy).

If you decide to use primary data, you must adhere to UCI’s IRB regulations. Your project must use data that can be shared with other researchers.1

Your research project must relate to your field of study. If you are a PhD student at the School of Education, your study should focus on a research question that matches your area of study.


Milestone 1

By Milestone 1, you should have identified the study you would like to conduct, and you should have sourced the necessary data to do so. Also, you should have developed a good understanding of what covariates you will need to have access to.

Here’s a summary of what you’ll need to submit for your first milestone (on Canvas). You should submit a PDF with a two-pager (or shorter document) that includes

  • a brief paragraph explaining your research question,
  • a DAG that summarizes the causal relationship investigated in the study, including a discussion of which relevant “backdoors” need to be closed to make a causal statement, and
  • a confirmation that you have successfully sourced all the necessary data to conduct your study (including for the covariates you identified).

As we’ll see in our class on sample size calculations, your dataset should include at least 200 observations or more.


1:1 meeting

You should meet with me at least once to discuss your project. Please meet with me after submitting your first milestone and before Wednesday, February 14, 2024 (incl.). Feel free to meet with me more often; I look forward to hearing about your work!


Milestone 2

For Milestone 2, you should generate all of the tables and figures of your paper (including for the bounding exercise). You should prepare a replication package you can share with others.

Where appropriate, please include estimating equations for the analyses you are conducting.

Here’s a summary of what you’ll need to submit for your second milestone (on Canvas).

  • A PDF of your draft paper. Your paper should follow the APA standards (JARS-Quant).
  • In the week before Milestone 2, I will pair you with a classmate (your “replicator”). In the week before the due date, you should share a replication package of your analyses with this classmate. Your classmate should be able to replicate all of your work with a single push of a button. For Milestone 2, please submit a screenshot or PDF printout of an email from your replicator, confirming (a) that your code allows for a push-button reproduction of all tables and figures and, vice versa, (b) that you were able to reproduce your classmate’s findings.
  • The same replication package (as a .zip file) which allows me to replicate all of your tables and figures with a single push of a button.

Please also post an abstract of your paper via Jamboard. That way, everyone can see what you’re working on. Limit your abstract to 150 words (not 250).


Final submission

Your submission of the final paper should improve over Milestone 2 in at least two ways. First, you will have received feedback from me, and you may follow some of my suggestions. Second, you should present a proposal of how to investigate the same causal relationship with a (hypothetical) quasi-experimental approach (using either a regression discontinuity design, instrumental variables, or a difference-in-differences strategy).

Here’s a summary of what you’ll need to submit for your final assignment (on Canvas).

  • A PDF of your final paper, including a section that proposes a quasi-experimental study.
  • A replication package (as a .zip file) which allows me to replicate all of your tables and figures with a single push of a button.

Below, you find an outline of what you should write about for the proposed quasi-experimental study.

Source of variation

State the source of variation in your main explanatory variable of interest. Your proposal can be purely hypothetical, but explain at least one scenario that would allow you to investigate your research question. Make sure you identify an “external shifter” in this variable (e.g., a policy change, or the introduction of a new program in schools).

Sample

Sampling and sample

What would be the study sample? How large will this sample be and how would you select it? Feel free to continue with your hypothetical scenario from just above.

Power calculations

Identify a reasonable effect size for your main outcome of interest. Conduct power calculations and report on the required sample size to detect this effect (accounting for clustering if needed). State additional assumptions you used for your calculations.

Independent of the analytical strategy you describe just below, here, you can simplify your analysis and conduct power calculations for a randomized experiment.

Analytical strategy

Identification strategy

How will you measure the causal effect? Will you rely on differences-in-differences? Regression discontinuity? Instrumental variables? How does your approach account for selection bias and endogeneity? How does your approach isolate the causal effect of your main variable of interest on the outcome?

Describe how you would analyze the data using your identification strategy. For instance:

  • If you’re proposing a diff-in-diff, specify a regression model with an interaction term to show the diff-in-diff.
  • If you’re proposing a regression discontinuity design, specify a model to check for a jump in the outcome variable at the cutoff in the running variable.
  • If you’re proposing to use instrumental variables, specify how you will check the validity of your instrument and run a 2SLS model.

Remember: You should propose a quasi-experimental study. You must go beyond your initial regression-adjusted approach. You must not propose matching, a purely qualitative strategy, or a randomized controlled trial.

Threats to internal validity

Briefly describe what kinds of threats to internal validity you would face in this proposed quasi-experimental study. How would your study mitigate these threats?

References

Grosz, Michael P., Julia M. Rohrer, and Felix Thoemmes. 2020. “The Taboo Against Explicit Causal Inference in Nonexperimental Psychology.” Perspectives on Psychological Science 15 (5): 1243–55. https://doi.org/10.1177/1745691620921521.
Oster, Emily. 2019. “Unobservable Selection and Coefficient Stability: Theory and Evidence.” Journal of Business & Economic Statistics 37 (2): 187–204. https://doi.org/10.1080/07350015.2016.1227711.

Footnotes

  1. You may have to de-identify data. Also, you must have the necessary rights to share your data with me (or anyone else).↩︎