Stata
Learning Stata
Our class assumes that you have previously used Stata—at least a little bit. Before our first class, you may review your course materials from EDUC 288A (Educational, Social, and Behavioral Statistics) and / or take a look at Chapters 3-4 in Regression Analysis for the Social Science (Gordon 2010). We will also dedicate our first lab session to review some of these materials.
You may use UCI’s Virtual Computing Lab to access Stata for free.
There are tons of online resources to help you with Stata. Three of the most important are UCLA’s Stata training site, the Statalist forum, and StackOverflow (a Q&A site with hundreds of thousands of answers to all sorts of programming questions). Also, ChatGPT will sometimes fool you but, as it was likely trained on and (is quickly replacing) StackOverflow, it is quite good at writing Stata code.
If you haven’t used Stata in a while, consider going over three introductory modules on the UCLA page under “Class Notes”—Entering, Exploring, Modifying. This will take approximately 2 hours. This “cheat sheet” of common Stata commands is a nice resource, too.
Other tutorials
Here, I list a selection of other introductory tutorials I find helpful.
Other useful Stata tutorials include J-PAL/IPA’s Stata modules (101, 102, 103, 104; all direct download; also on Github here), Stata’s own list of resources, which includes a link to cheat sheets, and the Princeton DSS online tutorial.
J-PAL also provides a list of coding resources for randomized trials, including resources on how to work with teams in the social sciences, guidance for reproducible work, and how to write randomization code.
Select tips
Here, I provide a brief list with a selection of tips. Others have written awesome books and guides, and constructed a “coders’ corner” with tips—so this list won’t be exhaustive. Still, I hope this helps.
- Generating LaTeX output. Of those, my go-to approach to writing fully flexible tables uses -file write-.
- Making nicer graphs. Of those, I prefer the -plotplainblind- scheme.
- Methods to account for differential attrition in randomized trials. Here, my co-authors and I use more advanced methods.
- Multiple hypothesis testing.
- Power calculations and ex-post calculation of MDEs.
- Randomization. Of those, I especially like the -randtreat- command.
- Randomization inference.
- Reproducible coding.
- Selection on observables, including how to replicate Emily Oster’s bounding method and how to use her -psacalc- command.