Children's Publishing Blogs - Political Analysis blog posts

Home > All Blogs > Tag > Political Analysis

What is JacketFlap

JacketFlap connects you to the work of more than 200,000 authors, illustrators, publishers and other creators of books for Children and Young Adults. The site is updated daily with information about every book, author, illustrator, and publisher in the children's / young adult book industry. Members include published authors and illustrators, librarians, agents, editors, publicists, booksellers, publishers and fans.
Join now (it's free).

Sort Blog Posts

Sort Posts by:

in
from

Suggest a Blog

Enter a Blog's Feed URL below and click Submit:

Most Commented Posts

In the past 7 days

JacketFlap Sponsors

Spread the word about books.
Put this Widget on your blog!

Are you a book Publisher?
Learn about Widgets now!

Advertise on JacketFlap

MyJacketFlap Blogs

Login or Register for free to create your own customized page of blog posts from your favorite blogs. You can also add blogs by clicking the "Add to MyJacketFlap" links next to the blog name in each post.

Blog Posts by Tag

In the past 7 days

Blog Posts by Date

Click days in this calendar to see posts by day or month

new posts in all blogs

Viewing: Blog Posts Tagged with: Political Analysis, Most Recent at Top [Help]

Results 1 - 18 of 18

1. Analyzing causal effects of multiple treatments in political methodology

By: MAlvarez, on 11/10/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: Journals, Politics, *Featured, Political Analysis, R. Michael Alvarez, political methodology, Jens Hainmueller, causal effects of multiple treatments, conjoint analysis, Daniel J. Hopkins, public policy research, Teppei Yamamoto, Add a tag

Recent years have seen amazing growth in the development of new tools that can be used to make causal claims about complex social phenomenon. Social scientists have been at the forefront of developing many of these new tools, in particular ones that can give analysts the ability to make causal inferences in survey research.

The post Analyzing causal effects of multiple treatments in political methodology appeared first on OUPblog.

1 Comments on Analyzing causal effects of multiple treatments in political methodology, last added: 11/11/2015

Display Comments Add a Comment

2. World Statistics Day: a reading list

By: Franca Driessen, on 10/19/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: Books, Journals, Mathematics, *Featured, oxford journals, Political Analysis, CESifo Economic Studies, oxford review of economic policy, An Introduction to Medical Statistics, Analyzing Wimbledon, Bayesian Theory and Applications, Biometrika, Eric Renshaw, Journal of Survey Statistics and Methodology, Law Probability & Risk, Martin Bland, Stochastic Analysis and Diffusion Processes, The New Statistics with R: An Introduction for Biologists, world statistics day, Add a tag

On 20 October 2015, the global mathematical community is celebrating World Statistics Day. In honour of this, we present here a reading list of OUP books and journal articles that have helped to advance the understanding of these mathematical concepts.

The post World Statistics Day: a reading list appeared first on OUPblog.

0 Comments on World Statistics Day: a reading list as of 10/19/2015 5:47:00 AM

Add a Comment

3. Text analysis for comparative politics

By: MAlvarez, on 9/16/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: Journals, Politics, *Featured, oxford journals, Political Analysis, R. Michael Alvarez, political methodology, Alex Storer, Brandon Stewart, Christopher Lucas, comparative politics, Dustin Tingley, Margaret Roberts, Rich Nielsen, Structural Topic Model, Text analysis, Add a tag

Every two days, humans produce more textual information than the combined output of humanity from the dawn of recorded history up through the year 2003. Much of this text is directly relevant to questions in political science. Governments, politicians, and average citizens regularly communicate their thoughts and opinions in writing, providing new data from which to understand the political world and suggesting new avenues of study in areas that were previously thought intractable.

The post Text analysis for comparative politics appeared first on OUPblog.

0 Comments on Text analysis for comparative politics as of 9/16/2015 7:18:00 AM

Add a Comment

4. Tips from a journal editor: being a good reviewer

By: MAlvarez, on 8/9/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: peer review, Political Analysis, R. Michael Alvarez, political methodology, Journals, Politics, Media, *Featured, oxford journals, academic journals, journal publishing, Add a tag

Peer review is one of the foundations of science. To have research scrutinized, criticized, and evaluated by other experts in the field helps to make sure that a study is well-designed, appropriately analyzed, and well-documented. It helps to make sure that other scholars can readily understand, appreciate, and build upon that work.

The post Tips from a journal editor: being a good reviewer appeared first on OUPblog.

0 Comments on Tips from a journal editor: being a good reviewer as of 1/1/1900

Add a Comment

5. Using web search data to study elections: Q&A with Alex Street

By: MAlvarez, on 6/15/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: Technology, Journals, Politics, search data, *Featured, Political Analysis, R. Michael Alvarez, Michael Alvarez, Alex Street, elections data, Google search data use, search data elections, search data voter behavior, voter behavior, voter behavior data, Add a tag

Social scientists made important contributions towards improving the conduct and administration of elections. A paper recently published in Political Analysis continues that tradition, and introduces the use of web search data to the study of public administration and public policy.

The post Using web search data to study elections: Q&A with Alex Street appeared first on OUPblog.

0 Comments on Using web search data to study elections: Q&A with Alex Street as of 1/1/1900

Add a Comment

6. Jonathan Nagler: writing good code

By: MAlvarez, on 2/8/2015
Blog: OUPblog (Login to Add to MyJacketFlap)
JacketFlap tags: Jonathan Nagler, perl, The Practice of Programming, Education, Technology, Journals, Politics, python, coding, *Featured, oxford journals, Political Analysis, R. Michael Alvarez, Add a tag

Introduction, by R. Michael Alvarez

Today’s data scientist must know how to write good code. Regardless of whether they are working with a commercial off-the-shelf statistical software package, R, python, or perl, all require the use of good coding practices. Large and complex datasets need lots of manipulation to wrangle them into shape for analytics, statistical estimation often is complex, and presentation of complicated results sometimes requires writing lots of code. To make sure that code is understandable to the author and others, good coding practices are essential.

Many who teach methodology, statistics, and data science, are increasingly teaching their students how to write good computer code. As a practical matter, if a professor requires that students turn in their code for a problem set, that code needs to be well-crafted to be legible to the instructor. But as increasing numbers of our students are writing and distributing their code and software tools to the public, professionally we need to do more to train students how to write good code. Finally, good code is critical for research replication and transparency — if you can’t understand someone’s code, it might be difficult or impossible to be able to reproduce their analysis.

When I first started teaching methods to graduate students, there was little in the methodological literature that I found useful for teaching graduate students good coding practices. But in 1995, my colleague Jonathan Nagler wrote out some great guidance on good methodological practices, in particular guidelines for good coding style. His piece is available online (“Coding Style and Good Computing Practices”), and his advice from 1995 is as relevant today as it was then. I use Jonathan’s guidelines in my graduate teaching.

Over the past few years, as Political Analysis has focused resources on research replication and transparency, it’s become clear that we need to develop better guidance for researchers and authors regarding how to write good code. One of the biggest issues that we run into when we review replication materials that are submitted to the journal is poor documentation and unclear code; and if we can’t figure out how the code works, I’m sure that our readers will have the same problem.

We’ve been thinking of developing some guidelines for documentation of replication materials, and standards for coding practices. As part of that research, I asked Jonathan if he would write an update of his 1995 essay, and for him to reflect some on how things might have evolved in terms of good computing practices since 1995. His thoughts are below, and I encourage readers to also read Jonathan’s original 1995 essay.

* * * * *

Coding style and good computing practices: it is easy to get the style right, harder to get good practice, by Jonathan Nagler, NYU

Many years ago I was prompted to write Coding Style and Good Computing Practices, an article laying out guidelines for coding style for political scientists. The article was reprinted in a symposium on replication in PS (September 1995, Vol. 28, No. 3, 488-492). According to Google Scholar, it has rarely been cited, but I’m convinced it has been read quite often because I’ve seem some idiosyncratic suggestions made in it in the code of other political scientists. Though re-reading the article I am reminded how many people have not read it, or just ignored it.

1024px-Ladies_Learning_Code_event,_November_26_2011 — Ladies coding event by Jon Lim. CC BY 2.0 via Wikimedia Commons.

Here is a list of basic points reproduced from that article:

Labbooks: essential.
Command files: they should be kept.
Data-manipulation vs. data-analysis: these should be in distinct files.
Keep tasks compartmentalized (‘modularity’).
Know what the code is supposed to do before you start.
Don’t be too clever.
Variable names should mean something.
Use parentheses and white-space to make code readable.
Documentation: all code should include comments meaningful to others.

And I concluded with a list of rules:

Maintain a labbook from the beginning of a project to the end.
Code each variable so that it corresponds as closely as possible to a verbal description of the substantive hypothesis the variable will be used to test.
Errors in code should be corrected where they occur and the code re-run.
Separate tasks related to data-manipulation vs data-analysis into separate files.
Each program should perform only one task.
Do not try to be as clever as possible when coding. Try to write code that is as simple as possible.
Each section of a program should perform only one task.
Use a consistent style regarding lower and upper case letters.
Use variable names that have substantive meaning.
Use variable names that indicate direction where possible.
Use appropriate white-space in your programs, and do so in a consistent fashion to make them easy to read.
Include comments before each block of code describing the purpose of the code.
Include comments for any line of code if the meaning of the line will not be unambiguous to someone other than yourself.
Rewrite any code that is not clear.
Verify that missing data is handled correctly on any recode or creation of a new variable.
After creating each new variable or recoding any variable, produce frequencies or descriptive statistics of the new variable and examine them to be sure that you achieved what you intended.
When possible, automate things and avoid placing hard-wired values (those computed ‘by-hand’) in code.

Those are still very good rules, I would not change any of them. I would add one, and that is to put comments in any paper citing the piece of code that produced the figures or tables in the paper. In 20 years a lot of things have changed about how we do computing. It has gotten much easier to follow good computing practices. Github has made it easy to share code, maintain revision history, and publish code. And the set of people who seamlessly collaborate by sharing files over Dropbox or one of its competitors probably dwarfs the number of political scientists using Github. But to paraphrase a common computing aphorism (GIGO), sharing or publishing badly written code won’t make it easy for people to replicate or build on your work.

I was motivated to write that article because as I stated then, most political scientists aren’t trained as computer programmers. Nor were most political scientists trained to work in a laboratory. So the article covered both style of code, and computing practice to make sure that an entire research project could be reproduced by someone else. That means keeping track of where you got your data, how it was processed, etc.

Any computer code is a set of instructions that produces results when read by a machine, and we can evaluate the code based on the results it produces. But when we share code we expect it to be read by humans. Two pieces of code be functionally equivalent — they could produce identical results when read by a machine — even though one is easy to read and understand by a human; while the other is pretty much unintelligible to a human. If you expect people to use your code, you need to make the code easy to read. I try to ask every graduate student I am going to work with to read several chapters from Brian W. Kernighan and Rob Pike’s, The Practice of Programming (1999), especially the Preface, Chapters 1, 3, 5, 6, and the Epilogue.

It has turned out to be easier to write clean code than to maintain good computing practices overall that would lead to easy reproducibility of an entire research project. It is fairly easy to post a ‘replication’ dataset, and the code used to produce the figures and tables in a paper. But that doesn’t really tell someone everything they need to know to try to reproduce your work, or extend it to other data. They need to know how your data was generated. And those steps occur in the production of the replication dataset, not in the use of it.

Most research projects in political science pull in data from many sources. And many, many coding decisions are made along the way to a finished product. All of those decisions may be visible in the code; but keeping coherent lab-books is essential for sifting through all the lines of code of any large project. And ‘projects’ rarely stand-alone anymore. Work on one dataset is linked to many projects, often with over-lapping sets of co-authors.

At the beginning of a research project it’s important for everyone to agree where the code is, where the data is, and what the overall structure of the documentation is. That means decisions about whether documentation is grouped by project (which could mean by individual paper), or by dataset. And it means reaching some agreement on whether there is a master document that points to many smaller documents describing individual tasks, or whether the whole project description sits in a single document. None of this is exciting to work out, certainly not as exciting as doing the research. But it is essential. A good goal of doing all this is to make it as easy as possible to make the whole bundle of documentation and code public as soon as it is time to do so. It both saves time when it is time to release documentation, and imposes some good habits and structure along the way.

Heading image: Typing computer screen reflection by Almonroth. CC BY-SA 3.0 via Wikimedia Commons.

The post Jonathan Nagler: writing good code appeared first on OUPblog.

Replication redux, by Nathaniel Beck

When I last wrote about replication for the OUPblog in August (“Research Replication in Social Science”), there was one smallish open question (about my own work) and one biggish question (on whether I would ever see the Kramer et al., “Experimental evidence of massive-scale emotional contagion through social networks”, replication file, which was “in the mail”). The Facebook story is interesting, so I start with that.

After not hearing from Adam Kramer of Facebook, even after contacting PNAS, I persisted with both the editor of PNAS (Inder Verma, who was most kind) and with the NAS through “well connected” friends. (Getting replication data should not depend on knowing NAS members!). I was finally contacted by Adam Kramer, who offered that I could come out to Palo Alto to look at the replication data. Since Facebook did not offer to fly me out, I said no. I was then offered a chance to look at the replication files in the Facebook office 4 blocks from NYU, so I accepted. Let me stress that all dealings with Adam Kramer were highly cordial, and I assume that delays were due to Facebook higher ups who were dealing with the human subjects firestorm related to the Kramer piece.

When I got to the Facebook office I was asked to sign a standard non-disclosure agreement, which I dec. To my surprise this was not a problem, with the only consequence being that a security officer would have had to escort me to the bathroom. I then was put in a room with a Facebook secure notebook with the data and R-studio loaded; Adam Kramer was there to answer questions, and I was also joined by a security person and an external relations person. All were quite pleasant, and the security person and I could even discuss the disastrous season being suffered by Liverpool.

I was given a replication file which was a data frame which had approximately 700,000 rows (one for each respondent) and 7 columns containing the number of positive and negative words used by each respondent as well as the total word count of each respondent, percentages based on these numbers, experimental condition. and a variable which omitted some respondents for producing the tables. This is exactly the data frame that would have been put in an archive since it contained all the data needed to replicate the article. I also was given the R-code that produced every item in the article. I was allowed to do anything I wanted with that data, and I could copy the results into a file. That file was then checked by Facebook people and about two weeks later I received the entire file I created. All good, or at least as good as it is going to get.

Intel team inside Facebook data center. Intel Free Press. CC BY 2.0 via Wikimedia Commons.

The data frame I played with was based on aggregating user posts so each user had one row of data, regardless of the number of posts (and the data frame did not contain anything more than the total number of words posted). I can understand why Facebook did not want to give me the data frame, innocuous as it seemed; those who specialize in de-de-identifying private data and reverse engineering code are quite good these days, and I can surely understand Facebook’s reluctance to have this raw data out there. And I understand why they could not give me all the actual raw data, which included how feeds were changed and so forth; this is the secret sauce that they would not like reverse engineered.

I got what I wanted. I could see their code, could play with density plots to get a sense of words used, I could change the number of extreme points dropped, and I could have moved to a negative binomial instead of a Poisson. Satisfied, I left after about an hour; there are only so many things one can do with one experiment on two outcomes. I felt bad that Adam Kramer had to fly to New York, but I guess this is not so horrible. Had the data been more complicated I might have felt that I could not do everything I wanted, and running a replication with 3 other people in a room is not ideal (especially given my typing!).

My belief is that that PNAS and the authors could simply have had a different replication footnote. This would have said that the code used (about 5 lines of R, basically a call to a Poisson regression using GLM) is available at a dataverse. In addition, they could have noted that the GLM called used the data frame I described, with the summary statistics for that data frame. Readers could then see what was done, and I can see no reason for such a procedure to bother Facebook (though I do not speak for them). I also note a clear statement on a dataverse would have obviated the need for some discussion. Since bytes are cheap, the dataverse could also contain whatever policy statement Facebook has on replication data. This (IMHO) is much better than the “contact the authors for replication data” footnote that was published. It is obviously up to individual editors as to whether this is enough to satisfy replication standards, but at least it is better than the status quo.

What if I didn’t work four blocks from Astor Place? Fortunately I did not have to confront this horror. How many other offices does Facebook have? Would Adam Kramer have flown to Peoria? I batted this around, but I did most of the batting and the Facebook people mostly did no comment. So someone else will have to test this issue. But for me, the procedure worked. Obviously I am analyzing lots more proprietary data, and (IMHO) this is a good thing. So Facebook, et al., and journal editors and societies have many details to work out. But, based on this one experience, this can be done. So I close this with thanks to Adam Kramer (but do remind him that I have had auto-responders to email for quite while now).

On the more trivial issue of my own dataverse, I am happy to report that almost everything that was once on an a private ftp site is now on my Harvard dataverse. Some of this was already up because of various co-authors who always cared about replication. And on stuff that was not up, I was lucky to have a co-author like Jonathan Katz, who has many skills I do not possess (and is a bug on RCS and the like, which beats my “I have a few TB and the stuff is probably hidden there somewhere”). So everything is now on the dataverse, except for one data set that we were given for our 1995 APSR piece (and which Katz never had). Interestingly, I checked the original authors’ web sites (one no longer exists, one did not go back nearly that far) and failed to make contact with either author. Twenty years is a long time! So everyone should do both themselves and all of us a favor, and build the appropriate dataverse files contemporaneously with the work. Editors will demand this, but even with this coercion, this is just good practice. I was shocked (shocked) at how bad my own practice was.

Heading image: Wikimedia Foundation Servers-8055 24 by Victorgrigas. CC BY-SA 3.0 via Wikimedia Commons.

The post Replication redux and Facebook data appeared first on OUPblog.

An update on Dataverse

By Gary King

If you’re an academic researcher, odds are you’re not a professional archivist and so you probably have more interesting things to do when making data available than following the detailed protocols and procedures established over many years by the archiving community. That of course might be OK for any one of us but it is a terrible loss for all of us. The Dataverse Network Project offers a solution to this problem by eliminating transaction costs and changing the incentives to make data available by giving you substantial web visibility and academic citation credit for your data and scholarship (King, 2007). Dataverse Networks are installed at universities and other institutions around the world (e.g., here is the Dataverse network at Harvard’s IQSS), and represent the world’s largest collection of social science research data. In recent years, Dataverse has also been adopted by an increasingly diverse array of other fields and protocols and procedures are being built out to enable numerous fields of science, social science, and the humanities to work together.

With a few minutes of set-up time, you can add your own Dataverse to your homepage with a list of data sets or replication data sets you make available, with whatever levels of permission you want for the broader community, and a vast array of professional services (e.g., here’s my Dataverse on my homepage). People will be able to more easily find your data and homepage, explore your data and scholarship, find connections to other resources, download data in any format, and learn proper ways of citing your work. They will even be able to analyze your data while still on your web site with a vast array of statistical methods through the transparent and automated connection Dataverse has built to Zelig: Everyone’s Statistical Software, and through Zelig to R. The result is that your data will be professionally preserved and easier to access — effectively automating the tasks of professional archiving, including citing, sharing, analyzing, archiving, preserving, distributing, cataloging, translating, disseminating, naming, verifying, and replicating data.

Dataverse_Network_Diagram — Dataverse Network Diagram, by Institute for Quantitative Social Science. CC-BY-2.0 via Wikimedia Commons.

Dataverse is an active project with new developments in software, protocols, and community connections coming rapidly. A brand new version of the code, written from scratch, will be available in a few months. Through generous grants from the Sloan Foundation, we have been working hard on eliminating other types of transaction costs for capturing data for the research community. These include deep integration with scholarly journals so that it can be trivially easy for an editor to encourage or require data associated with publications to be made available. We presently offer journals three options:

Do it yourself. Authors publish data to their own dataverse, put the citation to their data in their final submitted paper. Journals verify compliance by having the copyeditor check for the existence of the citation.
Journal verification. Authors submit draft of replication data to Journal Dataverse. Journal reviews it, and approves it for release. Finally, the dataset is published with a formal data citation and back to the article. (See, for example, the Political Analysis Dataverse, with replication data back to 1999.)
Full automation: Seamless integration between journal submission system and Dataverse; Automatic Link created between article and data. The result is that it is easy for the journal and author and many errors are eliminated.

Full automation in our third option is where we are heading. Already today, in 400 scholarly journals in the Open Journal System, the author enters their data as part of submission of the final draft of the accepted paper for publication, and the citation, permanent links between the data and the article, and formal preservation is taken care of, all automatically. We are working on expanding this as an option for all of OJS’s 5,000+ journals, and to a wide array of other scholarly journal publishers. The result will be that we capture data with the least effort on anyone’s part, at exactly the point where it is easiest and most important to capture.

We are also working on extending Dataverse to cover new higher levels of security that are more prevalent in big data collections and those in public health, medicine, and other areas with informative data on human subjects. Yes, you can preserve data and make it available under appropriate protections, even if you have highly confidential, proprietary, or otherwise sensitive data. We are working on other privacy tools as well. We already have an extensive versioning system in Dataverse, but are planning to add support for continuously updated data such as streamed from sensors, tools for online fast data access, queries, visualization, analysis methods for when data cannot be moved because of size or privacy concerns, and ways to use the huge volume of web analytics to improve Dataverse and Zelig.

This post comes from the talk I gave at the American Political Association Meetings August 2014, using these slides. Many thanks to Mike Alvarez for inviting this post.

Featured image: Matrix code computer by Comfreak. CC0 via Pixabay.

The post Gary King: an update on Dataverse appeared first on OUPblog.

The pros of preregistration for political science

By Jamie Monogan, Department of Political Science, University of Georgia

1024px-Howard_Tilton_Library_Computers_2010 — Howard Tilton Library Computers, Tulane University by Tulane Public Relations. CC-BY-2.0 via Wikimedia Commons.

Study registration is the idea that a researcher can publicly release a data analysis plan prior to observing a project’s outcome variable. In a Political Analysis symposium on this topic, two articles make the case that this practice can raise research transparency and the overall quality of research in the discipline (“Humphreys, de la Sierra, and van der Windt 2013; Monogan 2013).

Together, these two articles describe seven reasons that study registration benefits our discipline. To start, preregistration can curb four causes of publication bias, or the disproportionate publishing of positive, rather than null, findings:

Preregistration would make evaluating the research design more central to the review process, reducing the importance of significance tests in publication decisions. Whether the decision is made before or after observing results, releasing a design early would highlight study quality for reviewers and editors.
Preregistration would help the problem of null findings that stay in the author’s file drawer because the discipline would at least have a record of the registered study, even if no publication emerged. This will convey where past research was conducted that may not have been fruitful.
Preregistration would reduce the ability to add observations to achieve significance because the registered design would signal in advance the appropriate sample size. It is possible to monitor the analysis until a positive result emerges before stopping data collection, and this would prevent that.
Preregistration can prevent fishing, or manipulating the model to achieve a desired result, because the researcher must describe the model specification ahead of time. By sorting out the best specification of a model using theory and past work ahead of time, a researcher can commit to the results of a well-reasoned model.

Additionally, there are three advantages of study registration beyond the issue of publication bias:

Preregistration prevents inductive studies from being written-up as deductive studies. Inductive research is valuable, but the discipline is being misled if findings that are observed inductively are reported as if they were hypothesis tests of a theory.
Preregistration allows researchers to signal that they did not fish for results, thereby showing that their research design was not driven by an ideological or funding-based desire to produce a result.
Preregistration provides leverage for scholars who face result-oriented pressure from financial benefactors or policy makers. If the scholar has committed to a design beforehand, the lack of flexibility at the final stage can prevent others from influencing the results.

Overall, there is an array of reasons why the added transparency of study registration can serve the discipline, chiefly the opportunity to reduce publication bias. Whatever you think of this case, though, the best way to form an opinion about study registration is to try it by preregistering one of your own studies. Online study registries are available, so you are encouraged to try the process yourself and then weigh in on the preregistration debate with your own firsthand experience.

* * * * *

Experiments, preregistration, and journals

By Joshua Tucker, Professor of Politics (NYU) and Co-Editor, Journal of Experimental Political Science

I want to make one simple point in this blog post: I think it would be a mistake for journals to come up with any set of standards that involves publically recognizing some publications as having “successfully” followed their pre-registration design while identifying others publications as not having done so. This could include a special section for articles that matched their pre-registration design, an A, B, C type rating system for how faithfully articles had stuck with the pre-registration design, or even an asterisk for articles that passed a pre-registration faithfulness bar.

Let me be equally clear that I have no problem with the use of registries for recording experimental designs before those experiments are implemented. Nor do I believe that these registries should not be referenced in published works featuring the results of those experiments. On the contrary, I think authors who have pre-registered designs ought to be free to reference what they registered, as well as to discuss in their publications how much the eventual implementation of the experiment might have differed from what was originally proposed in the registry and why.

My concern is much more narrow: I want to prevent some arbitrary third party from being given the authority to “grade” researchers on how well they stuck to their original design and then to be able to report that grade publically, as opposed to simply allowing readers to make up their own mind in this regard. My concerns are three-fold.

First, I have absolutely no idea how such a standard would actually be applied. Would it count as violating a pre-design registry if you changed the number of subjects enrolled in a study? What if the original subject pool was unwilling to participate for the planned monetary incentive, and the incentive had to be increased, or the subject pool had to be changed? What if the pre-registry called for using one statistical model to analyze the data, but the author eventually realized that another model was more appropriate? What if survey questions that was registered on a 1-4 scale was changed to a 1-5 scale? Which, if any of these, would invalidate the faithful application of the registry? Would all of them together? It seems to the only truly objective way to rate compliance is to have an all or nothing approach: either you do exactly what you say you do, or you didn’t follow the registry. Of course, then we are lumping “p-value fishing” in the same category as applying a better a statistical model or changing the wording of a survey question.

This bring me to my second point, which is a concern that giving people a grade for faithfully sticking to a registry could lead to people conducting sub-optimal research — and stifle creativity — out of fear that it will cost them their “A” registry-faithfulness grade. To take but one example, those of us who use survey experiments have long been taught to pre-test questions precisely because sometime some of the ideas we have when sitting at our desks don’t work in practice. So if someone registers a particular technique for inducing an emotional response and then runs a pre-test and figures out their technique is not working, do we really want the researcher to use the sub-optimal design in order to preserve their faithfulness to the registered design? Or consider a student who plans to run a field experiment in a foreign country that is based on the idea that certain last names convey ethnic identity. What happens if the student arrives in the field and learns that this assumption was incorrect? Should the student stick with the bad research design to preserve the ability to publish in the “registry faithful” section of JEPS? Moreover, research sometimes proceeds in fits and spurts. If as a graduate student I am able to secure funds to conduct experiments in country A but later as a faculty member can secure funds to replicate these experiments in countries B and C as well, should I fear including the results from country A in a comparative analysis because my original registry was for a single country study? Overall, I think we have to be careful about assuming that we can have everything about a study figured out at the time we submit a registry design, and that there will be nothing left for us to learn about how to improve the research — or that there won’t be new questions that can be explored with previously collected data — once we start implementing an experiment.

At this point a fair critique to raise is that the points in preceding paragraph could be taken as an indictment of registries generally. Here we venture more into simply a point of view, but I believe that there is a difference between asking people to document what their original plans were and giving them a chance in their own words — if they choose to do so — to explain how their research project evolved as opposed to having to deal with a public “grade” of whatever form that might take. In my mind, the former is part of producing transparent research, while the latter — however well intentioned — could prove paralyzing in terms of making adjustments during the research process or following new lines of interesting research.

This brings me to my final concern, which is that untenured faculty would end up feeling the most pressure in this regard. For tenured faculty, a publication without the requisite asterisks noting registry compliance might not end up being too big a concern — although I’m not even sure of that — but I could easily imagine junior faculty being especially worried that publications without registry asterisks could be held against them during tenure considerations.

The bottom line is that registries bring with them a host of benefits — as Jamie has nicely laid out above — but we should think carefully about how to best maximize those benefits in order to minimize new costs. Even if we could agree on how to rate a proposal in terms of faithfulness to registry design, I would suggest caution in trying to integrate ratings into the publication process.

The views expressed here are mine alone and do not represent either the Journal of Experimental Political Science or the APSA Organized Section on Experimental Research Methods.

Heading image: Interior of Rijksmuseum research library. Rijksdienst voor het Cultureel Erfgoed. CC-BY-SA-3.0-nl via Wikimedia Commons.

The post The pros and cons of research preregistration appeared first on OUPblog.

Research replication in social science: reflections from Nathaniel Beck

Replication and data access has become a hot topic throughout the sciences. As a former editor of Political Analysis and the chair of the Society for Political Methodology‘s Data Access and Research Transparency (DA-RT) committee, I have been thinking about these issues a lot lately. But here I simply want to share a few recent experiences (two happy, one at this moment less so) which have helped shape my thinking on some of these issues. I note that in none of these cases was I concerned that the authors had done anything wrong, though of course I was concerned about the sensitivity of results to key assumptions.

The first happy experience relates to an interesting paper on the impact of having an Islamic mayor on educational outcomes in Turkey by Meyerson published recently in Econometrica. I first heard about the piece from some students, who wanted my opinion on the methodology. Since I am teaching a new (for me) course on causality, I wanted to dive more deeply into the regression discontinuity design (RDD) as used in this article. Coincidentally, a new method for doing RDD was presented at the recent (2014) meetings of the Society for Political Methodology by Rocio Titiunik. I want to see how her R code worked with interesting comparative data. All recent Econometrica articles are linked to both replication and supplementary materials on the Econometrica web site. It took perhaps 15 minutes to make sure that I could run Stata on my desktop and get the same results as in the article. So thanks to both Meyerson and Econometrica for making things so easy.

I gained from this process, getting a much better feel for real RDD data analysis so I can say more to my students than “the math is correct.” My students gain by seeing a first rate application that interests them (not a toy, and not yet another piece on American elections). And Meyerson gains a few readers who would not normally peruse Econometrica, and perhaps more cites in the ethnicity literature. And thanks to Titiunik for making her R code easily accessible.

The second happy experience was similar to the first, but also opened my eyes to my own inferior practice. At the same Society meetings, I was the discussant on a paper by Grant and Lebo on using fractional integration methods. I had not thought about such methods in a very long time, and believed (based on intuition and no evidence to the contrary) that using fractional integration methods led to no changes in substantive findings. But clearly one should base arguments on evidence and not intuition. I decided to compare the results of a fractional integration study by Box-Steffensmeier and Smith with the results of a simpler analysis. Their piece had a footnote saying the data were available through the ICPSR (excellent by the standards of 1998). Alas, on going to the ICPSR web site I could not find the data (noting that the lots of things have happened since 1998 and who knows if my search was adequate). Fortunately I know Jan so I wrote to her, and she kindly replied that the data were on her Dataverse at Harvard. A minute later I had the data and was ready to try to see if my intuitions might indeed be supported by evidence.

Feel free to use this image just link to www.rentvine.com — Typing on Keyboard – Male Hand by Dave Dugdale. CC BY-SA 2.0 via Flickr.

This experience made me think: could someone find my replication data sets? For as long as I can remember (at least back to 1995), I always posted my replication data sets somewhere. Articles written until 2003 sent readers my public ftp site at UCSD. But UCSD has changed the name and file structure of that server several times since 2003, and for some reason they did not feel obligated to keep my public ftp site going (and I was not worried enough about replication to think of moving that ftp site to NYU). Fortunately I can usually find the replication files if anyone writes me, and if I cannot, my various more careful co-authors can find the data. But I am sure that I am not the only person to have replication data on obsolete servers. Thankfully Political Analysis has required me to put my data on the Political Analysis Dataverse so I no longer have to remember to be a good citizen. And my resolution is to get as many replication data sets from old pieces on my own Harvard Dataverse. I will feel less hypocritical once that is done. It would be very nice if other authors emulated Jan!

The possibly less happy outcome relates to the recent article in PNAS on a Facebook experiment on social contagion. The authors, in a footnote, said that replication data was available by writing to the authors. I wrote twice, giving them a full month, but heard nothing. I then wrote to the editor of PNAS who informed me that the lead author had both been on vacation and was overwhelmed with responses to the article. I am promised that the check is in the mail.

What editor wants to be bothered by fielding inquiries about replication data sets? What author wants to worry about going on vacation (and forgetting to set a vacation message)? How much simpler the world would have been for the authors, editor, and me, if PNAS simply followed the good practice of Political Analysis, the American Journal of Political Science, the Quarterly Journal of Political Science, Econometrica, and (if rumors are correct) soon the American Political Science Review of demanding that authors post, either on the journal web site or the journal Dataverse, all replication materials before an article is actually published? Why does not every journal do this?

A distant second best is to require authors to post their replication on their personal website. As we have seen from my experience, this often leads to lost or non-working URLs. While the simple solution here is the Dataverse, surely at a minimum authors should provide a standard Document Object Identifier (DOI) which should persist even as machine names change. But the Dataverse solution does this, and so much more, that it seems odd in this day and age for all journals not to use this solution. And we can all be good citizens and put our own pre-replication standard datasets on our own Dataverses. All of this is as easy (and maybe) easier than maintaining private data web pages, and one can rest easy that one’s data will be available until either Harvard goes out of business or the sun burns out.

Featured image: BalticServers data center by Fleshas CC-BY-SA-3.0 via Wikimedia Commons.

The post Research replication in social science: reflections from Nathaniel Beck appeared first on OUPblog.

<<	April 2024					>>
Su	Mo	Tu	We	Th	Fr	Sa
	01	02	03	04	05	06
07	08	09	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

You are not logged in

Log in to JacketFlap

What is JacketFlap

Sort Blog Posts

Sort Posts by:

Suggest a Blog

Enter a Blog's Feed URL below and click Submit:

Most Commented Posts

In the past 7 days

Recent Posts

(tagged with 'Political Analysis')

Recent Comments

Recently Viewed

JacketFlap Sponsors

Blog Posts by Tag

In the past 7 days

Blog Posts by Date

Click days in this calendar to see posts by day or month

Viewing: Blog Posts Tagged with: Political Analysis, Most Recent at Top [Help]

Results 1 - 18 of 18

How to use this Page

Introduction, by R. Michael Alvarez

Coding style and good computing practices: it is easy to get the style right, harder to get good practice, by Jonathan Nagler, NYU

Related Stories

Replication redux, by Nathaniel Beck

Related Stories

An update on Dataverse

By Gary King

Related Stories

Related Stories

Related Stories

Related Stories

The pros of preregistration for political science

By Jamie Monogan, Department of Political Science, University of Georgia

Experiments, preregistration, and journals

By Joshua Tucker, Professor of Politics (NYU) and Co-Editor, Journal of Experimental Political Science

Related Stories

Related Stories

Introduction from Michael Alvarez, co-editor of Political Analysis:

Research replication in social science: reflections from Nathaniel Beck

Related Stories

Related Stories

By R. Michael Alvarez

Related Stories

By R. Michael Alvarez

Related Stories