Notifications have been sent out, and authors are now putting the finishing touches on their abstracts for the final CUNY 2020 program. We had 430 abstracts submitted to CUNY this year. The vast majority of submissions received three reviews, while a few received four, and a few received only two.
In our program, we had room for 27 talks and 270 posters. This meant that 133 abstracts were rejected outright. This rejection rate (~31%) was slightly higher than in the last few years, and higher than we would have liked, but we were pinched between space constraints on the number of posters we could accept and the high number of submissions that came in. There was a lot of good work that we couldn’t accept for this year’s CUNY, and it wasn’t fun to make that call.
How did we decide what got a talk? The short answer: we rolled up our sleeves and read lots of abstracts and reviews and talked it out. It took two full days of the full group meeting to hash it out. In terms of workload and difficulty of the task, this was easily the hardest thing we’ve had to do so far, both in terms of the overall effort and the difficulty of the decisions that we had to make.
First off, we had a number of criteria that we agreed on to guide the process:
- High ratings from the reviewers, weighted equally across all of the rating criteria.
- Appropriateness for the special session
- Diversity of topics, speakers, institutions, and languages
Equipped with our criteria, our process was simple: We worked backwards down the list of abstracts, in order of their average rating across reviewers, and considered each in turn for a talk by reading the abstract and all of the reviews until we got to our target number of talks. We used the quantitative portion of the reviews to help us focus our attention on the abstracts which most consistently generated enthusiasm among the reviewers, but we did not use the quantitative ratings as a hard filter on what was and wasn’t a talk; the scales were used just too differently across reviewers for that to be a reasonable strategy.
Instead, the qualitative content of the reviews in conjunction with our own read of the abstract was what was critical in making our decisions. In general, the more detail provided in a review, the more that review influenced our judgment. A set of all 6’s or all 2’s without any comment or with only general/short comments received less weight than a review accompanied by clear justification. Comments like ‘I think this abstract is exceptionally novel and interesting because of their use of method X or because of the novel theoretical insight Y, and it deserves to be heard as a talk’ or ‘I think this abstract is not yet ready for a talk because of concern Z so I recommend a follow-up to address Z before this would be successful as a platform presentation’ were generally quite useful. A persuasive and concrete review often helped us make the decision one way or another.
The ‘sample from the highly ranked abstracts and listen to our colleagues’ model ran into multiple snags along the way, as you might expect! For instance, a simple application of this strategy led to more than one talk from the same first author, but this seemed to us to be in conflict with our ‘diversity of speakers’ goal; in that situation, we made a judgment call about which of the two candidate talks would be a better fit in the overall program. In a similar vein, we sought to avoid too many talks on the the same narrow topic, and too many talks from a single institution. To come up with our list of 27 talks we ended up sampling and considering roughly the top 20% of abstracts, with significantly more consideration given to the top end of that distribution.
Decisions about rejections were made largely based on reviewer scores, averaged across all reviewers and questions, but with some tweaks. In particular, we inspected the variance of the ratings on the papers that received low average ratings, looking for submissions that specifically had two high scores and one low one. We then considered each of those abstracts individually for consideration in the main program, based on the contents of the reviews.