ArXiv, comments, and “quality control”

Those of you who browsed the arXiv recently may have seen a link to a user survey on top of the page (as of now, apparently no longer online) (update: still available here, until April 26). I ignored it a few times, until a friend brought this particular bit to my attention.

arxiv

Sure enough, I took the survey. As it turned out, the arXiv was also asking for feedback on what it calls “quality control”: actions such as rejecting “papers that don’t have much scientific value,” flagging papers that have “too much text re-use from an author’s earlier papers” (self-plagiarism) or from papers by other authors (plagiarism), or moderating pingbacks (such as links from blogs or articles) before they appear on the arXiv.

Internet comment sections are in decline everywhere you look. They are mocked, ridiculed, despised. Many websites have closed them already; others have seen their comments become a racist, sexist bog of eternal stench from which any reasonable person is best advised to stay away. I’ve talked about it here at length, with examples and links, and it’s very easy to google up more if you wish.

I’m often told that if a comment section is restricted to “registered” mathematicians posting under their real names, the conversations will be polite and civil, with the rare instances of abuse identified as such and condemned by the community. If that’s what you think, consider that much of what passes for “normal” interactions between mathematicians is viewed as passive-aggressive, if not downright abusive, just about everywhere else. We all know what referee reports can look like, or grant proposal reviews, or MathSciNet blurbs. If you believe that non-anonymity will solve the problem, I could give you many examples of questions from the audience in seminars and conference talks that were at least as problematic as any referee reports I’ve seen.

Women, in particular, get far too many comments questioning our competence, implying that we might not know the basic literature, that we might not really understand our own results, that said results might turn out to be false or trivial if only someone qualified had a look, or some such. We’re also subject to gendered standards of “professionalism” that do not allow us to respond in kind and give as good as we get. But if you tell me that men, too, can get inane, confused, or malicious comments–why, yes, I agree. More reason to refrain from making the arXiv more like YouTube. There’s enough abusive behaviour in mathematics already, on all sides. We should not mandate a form of discourse that has been shown empirically to promote and escalate it. Nor should we mandate having it attached in perpetuity to our formal publication records.

As for “quality control”: there have been well publicized cases where the arXiv moderators might have overstepped in rejecting papers and blacklisting authors. I’m not a fan of flagging papers for “substantial overlap,” either. We often write several consecutive papers in the same area, introducing the same notation each time, stating the same conjectures or prior results for reference, and so on. We might even reuse parts of the same TeX file for such purposes. None of this amounts to plagiarism or self-plagiarism, nor should it trigger red flags.

Now, here’s what all this might mean for the future of the arXiv. Allow me a little bit of speculation here.

The arXiv has become the universally accepted default repository for mathematicians, not only because it provides a service we need, but also because, in not attempting to do more than that, it gives us no reason to not use it. We don’t have to worry that the paper might not “qualify,” that it’s too long or too short, or too expository, or not sufficiently tailored for the “right” audience. We simply post what we think is right. We expect and welcome feedback (I often post papers on the arXiv prior to journal submission, specifically for that purpose), but the site does not allow public abuse or internet flame wars, so no need to worry about that. The bare-bones structure is not a bug, it’s a feature that has been essential for the arXiv’s success.

Currently, the arXiv has little competition. It works well enough for most of us and we have no reasons to look elsewhere. That might change. Discontent breeds business opportunities. The competing site viXra, started by physicists who were dissatisfied with the arXiv’s moderation practices, failed to gain much ground; but if the arXiv were to amp up its “quality control” in ways that test our tolerance, and especially if it were to implement comments and ratings, there just might be a critical mass of scientists willing to try such alternatives. I know I would be looking for them, and I’ve heard from others (including well known mathematicians, and not only women) who feel the same way.

It would be more than ironic if, say, Elsevier or Springer were to set up a competing open access repository where, for a small fee around $100, authors could post their papers on a site guaranteed to be free of comments and ratings. That would obviously discriminate against those unable to pay $100, but there’s nothing stopping anyone from setting up such a site if there is demand, as I assume there would be. Grant holders in many countries are now subject to open access policies that practically mandate the posting of papers on repositories; should we no longer wish to post on the arXiv, we’ll need an alternative. I can’t promise that I wouldn’t switch to a Springer or Elsevier site, in such circumstances. It would be even better if non-profit organizations, such as the AWM for example, were to set up their own preprint archives where the terms of service would reflect the preferences of the membership.

If comments or ratings are allowed retroactively, on papers already posted to the arXiv, then it’s far from clear to me that the arXiv would be able to hold on to such papers. My contract with the arXiv is, essentially, that the arXiv has my permission to distribute my articles on its website and its mirror sites. It does not have my permission to cross-post them on Reddit and Hacker News. By the same token, it does not have my permission to post them on a future site that might continue to use the arxiv.org URL, but would function in substantially different ways. That would have to be renegotiated. Individual mathematicians may have little power in that regard, but if major publishers become involved as per the above, and if they decide to encourage researchers to move their past publications to their servers, then I could think of some interesting ways in which this could develop.

My crystal ball here may well be less than perfect, but I think that some version of this would have to happen. If the arXiv wishes to remain the universal default repository for scientists in the covered areas, the plain vanilla model is the only one that will do that. Quality control is better left to journals, and for those authors who wish to have public discussions about their papers, a wide range of blogs and social media is available. Any changes that alienate a substantial group of users will inevitably lead to the rise of competition, and so within a few years we might well see a variety of arXiv-type sites with different functionalities and user bases.

And that would essentially end the arXiv as we know it.

Update, July 8, 2016: for those coming late to it, I’m also quoted in this Wired article by Sarah Scoles.

Comments Off on ArXiv, comments, and “quality control”

Filed under mathematics: general, publishing

Comments are closed.