Paper ID: 2301.05154
Mephisto: A Framework for Portable, Reproducible, and Iterative Crowdsourcing
Jack Urbanek, Pratik Ringshia
We introduce Mephisto, a framework to make crowdsourcing for research more reproducible, transparent, and collaborative. Mephisto provides abstractions that cover a broad set of task designs and data collection workflows, and provides a simple user experience to make best-practices easy defaults. In this whitepaper we discuss the current state of data collection and annotation in ML research, establish the motivation for building a shared framework to enable researchers to create and open-source data collection and annotation tools as part of their publication, and outline a set of suggested requirements for a system to facilitate these goals. We then step through our resolution in Mephisto, explaining the abstractions we use, our design decisions around the user experience, and share implementation details and where they align with the original motivations. We also discuss current limitations, as well as future work towards continuing to deliver on the framework's initial goals. Mephisto is available as an open source project, and its documentation can be found at www.mephisto.ai.
Submitted: Jan 12, 2023