Crowdsourcing 101: IntroWelcome to a brief series on crowdsourcing
(never finished the series... maybe one day. Originally published on Medium on 03.09.2022)
Photo by Jake Nackos on Unsplash
Even though synthetic data is increasingly used to supplement machine learning pipelines, there is often no substitute for the human mind in sorting through data for a machine learning problem. Human computation is central to the creation of many machine learning applications especially in language applications and computer vision.
Data annotation has a long history in the social sciences, especially within linguistics with oral linguistic data. With the introduction of Amazon Mechanical Turk in 2005, this idea became industrialized. While many researchers have turned to MTurk and other platforms to crowdsourcing for data, there are very few guides on best practices for task design, translating between best annotation tasks for human understanding and best data for machine learning pipelines, annotation metrics, and fair labor practices for crowdsourcing.
Several posts will focus on task design and the types of tasks that work well for machine learning. Many task designs have parallels to pedagogical design, as can be seen in the creation of such QA datasets for benchmarking large language models such as RACE, which is based on standardized tests. Other more recent task designs, such as those for conversational data tasks such as demonstrated by ParlAI, require more creative approaches to data elicitation and more control elements and management than MTurk-style tasks.
For data scientists and ML technologists, metrics for ML model output are part and parcel of their daily work. In data annotation, quality metrics for the labeled data are also key to ensuring that data scientists avoid “garbage in, garbage out” for this data in their pipelines. The two most important aspects of measurement in the crowdsourcing process are inter-rater reliability and internal consistency, which I shall cover in one post.
Finally, fair labor practices are often overlooked info crowdsourcing. Much recent work by Saiph Savage, among others, has highlighted the need to pay crowdsourced workers fairly and to listen to the input of workers on the tasks themselves. MTurk workers themselves have organized websites and discussion forums to discuss these topics and advocate among themselves for better pay and labor conditions from requesters.
If there is a specific topic within crowdsourcing you would like me to address, please mention it in the comments! I will be happy to investigate certain topics further in depth if there is interest. Stay tuned during the next few weeks for new posts.