Participation and Contribution in Crowdsourced Surveys

29 Replies

Participation and Contribution in Crowdsourced Surveys, a recent PLOSOne article, discusses some interesting approaches to crowdsourced surveys. Not only are the answers crowdsourced, but the questions themselves are also crowdsourced. The surveys are seeded with a small number of questions and later augmented with questions supplied by respondents. These questions are curated by hand and presented to respondents in a random order.

Approach and comparison with Quizz

The authors accomplished this by setting up three separate websites to collect the data. The only one that is still active is for Energy Minder which is an ongoing research project. The other two surveys were about BMI and personal finance.

The motivation for this work is very similar to Quizz. The authors state:

The crowdsourcing method employed in this paper was motivated by the hypothesis that there are many questions of scientific and societal interest, for which the general (non-expert) public has substantial insight. For example, patients who suffer from a disease, such as diabetes, are likely to have substantial insight into their condition that is complementary to the knowledge of scientifically-trained experts. The question is, how does one collect and combine this non-expert knowledge to provide scientifically-valid insight into the outcome of interest.

Like Quizz, their system eschews financial incentives for survey completion. Unlike Quizz, new questions are added by the users themselves, rather than by a system. In Quizz, the objective is to complete a knowledge base — responses to questions are point estimates. In this system, the questions serve as features designed to predict a variable of interest, whether it be energy consumption, BMI, or the amount an individual has in their savings. The paper does not explicitly state the measurement level of the outcome variable; it isn’t clear if, for example, energy consumption is a binary variable (high/low), a categorical variable defined by buckets, or a real-valued prediction of kWh.

Questions, Observations, Insights

Are there any baseline/ground-truth studies? All three surveys ask questions that could be influenced by a variety of biases (e.g., Vermonters are hippies who desire a low energy footprint, biases against overweight and poor people, etc.).
One big advantage of using crowdsourced questions is that they can give insights for how to get around social desirability bias. This isn’t discussed in the paper, but would of interest to social scientists.
Early in the paper they state, “…a real-time machine learning module continuously builds models from this growing store of data to predict the outcome variable, and the resulting predictions are shown to the users.” The machine learning module they refer to is Hod Lipson’s symbolic regression package. It’s not clear to me when the predictions are shown. Aren’t there methodological issues with telling the respondent what you’re trying to predict? Although this can sometimes be the case, social desirability and other biases may have a significant impact on the outcome variable.
Related work: Program Boosting uses GP and crowdsourcing.
“If the user decides to submit their own question, they are then required to respond to it. We have found that being forced to immediately respond to the question helps users to recognize when their questions are unclear or confusing.” Do they have revision data for questions? Or do respondents just re-enter the question if it isn’t clear? Is there feedback on question clarity, or is this something that the human curator determines? It’s not clear to me how this works, but this data might be an interesting feature to use in quality control.
Beyond surveys, this is an interesting way to collect features for some other task. The questions are basically features here.
Problems of propagation of error could be connected to issues we’re looking at in Automan vis a vis multiple comparisons.
The learning module: could we use these techniques to build up blocks dynamically? Learn blocks?

Criticisms

The validation checks (e.g. valid ranges) are a very weak adversarial model. Since there are no financial incentives for this survey, a greater threat to validity are inattentive respondents.
I’d like to see a stronger comparison with the active learning literature. There are issues of compounded error when using stepwise regression and the kind of user-generated question dogfooding fu that’s happening here. I suspect the active learning literature addresses some of these issues and would give insight into how to have greater statistical validity.
Testing for a correlation coefficient different from 0 is too sensitive. This hardly ever happens. To guard against this, or at least establish a kind of prior on false correlations, the authors could inject seemingly unrelated questions into the survey. Of course, there is some probe bias here that could cause unintended consequences, so it would have to be thought out carefully. I’m just not satisfied with, “The lack of correlation between participation and contribution falsifies the hypothesis that a higher level of participation is indicative of interest in and knowledge of the subject area.”
Also asked above: what’s the baseline? What’s to stop the system from predicting the most common answer, given the class? How does this perform against a naive Bayes or decision tree classifier?
I would like to see some regularlization in the modelling. Symbolic regression can be very sensitive to outliers. I’m not sure what’s in this implementation, though. The paper would benefit from a discussion of regularization.

29 thoughts on “Participation and Contribution in Crowdsourced Surveys”

Johnathan July 22, 2024 at 9:25 am

Can I share this information at https://besthydrogenwatermachine.com/? It is good to learn and share with other friends. Great contribution!

Reply ↓
1. Tony August 31, 2024 at 9:49 am
  
  Homeowners are increasingly turning to green plumbing solutions to reduce their environmental impact. Options such as rainwater harvesting systems, solar water heaters, and composting toilets are becoming more popular as they offer sustainable alternatives to traditional plumbing methods. plumbers Surrey
  
  Reply ↓
David August 1, 2024 at 4:48 am

Thank you for your valuable contributions; they have been a great source of motivation for me. I’d also like to share a bit of information about modoohome.com that I think you might find interesting.

Reply ↓
Peter August 9, 2024 at 10:28 am

The insights in this article have truly inspired some great ideas for my own plans. I’m excited to read more from you. Thank you so much—I’m deeply grateful for the suggestions that have shaped my plans about Personalized Bracelets

Reply ↓
Alexis August 22, 2024 at 1:00 am

Thank you for this detailed and thoughtful post! It really highlights both the potential and the complexities of crowdsourced surveys. I appreciate the insights into the challenges of bias, validation, and the role of machine learning in this context. Your discussion has given me a lot to think about, especially in my upcoming Mississauga Mold Removal project.

Reply ↓
Pusha August 26, 2024 at 4:41 am

Great insights on the exploration of crowdsourced surveys! It’s fascinating to see how leveraging the insights of the general public can lead to more nuanced understandings in various fields. If you’re interested in more discussions like this and want to explore my music creations, I invite you to visit Pusha T.

Reply ↓
Naila August 30, 2024 at 4:42 am

This post on participation and contribution in crowdsourced surveys is fascinating, and it raises some great points about how we gather input from diverse perspectives. If you’re interested in exploring more of my thoughts, I’d love for you to go to website Insta Story Viewers.

Reply ↓
Selena Motley September 4, 2024 at 4:26 am

This post offers a fascinating look into the potential of crowdsourced surveys, especially with the inclusion of both crowdsourced questions and responses. If you enjoy exploring topics like this, visit Hairstyle website

Reply ↓
Webster Watkins September 4, 2024 at 4:33 am

It’s interesting to see how this approach can address social desirability bias and provide unique insights from non-experts. View all post by Webster, access 먹튀검증

Reply ↓
emmajohnson September 14, 2024 at 4:15 am

This was exactly what I needed—thank you for sharing such valuable content. Don’t forget to stop by my Travelog project!

Reply ↓
emmajohnson September 14, 2024 at 4:17 am

Thanks for the excellent advice! I found it very useful. Check out my Travelog project as well if you’re interested.

Reply ↓
James Miller September 14, 2024 at 6:50 am

I really appreciate the great advice! Be sure to visit my Nivela project as well!

Reply ↓
Gabby Foster September 18, 2024 at 5:39 am

Thank you for writing such an informative post. Your post has been incredibly valuable. Join our Rook Brand website, where we feature contributions from talented writers like yourself.

Reply ↓
Derrick R. Bennett September 26, 2024 at 10:51 pm

I find the prediction market angle useful here because there’s some literature on how to structure them for good results. Check out my homepage for some inspiration.

Reply ↓
Harley Daniel September 27, 2024 at 2:34 am

I agree that further exploration and investigation into the potential biases and limitations of using crowdsourced questions is necessary. The following Adutoys will give more information.

Reply ↓
Melissa R. Williams September 30, 2024 at 10:12 pm

The discussion on the implications of user-generated questions and the potential biases was particularly eye-opening. Keep up the great work, and we look forward to seeing you on great friendship songs.

Reply ↓
Sau B. Meunier October 1, 2024 at 12:50 am

I found the exploration of user-generated questions and the discussion on biases especially insightful. Check out our song about train playlist!!

Reply ↓
Charmaine Drusilla October 24, 2024 at 4:39 am

I also wondered about the reliability and accuracy of the data collected, especially without financial incentives to encourage participation. That’s Not My Neighbor got me thinking about how we can improve these survey methods to both collect valuable information and ensure the accuracy and validity of the results.

Reply ↓
Alexis Gardner December 18, 2024 at 5:24 am

The idea of crowdsourcing questions directly from participants is fascinating and seems to offer unique opportunities for mitigating biases and uncovering diverse insights. Visit unlocked games premium for more.

Reply ↓
guest posting adalah December 19, 2024 at 6:42 am

I suppose you will keep the quality work going on. guest posting adalah

Reply ↓
Francua January 25, 2025 at 12:23 pm

The concept of using crowdsourced questions as predictive features is fascinating and has potential beyond surveys. It could be explored in other machine learning tasks, especially where domain-specific iogamesio insights are valuable but hard to formalize.

Reply ↓
taylor2048 January 25, 2025 at 12:24 pm

The mention of biases (e.g., social desirability or demographic-specific tendencies) raises a valid concern. A stronger adversarial validation model and active learning methodologies could significantly enhance the statistical validity and reliability of the results. taylor2048

Reply ↓
taylor2048 January 25, 2025 at 12:25 pm

The mention of biases (e.g., social desirability or demographic-specific tendencies) raises a valid concern. A stronger adversarial validation model and active learning methodologies could significantly enhance the statistical validity and reliability of the results. taylor2048

Reply ↓
taylor2048 January 25, 2025 at 12:26 pm

The mention of biases (e.g., social desirability or demographic-specific tendencies) raises a valid concern. A stronger adversarial validation model and active learning methodologies could significantly enhance the statistical validity and reliability of the results.taylor2048

Reply ↓
igry February 7, 2025 at 1:40 pm

This article provides a thorough analysis of crowdsourced survey methodologies and their implications for data collection and machine learning. The approach of allowing users to contribute their own questions offers valuable insights into reducing bias and improving data relevance. However, concerns about statistical validity, inattentive respondents, and the potential for compounded errors highlight areas for further refinement. Overall, this research presents an intriguing method for leveraging public knowledge, but a stronger connection to existing active learning frameworks could enhance its robustness. igry

Reply ↓
infowebsasa April 10, 2025 at 11:22 am

Top best trending drama revies Park Shin Hye dramas I hope you like it

Reply ↓
infowebsasa April 10, 2025 at 11:23 am

Top best trending drama revies Park Shin Hye dramas I hope you like it

Reply ↓
gmtsas April 12, 2025 at 7:02 am

Top best trending drama revies historical chinese dramas I hope you like it

Reply ↓
gmtsas April 12, 2025 at 7:04 am

you can see many amazing images piranthanal valthukkal tamil I hope you like it

Reply ↓