Paper accepted for HCOMP 2019

HCOMP 2019 logo

The Crowd Lab had a paper, titled, “Second Opinion: Supporting last-mile person identification with crowdsourcing and face recognition,” accepted for the upcoming AAAI Human Computation and Crowdsourcing (HCOMP 2019) conference at the Skamania Lodge in Stevenson, WA, USA, October 28-30, 2019. The conference had a 25% acceptance rate.

Ph.D. student and lead author Vikram Mohanty will present the paper, co-authored with Dr. Luther and Crowd Lab undergraduate researchers Kareem Abdol-Hamid and Courtney Ebersohl. Here’s the paper’s abstract:

As AI-based face recognition technologies are increasingly adopted for high-stakes applications like locating suspected criminals, public concerns about the accuracy of these technologies have grown as well. These technologies often present a human expert with a shortlist of high-confidence candidate faces from which the expert must select correct match(es) while avoiding false positives, which we term the “last-mile problem.” We propose Second Opinion, a web-based software tool that employs a novel crowdsourcing workflow inspired by cognitive psychology, seed-gather-analyze, to assist experts in solving the last-mile problem. We evaluated Second Opinion with a mixed-methods lab study involving 10 experts and 300 crowd workers who collaborate to identify people in historical photos. We found that crowds can eliminate 75% of false positives from the highest-confidence candidates suggested by face recognition, and that experts were enthusiastic about using Second Opinion in their work. We also discuss broader implications for crowd–AI interaction and crowdsourced person identification.

Two papers accepted for CSCW 2019

CSCW 2019 logo

The Crowd Lab had two papers accepted for the upcoming ACM Computer Supported Cooperative Work and Social Computing (CSCW 2019) conference in Austin, TX, USA, November 9-13, 2019. The conference had a 31% acceptance rate.

Ph.D. student Sukrit Venkatagiri will be presenting “GroundTruth: Augmenting expert image geolocation with crowdsourcing and shared representations,” co-authored with Jacob Thebault-Spieker, Rachel Kohler, John Purviance, Rifat Sabbir Mansur, and Kurt Luther, all from Virginia Tech. Here’s the paper’s abstract:

Expert investigators bring advanced skills and deep experience to analyze visual evidence, but they face limits on their time and attention. In contrast, crowds of novices can be highly scalable and parallelizable, but lack expertise. In this paper, we introduce the concept of shared representations for crowd–augmented expert work, focusing on the complex sensemaking task of image geolocation performed by professional journalists and human rights investigators. We built GroundTruth, an online system that uses three shared representations—a diagram, grid, and heatmap—to allow experts to work with crowds in real time to geolocate images. Our mixed-methods evaluation with 11 experts and 567 crowd workers found that GroundTruth helped experts geolocate images, and revealed challenges and success strategies for expert–crowd interaction. We also discuss designing shared representations for visual search, sensemaking, and beyond.

Ph.D. student Tianyi Li will be presenting “Dropping the baton? Understanding errors and bottlenecks in a crowdsourced sensemaking pipeline,” co-authored with Chandler J. Manns, Chris North, and Kurt Luther, also from VT. Here’s the abstract:

Crowdsourced sensemaking has shown great potential for enabling scalable analysis of complex data sets, from planning trips, to designing products, to solving crimes. Yet, most crowd sensemaking approaches still require expert intervention because of worker errors and bottlenecks that would otherwise harm the output quality. Mitigating these errors and bottlenecks would significantly reduce the burden on experts, yet little is known about the types of mistakes crowds make with sensemaking micro-tasks and how they propagate in the sensemaking loop. In this paper, we conduct a series of studies with 325 crowd workers using a crowd sensemaking pipeline to solve a fictional terrorist plot, focusing on understanding why errors and bottlenecks happen and how they propagate. We classify types of crowd errors and show how the amount and quality of input data influence worker performance. We conclude by suggesting design recommendations for integrated crowdsourcing systems and speculating how a complementary top-down path of the pipeline could refine crowd analyses.

Congratulations to Sukrit, Tianyi, and their collaborators!

Paper accepted to CHI 2019 HCI + AI workshop

Our paper, “Flud: a hybrid crowd-algorithm approach for visualizing biological networks,” was accepted to the CHI 2019 workshop titled, Where is the Human? Bridging the Gap Between AI and HCI, in Glasgow, Scotland. Congratulations to Crowd Lab co-authors Aditya Bharadwaj (Ph.D. student) and David Gwizdala (undergraduate researcher), as well as Yoonjin Kim and Aditya’s co-advisor, Dr. T.M. Murali.

Paper accepted for CHI 2019

CHI 2019 logo

Congrats to Crowd Lab Ph.D. student Aditya Bharadwaj for his accepted paper at the upcoming CHI 2019 conference in Glasgow, Scotland, in May. The acceptance rate for this top-tier human-computer interaction conference is 24%. The paper, titled “Critter: Augmenting Creative Work with Dynamic Checklists, Automated Quality Assurance, and Contextual Reviewer Feedback“, was co-authored with colleagues Pao Siangliulue and Adam Marcus at the New York-based startup B12, where Aditya interned last summer. The paper’s abstract is as follows:

Checklists and guidelines have played an increasingly important role in complex tasks ranging from the cockpit to the operating theater. Their role in creative tasks like design is less explored. In a needfinding study with expert web designers, we identified designers’ challenges in adhering to a checklist of design guidelines. We built Critter, which addressed these challenges with three components: Dynamic Checklists that progressively disclose guideline complexity with a self-pruning hierarchical view, AutoQA to automate common quality assurance checks, and guideline-specific feedback provided by a reviewer to highlight mistakes as they appear. In an observational study, we found that the more engaged a designer was with Critter, the fewer mistakes they made in following design guidelines. Designers rated the AutoQA and contextual feedback experience highly, and provided feedback on the tradeoffs of the hierarchical Dynamic Checklists. We additionally found that a majority of designers rated the AutoQA experience as excellent and felt that it increased the quality of their work. Finally, we discuss broader implications for supporting complex creative tasks.

Two papers accepted for IUI 2019

Two members of the Crowd Lab each had a paper accepted for presentation at the upcoming IUI 2019 conference in Los Angeles, CA. The acceptance rate for this conference, which focuses on the intersection of human-computer interaction and artificial intelligence, was 25%.

Crowd Lab Ph.D. student Vikram Mohanty will present “Photo Sleuth: Combining Human Expertise and Face Recognition to Identify Historical Portraits“, co-authored with undergraduate David Thames and Ph.D. student Sneha Mehta. Here is the paper’s abstract:

Identifying people in historical photographs is important for preserving material culture, correcting the historical record, and creating economic value, but it is also a complex and challenging task. In this paper, we focus on identifying portraits of soldiers who participated in the American Civil War (1861- 65), the first widely-photographed conflict. Many thousands of these portraits survive, but only 10–20% are identified. We created Photo Sleuth, a web-based platform that combines crowdsourced human expertise and automated face recognition to support Civil War portrait identification. Our mixed-methods evaluation of Photo Sleuth one month after its public launch showed that it helped users successfully identify unknown portraits and provided a sustainable model for volunteer contribution. We also discuss implications for crowd-AI interaction and person identification pipelines.

Crowd Lab Ph.D. student Tianyi Li will present “What Data Should I Protect? A Recommender and Impact Analysis Design to Assist Decision Making“, co-authored with Informatica colleagues Gregorio Convertino, Ranjeet Kumar Tayi, and Shima Kazerooni. Here is the paper’s abstract:

Major breaches of sensitive company data, as for Facebook’s 50 million user accounts in 2018 or Equifax’s 143 million user accounts in 2017, are showing the limitations of reactive data security technologies. Companies and government organizations are turning to proactive data security technologies that secure sensitive data at source. However, data security analysts still face two fundamental challenges in data protection decisions: 1) the information overload from the growing number of data repositories and protection techniques to consider; 2) the optimization of protection plans given the current goals and available resources in the organization. In this work, we propose an intelligent user interface for security analysts that recommends what data to protect, visualizes simulated protection impact, and helps build protection plans. In a domain with limited access to expert users and practices, we elicited user requirements from security analysts in industry and modeled data risks based on architectural and conceptual attributes. Our preliminary evaluation suggests that the design improves the understanding and trust of the recommended protections and helps convert risk information in protection plans.

Congratulations to Vikram, David, Sneha, Tianyi, and their collaborators!

Article published in AI Magazine

Dr. Luther recently published a report about the GroupSight Workshop on Human Computation for Image and Video Analysis in AI Magazine. The workshop, held at the HCOMP 2017 conference in Quebec City, Canada, was co-organized by Dr. Luther, Danna Gurari, Genevieve Patterson, and Steve Branson. More information about the workshop can be found in a Follow the Crowd blog post by Dr. Luther.

Two papers accepted for CSCW 2018

CSCW 2018 logo

Two members of the Crowd Lab each had a paper accepted for presentation at the CSCW 2018 conference in Jersey City, NJ. The acceptance rate for this top-tier conference was 26%.

Ph.D. student Nai-Ching Wang presented “Exploring Trade-Offs Between Learning and Productivity in Crowdsourced History” with Virginia Tech professor of education David Hicks and Dr. Luther as co-authors. Here is the paper’s abstract:

Crowdsourcing more complex and creative tasks is seen as a desirable goal for both employers and workers, but these tasks traditionally require domain expertise. Employers can recruit only expert workers, but this approach does not scale well. Alternatively, employers can decompose complex tasks into simpler micro-tasks, but some domains, such as historical analysis, cannot be easily modularized in this way. A third approach is to train workers to learn the domain expertise. This approach offers clear benefits to workers, but is perceived as costly or infeasible for employers. In this paper, we explore the trade-offs between learning and productivity in training crowd workers to analyze historical documents. We compare CrowdSCIM, a novel approach that teaches historical thinking skills to crowd workers, with two crowd learning techniques from prior work and a baseline. Our evaluation (n=360) shows that CrowdSCIM allows workers to learn domain expertise while producing work of equal or higher quality versus other conditions, but efficiency is slightly lower.

Ph.D. student Tianyi Li presented “CrowdIA: Solving Mysteries with Crowdsourced Sensemaking” with Dr. Luther and Virginia Tech computer science professor Chris North as co-authors. Here is the paper’s abstract:

The increasing volume of text data is challenging the cognitive capabilities of expert analysts. Machine learning and crowdsourcing present new opportunities for large-scale sensemaking, but we must overcome the challenge of modeling the overall process so that many distributed agents can contribute to suitable components asynchronously and meaningfully. In this paper, we explore how to crowdsource the sensemaking process via a pipeline of modularized steps connected by clearly defined inputs and outputs. Our pipeline restructures and partitions information into “context slices” for individual workers. We implemented CrowdIA, a software platform to enable unsupervised crowd sensemaking using our pipeline. With CrowdIA, crowds successfully solved two mysteries, and were one step away from solving the third. The crowd’s intermediate results revealed their reasoning process and provided evidence that justifies their conclusions. We suggest broader possibilities to optimize each component, as well as to evaluate and refine previous intermediate analyses to improve the final result.

Congratulations Nai-Ching and Tianyi!

Sukrit Venkatagiri selected as Rita Allen Misinformation Forum Graduate Fellow

Rita Allen Foundation Logo

Congratulations to Crowd Lab Ph.D. student Sukrit Venkatagiri on his selection as one of 12 Graduate Student Fellows of the Rita Allen Foundation’s Misinformation Solutions Forum, which took place in October 2018 in Washington, DC. As a Graduate Fellow, Sukrit received a travel grant to attend the Forum and co-authored (with Amy Zhang of MIT) an essay that was published in the Forum’s proceedings.

Article accepted for IEEE TVCG

Highlight propagation: entity-based (A) and bicluster-based (B).
Highlighted entities are colored in orange. A user hovered or selected
entity is marked with a blue border. Entities in the normal state are blue.

Maoyuan Sun, assistant professor of computer and information science at UMass-Dartmouth, recently published an article, titled, “The Effect of Edge Bundling and Seriation on Sensemaking of Biclusters in Bipartite Graphs,” in the journal IEEE Transactions on Visualization and Computer Graphics (TVCG). The co-authors are Jian Zhao, Hao Wu, Dr. Luther, Chris North, and Naren Ramakrishnan. The article’s abstract is as follows:

Exploring coordinated relationships (e.g., shared relationships between two sets of entities) is an important analytics task in a variety of real-world applications, such as discovering similarly behaved genes in bioinformatics, detecting malware collusions in cyber security, and identifying products bundles in marketing analysis. Coordinated relationships can be formalized as biclusters. In order to support visual exploration of biclusters, bipartite graphs based visualizations have been proposed, and edge bundling is used to show biclusters. However, it suffers from edge crossings due to possible overlaps of biclusters, and lacks in-depth understanding of its impact on user exploring biclusters in bipartite graphs. To address these, we propose a novel bicluster-based seriation technique that can reduce edge crossings in bipartite graphs drawing and conducted a user experiment to study the effect of edge bundling and this proposed technique on visualizing biclusters in bipartite graphs. We found that they both had impact on reducing entity visits for users exploring biclusters, and edge bundles helped them find more justified answers. Moreover, we identified four key trade-offs that inform the design of future bicluster visualizations. The study results suggest that edge bundling is critical for exploring biclusters in bipartite graphs, which helps to reduce low-level perceptual problems and support high-level inferences.