Civic Futures Awards

Public dataset and AI for making images more accessible

University of Texas at Austin

School of Information

The problem

People who are blind or have low vision regularly seek visual assistance for a wide range of tasks, including to describe pictures they take themselves and find online. The artificial intelligence (AI) for this task remains a challenge, so we rely on humans to provide descriptions.

Solution / Approach

This work introduces the first publicly-available image captioning dataset for images originating from people who are blind or have low vision, an AI-based interactive assistant to improve the dataset, and interviews with end users to understand how to improve the dataset and functionality over time.

Core team
  • Dr. Danna Gurari
  • Dr. Kenneth R Fleischmann
  • Dr. Abigale Stangl
  • Dr. Rachel N. Simons
  • Nilavra Bhattacharya
  • Jaxsen Day
  • Nitin Verma
  • Yanan Wang
  • Xiaoyu Zeng
  • Meng Zhang
  • Yinan Zhou
Key advisors
  • Dr. Meredith Ringel Morris
  • Dr. Ed Cutrell
  • Dr. Neel Joshi
  • Dr. Roy Zimmermann
What can make it sustainable?

The team recognizes the need to promote this work on a broader scale. Following our standard practices, the team is releasing all code and datasets to facilitate future extensions, and publishing in conferences and journals.

Advice to others

Interdisciplinary research is exciting and has the potential for great impact, but takes significant time and effort to become productive. It is important to make regular, sustained interactions a key component of a successful interdisciplinary project.