Abstract
Even though data annotation is extremely important for interpretability, research, and development of artificial intelligence solutions, annotating data remains costly. Research efforts such as active learning or few-shot learning alleviate the cost by increasing sample efficiency, yet the problem of annotating data more quickly has received comparatively little attention. Leveraging a predictor has been shown to reduce annotation cost in practice but has not been theoretically considered. We ask the following question: to annotate a binary classification dataset with N samples, can the annotator answer less than N yes/no questions? Framing this question- and-answer (Q&A) game as an optimal encoding problem, we find a positive answer given by the Huffman encoding of the possible labelings. Unfortunately, the algorithm is computationally intractable even for small dataset sizes. As a practical method, we propose to minimize a cost function a few steps ahead, similarly to lookahead minimization in optimal control. This solution is analyzed, compared with the optimal one, and evaluated using several synthetic and real-world datasets. The method allows a significant improvement (23−86%) in the annotation efficiency of real-world datasets.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence |
| Publisher | Association for the Advancement of Artificial Intelligence |
| Pages | 14336-14343 |
| Number of pages | 8 |
| Volume | 39 |
| Edition | 13 |
| ISBN (Print) | 9781577358978 |
| DOIs | |
| Publication status | Published - 11 Apr 2025 |
| Externally published | Yes |
| Event | 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States Duration: 25 Feb 2025 → 4 Mar 2025 |
Publication series
| Name | Proceedings of the AAAI Conference on Artificial Intelligence |
|---|---|
| Publisher | Association for the Advancement of Artificial Intelligence |
| Number | 13 |
| Volume | 39 |
| ISSN (Print) | 2159-5399 |
| ISSN (Electronic) | 2374-3468 |
Conference
| Conference | 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 |
|---|---|
| Country/Territory | United States |
| City | Philadelphia |
| Period | 25/02/25 → 4/03/25 |
Bibliographical note
Publisher Copyright:Copyright © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Funding
This work was supported by a CIFRE grant from ANRT. It was also partially financed by ANII Uruguay. Centre Borelli is also with Université Paris Cité, SSA and INSERM.
Fingerprint
Dive into the research topics of 'Optimal and Efficient Binary Questioning for Accelerated Annotation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver