The world of artificial intelligence is rapidly advancing, but a fundamental question remains: How do these cutting-edge systems actually work? Despite the surge in AI progress, researchers are grappling with the challenge of explaining the inner workings of these powerful tools. This mystery is particularly evident in the field of interpretability, where the quest to understand the 'electrons' of AI systems is still in its infancy.
At the NeurIPS conference, a gathering of academics, startup founders, and researchers from industrial giants, the buzz centered around this very question. The conference, held annually since 1987, has grown exponentially, attracting a record-breaking 26,000 attendees this year. It focuses on neural networks and their intricate relationship with computation, neurobiology, and physics. While once considered an esoteric academic pursuit, neural networks are now the backbone of AI systems, transforming NeurIPS from a niche event to a major player in the San Diego Convention Center.
The interpretability challenge is a complex one. Most leading AI researchers and CEOs openly admit their struggle to comprehend the inner workings of today's advanced AI systems. This pursuit of understanding is called interpretability, aiming to 'interpret' how these models function. Shriyash Upadhyay, an AI researcher and founder of Martian, a company dedicated to interpretability, highlights the field's early stage. He compares it to traditional science, where settled ideas are refined, to the current state of interpretability, where fundamental questions about the nature of AI systems persist.
Google's interpretability team recently shifted their focus from attempting to understand every model component to more practical methods addressing real-world impact. This change reflects the rapid progress in AI and the realization that some goals may take longer than anticipated. In contrast, OpenAI's head of interpretability, Leo Gao, is doubling down on a deeper, more ambitious form of interpretability, aiming to fully comprehend neural networks. However, AI researcher Adam Gleave expresses skepticism about the possibility of fully understanding model behavior, suggesting that deep-learning models may not have simple explanations.
Despite these challenges, researchers remain optimistic about progress. Sanmi Koyejo, a professor at Stanford University, emphasizes the need for better measurement tools to assess AI systems' capabilities, especially in complex areas like intelligence and reasoning. Ziv Bar-Joseph, an expert in biology-specific AI models, agrees that evaluating these systems is in its early stages, mirroring the interpretability field's infancy.
Despite the hurdles, the AI community is witnessing rapid advancements in AI's ability to enhance scientific research. Upadhyay draws a parallel to building bridges before Newton's physics, suggesting that a complete understanding of AI systems is not essential for significant real-world impact. The NeurIPS conference has dedicated a portion of its agenda to boosting scientific discovery through AI, attracting growing interest from researchers eager to explore the field's potential.
The enthusiasm is palpable, with pioneers like Jeff Clune noting a surge in interest in creating AI for scientific innovation. Clune's experience reflects a broader trend, as the world increasingly recognizes the potential of AI to address pressing human well-being challenges. As the field continues to evolve, the quest for interpretability remains a central focus, driving researchers to unlock the secrets of these powerful AI systems.