AI-Driven Double Materiality: Benchmarking Autonomous Agents Against Human Experts
AI-Driven Double Materiality: Benchmarking Autonomous Agents Against Human Experts
In corporate sustainability, the Double Materiality Assessment (DMA) is the bedrock of compliance with the EU's Corporate Sustainability Reporting Directive (CSRD) and the European Sustainability Reporting Standards (ESRS). Typically, a comprehensive DMA requires hundreds of hours of consultant-led interviews, stakeholder surveys, and qualitative alignment sessions.
To evaluate whether technology can streamline this bottleneck, ExecutESG, the Hanken School of Economics (Sustainability Transformation Lab), and Ab Stormossen Oy conducted a joint academic study.
The project was executed by a dedicated team of master's students who served as the lead researchers and main contributors: Aleksandra Uzunova, Wen Zhang, Henrik Skjolden, and Trisha Shenoy, under the academic guidance of Professor Martin Fougère and Nikodemus Solitander.
🎓 Hanken Student Research Team & Lead Contributors
This academic study was conducted by the following master's students as the primary researchers and core contributors of this project:
- Aleksandra Uzunova (Master's Student, Hanken)
- Wen Zhang (Master's Student, Hanken)
- Henrik Skjolden (Master's Student, Hanken)
- Trisha Shenoy (Master's Student, Hanken)
Academic Guidance & Advising: Professor Martin Fougère and Nikodemus Solitander (Sustainability Transformation Lab, Hanken Business School).
The core research question evaluated was:
Can autonomous, persona-driven AI agents produce a Double Materiality Assessment of comparable quality to human sustainability experts, suitable for board-level decision-making?
To answer this, the research team set up a side-by-side comparison of three completely distinct methodologies applied to Stormossen, a municipal waste management and circular economy actor in Vaasa, Finland.
📊 The Three Methodologies Compared
The study benchmarked three different DMA processes, each leveraging different inputs, degrees of stakeholder involvement, and levels of automation:
| Feature / Dimension | 1️⃣ Human-Driven DMA | 2️⃣ ExecutESG Custom AI DMA | 3️⃣ ChatGPT-Driven DMA (GPT-4) |
|---|---|---|---|
| Primary Inputs | Stakeholder workshops, pre-task surveys, manual discussions | Internal strategic presentations, academic papers, regulatory guidelines | Publicly available company website info, general sector assumptions |
| Stakeholder Input | Lived, qualitative input from 15 distinct stakeholder groups | Simulated by 8 distinct management personas (CEO, COO, HR, etc.) | None (zero-shot generative profiling) |
| Prioritization Method | Stakeholder voting rounds and group consensus | Proprietary Pairwise Comparison Engine (Playwright automation) | Direct top-down AI scoring guesses |
| Timeline | ~3 months | ~2 hours | < 15 minutes |
📈 Key Findings: Merits & Limitations of Each Approach
The study revealed that all three approaches converged on the same core material topics (specifically Circular Economy, Pollution, Climate Change, and Own Workforce). However, they differed significantly in how materiality was constructed, yielding distinct strengths and limitations:
1️⃣ Human-Driven DMA
- Merits: Captures exceptional operational nuance and local context. For example, humans identified that Stormossen's environmental success depends heavily on customer sorting habits and communication logistics—insights that help design effective physical interventions. It also distinguishes differing weights and viewpoints across stakeholder sub-groups (Board vs. local residents).
- Limitations: Highly resource-intensive, taking months of facilitation and incurring substantial advisory costs.
- Strategic Value: The process itself builds essential organizational buy-in and psychological alignment. Discussing, negotiating, and voting on material topics prepares the board and management team to execute the resulting strategy.
2️⃣ ExecutESG Custom AI-Driven DMA
- Merits: Grounded in internal documents (eliminating generic LLM hallucinations) and highly systematic. It excelled at linking environmental impacts to financial consequences—for instance, mapping groundwater contamination risks to specific liabilities and insurance premium hikes. It is extremely fast (running in hours) and structures hundreds of IROs objectively.
- Limitations: Treats the company as a single unified entity without the qualitative stakeholder granularity of human workshops. It also requires secondary human double-checks to correct occasional classification errors (e.g., misclassifying an own-workforce impact under circular economy).
- Proprietary Advantage: Unlike standard AI, ExecutESG utilizes a forced pairwise comparison algorithm. By forcing agents to make trade-offs ("Which issue is more significant: A or B?"), it mathematically derives a "Win Rate" for every IRO, which is calibrated directly to standard 1–5 ESRS significance ratings. This removes subjective human fatigue and arbitrary scoring.
3️⃣ ChatGPT-Driven DMA (GPT-4)
- Merits: Offers a very low barrier to entry for a first-pass ESRS-aligned baseline structure. It is highly capable of generating clean narrative summaries and structured tables.
- Limitations: Lacks connection to company-specific realities, local operational nuances, or internal data. Without stakeholder validation, it remains generalized.
🧠 The Conclusion: AI as a Helper, Not a Replacement
The Hanken report concludes that AI-driven DMAs cannot fully replace human participation.
Because double materiality is fundamentally an interpretive and consensus-building process, human judgement is necessary to validate findings, resolve conflicting interests, and—most importantly—translate reporting outputs into actual strategic execution. Lived stakeholder experiences cannot be simulated entirely in a database.
However, the study highlights that AI-driven DMA is a game-changer as a helper and complement:
- For Resource-Constrained Companies (VSMEs): Smaller businesses often lack the budget and time for months of consultant-led workshops. For these companies, an AI-driven DMA is highly relevant, providing a structured, compliant, and rigorous baseline ("just to have something solid") to kickstart their sustainability journey.
- As a Hybrid Workflow: For larger organizations, the optimal approach is a hybrid model. Use AI to ingest company documents, run initial context screenings, and generate candidate topic longlists. Then, use human workshops to validate, prioritize, and build organizational alignment around those topics.
By combining the structural efficiency and mathematical consistency of ExecutESG's pairwise consensus engine with the qualitative wisdom of human stakeholders, companies can produce superior, audit-ready disclosures in a fraction of the time.
📂 Downloads & Full Reports
You can download the official presentation and complete academic report here: