Case Study 6 min read

AI-Driven Double Materiality: Benchmarking Autonomous Agents Against Human Experts

MF
Mahdi Farimani 22 May 2026
AI-Driven Double Materiality: Benchmarking Autonomous Agents Against Human Experts

AI-Driven Double Materiality: Benchmarking Autonomous Agents Against Human Experts

In corporate sustainability, the Double Materiality Assessment (DMA) is the bedrock of compliance with the EU's Corporate Sustainability Reporting Directive (CSRD) and the European Sustainability Reporting Standards (ESRS). However, a comprehensive DMA typically requires hundreds of hours of consultant-led interviews, stakeholder surveys, and qualitative alignment sessions.

To address this bottleneck, ExecutESG, the Hanken School of Economics (Sustainability Transformation Lab), and Ab Stormossen Oy conducted a joint academic study. The project was executed by a dedicated team of master's students who served as the lead researchers and main contributors of this study: Aleksandra Uzunova, Wen Zhang, Henrik Skjolden, and Trisha Shenoy, under the academic guidance of Professor Martin Fougère and Nikodemus Solitander.

πŸŽ“ Hanken Student Research Team & Lead Contributors

This landmark academic study was successfully conducted by the following master's students as the primary researchers and core contributors of this project:

  • Aleksandra Uzunova (Master's Student, Hanken)
  • Wen Zhang (Master's Student, Hanken)
  • Henrik Skjolden (Master's Student, Hanken)
  • Trisha Shenoy (Master's Student, Hanken)

Academic Guidance & Advising: Professor Martin Fougère and Nikodemus Solitander (Sustainability Transformation Lab, Hanken Business School).

The core research question evaluated was:

Can autonomous, persona-driven AI agents produce a Double Materiality Assessment of comparable quality to human sustainability experts, suitable for board-level decision-making?

Here is a comprehensive breakdown of the methodology, execution, and most important findings of this landmark project.


πŸ“‹ Operational Context & Grounding

A major criticism of generic AI in corporate compliance is the risk of "hallucinations" or high-level generic outputs. To prevent this, the ExecutESG platform grounded its autonomous agents directly in Stormossen's operational realities:

  • Core Facilities: A state-of-the-art biogas production plant fueling Vaasa's public transport system, and Minimossen, the first recycling shopping mall in the Nordics.
  • Environmental Realities: Methane leakage risks, odor management issues in neighboring municipalities, and a strict 65% municipal recycling rate target.
  • Corporate Persona Models: We ran the DeepSeek API (deepseek-chat) using 8 distinct corporate personas representing key operational functions:
    1. Chief Executive Officer (CEO)
    2. HR Director / Occupational Health & Safety Lead
    3. Development Manager
    4. Chief Operating Officer (COO)
    5. Procurement / Value Chain Coordinator
    6. Financial Controller
    7. Communications / Public Relations Manager
    8. Environmental Compliance Officer

πŸ› οΈ The Pairwise Consensus Methodology

Rather than asking the AI to "write a materiality report," ExecutESG deployed an advanced automated pairwise consensus mechanism:

  1. IRO Generation: The grounded personas automatically generated a highly detailed, operationally focused matrix of 844 Impacts, Risks, and Opportunities (IROs).
  2. Pairwise Stakeholder Trade-offs: Using automated Playwright scripts, we simulated 15 distinct stakeholder groups (including municipal owners, environmental regulators, local residents, trade unions, and industrial partners).
  3. Forced Comparison: Stakeholder agents performed forced pairwise comparisons ("Which issue is more significant: A or B?"). This comparison created a mathematical Win Rate for every IRO.
  4. Calibration: These win rates were mathematically mapped and calibrated directly into the standard 1–5 ESRS significance ratings, removing human bias and subjective fatigue.

πŸ“ˆ Key Findings & Outcomes

The academic study benchmarked the AI-driven DMA against Stormossen's actual, recently completed human-led DMA. The outcomes were staggering:

  • Alignment of Material Topics: The AI-driven assessment identified the exact same set of core material topics (particularly in Resource Use & Circular Economy, Pollution, and Workforce) as the human sustainability committee.
  • Granularity & Coherence: The AI identified 844 operational IROs, which were subsequently linked directly to Stormossen's 37 active strategic projects. This bridging showed how compliance matches execution.
  • Speed & Cost Efficiency: While the human-led DMA took approximately three months and tens of thousands of Euros in advisory fees, the calibrated AI-driven consensus ran in under two hours at a fraction of the cost.
  • Commercial and Board Readiness: The resulting materiality matrix was validated by the Hanken research team as fully coherent, rigorous, and completely suitable for board-level strategic planning.

πŸ“‚ Downloads & Full Reports

You can download the official presentation and complete academic report here:


This case study proves that AI-driven pairwise consensus mechanisms match or exceed human-led assessments in coherence, and serves as our core validation for the ExecutESG platform.

πŸͺ Your Privacy Options

We use strictly necessary cookies to keep you signed in and protect your session. With your explicit consent, we also use analytics cookies (Google Analytics GA4) to improve our service. You can choose to accept all cookies or only allow essential ones. Read our Privacy Policy.