Walton Family Foundation
The Vision
The Walton Family Foundation’s Environment Program funds thousands of nature-based solutions (NBS) across the U.S. to mitigate climate risks and improve river and ocean health. Tracking these on-the-ground projects is critical for understanding portfolio impact. However, the data surrounding these projects was largely captured in diverse, unstructured narrative reports and PDFs managed by dozens of different partner organizations.
The Solution
The Foundation needed to aggregate this data to answer portfolio-level questions (e.g., “Show a map of all projects in Louisiana”) without forcing grantees to abandon their established reporting workflows. We built a two-phased approach to liberate data from narrative reports:
- LLM-Powered Data Extraction: Rather than forcing rigid data-entry systems on grantees, we deployed Large Language Models at the ingestion layer to read and synthesize the rich backlog of narrative progress reports. The LLMs automatically extracted and standardized core data fields—unique identifiers, geographic locations, grant amounts, and projected environmental outcomes (e.g., acres restored, water saved)—transforming unstructured text into a clean schema.
- Standardized Data Warehousing: We established a modern data warehousing workflow on Google Cloud Platform. The extracted data pipelines directly into BigQuery, providing a robust, spatial-enabled backend. We then connected BigQuery to Looker Studio to build interactive, real-time dashboards.
The Outcome
The Foundation now has a dynamic, automated window into their vast portfolio of nature-based solutions. Program staff can instantly visualize geographic spread, analyze federal funding synergies, and aggregate macro-level ecological benefits. By tapping directly into the wealth of data grantees were already providing in text format, the Foundation unlocked entirely new levels of strategic insight.
The takeaway: Narrative grantee reporting is a treasure trove of institutional knowledge. Rather than treating varied reporting formats as a data quality issue to be “fixed” with rigid new software, organizations can use LLMs at the ingestion layer to meet grantees where they are.