
Challenge
Earth Security helps organisations navigate the carbon credit market. Their research team identifies which companies are most likely to invest in specific types of carbon projects — from mangrove restoration to clean energy. The intelligence they need is buried across sustainability reports, annual filings, and regulatory disclosures. Each research cycle meant weeks of manual document review.
The brief was tight: build an in-house analysis tool that helps their analysts identify leads and opportunities — discovery to working MVP, under £10k.
Approach
I worked with the analysts through a discovery phase to define the data points, data sources, and queries that would actually move the needle. 60+ fields per company — climate commitments, carbon credit history, biodiversity targets, financial instruments, SBTi status. Each extraction prompt was specified with validation criteria so the LLM pipeline could reliably pull structured data from unstructured reports.
We started with their real research questions, not a product roadmap. The data points, sources, and scoring logic were defined with the analysts who’d use the tool every day.
Built from 7 sources, and growing.
Firmographics, corporate hierarchy, climate commitments, carbon credit history, and contact data — stitched together automatically so analysts never start from a blank spreadsheet.
Sixty data points sounds like a lot. It’s the minimum. Every field maps to a question their analysts were already asking manually — we just made the answers show up before they had to go looking.
The core of the tool is a vector store built on FAISS, indexing sustainability disclosures with OpenAI embeddings. LangChain orchestrates the RAG pipeline — from document ingestion through to the final matching output. Each company is scored against project types with evidence pulled directly from source documents.
Analysts query the vector store in natural language through a Streamlit interface. No SQL, no filters — just ask the question. The system returns ranked matches with evidence pulled directly from source documents.
The best tool is the one your team actually uses. Natural language search meant analysts didn’t need to learn SQL or wrestle with filters — they just asked the question the same way they’d ask a colleague.
Result
Discovery to MVP in 2 months, under £10k. What used to take weeks of manual document review now takes seconds.
The tool became a core part of the research workflow — replacing manual document review with structured, enriched company intelligence that surfaces the right prospects in seconds.