Thursday, 29 January 2026

Slides for Deconstructing an LLM:What data is inside?

Sildes

On Jan 29th 2026 I gave a talk at the Enterprise Tech Monthly (ETM) meetup, hosted by LSEG in London on Data for AI.  

Here are the slides: Deconstructing an LLM: What data is inside?

Event information: https://www.eventbrite.co.uk/e/deconstructing-an-llm-what-data-is-inside-tickets-1980199014558 (sign up for future ETM events!)

I did a 10min version of the talk last summer: https://youtu.be/hhbH375tV84

Get Involved!

Here are some links to the topics I talked about:

Open Data Institute (ODI) Solid (Social Linked Data)

Email the Solid team at the Open Data Institute on solid@theodi.org

Or read more at https://solidproject.org/ or https://theodi.org/what-we-do/solid/

EDMAssociation: AI, Data & Analytics Controls (ADAC)

The ADAC Working Group meets Mondays 3pm UK time.  Contact Oli on LinkedIn to be added to the calendar invite, or just register to join direct here:

EDM Association ADAC Weekly Zoom meeting link:

https://us06web.zoom.us/j/85769777904?pwd=Jy0NrVbU1eO1MfCR87qKLQpzwzrZQt.1

Please click the link to register in advance for your individual zoom meeting link.

EDM Association: Data Products (DPROD)

The DPROD group meets weekly on Thursdays 4.30pm UK time.  Contact Tony Seale to be invited to the zoom

FinOS CALM & CCC

Financial Open Source Foundation (FinOS) - Common Architectural Language Model (CLAM)

The Architecture as Code working group develop CALM and meet monthly on Tuesdays 4-5pm.  See the FinOS community calendar for the registration link.
CALM repo: https://github.com/finos-labs/architecture-as-code

Common Cloud Controls (CCC)

A FinOS project for establishing a common set of automated cloud controls for regulated industries.
Introductory white paper: https://github.com/finos/common-cloud-controls/blob/main/docs/Citi-Contributed-White-Paper/Financial-Services-Common-Cloud-Controls-Standard-v1.0-(for%20publication).pdf

Total Neural Enterprises (TNE.ai)

TNE.ai build enterprise AI that uses CDMC, ADAC and Ai2 OLMo.  Get in touch with Oli Bage on LinkedIn to learn more.

Allen Institute for AI (Ai2) builds the Open Language Model (OLMo) family of frontier models, using open training data.  https://allenai.org/olmo

I traced the lineage of the training data using Solidatus - an industry leading lineage visualisation tool.  Learn more about Solidatus here: https://www.solidatus.com/

Play with the OLMo2 lineage model, hosted on the EDMConnect intance of Solidatus here (free registration for EDMConnect, but many corporates are already members:
https://edmc.solidatus.com/viewer/685bfbd05c54812d123d1dc8



Thursday, 22 January 2026

Helpful Links: CDMC, ADAC, DPROD, Solid, CALM, CCC, OLMo, ODI, TNE, ETM

I am involved in a wide range of industry activity.  Here are the useful links if you want to find out more:

 🌐 CDMC - Cloud Data Management Capabilities, an EDMCouncil Data Management best practices framework with automations for cloud. 
CDMC guide: https://edmcouncil.org/frameworks/cdmc/
CDMC Spec: https://github.com/finos/compliant-financial-infrastructure/files/7650814/CDMC_Framework_V1.1.1.pdf

🌐 ADAC - AI, Data & Analytics Controls, An EDMCouncil group developing an extension of CDMC to cover AI use case governance.
Overview doc: https://docs.google.com/document/d/1q-QKXrjNy7aSRKOjS2-fMtzaOw5_TWRO9Lg4tDSdJtU 
OLMo model from AI2 (ADAC has traced the lineage of the training data): https://allenai.org/olmo

🌐 DPROD - An OMG & EKGF project to extend W3C DCAT to enable Data Product sharing across an enterprise and its data partners.
DPROD spec: https://ekgf.github.io/data-product-spec/dprod


🌐 Solid - Tim Berners Lee’s W3C and Open Data Institute project to enable individuals to own and manage their personal data, stored in secure data wallets.
Solid homepage: https://solidproject.org/
🌐 Architecture as Code - FinOS project to establish machine readable Common Architecture Language Model (CALM) to enable automation.
CALM v1.0 announcement: https://lists.finos.org/g/toc/topic/finos_community_calm_v1_0/114686088
CALM repo: https://github.com/finos-labs/architecture-as-code


🌐 CCC - FinOS project for establishing Common Cloud Controls. We are working to reuse the CDMC controls in this wider framework.
Introductory white paper: https://github.com/finos/common-cloud-controls/blob/main/docs/Citi-Contributed-White-Paper/Financial-Services-Common-Cloud-Controls-Standard-v1.0-(for%20publication).pdf


🌐 Cloud - I have spoken at several conferences on the top lessons learned about at-scale cloud migration: https://www.infoq.com/presentations/lseg-cloud-lessons/

🌐 Enterprise Tech Meetup - I co-host the London Enterprise Tech Monthly meet-up on a regular basis with Ian Ellis

🌐 Emerging Tech Portfolio companies & advisory relationships:
I am an advisor and angel investor for a number of high-potential data companies.
Honeydew - semantic layer for AI & BI: https://honeydew.ai/
Lunar - The Control Layer for Enterprise Agents (inc MCP & API gateway): https://www.lunar.dev/
MagicOrange - advanced ITFM & FinOps platform: https://www.magicorange.com/
Deontic Data - AI for market data licences: https://www.deonticdata.com/