Today is day 1 of Small Data SF!!! We'll kick things off with a day of hand-on workshops.
From Zero to Query: Building Your First Serverless Lakehouse with DuckLake - Jacob Matson from MotherDuck walks through creating a serverless lakehouse with DuckLake, covering ACID transactions, time travel, and schema evolution.
Stop Measuring LLM Accuracy, Start Building Context - Tahlia DeMaio from Hex argues that context, not accuracy, is the real challenge in LLM systems and shows how to build context-aware analytical workflows.
Keep it Simple and Scalable: pythonic ELT using dltHub - Thierry Jean from dltHub with Brian Douglas from Continue and elvis kahoro from Chalk teaches Python-based data ingestion and transformation pipelines.
Composable Data Workflows: Building Pipelines That Just Work - Dennis Hume from Dagster Labs covers practical patterns for building reliable, modular pipelines that scale from laptop to production.
Open Data Science Agent - Zain Hasan from Together AI shows how to build an autonomous data science agent using open-source models and the ReAct framework for end-to-end analysis tasks.
Duck, duck, deploy: Building an AI-ready app in 2 hours - Russell Garner and Rebecca Bruggman from Omni start with a MotherDuck dataset and build a production-ready analytics app using Omni's semantic model and APIs.
From Parsing Nightmares to Production - Upal Saha from bem demonstrates how to transform any unstructured input (PDFs, images, audio, etc.) into clean JSON and load it directly into MotherDuck.
Just-in-Time Insights with Estuary - Zulfikar Qureshi from Estuary provides hands-on experience with real-time data streaming, including a lab exercise streaming live data into MotherDuck.