Scaling AI Projects with Dataiku Cobuild on Snowflake
💼 Business How-To

Scaling AI Projects with Dataiku Cobuild on Snowflake

A step‑by‑step guide to building, governing and deploying enterprise AI using Dataiku’s Cobuild feature on the Snowflake data platform

Scaling AI Projects with Dataiku Cobuild on Snowflake

Hook: Imagine you’ve just received a mountain of customer data and need the insights fast, but the existing spreadsheets are choking your team. With the right combination of tools, you can turn that data into reliable AI predictions without drowning in spreadsheets or code.

1️⃣ Get your foundations ready

Before the AI magic begins, you need a solid base. Think of Snowflake as a massive, cloud‑based storage locker that can hold any amount of data and serve it instantly to authorised users.

  1. Create a Snowflake account – sign up for the free trial or use your organisation’s cloud subscription.
  2. Set up a database and a schema – these are just folders that keep your tables organised.
  3. Load your data – import CSVs, connect to your CRM, or run a simple COPY INTO command (Snowflake’s way of moving data from external storage into the warehouse).

Tip: Keep a naming convention (e.g. sales_2024_q1) so you can find tables quickly later.

2️⃣ Connect Dataiku to Snowflake

Dataiku is a data‑science platform that lets non‑programmers build AI models using a visual workflow. The first time you hear “Dataiku”, think of a kitchen where the chef (you) can mix ingredients (data) without needing to know the exact recipe code.

  1. Install the Snowflake plugin in Dataiku – this is a pre‑built connector that speaks Snowflake’s language.
  2. Enter your Snowflake credentials – a username, password, and the warehouse URL. Dataiku will store them securely.
  3. Test the connection – Dataiku will fetch a list of tables so you can confirm everything works.

3️⃣ Use Cobuild to create AI models

Cobuild is Dataiku’s “no‑code” model builder. It automatically suggests the best algorithm (like a recipe book recommending a cake flavour) based on the data you feed it.

  1. Create a new Cobuild project – choose “AI model” and point it at a Snowflake table you loaded earlier.
  2. Define your target column – the field you want the model to predict (e.g., churn_status).
  3. Select features – let Cobuild auto‑detect relevant columns or manually pick a few (e.g., last_purchase_date, region).
  4. Run the training – Cobuild will split the data into training and testing sets, train several algorithms, and pick the top performer.

What’s happening behind the scenes? Cobuild uses a large language model (LLM) to interpret your instructions and a transformer architecture (think of it as the AI’s internal brain that pays attention to important words) to build the prediction model.

4️⃣ Govern, visualise and control

Enterprise AI needs rules so that everyone trusts the output. Dataiku offers built‑in governance tools that act like a traffic cop for data.

  • Version control – each model iteration is saved, so you can roll back if a change breaks accuracy.
  • Access rights – assign read‑only or edit permissions per user or team.
  • Model cards – automatically generated documentation that explains what data was used, how the model was trained, and its performance metrics.

Snowflake adds another layer of visibility with query history and audit logs – you can see who accessed which table and when, satisfying most compliance requirements.

5️⃣ Deploy and monitor

Once the model passes validation, push it to production with a single click. Dataiku will export the model as a REST API (a web address that other software can call) and register it in Snowflake’s External Functions – a feature that lets Snowflake run the model directly inside SQL queries.

  1. Create an API endpoint – Dataiku generates a URL like https://mycompany.dataiku.io/api/v1/predict.
  2. Register the endpoint in Snowflake – use the CREATE EXTERNAL FUNCTION command; now you can call predict_churn(customer_id) straight from a Snowflake query.
  3. Set up alerts – Dataiku’s monitoring dashboard can email you if model accuracy drops below a threshold you define.

Wrap‑up

Scaling AI doesn’t have to be a technical maze. By pairing Snowflake’s flexible data warehouse with Dataiku’s Cobuild visual builder, you gain a repeatable, governed workflow that anyone in the team can manage. Today, try loading a modest dataset into Snowflake, connect it to Dataiku, and run a simple Cobuild project – you’ll see how quickly raw data can turn into actionable predictions. Happy building!

✦ Original guide written by AI World Co.'s own AI editorial team. Reviewed for accuracy and clarity.

← Retour aux actus