How to: use code components to pull API data into a Bragi pipeline

Bragi pipelines support custom functionality when SQL alone isn’t enough, whether that’s scraping from an API, transforming with external logic, or integrating with a non-SQL system.

In this example, we’ll use Bragi’s code component to scrape public transport data from an API, combine it with local weather metrics from a CSV file, and prep the data for further modelling. The result is a staged table that can be used to explore possible links between rainfall, temperature, and bus usage.

code component fetching bus usage data

Introduction

We’ll walk through how to set up a code component to scrape data from data.gg, archive the results, load a static weather CSV file, and stage the two datasets together for analysis.

Tips for writing code components in Bragi:

✏️ Use bragiCodeUtil.GetHttpClientFactory() to obtain a preconfigured, secure HTTP client (handles proxies, headers, and security best practices automatically).

✏️ Your code should return a DataTable, not raw JSON. Bragi will automatically store, track, and version the data with full observability, so you don’t need to implement any custom wiring or persistence logic.

This code component will run consistently across environments (Dev → Test → Prod) and can be monitored like any other pipeline step.

1. Scraping public transport data

The dataset we’re working with is Guernsey’s public dataset of monthly bus usage.

We’ll use this as our external input and pull it into a structured SQL-compatible format (dbo.buses) so we can treat it like any other internal table.

Source URL Format
Guernsey bus usage https://data.gg/api/1.0/buses/usage JSON

This returns standard JSON that looks like this:

[
    {
        "Year":2011,
        "January":90006,
        "February":92883,
        "March":111688,
        "April":123742, //... etc...
    },
    {
        "Year":2012,
        "January":90022, //... etc... 
    }
]

We’ll write a small Bragi code component in C# that:

  • Calls the API

  • Deserialises the response

  • Transforms the data into a DataTable with rows by year and columns for each month

  • Returns it to Bragi for archiving and staging

Let’s look at the code. It’s absolutely standard C# that combines with some of Bragi’s timesaving helper methods:

using System;
using System.Data;
using System.Net.Http;
using System.Text.Json;
using System.Threading.Tasks;
using Bragi.CodeExternal;
using Bragi.Util;

public class BusesStats : ICodeConfigCode
{
    private const string Url = "https://data.gg/api/1.0/buses/usage";
    public async Task<BragiCodeResult> CodeImpl(IBragiCodeUtil bragiCodeUtil)
    {
        bragiCodeUtil.LogInformation($"Getting data from {Url}");
        
        // get the raw json
        using var client = bragiCodeUtil.GetHttpClientFactory().CreateClient();
        var jsonRaw = await Request(client, Url);
        var yearlyData = JsonSerializer.Deserialize<BusData[]>(jsonRaw);
        
        var dt = new DataTable("dbo.buses");
        dt.AddClassPropertiesToDataTable<BusData>();
        dt.AddRows(yearlyData);
        
        return new BragiCodeResult(BragiCodeResultCode.Success, dt);
    }

    public class BusData
    {
        public int Year { get; set; }
        public int? January { get; set; }
        public int? February { get; set; }
        public int? March { get; set; }
        public int? April { get; set; }
        public int? May { get; set; }
        public int? June { get; set; }
        public int? July { get; set; }
        public int? August { get; set; }
        public int? September { get; set; }
        public int? October { get; set; }
        public int? November { get; set; }
        public int? December { get; set; }

    }
    
    private static async Task<string> Request(HttpClient client, string requestUrl)
    {
        // Make the page request and check the response is successful
        var response = await client.GetAsync(requestUrl);
        var httpStatus = response.StatusCode;

        if (!response.IsSuccessStatusCode)
        {
            return string.Empty;
        }

        // Pull the content down from the site 
        var content = await response.Content.ReadAsStringAsync();
        return content;
    }
} 

When the code is plugged in to Bragi, it will automatically parse it, compile it and infer the data that is returned from it and prepare all the necessary the bulk load it in to a load table.

a screen with Bragi’s code component containing the code config getting the bus data from data.gg

2. Archiving the scraped data

Once the code component is built, we archive the output using a traditional archive with the following settings:

Setting Value Description
Schema dbo Database schema where the archive is stored
Name arc_buses Archive name
Description (optional) Description for the archive
Type Tracker Traditional Archive type specifying change tracking method
Update Non-Change Tracked Column true Enables update of columns not tracked by change tracking
Include Expiry Logic true Enables record expiry logic in the archive
Intra-Day Update enabled Allows intra-day updates of the archived data

We don’t expect the API data to change retroactively, but enabling Update Non-Change Tracked Column catches corrections if they do. Including expiry logic helps to manage the lifecycle of archived records, and intra-day updates ensure data is refreshed more frequently within the day.

a screen archiving bus data

This gives us historical tracking and the ability to diff across time if needed.

3. Loading weather data

We’re pairing the bus usage with a simple weather.csv file with just a few columns:

  • Year

  • Month

  • Rainfall (mm)

  • Temperature (°C)

To load it:

  1. Head to Load Configs
  2. Create a Text / Excel File Load
  3. Upload the file or link to it via a file source

Bragi will auto-detect the headers and map the columns for you.

a screen loading weather data

This creates a load config (e.g. load_weather) you can then archive like any other dataset.

4. Archiving the weather data

Just like with the buses, create a Traditional Archive using the following settings:

Setting Value
Schema dbo
Name arc_weather
Description (optional)
Type Tracker Traditional
Update Non-Change Tracked Column true
Include Expiry Logic true
Intra-Day Update enabled

Now both datasets are in the archive with proper versioning and lifecycle management.

5. Staging the combined dataset

With both sources in place, we can build a simple stage that joins the two together on Year and Month.


SELECT
  b.Year,
  b.Month,
  b.BusRides,
  w.Temperature,
  w.Rainfall
FROM (
    SELECT Year, 'January'   AS Month, January   AS BusRides FROM arc_buses
    UNION ALL SELECT Year, 'February', February  FROM arc_buses
    UNION ALL SELECT Year, 'March',    March     FROM arc_buses
    UNION ALL SELECT Year, 'April',    April     FROM arc_buses
    UNION ALL SELECT Year, 'May',      May       FROM arc_buses
    UNION ALL SELECT Year, 'June',     June      FROM arc_buses
    UNION ALL SELECT Year, 'July',     July      FROM arc_buses
    UNION ALL SELECT Year, 'August',   August    FROM arc_buses
    UNION ALL SELECT Year, 'September',September FROM arc_buses
    UNION ALL SELECT Year, 'October',  October   FROM arc_buses
    UNION ALL SELECT Year, 'November', November  FROM arc_buses
    UNION ALL SELECT Year, 'December', December  FROM arc_buses
) b
JOIN arc_weather w
  ON b.Year = w.Year
 AND b.Month = w.Month
  

a screen staging the combined bus and weather data

You can now model seasonality, test correlations, or feed the result into forecasting logic!

6. Deploy to Test

Once the code, archives, and stages are configured, you can deploy into Test with one click. Once deployed, Bragi:

  • Compiles and packages the code

  • Binds the environment context

  • Makes it repeatable and traceable

If the API structure changes, Bragi will alert you to the failure, know exactly which version broke, and quickly roll out a fix.

Summary

This is a minimal example of how to use Bragi’s custom functionality to pull in third-party data and make it part of a repeatable, governed pipeline.

The same pattern applies to:

  • Calling internal services or APIs

  • Referencing ML models or lookup logic

  • Performing data reshaping not suited to SQL

If something can output a DataTable, you can plug it into Bragi.

More information

Learn more about Bragi’s custom functionality or contact us to see how it can fit your data needs.