tinybird
/
guides

Intro to ingesting data with Tinybird

Easy
Follow this guide to learn the different ways of ingesting data into Tinybird in no more than 10 minutes. From huge CSV files to ultra-fast queries over your Data Sources.

Tinybird Analytics is a powerful tool that lets you ingest, transform and expose big amounts of CSV-structured data in real time. It lets you query the data using SQL - so that you don't have to learn a new query language - and also create secure API endpoints in a matter of seconds, instead of days or weeks, to consume your data.

In these Getting Started guides, we'll explore a common use-case: doing real-time analytics of events of an e-commerce store. After reading them, you'll have a much better understanding on how to leverage Tinybird to analyze big amounts of real-time data in other areas, such as SaaS, marketplaces, media, etc

The data

In Tinybird, the bigger your data, the better. Although all the guides explain common concepts that can be applied to most datasets out there, we've generated two sample e-commerce datasets that we'll use in most guides:

  • One with data for 2.4M products, split in two parts
  • and another one with 100M rows containing website events, divided in two files as well

Ingesting products data via the User Interface

This is the simplest way to get your data on Tinybird. For each dataset you have you'll need to create a new Data Source, using your existing CSV files. After creating a Data Source you can always continue appending or replacing your data.

Creating a data source

Go to your dashboard and click on the "Add Data Source" button. Download the sample file to your computer and select it, or just paste the file URL in the input.

By default, Tinybird guesses the type of every column present in your CSV file. After clicking on "Add", you'll be taken to a screen where you'll see a preview of your data, and from you'll be able to change the schema if desired.

Change your column names and types in the Data Source preview modal window

You can only make changes in the schema of the data source at this point (for now). After making sure everything looks OK, click on "Continue" and, as soon as your data starts getting ingested (we ingest data in streaming even if it sits in big CSV files), you'll see it from the Data Source modal window.

The Data Source modal window shows everything related your Data Source

As we will see later, from this modal window you will be able to change your Data Source name, append new data to it, truncate and delete it. In following guides, we will cover how to do all this operations programmatically.

{% tip-box title="VIEWING YOUR DATA SOURCE INFORMATION" %}You'll always be able to bring back this view by clicking on the name data source, in the lateral panel.{% tip-box-end %}

Append data to an existing Data Source

Once you've created a Data Source, you can append more data to it easily through the User Interface or the Rest API. Doing it through the UI is as easy as clicking on "Options" and then on "Append data".

A similar modal window like the one you were prompted when creating the Data Source will appear. Try using this other file this time.

As you can see, when appending data you can't change your Data Source schema. Once your data has been correctly appended you will see a new entry in your Operations Log.

{% tip-box title="WHAT IF IT FAILS?" %}In case there are rows failing, you will see them in the quarantine view. This is especially useful for fixing your data at ingestion time.{% tip-box-end %}

Ingesting events data programmatically

Most developers will want to upload data programmatically to Tinybird. In that case, our REST API is the way to go. It also provides access to some features that aren't available via UI, and that will let you fine-tune Tinybird to be faster.

Creating Data Sources via the Rest API or the CLI

Let's add the events data now directly from its URL {% code-line %}https://storage.googleapis.com/tinybird-assets/datasets/guides/events_50M_1.csv{% code-line-end %}, but first, be sure that you have a token with access to the "Data Sources management" scope. A quick way for creating it is from your Auth Tokens section which is accessible from the sidebar on your Tinybird dashboard. You can always use your admin token, but do it with care.

Then, the following request will read the events CSV, create a new Data Source with the guessed schema and ingest the data to it, either through the Rest API or the CLI

{% tip-box title="Use a token with the right scope" %}Replace {% code-line %}<your_token>{% code-line-end %} by a token whose scope is {% code-line %}DATASOURCES:CREATE{% code-line-end %} or {% code-line %}ADMIN{% code-line-end %}{% tip-box-end %}

You can also create a data source programatically from local CSV file, like follows:

Appending or replacing data via the Rest API or the CLI

To append new data to an existing Data Source, you just need specify the {% code-line %}mode{% code-line-end %} parameter and set it to {% code-line %}append{% code-line-end %} ({% code-line %}mode{% code-line-end %} is {% code-line %}create{% code-line-end %} by default).

{% tip-box title="Use a token with the right scope" %}Replace {% code-line %}<your_token>{% code-line-end %} by a token whose scope is {% code-line %}DATASOURCES:CREATE{% code-line-end %}, {% code-line %}ADMIN{% code-line-end %} or {% code-line %}DATASOURCES:APPEND:events{% code-line-end %}.{% tip-box-end %}

{% tip-box title="APPEND OR REPLACE DATA USING THE API" %}Read more about the different modes in our API reference or check out our guide on replacing or deleting data selectively.{% tip-box-end %}

ON THIS GUIDE