tinybird
/
guides

Intro to the CLI and Docker

Easy

After following the three previous guides, you now know how to import, explore data and create dynamic endpoints with Tinybird, using the UI and the REST API. In this guide, you'll learn how we work to develop real-life data projects with Tinybird, having the Data Sources, Pipes and Endpoints defined and in a code repository, and using a very important tool for this: our CLI.

Creating the project directory and a virtual environment in it

First, create the directory where our data project will be

Now, we create a virtual environment so that the packages we install don't interfere with our system Python installation and they're isolated from it. You could use virtualenv, venv, Pipenv or other packaging tools. We'll use venv here

Note that ``.`` is an alias to ``source``, which in this case reads and executes the content of ``.e/bin/activate`` into the current bash process.

Create a git repository, a ``.gitignore`` file and add ``.e`` to it so that third-party code doesn't get tracked

Installing the CLI and using Docker

If you follow the CLI docs, you'll see that there are two options for installing the CLI. If doing ``pip install tinybird-cli`` works for you, you can omit this section. This is how to run it using Docker

You need to have Docker installed and running. Download it from here and then run it. You should see something like this

Docker Desktop must be running

Once it's running, navigate to your project folder (or do nothing if you're in it already), and run

If you're new to Docker, this does two things:

  • Downloads the latest version of the Docker image named tinybird-cli-docker from the tinybirdco user on Docker Hub
  • Mounts a volume, setting the current directory (with ``$(pwd)``) as the ``source`` and ``/mnt/data`` as the target. In Docker's words, volumes are the "preferred mechanism for persisting data generated by and used by Docker containers". This will keep data in sync between the local directory (``ecommerce_guides_project``, in our case) and everything under ``/mnt/data`` in the container

Lastly, within the container, run ``cd /mnt/data`` to navigate to the target folder, where we'll have a copy of our local project files.

{% tip-box title="Mounting two volumes" %}In the case that you want to have your datasets and the Tinybird project files in different folders, you can mount both of them as volumes so that docker can access them. You'd do it with a command like ``docker run -v $(pwd)/tb_project:/mnt/data -v $(pwd)/datasets:/mnt/datasets -it tinybirdco/tinybird-cli-docker``.{% tip-box-end %}

Using the CLI

We'll only go over the basics here, as you'll see all the CLI functionalities in detail in the next guides. To see all the available commands you can use, you can run ``tb --help``

All the commands have also their own help command, so if you run, for example, ``tb datasource --help``, you'll see the options available when running it

Authenticating with your Tinybird account

To be able to use the CLI, run ``tb auth`` first

If you've followed the previous steps, a .tinyb file containing your admin token and the host will be created and will appear in your local directory as well

If you're on a Pro or Enterprise plan and your Tinybird account runs dedicated machines, your host will be different. You can provide it with the ``--host`` flag, or change the ``.tinyb`` file directly.

Initializing the folder layout

Running ``tb init`` will create these folders to keep your Data Sources and Pipes organized.

The idea is that:

  • ``datasources`` contains all your Data Sources definitions
  • ``explorations`` contains exploration Pipes
  • ``pipes`` contains Transformation Pipes where you define materialized views, etc
  • ``endpoints`` contains Pipes where the last node is exposed as an API endpoint

Downloading existing Data Sources and Pipes

This can be done with the ``tb pull`` command. These are its available options

Let's download the Data Source definition (the schema) of the ``events`` Data source with the CLI, and save it in the ``datasources`` folder. This can be done running ``tb pull --match events.datasource --folder datasources``

After running it, you should see a new file on the specified folder
You can also download the events.datasource file by clicking on the "Download schema" button through the UI

And the same could be done for the ``ecommere_example`` pipe and endpoint we created in the previous two guides

ON THIS GUIDE