GDPR-Compliant Data Deletion in Tinybird¶
Tinybird provides several methods for deleting user data to maintain GDPR compliance. The best approach depends on your data volume and deletion frequency.
Methods for Data Deletion¶
1. Using the CLI¶
For smaller datasets or infrequent deletions, you can use the tb datasource delete
command:
tb datasource delete [datasource_name] --sql-condition "user_id = 'user_to_delete'"
This method is suitable for datasets with up to a few million rows and deletion frequencies of up to 1000 operations per week.
2. Using the Delete API Endpoint¶
For more frequent deletions, you can use the Delete API endpoint:
POST /v0/datasources/(.+)/delete
This method is more efficient for regular deletion operations.
3. Implementing a Deletion Queue¶
For large-scale applications with frequent deletion requests:
- Create a
delete_user
table to queue deletion requests. - Set up a scheduled job (e.g., weekly) to process the queue and delete user data from all relevant datasources.
- Use partitioning on large datasources to optimize deletion operations.
Best Practices¶
- Ensure deletions cascade to all relevant datasources and materialized views.
- For datasources with millions of rows, consider partitioning to improve deletion performance.
- Monitor the performance impact of deletion operations and adjust your strategy as your data grows.
- Implement TTL (Time to Live) on datasources where appropriate to automatically remove old data.
Considerations¶
- Deleting data using
ALTER TABLE ... DELETE WHERE
can have performance implications on large datasets. - Balance between immediate deletion for GDPR compliance and system performance.
- Regularly review and optimize your deletion strategy as your data volume grows.