BI platforms evaluation: Preset
In this opportunity, we bring you a real gem, an open source-based Business Intelligence software with great potential.
Preset Cloud is a cloud-hosted and fully managed service platform for data exploration and visualization built on top of the popular open-source project, Apache Superset.
Our evaluation
8 (out of 10) When we first heard about Preset as a new and pushing tool, we were hesitant and gratefully surprised by it. First off, when we started to read about the tool, we found out that huge companies had adopted Superset, Apache was behind Preset, and it was completely open-sourced.
Some of the outstanding features that caught our eye at a glance were: a robust and flexible security layer through role-based access control and row-level security; a great semantic layer (not a standard feature provided); near 50 ready-to-use visualizations, and a useful SQL editor along with a friendly visualization builder without the need to write a single SQL line.
Last but not least is that it is an open-source tool. You’re not attached to any company (you can have your own instance of a standalone Superset instance), and it is constantly benefited from different third parties collaborations.
Unfortunately, there are some weak links: Preset is still a relatively new tool (not Superset), and we've run into several errors. Though you can build an entire dashboard without SQL, you still need some technical assistance to develop a good semantic layer and manage the security.
Finally, scheduled reports are not flexible enough compared to other tools, but you can still get valuable alerts and reports by Slack or email.
Features Overview
Preset always runs the latest, battle-tested version of Superset, so you have access to all the best features that Superset has to offer. They manage and maintain the newest Superset installation, including the underlying message queue, asynchronous celery workers, multiple caching layers, SMTP email server for supporting alerts & reports, and a well-monitored observability solution.
Apache Superset is an open-source software cloud-native application for data exploration and data visualization. It started as a result of a hack-a-thon back in 2017. It’s a modern, lightweight, cloud-native, free, and open-source BI web application with an advantageous SQLAlchemy python backend, making it scalable and compatible with almost any database technology speaking SQL. Several companies quickly adopted and adapted it as part of their top modern data stack layer (Airbnb, Lyft, Twitter, Netflix, Amex).
Dropbox also adopted Superset, and they have written an interesting article about this adoption and the reasons behind their decision. They centered on Superset’s ability to get answers without using SQL, create viz quickly, the possibility of sharing virtual tables and metrics, the security features, and the flexible API for creating custom visualizations. They shared a table with their findings comparing several tools that you can see here.
Characteristics
Preset comes with six main UIs that are the core of it:
- Workspace Screen: It's a way to group correlated data in database connections, datasets, charts, and dashboards. A workspace is simply a collection of all content shared between yourself and others. There's a limit on the number of workspaces you can have.
- Home Page: It collects all everyday actions such as Recents Panel, Dashboards Panel, Saved Queries Panel, and Charts Panel.
- Dashboard Screen: It's where you'll fix all your charts in a layout, along with filters and other features.
- Charts: It’s used to create a visualization. Preset offers a very vast selection of visualizations that allow you to show your data in different ways.
- SQL Lab: Here you can create queries, access saved queries, and review the query history.
- Data: All your databases and datasets are managed in this section.
1. Types of users
- Administrator: manages access to teams (or organizations), access to workspaces, and has access to all information in workspaces.
- User: connects to a team based on an invitation and access to a workspace based on provided permissions.
The typical workflow for an administrator providing role-access control are:
- Invite a user to a team using an email invite.
- Give the user access to a workspace role. There are several roles for a user in a workspace. New roles can be added on-demand if you contact support.
- Add the user to data access roles: you can create a role, add users to that role, and grant permission on databases, schemas, tables, and/or queries.
- Finally, control data access with row-level security: this powerful feature enables you to exert a granular level of control over who can query—and view—specific data in selected datasets or tables. It is usually combined with data access roles. The two filter types can be applied. Regular filters add where clauses to queries if a user belongs to a role referenced in the filter. Base filters apply filters to all queries except the roles defined in the filter and can be used to define what users can see if no role-level security filters within a filter group apply to them.
2. Database connections
It supports plenty of modern databases such as Google BigQuery, Amazon Redshift, Snowflake, Databricks Hive, Azure Synapse, MySQL, PostgreSQL, Google Sheets, etc.
3. Alerts and Reports
An alert provides a snapshot of a chart or an entire dashboard on-demand and is triggered when an event occurs, such as a threshold value exceeded. This event is defined with an SQL query which results in a unique value such as a count. Alerts are delivered via email and/or Slack. In the notification, a visual snapshot of the chart or dashboard is provided as well as a link to the chart or dashboard's Explore page.
4. Support and documentation
Knowledge Base: the is online documentation and videos for the basics of the tool.
User’s corner: a dedicated youtube channel covering several topics.
Assistance from the tool through a bot: Open a supper ticket, schedule a meeting with a member support team, and chat online with an agent 24/7.
5. Semantic Layer
All datasets you add to Preset can be customized. These customizations will appear in the dataset panel; you can group data as needed (if applying metrics) and then create charts. You can also add the following items to your dataset:
Metrics: define the metrics logic used in aggregated rows or when pivoting rows into columns. You can easily modify the metric’s name, the label, the SQL expression for the calculation, a description, D3 Format of the metric, a warning message, a certification (organization or a person), and certification details. If the metric has a warning or a certification, icons and additional information will appear when you explore the dataset.
Columns: you can select what columns to show or not, an alias, data type, date format, and several boolean settings such as if it's a date if it's filterable, or a dimension.
Calculated Columns: you can use this for transformation, enrichment, or data validation, same editable fields that those in the column are available here, plus SQL expression.
6. Charts
Preset comes with a wide variety of charts (50 and growing). Here is a breakdown of the five categories of featured charts and some examples in each category:
- Time Series Charts: line charts, time series bar charts, and time series tables.
- Composition Charts: bar charts, pie charts, and treemaps.
- Distribution Charts: histograms, box plots, and horizon charts.
- Relationship Charts: pivot tables, heat maps, and bubble charts.
- Geospatial Charts: scatterplots, arcs, grids, and polygons.
Preset comes with a dataset explorer, so you can easily explore the data and play with different charts to create an appropriate visualization. It is a complete and flexible tool to browse your data and export it in several formats (JSON, CSV, image).
Another exciting feature of some charts is the advanced analytics which consists of different functions that can be used with the data after it has been retrieved from the database. The main goal of advanced analytics is to further process the data to get a better visualization:
- Rolling Window: it will enable you to use a statistical value to represent the values, such as mean, sum, standard deviation, and cumulative sum.
- Time Comparison: It compares the same data series with a time shift utilizing actual values, absolute difference, percentage change, or ratio.
- Python Functions: Data fetched from the database are placed in a pandas DataFrame to perform analytics functions to be used in the following scenarios: advanced analytics is not supported, when data is missing, and group data based on Period and show values based on Method.
7. Dashboards
Dashboards are a way to group related charts, and the same chart can be added to more than one dashboard. The Dashboard screen is configured in a grid layout composed of rows and columns, and you can add a row or column indistinctly to create space for your charts.
Within a dashboard, the Filter visualization type enables users to filter data on all the dashboard’s charts directly. Filter comes with a feature to customize which charts will be affected. Another way to filter data is using Preset's Dashboard Filters which comes with an easy-to-use intuitive interface. From any dashboard, you can expand the Filters sidebar to quickly apply a variety of filters across the board to all charts within a dashboard.
Dashboard content can be organized into different tabs for navigation and presentation purposes, giving you the flexibility of arranging charts in a meaningful way.
If you add a markdown component to your dashboard, you can customize it using markdown, CSS, and/or HTML.
You can insert images and completely change the look and feel of a dashboard using CSS.
8. SQL Lab
The SQL Lab is the workspace for users to manage all aspects of SQL usage in their Preset environment. It's a useful tool that gives you a multi-tab environment to work on multiple queries at a time. It allows you to browse your database, support long-time running SQL queries, historical logs of queries, and support templating using the Jinja templating language, which allows for using macros in your SQL code.
The SQL Lab consists of three tools: SQL Editor, Saved Queries, and Query History.
SQL Lab also supports templating (programmatic capabilities) via the Jinja Framework, which is a web template engine for the Python programming language. Besides the flexibility provided in terms of SQL generation, templating is mainly used to increase the power of Preset filter functionality. For instance, if you need to filter data for the currently logged user, change dynamically filters in a dashboard URL, process formatted data coming from a field, or to personalize dashboards.
9. Preset Embedded SDK
With a straightforward deployment, we can integrate dashboards created into external applications to enrich the user experience, turning it into an interactive analytics experience. To protect our infrastructure, Preset offers a variety of security features to provide flexible access control.
10. General Limitations
Preset doesn’t allow creating visualization from multiple tables within its Data Exploration layer.
Although you can design the semantic without using SQL, some definitions need a basic knowledge of SQL. It would be beneficial to have a general technical setting and allow non-SQL users to broaden the models and create charts and dashboards.
It doesn’t support NoSQL data sources. You can easily overcome this limitation by having a staging area in your database where you can import tables and then manipulate those tables to create your models.
Version control isn’t an explicit feature as Superset stores the components of a dashboard separately in its dedicated database. That being said, Preset teased an upcoming feature allowing users to version dashboards as YAML configurations and move between instances through the API.
Dashboards and charts don't count with drill-down functionality. It is said that Preset is working on this feature, and it will be available in future versions.
There are no clear indicators or labels if you're viewing/modifying a data source or a chart. The same thing happens when you are in different workspaces, which can be similar but with another objective.
Scheduled reports don't support filters. You end up sending the report with default filters or creating a different report according to circumstances. Also, no attachments are allowed, like CSV or Excel files. The email/slack message you receive will be just an image and a link to the dashboard, so all the recipients need to have an account created in Preset.
Finally, no programming languages such as Python or R are supported for further and flexible analytics. However, they provide some limited Python analytic functions to be used when creating some specific types of charts.
This finishes our evaluation of Preset and as most of the BI platforms are growing fast we are sure we will need to revisit it pretty soon.
Stay tuned for more BI platform reviews coming soon.
Dynamic Data.