Tuesday, January 21, 2025

Democratizing the Journey of Analytics Development to Scale Up End Product

dbt Labs, the pioneer in analytics engineering, has officially announced the launch of several new features for its dbt Cloud platform.

According to certain reports, these new innovations are all designed to support users across various stages of the analytics development lifecycle, and it does that by providing unprecedented cross-platform flexibility, empowering more people to contribute to the analytics workflow, accelerating speed and productivity, as well as improving organizational trust in data.

Talk about these innovations on a slightly deeper level, they begin from the introduction of dbt Copilot, which happens to be the AI engine in dbt Cloud that helps users accelerate their analytics workflows. You see, dbt Copilot arrives on the scene with an ability to automate tasks that otherwise would demand repetitive manual work. This, in turn, should significantly improve productivity, data quality, and stakeholder trust. As for how it will achieve that on a more actionable note, the answer resides in the solution’s knowhow in terms of auto-generating tests, documentation, and semantic models (all in beta). Furthermore, it resides in an AI-chatbot that allows business stakeholders to ask natural language questions of their data. In the next few months, dbt Copilot will even extend to help automate model code generation.

“The data industry has made real progress towards maturity over the past decade,” said Tristan Handy, founder and CEO of dbt Labs. “But real problems persist. Siloed data. Lack of trust. Too much ‘duct tape’ in our operational systems. Our announcements from this week go a long way toward fixing these gaps: One dbt experience that is cross-platform, multi-persona, trusted, and infused with AI. All facilitating a single mature workflow: the Analytics Development Lifecycle.”

The next innovation in line here is Cross-platform dbt Mesh, a solution prepared to build upon dbt Mesh’s existing support for cross-project references. For instance, it will facilitate cross-platform references using the Iceberg table format as the underlying transport layer. Such a setup makes it possible for users to eliminate silos, while simultaneously maintaining data governance, even in increasingly complex, multi-platform environments. Beyond that, data teams can bank upon the given technology to centrally define and maintain data governance standards, see end-to-end lineage across various data platforms, and easily find, reference, and re-use existing data assets instead of rebuilding the whole thing from scratch.

Almost like an extension of the previous feature, dbt also offers newly-introduced support for Apache Iceberg, allowing users to create tables in the Iceberg format and benefit from Iceberg’s first-class performance and portability. In case that wasn’t enough, Snowflake support is also now in beta phase, whereas Athena, Spark, Databricks, Starburst/Trino, and Dremio are all now generally available.

Another detail worth a mention here talks to a new visual editing experience. This particular experience brings forth a low-code, drag-and-drop environment for building and exploring dbt models, designed to democratize the ADLC to more types of users. Quite similar to how everything else in dbt, these visual models compile down to SQL and new code must be version controlled before being deployed into production. All in all, the stated new development interface gives downstream users (who already have the most business context) the ability to accessibly, and safely, author analytics code. Markedly enough, users who are more familiar with SQL can opt to use the visual editing experience to check their work and explore a visual representation of their models.

Joining the mix here is an Advanced CI, which enables users to compare code changes as part of the CI process to catch unexpected behavior before new code is merged into production. By doing so, it improves upon the code quality, and at the same time, helps organizations optimize compute spend by only materializing correct models. Apart from that, the technology gives users a chance to view summary of their changes in their Git pull request and dive into modified, added, as well as removed rows and columns within dbt Cloud.

Latest