Investing In The Next Generation Of DataOps With Unravel Data
Curtis McKee September 27, 2022
There are over 4.66 billion active internet users worldwide consuming data via social feeds, video streams and searches at an incredible pace. The estimated internet data consumption will hit 94 zettabytes this year alone. The zettabyte is a multiple of the unit byte that measures digital storage, and it is equivalent to 1,000,000,000,000,000,000,000 [10^21] bytes, also approximately equal to 94 thousand exabytes or 94 billion terabytes. Due to the pandemic, data growth has exploded on the internet as the world transitioned to be online more than ever working remotely. On Whatsapp alone, about 1.4 million video & voice calls are made and over 41 million text messages are shared every minute. Fraction attributable to video in the total global internet traffic was an astounding 82% at the end of 2021. It’s anticipated that by the end of 2025, data consumption will double to over 188 zettabytes.
Suffice it to say we live in a data driven world and enterprises today run on making real time decisions based on a data technology stack and massively scalable computing cloud infrastructure. The data stack of course, has had to evolve over the years to keep up. This has been evolving rapidly since the 80s & 90s, with mainframe computers processing data in on-premises datacenters, to the rise of the Big Data era in the 2000’s when relational databases and distributed processing platforms such as Hadoop and Spark allowed greater scalability for processing petabyte sized workloads. Then the rise of cloud computing brought even greater scalability and broke barriers in performance and cost, allowing more enterprises to become focused on utilizing their data as a source for monetary value and ushering in the new era where “Data is the new oil”.
Rise of DataOps and the Modern Data Stack
With the growth of data over the past few decades, enterprises are now moving into an era where the “data cloud” is transforming the landscape for data processing tools and storage platforms. Enterprises are managing data sources across their own infrastructure, as well as in multi-cloud & data warehouse environments. Traditionally enterprises would ingest data through extract, transform and load (ETL) tools into data storage or Hadoop clusters but this method is now giving way to a Modern Data Stack where data lakes and data warehouses are used as centralized cloud storage platforms. Additionally, data querying engines like Apache Presto are enabling the movement towards a decentralized model for modern data infrastructure. Data orchestration tools like Airflow are used for workflow orchestration allowing deeper integrations between disparate systems and for data transformation to happen where data resides. Monitoring and automating these data operations or DataOps from on-premises to the cloud has become a nightmare for IT and data engineering teams to manage. Mission critical data driven applications are challenged to work reliably across hybrid-cloud, multi-cloud, and data cloud landscapes. Large enterprises today typically manage 10+ data systems and potentially 1000s of data apps, each with 100s of bugs every day. Modern enterprise applications are just too data intensive for first generation IT performance monitoring solutions to manage alone. With massive Big Data migration projects happening from on-premises to cloud, enterprises need “end-to-end data observability” of their petabyte-sized workloads. Enterprise data engineering teams require full automation to optimize the cost of their clusters, AI to predict data pipeline issues and a single-pane-of-glass view to accelerate debugging of complex analytics workloads running on complex infrastructure.
Given this exciting transformation backdrop to the data world, we are thrilled to announce our investment in Unravel Data, the leader in DataOps Observability for modern data stacks. Unravel Data’s DataOps Observability platform is built for today’s data teams who require full-stack visibility, automation, and actionable intelligence to meet their needs around data pipeline performance, cost, and quality. Unravel Data leverages artificial intelligence, machine learning and analytics to offer actionable recommendations and automation for tuning, troubleshooting, and improving performance, enabling businesses to understand and optimize their data-driven applications. AWS, Databricks, Google Cloud Platform, and Microsoft Azure all recommend Unravel Data to their largest customers for data onramp to the cloud. They love Unravel because Unravel helps their customers move workloads to the cloud and operate more efficiently when they are there.
We have been very impressed with Unravel Data’s ability to move from being focused early on delivering DataOps automation for on-premises Big Data workloads to now being the only DataOps Observability vendor that can automate data pipeline workloads across all clouds, data lakes, warehouses, and on-premises data environments. Their customer adoption ramp of their cloud DataOps automation platform has been stellar, growing well over 500% in the past year with a top tier list of Fortune 500 customers adopting it. Under the leadership of the co-founders, CEO, Kunal Agrawal and CTO, Dr. Shivnath Babu, the transformation of the company in recent years is a testament to their brilliant work and deep understanding of customers’ pain points in data workload automation. Third Point Ventures is very excited to be joining the Board with Menlo Ventures & GGV and partnering on the road ahead.