After spending the last decade researching foundation models at Stanford, our founding team at Numbers Station has focused on applying the cutting edge AI technology to data and analytics, decreasing time to business insights. We recently spent time chatting with Mark Nelson, former CEO and EVP of Product at Tableau, to hear his perspective on the data analytics space and emergence of the modern data stack.
Mark has a wealth of experience over the last several decades in data analytics and IT. Previous to Tableau, he was the CTO at Concur where he was responsible for all aspects of product development, as well as hosting operations for their SaaS services and Concur's internal IT function. Mark also spent 17 years as Vice President and Architect at Oracle. He currently is an angel investor in Numbers Station, serves on the board of directors for CircleCI and is a Strategic Director for the VC firm Madrona. Below are some key insights he shared on the past, present and future of data analytics.
Data Analytics Evolution
Q: You have been working with data for the past 30 years, how have you seen the world of data evolve? Where do you see it today and where do you see it moving forward?
Data analytics has come a long way in the last few decades, our reality now is much like science fiction. We have faster processing speed, more storage power and data than we ever could have imagined all at our fingertips. Ten years ago the cost of standing up infrastructure and running it alone, not even considering the expertise required, could have been easily in the millions of dollars. In comparison, modern day data software solutions cost a fraction of that. We have entered the golden age of data where we can access and apply data to any problem.
Q: How do you define the modern data stack? How has it changed the landscape of the data industry?
I have seen the modern data stack emerge with cloud storage and cloud data processing in the last 5 years. While there is no one definition, generally the modern data stack is largely cloud based, with data stored in a data lake or warehouse like Snowflake, Redshift or Databricks. With recent advancements in data analytics, traditional approaches in managing and analyzing data are being challenged- there is a change in the paradigm of storing data. Data can be used as an asset that you massage and build up over time, as opposed to the BI paradigm where you have to identify exactly what you need from your data and carefully curate that at high cost.
The classic paradigm of extract, transform, load (ETL) where you transform and store a perfect dataset is out. We’ve shifted to extract, load, transform (ELT), meaning we no longer need to have our data in perfect condition before getting value from it. Now we can simply store massive amounts of data until we are ready to transform, analyze and visualize. This usually means faster and more flexible insights as data teams no longer need to wait long cycles to get data in the warehouse, but this also requires more transformation work which is why modern tools like DBT have gained a lot of popularity.
Q: What gaps remain in the modern data stack?
It's easier to get started now than ever without the need to install, run and administer software but getting raw data to the insights you want from it is still too complex. We have amazing tools up and down the stack but what users are really looking for is a way to turn that around into quicker value for businesses. We’ve come a long way with the creation of so many tools available today to help store, transform and visualize data. The barrier to entry to some of these tools, and especially visualization tools down the stack has been lowered. But some of the other tools up the stack still suffer from a high barrier to entry, causing too many tedious steps in the process and slowing down the time to insights. Finding the right combination of tools and getting to the desired results is still too hard and expensive.
The Shifting Landscape of the Modern Data Stack
Q: What are some of the traditional and new roles that exist in the data analytics space?
With the development of the modern data stack, we are seeing a continuation of traditional roles in data analytics, primarily data engineers or BI engineers. We rely on their expertise for complex tasks like creating a data warehouse and determining a schema. In addition to this, we are seeing a climb in less technical users participating in the data analysis workflow. There is a rise in what can be considered the business analyst, or citizen data analyst. However it is labeled, this role while less technical is venturing deeper into what a traditional data engineer would be doing to analyze and articulate the value of data.
Q: What shifts are we seeing within the skill sets in both new and traditional roles?
There is a melting pot of skill levels among data engineers and analysts, but both have the same end goal of extracting business value out of data. With some roles just scratching the surface of technical expertise and others going much deeper, we will continue to see a growing spectrum in technical skills. While there will always be the need for highly technical data engineers, there is a clear appetite for a truly no-code experience as more business users enter the space.
Enterprises really start to gain value by getting data in the hands of a business user. While we will always need data engineers, and SQL will be part of this equation forever, we need to keep working to make sure the barrier of entry remains low and enable access for the users who are looking to answer questions with data.
Furthermore, in a sea of available tools, selecting the right option to solve each organization's unique challenges can be difficult. Flexible tooling that both data engineers and analysts can use to iterate faster and gain insights previously inaccessible is key. Especially in an environment of economic volatility, finding a way to make each function more productive and arrive more quickly at insights leading to business value is critical.
Q: Where does Tableau fit into this shift? What successes and challenges did it face in its product journey?
Tableau's founders had the vision to help everyone see and understand data - to enable anyone to be an analyst and leverage visualizations for impactful business insights. It had a transformative notion that anyone should be able to use visualizations to answer a meaningful business question.
I would like to believe Tableau made it easy to analyze data, but there is still a lot of work to be done. The biggest problem customers face now in getting good answers from data is that they can't find their data, they can’t make sense of their data, they can’t get the data in the shape that they wanted to in order to effectively use Tableau as an analytics tool. The growing challenge right now is not the visualization software, but the fact that getting data in the right format for reliable use is a difficult and complex undertaking.
Q: What advantages and challenges did you see rolling Tableau into Salesforce?
Tableau’s acquisition by Salesforce, which generates some of the most valuable business data in sales, marketing, and cloud commerce, creates an incredibly valuable offering for customers. In combination with Tableau, users can arrive at insights in the most critical areas for understanding customers.
As the second large acquisition I had experienced, after Concur’s acquisition by SAP, a challenge is the give and take in merging two incredibly successful organizations. From a business perspective, it's critical to focus on striking a balance between maintaining important pieces of each organization's fabric while being flexible to adapting to a new culture.
After nearly 20 years of established processes and culture building at Tableau we had to question: was this an important part of our DNA, or is this just tactical? It’s okay if that changes and some of these questions are not easy to answer but one of the biggest challenges is the human aspect, and helping people orient themselves through such a transition can be so rewarding.
The Future of Data Analytics
Q: What areas of data analytics are most ripe for disruption with the new technology wave of foundation models?
Recently there has been a spotlight on foundation models with services like ChatGPT awing us all with its endless capabilities. While entertaining, the application of foundation models can enhance business processes, specifically in the data journey. We can use them to dramatically improve our relationships with data and manage issues threatening its reliability and value.
Foundation models will make data analysts' lives better and easier, helping to remove the noise in dealing with massive amounts of data. The mundane tasks required to make data ready for insights is currently all done by hand but could be easily done by foundation models.
Q: What areas of data analytics are least likely to be impacted by foundation models?
In terms of data analytics, this new wave needs to be seen as a way to enhance, not replace, human capabilities. There is an opportunity to use cutting edge AI technologies to bridge gaps in the different roles in data analytics and increase productivity. Foundation models don’t have the capacity of human intellect to draw conclusions and find value, but they can dramatically reduce the work required to get there.
After that is where the human intellect is valuable - we can complete the last mile of arriving at amazing insights that other humans can digest, without the wasted time and headache. That’s why I'm so excited about the idea of applying these techniques- taking the best of AI and leaving the human in the loop is where the magic happens.
About Numbers Station
Numbers Station's platform built on foundation models uses cutting-edge AI technology to enable data practitioners of all skill levels to rapidly automate workflows in the modern data stack. It reduces time wasted on mundane data analytics tasks in a traditional data workflow and leverages next generation data automation for rapid execution and insights. Pioneered in the Stanford AI lab and based in Menlo Park, Numbers Station is available today in private beta by simply signing up and connecting your data warehouse. See https://www.numbersstation.ai to learn more.