The Rise of Data Analytics in the Startup World

Sarah Igoe, M.D. T’19

May 2nd, 2019

Topics: Big Data / Analytics Culture Customer Entrepreneurial Tech

Five Experts Share Insights from the Front Lines

The Science of Startups: Embracing Big Data in Entrepreneurship Research

Stakeholder/Expert:

Andreas Schwab PhD, Management and Organization Theory, University of Wisconsin-Madison
Associate Professor, Entrepreneurship and Organizational Learning Processes, Ivy College of Business at Iowa State University

Stakeholder/Expert background:

Dr. Andreas Schwab, currently a Management Professor at the Ivy College of Business at Iowa State University, focused his doctoral research on Strategic Management and Organizational Theory. His current areas of expertise include Entrepreneurial Ecosystems, Corporate Entrepreneurship, Innovation and Organizational Learning, Gender, Race and Culture, and Research Methodology with a focus on Bayesian Statistics. His findings have been published in a variety of Management and Organizational Science journals including Strategic Organization, Industrial and Corporate Change, and the Academy of Management Journal.

Dr. Schwab’s research extends beyond academia by way of the various new solutions and application that have emerged from his work. He has worked to test hypotheses for Google, Facebook, the National Football League, Major League Baseball, the US and International movie industry, and International IPO’s.

As various data collection and analysis techniques have pervaded both mainstream corporate environments as well as the research community, Schwab is hopeful that the divide between academic research and startup culture will continue to blur, opening new doors for the utilization of empiric research in what has historically been a predominantly qualitative field. Big data research in the startup space poses its own unique set of challenges, but Schwab and his coauthor Zhu Zhang offer guidance to the potentially skeptical academics who likely constitute his readership, offering access to more thorough and accurate findings if the process is navigated appropriately.

Summary and Analysis of Stakeholder’s Perspective:

The availability of data can be overwhelming: not only is there more available to analyze than ever before, but comes at us in real time. One rich source of data used in academic research, for example, has been systems that collect data for city and state governments (things like ambulance dispatch data or video surveillance). Of course, smartphones also provide companies with endless data points that become more valuable to the company as they integrate location and time parameters to track and predict customer behavior. From a research perspective, the information is becoming more available and easier to digest; in the right hands, it can and should lead to better and more valuable solutions.

The term “big” data, the authors explain, refers to several key concepts beyond the fact that the data sets we use are increasingly high in volume. Crediting a particularly eloquent overview from colleagues in the Harvard Business Review, they describe the major differences as such: “big data tend to exhibit high variety and velocity—often combined with low veracity.” (McAfee & Brynjolfsson, 2012) Variety here refers to the wide array of different options with regard to sources of data and the tools used to collect it, while velocity alludes to the speed with which data is not only produced and collected, but also distributed and acted upon. Finally, in today’s era of “fake news,” the data through which we must sift as researchers is plummeting in veracity – errors, omissions, biases, and countless other messes abound.

For any business –particularly one with limited resources, like a startup – special attention should be taken when building a data analytics program that intends to conduct and interpret “Big Data Studies.” Schwab and Zhang describe the specific methodological capabilities required of such programs as falling into one of two general buckets: data management capabilities and data analysis capabilities.

Data Management capabilities are those that are needed to efficiently clean, restructure, integrate, and combine the massive data sets researchers produce and collect. Especially when the research topic is related to current issues or events, information can be generated faster than any one person’s ability to keep track of it all – all while countless news sources reporting on the event continue to churn out more information.

An obvious challenge for researchers is the fact that their data sets will likely contain mountains of information that was not initially collected in order to answer their specific research question. These irrelevant data are not always easy to identify, and may involve recording protocols that are unfamiliar or poorly documented. Unsurprisingly, data sets that are collected in real time are particularly vulnerable to errors or missing information.

Manual data collection and integration predictably becomes unfeasible at volumes this high, necessitating research teams that are technologically savvy and proficient in an increasing number of software tools. In fact, research teams will likely face enough unique challenges that they find themselves fed up enough to give up on what is already available and develop new tools from scratch. Creating, coding, iterating and adapting new tools requires major investments of both talent and resources, and the volume and diversity of unstructured information will only continue to grow.

Once a comprehensive and clean data set is prepared, a new set of challenges emerges, and Data Analysis capabilities are required to take them on. Most social scientists are at least somewhat fluent in basic statistical analysis, but the techniques used to analyze classic empirical research data sets have sample size sensitivities that render them effectively useless for Big Data research.

As a result, statistical scholars are developing and testing new solutions as we speak, and good analytics teams in the business world will be forced to keep up as they emerge. Examples of ideas in development today include: “supervised machine learning (e.g., classification, nonparametric regression), unsupervised machine learning (e.g., clustering), spatial-temporal and other forms of data mining, and new ways to visualize data, such as heat maps.”

Viewing the landscape through an even wider lens, the authors acknowledge that the opportunities associated with Big Data research are reigniting an interest in quantitative research among those that study entrepreneurship. They caution the reader to resist the temptation to classify their own attitudes and strategies as either quantitative or qualitative, but rather capitalize upon the opportunity to develop a perspective that allows for both.

Indeed, the role of data analytics in entrepreneurship research has yet to be fully defined, and Schwab and Zhang invite players at all levels to take advantage of the opportunity to get involved. Individual researchers, for example, could seek out educational options, like workshops and seminars, to increase their proficiency in relevant skills, or make efforts to launch interdisciplinary studies or programs involving colleagues with relevant expertise. Educational institutions could work to integrate Big Data research skills into their graduate curriculums (the authors specifically suggest training with software tools geared for Big Data management and analysis like Python, R and SQL).

Zooming out on the matter one final time, Schwab and Zhang’s insights confirm that academic research in the field of entrepreneurship is alive and well. Their perspective provides an apt introduction to a conversation on how empirical research and innovation will grow and develop together, rather than simply side-by-side.

Bridging the Gap: How to Get Entrepreneurs to Care About Analytics

Stakeholder/Expert:

Tom Davenport PhD, Social Science, Harvard University
Professor, IT and Management, Babson College
Digital Fellow, Massachusetts Institute of Technology
Senior Analytics Advisor, Deloitte Analytics

Stakeholder/Expert background:

Dr. Tom Davenport is an established academic researcher and professor in the Information Technology and Management Department at Babson College, which for 25 consecutive years has enjoyed the #1 distinction for graduate business programs in entrepreneurship. His research interests span a broad variety of social sciences (cognitive science, information management, organizational behavior, and innumerable others) and his continued relevance in academics is undeniable (publications in the British Medical Journal Innovations, Harvard Business Review, and the New England Journal of Medicine, can all be found from 2018 alone.)

At the same time, Dr. Davenport’s work has expanded beyond academia to broader audiences across the Business News media, with regular contributions to Fortune, the Financial Times, and the Wall Street Journal, and nearly twenty books exploring management, data analytics, and the complex relationship developing between the two. In pursuit of an authority on “Bridging the Gap,” one would have trouble finding a more appropriate source of wisdom.

Summary and Analysis of Stakeholder’s Perspective:

In an April 2018 Forbes article titled “Even Entrepreneurs Need Analytics,” Davenport laid out in no uncertain terms that the startup world, despite some enthusiastic rhetoric to the contrary, does not transcend a dependence on empirical data.

Entrepreneurs by nature, he explains, are “not a highly analytical bunch.” For better or for worse, the startup zeitgeist, if you will, placed a high premium on hunches, intuition, and gut feelings. Relying on a hunch that ultimately fails and necessitates sudden course correction was called “pivoting,” and it was a perfectly natural stage in many companies’ growth. Fortunately for those who believe that entrepreneurship and empirical research can have a symbiotic relationship, Davenport has noticed a turning of the tide among his students.

Perhaps the development is less of a change in philosophy and more an inevitable adjustment to the types of products and services most commonly offered by companies in the startup space: many if not most of them exist within the Internet/Technology sector. In some ways, founders and other managers of early stage companies have had no choice but to make room for data analytics; after all, they are the ones generating these massive amounts of data.

Alternatively, perhaps rising entrepreneurs are embracing analytics because the most successful and highly visible technology companies (the so-called FAANG companies: Facebook, Amazon, Apple, Netflix, Alphabet/Google in the U.S., and Chinese online companies like Baidu and TenCent) have all seemed to hold data science in high regard and work to integrate analytics programs into their overarching strategies.

Davenport points out how Babson, his home institution, has responded to the increase in demand for analytics education in business education (particularly in a community where a large portion of students are aspiring founders, innovators, and company builders). New course offerings including “Competing on Analytics,” “Analytical Managers and Organizations,” “Cognitive Technologies,” legitimize an educational framework that bridges the analytics-entrepreneurship gap. He even credits Babson’s newest graduate program – a Masters of Science in Business Analytics (MSBA) – as “perhaps the greatest indication of the need to combine entrepreneurship and analytics.” As students and researchers explore the question of how to integrate the two fields, we can expect to see results in both academia and industry, ideally ones that benefit from the best of what both worlds have to offer.

Getting Started: Advice for Entrepreneurs on Building an Analytics Program

Stakeholder/Expert:

Florian Zettelmeyer PhD, Marketing, Massachusetts Institute of Technology
Professor and Chair, Marketing Department; Program Director, Data Analytics, Kellogg School of Management, Northwestern University

Stakeholder/Expert background:

Dr. Florian Zettelmeyer, a highly celebrated researcher and professor of Marketing at the Kellogg School of Management at Northwestern, is also the founder and director of the University’s landmark data analytics program. PDAK (the Program for Data Analytics at Kellogg) offers a data analytics curriculum designed specifically for future managers and business leaders. Its creation was based on the philosophy that the utility and overall presence of data science in business is only growing stronger, and the most successful business leaders within our rising generation will need a strong foundation of scientific knowledge with which to leverage its potential.

With not only support from Northwestern but also partnerships with a variety of companies and industry leaders, Zettelmeyer and his colleagues believe programs like theirs are necessary to capture the depth and breadth of potential applications for data science. A passing understanding of basic research methods will no longer suffice in our increasingly data-driven world, and MBAs who sidestep a deep dive into statistical and methodological fundamentals risk being left behind as demand for these skills continues to grow.

Meanwhile, social science scholars with expertise in business and management have no trouble demonstrating the potential value that a thoughtful and calculated integration of business and data science could provide. By curating insights from throughout his network of respected academic colleagues, Zettelmeyer has established what promises to be an enduring conversation on the potential future for a field of study that gives proper attention to both disciplines.

Summary and Analysis of Stakeholder’s Perspective:

In an April 2018 contribution to Kellogg’s monthly magazine, he teamed up with Eric T. Anderson, Thomas O’Toole, and Steven Franconeri to create a rudimentary how-to guide for business leaders. Aimed at founders and managers looking to build business that embrace analytics from Day 1, their piece, “Take 5: A Guide to Getting Started and Succeeding with Data Analytics,” offers five basic insights on how to get the most out of data analytics:

Leaders Need to Understand Analytics, Too
Tips for Building an Analytics Team
Create a Culture of Intellectual Curiosity
The Analytics Paradox
Displaying Your Data

Leaders Need to Understand Analytics, Too

Entrepreneurs and managers can be perfectly effective without a masters-level understanding of computer science, but the increasingly technological business landscape has raised the bar for a fundamental understanding of data science. Leaders must avoid the temptation to defer to the experts and scientists when it comes to analytics. They must have enough basic education and fluency in the field to ask the right questions, separate good data from bad, and identify how and where data science can add value.

Tips for Building an Analytics Team

Excellent data scientists are in short supply, and identifying and recruiting the best ones is not easy. Dan Wagner, former chief analytics officer for the Obama 2012 campaign, describes how thousands of hours interviewing potential analysts led him to the realization that he simply cannot identify who will be good at the job through traditional methods. Much more revealing, in reality, was a simulated exam process designed to showcase relevant skills. Naturally, it was the gifted yet reserved introverts that excelled in this environment, “and that,” he confirms, “is the classic person that you’re trying to hire.”

Create a Culture of Intellectual Curiosity

Making the most of the ability to use data to solve problems involves tapping into all the expert minds at one’s disposal. Naturally, this extends far beyond the confines of a data analytics team, or even those who regularly interact with one. Data science is only as useful as the questions it answers, and these can come from unexpected sources. The authors encourage managers to be explicit in this pursuit: communicate to all employees that intellectual curiosity is no less than a job requirement, and they can demonstrate value by asking new questions and proposing new data-driven processes to pursue solutions.

The Analytics Paradox

An established analytics program with a pipeline of thoughtful questions at its disposal can begin to appear so streamlined it is almost automatic. This may be a consequence, they warn readers, of what they call the “Analytics Paradox.” Essentially, as data collection processes become increasingly efficient, the errors and noise predictably fade, and the output becomes increasingly homogenous. While this seems like a harmless trend, the errors and the noise are actually providing essential inputs that allow the algorithm to test and analyze a variety of data points. The fact that this largely unknown yet fundamentally relevant insight is found buried within one business school’s Internet periodical is remarkable! Proponents of dedicated data science programs in management curriculums truly need look no further for proof positive of the unmet need.

Displaying Your Data

Presentation and packaging are not superfluous concerns. On the contrary, failure to clearly present the hard-won conclusions of a data science investigation would be a disservice to the entire program. Create visualizations (do not ask your audience to read text and listen simultaneously) and make them as different from one another as possible.

Ideally, simple guidelines like these will chip away at the cloak of mystery that surrounds data science, especially in the startup community. As the basic concepts and terminology become more pervasive across businesses of all sizes, even the most free-spirited of entrepreneurs will start looking for new and exciting ways to enhance their business through data science. To use phrasing borrowed from the article’s title, the most successful entrepreneurs may soon be those who see “Succeeding with Data Analytics” as synonymous with “Getting Started.”

Data Science Startups vs. Academia: Lessons Learned from a Career Switch

Stakeholder/Expert:

Brock Ferguson PhD, Cognitive Psychology, Northwestern University
Data Scientist, Strong Analytics

Stakeholder/Expert background:

Dr. Brock Ferguson performed his doctoral research in Cognitive Science at Northwestern University, and participated in a dual degree program in Business Management for Scientists and Engineers offered by Northwestern’s Kellogg School of Management. His cognitive science research approached the discipline from both social science and technological perspectives, drawing novel insights from how infants learn and process language into the linguistic landscape of computer science. He developed and led educational programs using novel data processing tools before transitioning to advisor and consultant roles for various venture-backed startups including popular subscription apparel service StitchFix.

He currently serves as Principal Data Scientist for Strong Analytics, a consulting firm he co-founded with Jacob Zweig, a fellow former academic with a PhD in neuroscience. Their team offers solutions designed to leverage machine learning to improve products and operations for businesses across tech, marketing, manufacturing, and distribution, and are developing an in-house product that optimizes drip email marketing campaigns using AI.

Summary and Analysis of Stakeholder’s Perspective:

Reflecting upon his transition from academia to startups, Dr. Ferguson realized that his journey, while not necessarily unique among PhDs in technical fields, was not often discussed. To shed light on the process for this emerging scientist-turned-founder community, he presented an examination of his first year as a founder in a 2017 article for Medium.com entitled “Leaving academia to start a data science company: Looking back at our first year.”

He recounts that early in his career, he saw academia and business as two equally fulfilling but decidedly separate paths. Academia provided him with intellectual stimulation, while software development satisfied his urge to build new things. Along the way, however, he learned how to thrive in a career that existed at the intersection of both. For like-minded peers, Ferguson offers the following lessons:

Learn what ‘data science’ really means
Learn about the hype around data science
Learn to sell data science consulting
Learn to stay motivated and excited

Learn what ‘data science’ really means

Even as active participants, he and his colleagues initially failed to appreciate the true breadth of skills and tools expected of today’s data scientists. Unlike most sciences, for example, theirs is expected to produce results – scholars and journalists are often found referring to Data as something with a measurable return on investment. Managing these expectations while attempting to deliver the desired results when possible requires a deeper knowledge of business concepts than what most technical roles entail. Perhaps more alarmingly, even the skill requirements on the technical side were more than most would have expected for an entry-level position. (A comprehensive list of “things we use regularly” is provided for reference and also, one assumes, to prove his point.) “In the end,” he writes, “I’ve come to describe a data scientist as someone who both recognizes the opportunities in data and has the skills to capitalize on these opportunities end-to-end.”

Learn about the hype around data science

Thought leaders across various fields are showing increased excitement about data science, which was generally good news for academics (jobs for everyone!), though there were downsides as the field flooded with newcomers, like having to compete with flashy new competitors making impossible promises. Overall, however, Ferguson thinks the hype has a net positive effect on the industry – applications in new industries are appearing every day, and the scientists are thrilled to have a new set of questions to answer or problems to solve.

Learn to sell data science consulting

Ferguson recalls the so-called “no touch” sales strategy he devised at the first company he started, which essentially meant that customers found, tested, and decided to buy the product all on their own. He worked hard to enable this process so he could spend his time developing software instead of pushing sales. This was fine for a products business, but in a service business, especially when its market as hot as analytics is right now, one might quickly find himself up against an influx of competing firms making similar or bigger promises, outlandish as they may be. He had no choice but take on the salesman role he’d eschewed for so long and convince his customer base that his offering was the best in the market. Even worse, he would have to do it without any metrics or calculations to support his claim (a scientist’s worst nightmare, of course).

Unfortunately clients are used to hearing over-the-top predictions and egregious promises from sales people. In all likelihood, they’ve probably developed an automatic (likely entirely subconscious) adjustment in their minds to temper expectations. I have named it the Founder Fervor Factor. Unfortunately, good scientists would never allow such chicanery, which leaves them in the unfortunate position of their having their (more realistic and defensible) claims fall flat, causing their company to be passed aside for a competitor that ends up being all show and no go.

To address this problem, Ferguson implemented a policy wherein customers are offered a “discovery” phase in which he and his team will conduct a thorough initial review of not only their data, but also things like operations processes, available tools, and business plans. Ferguson’s team spends this phase gathering enough information to understand what their client’s particular collection of data and other tools can do and what it can’t. Then and only then will his team present their recommendations, which by then will justifiably include a clear explanation of the project along with projected milestones, deliverables, and even ROI.

Most clients will see the discovery phase as a can’t lose option: without shelling out a large upfront check, they can use empirically derived information to inform their ultimate hiring decision, and even if they don’t end up deciding on Ferguson’s team, they walk away with an expert assessment of the potential applications of their data and a fresh set of eyes on their tools, plans and strategies.

Learn to stay motivated and excited

Academic researchers are no stranger to all-nighters and ungodly work hours, and the company was meeting his performance expectations just a few months in, so Ferguson was understandably surprised when he began feeling burnt out so soon. After some introspection, he concluded that since his transition from academia to industry, the impetus he used to have to set aside time for reading academic papers or practicing new skills had been fading away. He urges readers to prophylaxe against impending complacency by making time in their schedules to learn about something, as long as it has nothing to do with their current client project and they are excited about the topic.

He closes on a positive note, conceding that the intersection of analytics and the business community – entrepreneurially-inclined or otherwise) is home to an abundance of opportunities to learn new skills, ask new questions, and discover new solutions. He seems to have shared this article in hopes of demystifying the business world and building a rich bullpen of top quality data scientists for the future.

Pitfalls and Red Flags: Data Science Mistakes Startups Make

Stakeholder/Expert:

Amanda Richardson, CEO, Rabbit, Inc
Former Chief Data and Strategy Officer, HotelTonight
MBA, Stanford University Graduate School of Business

Stakeholder/Expert background:

Amanda Richardson received her BS in commerce and worked as an equity analyst before attending Stanford University Graduate School of Business for her MBA. Unlike our other selected experts, she never returned to academia after graduating from Stanford, instead pursuing an impressive career path involving various roles in business development and product management. At Eclipsys Corporation (now Allscripts Group), she managed an analytics software product and was able to capitalize on this early experience by incorporating data science into future leadership endeavors. For example, as VP Product for HotelTonight, Richardson focused on data insights to inform strategy and planning efforts, which likely contributed to her subsequent appointment as Chief Data and Strategy Officer. She is currently CEO of Rabbit, Inc., a Bay Area-tech company whose product is a mobile app that combines social networking with streaming media content.

While still in her role as Chief Data and Strategy Officer for HotelTonight, Richardson contributed a data science-focused feature to First Round Review, an online content hub for entrepreneurs that aims to be considered “the Harvard Business Review for startups.” The article, entitled “The Four Cringe-Worthy Mistakes Too Many Startups Make with Data,” provides a clear and comprehensive introduction to analytics for the startup community, cleverly packaged as a simple piece of guidance on What Not to Do. Her lessons take the form of the following four errors:

Mistake #1: Starting with Metrics Instead of a Goal
Mistake #2: Rampant Personalization
Mistake #3: Hiring a Dedicated Data Scientist
Mistake #4: Chasing After the Latest Toolset

Mistake #1: Starting with Metrics Instead of a Goal

Resist the temptation to start collecting and sorting data without a clear vision as to why. Research experts with years of academic experience may find this obvious, but it is equally important that they understand that many people on the business side of their institutions may not. Make sure all stakeholders – not just the analytics team – knows the specific question or hypothesis that prompted the data collection to begin with. Entrepreneurs will find that understanding the dangers of an answer-first question-second approach to research can pay dividends across all stages and functions. Think of your research question as a scorecard; have something written down before starting any new projects. Recall the SMART acronym for setting goals: specific, measurable, achievable, relevant, and timely. To be clear, the importance of a clear objective does not preclude adapting to unexpected findings that your data might produce, but keep top metrics top of mind at all times.

Mistake #2: Rampant Personalization

Especially on a startup’s budget, consider the costs – financial and otherwise – of each choice, and prioritize functionality over customization. Even if personalization is a choice that fits with your product, trying to achieve it too soon can be wasteful and ineffective. “Effectively personalizing a product generally requires a good amount of banked data — data that a new company may not have had time to amass.” Indeed, if you are finding it easy to personalize your product from the outset, ask yourself if you could be making better use of data to understand your customer moving forward. Return to the question of what problem you’re solving and recognize the considerable opportunity costs of misdirected focus.

Mistake #3: Hiring a Dedicated Data Scientist

Consider reframing analytics, at least at first, as a company-wide program rather than a single person (or team)’s job. If data science occupies a sequestered space or an insulated department within the company, other employees may fail to contribute to key decisions or conversations. As we have established, it takes multiple inputs to craft a good research strategy. Don’t blind the scientists to the realities of how the company is operating, and don’t coddle the strategists by avoiding the statistical principles needed to generate a valid hypothesis.

Mistake #4: Chasing After the Latest Toolset

Remember “garbage in, garbage out,” meaning the insights achieved from any data analytics program are only as good as the data that were originally put in. In other words, focus on the quality of the data being collected, not the computer program doing the collecting. Simple analytics software can meet nearly any company’s basic needs, and Richardson explains that, for most organizations, there are only three: a central dashboard, accessible data, and flexible tooling.

Richardson urges founders– especially from brand-new firms that are still building their teams – to view this list of insights as an opportunity for a solid methodological head start on the creation of their own analytics programs. The sooner a team is assembled and the fundamentals are embraced company wide, the sooner the data can be put to work.

Sarah Igoe, M.D. is an MBA student at the Tuck School of Business at Dartmouth College and a medical doctor, having worked at Harvard University, Yale University, Mass General Hospital and Montefiore Health System.

Featured Papers/Articles:

Richardson, A. “The Four Cringe-Worthy Mistakes Too Many Startups Make with Data.” FirstRound.com, 2018. https://firstround.com/review/the-four-cringe-worthy-mistakes-too-many-startups-make-with-data/

Ferguson, B. “Leaving academia to start a data science company: Looking back at our first year.” Medium.com, 12 Jun 2017. https://medium.com/@brockferguson/leaving-academia-to-start-a-data-science-company-looking-back-at-our-first-year-33dab049d965

Zettelmeyer, F, Anderson, E, O’Toole, T, Franconeri, S. “Take 5: A Guide to Getting Started and Succeeding with Data Analytics.” Kellogg Insight; Kellogg School of Management. 3 Apr 2018. https://insight.kellogg.northwestern.edu/article/take-5-a-guide-to-getting-started-and-succeeding-with-data-analytics

Davenport, T. “Even Entrepreneurs Need Analytics.” 18 Apr 2018. Forbes Magazine. https://www.forbes.com/sites/tomdavenport/2018/04/18/even-entrepreneurs-need-analytics/#60c4003f4ff9

Schwab, A and Zhang, Z. “A New Methodological Frontier in Entrepreneurship Research: Big Data Studies.” Entrepreneurship Theory and Practice. Feb 2018. https://journals.sagepub.com/doi/pdf/10.1177/1042258718760841

Referenced within:

McAfee, A., & Brynjolfsson, E. (2012). Big data: The management revolution. Harvard Business Review, 90(10), 60–66.

Blog