Home SQL Tutorial Python Tutorial Mode Analytics

Data Science at Companies

Why does Airbnb have a data scientist on every team? What did it take to build out Thumbtack's data infrastructure? How do Twitch data scientists convince execs to embrace data-informed decision making? Get behind-the-scenes perspectives on how data teams at actual companies tackle questions of building infrastructure, scaling analytics, attaining buy-in, and structuring teams.

The View From The Data

Making data-informed decisions has a lot more to do with people than it does with the actual data. - Karen Roter Davis

Data Science On The Silicon Beach

In this interview, the Chief Data Officer for the city of San Diego discusses his team’s ad-hoc approach, integrating their stack with legacy systems, and his plans for employing data to alleviate traffic congestion. - Partially Derivative

Hiring a data scientist

Hiring for a data analyst is no easy task. Wikimedia shares how they drew on existing resources to synthesize a better approach to interviewing and hiring a new member of their data team. - Wikimedia

How Fitbit’s data science team scales machine learning

Workout regimens need to be tailored to each individual. Directional correctness isn’t enough. Fitbit’s head of data science shares how his team builds a model for every user to increase motivation and prevent injuries. - Mixpanel

Scaling Data Science at Stitchfix

Not many companies can say they employ 80 data scientists. The folks at Stitchfix share their tactics for making data and compute resources more accessible—which in turn keeps data scientists happy and infrastructure healthy. - MultiThreaded

Building & Maintaining a Master Data Dictionary: Part 2

Check out these ideas for structuring key metric definitions to keep everyone at your organization on the same page. - The Data Point

What’s it like to work in sports analytics?

From the outside, crunching numbers for a national sports league seems glamorous. The cold hard truth? It’s an often thankless job with low pay and long hours. The only thing that’ll prevent burnout is a pure love of the game. - StatsbyLopez

Quora Session with Monica Rogati

The Former VP of Data at Jawbone did a Quora session last week. - Quora

How To (Actually) Calculate CAC

Quick: What’s the difference between customer acquisition cost (CAC) and cost per acquisition (CPA)? If you hesitated, this post is for you. - Brian Balfour

Breaking the Vanity Metric Cycle

“[B]reaking free of worthless metrics is hard because it is breaking a psychological reward, not just adopting some new stats.” - Amplitude

The Limitations Of Data And Benchmarks

“All the quantitative analysis in the world won’t lead me to the next great idea for startup. Those figures can’t create empathy, develop the right culture, or hire the right people.” - Tomasz Tunguz

Trust in Data Science

“An untrusted analysis is an unused one, regardless of the quality. So how does one go about building, or rebuilding, trust in the face of challenges and failure?” - Clover Health

8 Data Science Skills That Every Employee Needs

A nice primer to share with your colleagues. - Amplitude

Ten Ways Your Data Project is Going to Fail

“Many companies seem to go through a pattern of hiring a data science team only for the entire team to quit or be fired around 12 months later. Why is the failure rate so high?” - Martin Goodson

Why I’m Teaching Twitch to Predict the Future

Forecasting is a good habit to adopt in the workplace. It’ll help you figure out the odds of delivering on your goals. Plus, having a record of accurate predictions builds trust in your work and analytical thinking in general. - Twitch

Don’t Become a Victim of One Key Metric

“[T]he search for one key metric at all for a complex ecosystem like Pinterest over-simplifies how the ecosystem works and prevents anyone from focusing on understanding the different elements of that ecosystem. You want the opposite to be true.” - Casey Winters

Data Literacy, Product Design and the Many-Faced God

“Building a team that’s doing ‘cutting-edge research in deep learning, machine intelligence, and artificial intelligence’ is not easy—not in this hiring environment. But infusing data thinking throughout a company is orders of magnitude harder. This matters, because data thinking permeates your products and can make them feel ‘smart’—or not.” - Monica Rogati

Tracking Customer Service Metrics With SQL

This guide includes a dozen SQL queries for calculating customer service metrics with raw Intercom data. - Mode

GGPlot2 As a Creativity Engine, and Other Ways R is Transforming the Financial Times Data Journalism

Learn how the Finanical Times produces high-quality data visualizations in this presentation, complete with the R code and data used for their piece, Explore the changing tides of European footballing power. - Financial Times

Surviving Data Science “at the Speed of Hype”

Complex optimization models work best when they’re asked to deal with stable business problems, like airline scheduling or ad targeting at Google. But at a startup, where the business model is constantly changing, simply summarizing data is a much better way to find answers. - John Foreman

How We Rebuilt the Wall Street Journal’s Graphics Team

The WSJ used to have two Graphics teams—one for print and one for the web. Combining the two has allowed editors to focus on storytelling from the start of projects, instead of the medium. - Source

What I Wish I Knew About Data For Startups

One entrepreneur reflects on his learnings from four years of working with data at a startup. It’s a goldmine of advice on building a strong, scaleable data culture. Don’t skip this one. Seriously. - Jean-Nicholas Hould

Simple requirements gathering questions for dashboard design

Next time someone asks you to make a dashboard, pull this list out. It provides a framework for sussing out what’s needed for the dashboard to be useful and effective. - Paint by Numbers

The Data Driven Daily

This newsletter provides definitions of business KPIs, how to calculate them for your business. This week they’re covering how to determine the size of your potential customer market. The archive is well worth perusing; past segments include revenue calculation and pricing strategy. - Outlier

FiveThirtyEight’s data journalism workflow with R

FiveThirtyEight’s quantitative editor shares the analytical process behind some of their publication’s most popular articles. - useR!

Practical advice for analysis of large, complex data sets

“This document has been read more than anything else I’ve done at Google over the last eleven years. Even four years after the last major update, I find that there are multiple Googlers with the document open any time I check.” - The Unofficial Google Data Science Blog

One year as a Data Scientist at Stack Overflow

The chronicle of one data scientist’s transition from academia to the tech industry, combined with a peek into Stack Overflow’s machine learning and data infrastructure projects. - David Robinson

Whom the Gods Would Destroy, They First Give Real-time Analytics

Every few months, I try to talk someone down from building a real-time product analytics system. When I'm lucky, I can get to them early. - Dan McKinley

Real-time dashboards considered harmful

There’s a certain allure to real-time data: your team can see what’s happening right now and take action immediately. Ultimately, though, most real-time dashboards create a bunch of noise that distracts you from more important metrics. - Basecamp

Boosting Sales With Machine Learning

One developer shares how his team used natural language processing and machine learning in Python to pre-qualify sales leads so reps don’t have to spend hours doing it manually. - Xeneta

Scaling Knowledge at Airbnb

Airbnb’s data team shares their solution to ensuring insights don’t get lost in Google docs or email threads: a centralized knowledge repository. - Airbnb Engineering

Bridging the Gap Between Data Science and Data Engineering

Josh Wills, Director of Data Engineering at Slack, shares his thoughts on how data engineers and data scientists work best together. - Hakka Labs

Statistically Interesting

Craving a new data science podcast? Check out Statistically Interesting, a series of interviews with data science leaders at companies like Twitch, Vimeo, and Weebly. - Statistically Interesting

How to Make Reps Care About Data Quality

When a sales rep fails to record information about her activities or clients, it can lead to incomplete and inaccurate reports and forecasts. These tips and tricks will help sales leaders encourage reps to be vigilant about consistently logging data. - InsightSquared

Building Data Science in Healthcare

Many tech companies have complete control over the format of the data they collect. Healthcare, which relies on external data about patients and their interactions, has no such luxury. Ian Blumenfield, Head of Data Science at Clover Health, shares how they handle messy data and the other unique data challenges the industry faces. - Clover Health

This Is How You Build Products for the New Generation of ‘Data Natives’

We’ve grown used to the idea of digital natives—the toddler who expects everything to be a touchscreen and pinches and swipes her fingers on TVs and magazines. But data natives are something different: they expect “their world to not just be digital, but to be smart and to adjust immediately to their taste and habits.” Monica Rogatti, former VP of Data at Jawbone, shares ideas for harnessing data to build products for these new consumers. - First Round Review

So you want to build a data business? Play the long game

Foursquare has demonstrated, once again, that it’s capable of predicting public company earnings with an incredible degree of accuracy based on real world foot traffic data. - Michael Carney

Hot property: How Zillow became the real estate data hub

Zillow is a real estate powerhouse, and one of their biggest competitive advantages is their massive dataset of property listings. The most interesting part of this article goes into how their data science team brings together messy data from disparate sources to create one coherent super-dataset. - InfoWorld

The Art and Science of Storytelling Through Data at Jawbone

Analysis can be worthless if it’s not communicated well. Jawbone data scientist Kirstin Aschbacher shares how she develops a data story that inspires action, from concept to presentation. - Insight Data Science

Doing Data Science Right — Your Most Common Questions Answered

This is a must-read for startup founders who want to build data science teams. It’s packed with details on the inner-workings of data-driven businesses and advice on where to start based on your company’s needs. - First Round Review

How Does the Data Science Team Work at Twitch?

In this interview, Twitch's Director of Science shares how the data science team thinks about mentorship, gaining leverage, and qualitative research. - Mode

Analyzing Your Stripe Data, Part 1: Measuring Subscription MRR

Got raw Stripe data? Want to calculate your subscription monthly recurring revenue? Lucky for you, this post provides the SQL queries you’ll need, tips for data prep, and ways to tailor the analysis to your business. - Analyst Collective

Building a high-throughput data science machine

Scaling is a problem every data science team faces. How do you go from one nomadic analyst roaming between departments to a structured team? The answer is a little different for every company, but this interview introduces some best practices to keep in mind. - O’Reilly

How to Find Correlative Metrics For Conversion Optimization

A thorough walk-through of how to find correlative metrics and leverage them for conversion. It’s jam-packed with examples and advice from experts, plus a handy list of tools. - ConversionXL

Why Airbnb Has a Data Scientist on Every Leadership Team

Airbnb's head of data science shares his keys for success in data and business. - Inc.

Riley Newman on Data Science for Startups

In this interview, Airbnb’s head data scientist Riley Newman talks about building a strong data culture, balancing technical skills with storytelling ability, and scaling data science at a high-growth startup. - Intercom

How does Lumosity use data science?

An inside look at the structure of Lumosity’s data science team and the internal tools and product features they build. - Quora

CAC Payback Period: The Most Misunderstood SaaS Metric

If you’re calculating customer acquisition cost payback period for a SasS product, keep these two things is mind: payback metrics are about risk, not return, and that most SaaS products operate on an annual model, not monthly. - Kellblog

The Five-Step Guide to Robust Help Center Metrics

When a documentation manager set out to revamp her company’s help site content, she was surprised to find very few resources on how to measure her project. Thankfully, she documented her journey so we can all learn from it. Great tips in here for anyone looking to make their help center more, well… helpful. - RJMetrics

How to catch million dollar mistakes before they cost you millions of dollars

Are you measuring the impact of back-end updates on user behavior? Failure to do so could cost you big time. - Lucidchart

The Role of Statistical Significance in Growth Experiments

When you run an experiment, you’re looking for statistically significant results. But if you’re running growth experiments on a product—iterating quickly to optimize—the standard rules of statistical significance may not apply. - Medium

Minimum Viable Onboarding for PMs

A product manager from Doordash shares his thoughts on the most successful employee onboarding process he’s experienced. Spoiler—it involves data analysis. - Charlton Soesanto

7 Steps to Measuring the Success of a Feature

You’ve spent months working on a feature and now it’s live. How do you tell if users actually like it? Dig into your user data and start measuring with this detailed walkthrough. - Amplitude

Highly Effective Data Science Teams

To do great data science work, you need more that a huge heap of data. This article offers 14 criteria for assessing your team’s effectiveness. - Twitch

What BuzzFeed’s Dao Nguyen Knows About Data, Intuition, And The Future Of Media

This entire article is worth reading, but skip to the middle for the real gem—publisher Dao Nguyen’s holistic philosophy on data at BuzzFeed. “[Y]ou can’t only use comments, you can’t only use data, you can’t only use anything. You can’t only use your own intuition, either. It has to be all of those things you use.” - Fast Company

Diligence at Social Capital, Epilogue: Introducing the 8-ball and 'GAAP for Startups'

Figuring out what metrics to present to investors can be a struggle for startups. That’s because there’s really no standardized metrics or reporting in the startup world. Venture capital firm Social Capital is hoping to change that with their tool for gauging product-market fit at early stage companies. Plug in your own data and give it a whirl. - Jonathan Hsu

Building a business that combines human experts and data science

An insightful interview with Eric Colson about algorithms, human computation, and building data science teams at Stitch Fix and Netflix. - O’Reilly Data Show Podcast

You’re Measuring Daily Active Users Wrong

A high number of daily active users (DAU) may sound impressive, but does it actually mean anything? To make your DAU metric actionable, you need to measure how often users are getting core value out of your product, not how many times they log in. - Amplitude

The Ecommerce Holiday Customer Benchmark

Those new customers from the holiday season are more valuable than you thought. So when should you engage these shoppers to turn them into repeat customers? - RJMetrics

How Toyota Revamped Its Collections Biz with Big Data Analytics

Toyota Financial Services (TFS) used to collect car payments with a one-size-fits-all approach. Then the recession hit. For the first time, 100,000 people a day were behind on their payments. With a massive analytics overhaul, TFS were able to personalize their collection strategies and help 6,000 customers keep their cars from being repossessed. - Datanami

How Instacart Uses Redshift to Drive Growth

In this interview, Fareed Mosavat, growth PM at Instacart, shares how his team combines behavior, shipping, and fulfillment data to inform product decisions. Check out how his team uses SQL to define internal metrics, conduct A/B tests, and discover how many touches it takes before a user makes their first order. - Segment

Wrangle Conference 2015

If you didn’t get to attend the Wrangle Conference in October, now’s a good time to catch up. - Cloudera

Fashion Goes Deep. Data Science at Lyst

Fashion moves quickly. So, too, does the data science that powers e-commerce sites. In this interview, Lyst lead data scientist Eddie Bell shares the ins and outs of their recommendation engine. Learn how his team has tackled the challenges of constantly changing merchandise and kept suggestions fresh using machine learning and image analysis. - Fast Forward Labs

3 Tips for Centralizing Your Analytics Team Structure As You Grow

In practice, centralized analytics teams often report into product while supporting the needs of the entire business. - Mode

Data Down on the Farm

This episode is part of a series about data and food from Andreessen Horowitz. Learn how farmers are using software and analytics programs to monitor crop health and performance, implement agricultural policies, and adopt revenue-focused business practices. - Andreessen Horowitz