Mode Analytics Learn SQL Learn Python Gallery Discussions Data Jobs Data News

Interesting Data Sets

A robust data set is usually the first step toward answering a question. We've collected articles including whacky and useful data sets for training machine learning models, practicing an analytical language, or finding compelling insights.

Census Oddities

So many analyses are built on data from the U.S. Census and American Community Survey, but those datasets have their own quirks you need to watch out for. - Carto Blog

US House PSCI Social Media Ads

Last Thursday, Democratic members of the House Intelligence Committee released 8.8 gigabytes of information about Facebook ads paid for by Russians attempting to interfere in American politics. The data has since been converted to a CSV, so you can explore it for yourself. - data.world

Need a ratings boost? Make a Halloween episode.

This analysis of over 24,000 episode ratings from 184 television shows proves that Halloween TV episodes aren’t just filler. - Kaylin Walker

The Anatomy of a Thousand Typefaces

Say goodbye to endlessly scrolling through the font menu in your word processor. Instead, use this database of typefaces, classified by characteristics like width, spacing, and stroke contrast. - Florian Schulz

9 Elements of Deal-Closing Sales Demos, According to New Data

Forward this one to your sales team. This is yet another good example of a company using their proprietary dataset (in this case, recordings of sales calls) to tell stories and generate interest in their brand. - Gong.io

Quick, Draw! The Data

These doodles are a unique data set that can help developers train new neural networks, help researchers see patterns in how people around the world draw, and help artists create things we haven’t begun to think of. - Google

We’re Sharing A Vast Trove Of Federal Payroll Records

Buzzfeed, via the Freedom of Information Act, got their hands on a dataset comprising four decades of salaries, titles, and demographic details about millions of U.S. government employees, as well as how they moved through the federal bureaucracy. - Buzzfeed

3 Million Instacart Orders, Open Sourced

Instacart has released an anonymized dataset containing a sample of over 3 million grocery orders from more than 200,000 users. Download the data and dig in. - Engineering at Instacart

Executive Office of the President Open Data Archive Backup

Data downloaded from the White House website on January 20, 2017. - Maxwell Ogden

TrumpWorld Data

Buzzfeed put together a dataset to shed light on Trump’s giant network of businesses, investments, and corporate connections. Right now, it includes more than 1,700 people and organizations. Explore the data yourself via Github or Google Sheets. - Buzzfeed

CoolDatasets

Follow this brand new Twitter account for tons of open, online datasets. - Twitter

The DataRefuge Project

DataRescue events create trustworthy copies of federal climate and environmental data, while the Internet Archive, datarefuge.org, and a consortium of major research libraries holds these copies. - PPEH Lab

Academic Torrents

Getting your hands on interesting data can be a chore. Some clever folks at the University of Massachusetts put together a platform for distributing datasets and research papers with BitTorrent technology. - Academic Torrents

20 Weird & Wonderful Datasets for Machine Learning

Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model. - Oliver Cameron

San Francisco Housing Construction History

When someone mentions San Francisco’s housing shortage, they usually cite a limited dataset containing San Francisco Chronicle rental listings from 1979-2001. Eric Fischer took it upon himself to collect decades of new information by transcribing Chronicle rental ads from 1948-1979 and Craigslist rental listings from 2001 onward. - Eric Fischer

Zika Data Guide

It’s surprisingly hard to find data on the Zika virus outbreak. That’s why Buzzfeed’s Jeremy Singer-Vine put together a collection of links to of Zika datasets for people to contribute to and use for reference. - Buzzfeed

A terrifying and hilarious map of squirrel attacks on the U.S. power grid

Explore this nutty dataset in detail - Wonkblog

Yahoo News Feed

A collection of 110 billion Yahoo News user actions, and the largest publicly-released machine learning dataset to date. - Yahoo Labs