Mode Analytics Learn SQL Learn Python Data Viz Analytics Dispatch Forum

Interesting Data Sets

A robust data set is usually the first step toward answering a question. We've collected articles including whacky and useful data sets for training machine learning models, practicing an analytical language, or finding compelling insights.

We’re Sharing A Vast Trove Of Federal Payroll Records

Buzzfeed, via the Freedom of Information Act, got their hands on a dataset comprising four decades of salaries, titles, and demographic details about millions of U.S. government employees, as well as how they moved through the federal bureaucracy. - Buzzfeed

3 Million Instacart Orders, Open Sourced

Instacart has released an anonymized dataset containing a sample of over 3 million grocery orders from more than 200,000 users. Download the data and dig in. - Engineering at Instacart

Executive Office of the President Open Data Archive Backup

Data downloaded from the White House website on January 20, 2017. - Maxwell Ogden

TrumpWorld Data

Buzzfeed put together a dataset to shed light on Trump’s giant network of businesses, investments, and corporate connections. Right now, it includes more than 1,700 people and organizations. Explore the data yourself via Github or Google Sheets. - Buzzfeed


Follow this brand new Twitter account for tons of open, online datasets. - Twitter

The DataRefuge Project

DataRescue events create trustworthy copies of federal climate and environmental data, while the Internet Archive,, and a consortium of major research libraries holds these copies. - PPEH Lab

Academic Torrents

Getting your hands on interesting data can be a chore. Some clever folks at the University of Massachusetts put together a platform for distributing datasets and research papers with BitTorrent technology. - Academic Torrents

20 Weird & Wonderful Datasets for Machine Learning

Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model. - Oliver Cameron

San Francisco Housing Construction History

When someone mentions San Francisco’s housing shortage, they usually cite a limited dataset containing San Francisco Chronicle rental listings from 1979-2001. Eric Fischer took it upon himself to collect decades of new information by transcribing Chronicle rental ads from 1948-1979 and Craigslist rental listings from 2001 onward. - Eric Fischer

Zika Data Guide

It’s surprisingly hard to find data on the Zika virus outbreak. That’s why Buzzfeed’s Jeremy Singer-Vine put together a collection of links to of Zika datasets for people to contribute to and use for reference. - Buzzfeed

A terrifying and hilarious map of squirrel attacks on the U.S. power grid

Explore this nutty dataset in detail - Wonkblog

Yahoo News Feed

A collection of 110 billion Yahoo News user actions, and the largest publicly-released machine learning dataset to date. - Yahoo Labs