Home SQL Tutorial Python Tutorial Mode Analytics

Interesting Data Sets

A robust data set is usually the first step toward answering a question. We've collected articles including whacky and useful data sets for training machine learning models, practicing an analytical language, or finding compelling insights.

Executive Office of the President Open Data Archive Backup

Data downloaded from the White House website on January 20, 2017. - Maxwell Ogden

TrumpWorld Data

Buzzfeed put together a dataset to shed light on Trump’s giant network of businesses, investments, and corporate connections. Right now, it includes more than 1,700 people and organizations. Explore the data yourself via Github or Google Sheets. - Buzzfeed


Follow this brand new Twitter account for tons of open, online datasets. - Twitter

The DataRefuge Project

DataRescue events create trustworthy copies of federal climate and environmental data, while the Internet Archive, datarefuge.org, and a consortium of major research libraries holds these copies. - PPEH Lab

Academic Torrents

Getting your hands on interesting data can be a chore. Some clever folks at the University of Massachusetts put together a platform for distributing datasets and research papers with BitTorrent technology. - Academic Torrents

20 Weird & Wonderful Datasets for Machine Learning

Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model. - Oliver Cameron

San Francisco Housing Construction History

When someone mentions San Francisco’s housing shortage, they usually cite a limited dataset containing San Francisco Chronicle rental listings from 1979-2001. Eric Fischer took it upon himself to collect decades of new information by transcribing Chronicle rental ads from 1948-1979 and Craigslist rental listings from 2001 onward. - Eric Fischer

Zika Data Guide

It’s surprisingly hard to find data on the Zika virus outbreak. That’s why Buzzfeed’s Jeremy Singer-Vine put together a collection of links to of Zika datasets for people to contribute to and use for reference. - Buzzfeed

A terrifying and hilarious map of squirrel attacks on the U.S. power grid

Explore this nutty dataset in detail - Wonkblog

Yahoo News Feed

A collection of 110 billion Yahoo News user actions, and the largest publicly-released machine learning dataset to date. - Yahoo Labs