Category: 09. Built-in small datasets

  • California Housing price regression dataset

    load_data function Loads the California Housing dataset. This dataset was obtained from the StatLib repository. It’s a continuous regression dataset with 20,640 samples with 8 features each. The target variable is a scalar: the median house value for California districts, in dollars. The 8 input features are the following: This dataset was derived from the 1990 U.S.…

  • Fashion MNIST dataset, an alternative to MNIST

    load_data function Loads the Fashion-MNIST dataset. This is a dataset of 60,000 28×28 grayscale images of 10 fashion categories, along with a test set of 10,000 images. This dataset can be used as a drop-in replacement for MNIST. The classes are: Label Description 0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6…

  • Reuters newswire classification dataset

    load_data function Loads the Reuters newswire classification dataset. This is a dataset of 11,228 newswires from Reuters, labeled over 46 topics. This was originally generated by parsing and preprocessing the classic Reuters-21578 dataset, but the preprocessing code is no longer packaged with Keras. See this GitHub discussion for more info. Each newswire is encoded as a list of…

  • IMDB movie review sentiment classification dataset

    load_data function Loads the IMDB dataset. This is a dataset of 25,000 movies reviews from IMDB, labeled by sentiment (positive/negative). Reviews have been preprocessed, and each review is encoded as a list of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer “3” encodes the 3rd…

  • CIFAR100 small images classification dataset

    load_data function Loads the CIFAR100 dataset. This is a dataset of 50,000 32×32 color training images and 10,000 test images, labeled over 100 fine-grained classes that are grouped into 20 coarse-grained classes. See more info at the CIFAR homepage. Arguments Returns x_train: uint8 NumPy array of grayscale image data with shapes (50000, 32, 32, 3), containing the training data. Pixel…

  • CIFAR10 small images classification dataset

    load_data function Loads the CIFAR10 dataset. This is a dataset of 50,000 32×32 color training images and 10,000 test images, labeled over 10 categories. See more info at the CIFAR homepage. The classes are: Label Description 0 airplane 1 automobile 2 bird 3 cat 4 deer 5 dog 6 frog 7 horse 8 ship 9 truck Returns…

  • MNIST digits classification dataset

    load_data function Loads the MNIST dataset. This is a dataset of 60,000 28×28 grayscale images of the 10 digits, along with a test set of 10,000 images. More info can be found at the MNIST homepage. Arguments Returns x_train: uint8 NumPy array of grayscale image data with shapes (60000, 28, 28), containing the training data. Pixel values range from 0…