Photo by Milad Fakurian on Unsplash

The Multilingual Amazon Reviews Corpus Dataset for Machine Learning

Install DagsHub:

pip install dagshub
Click on copy button to copy content

To stream this data directly on DagsHub

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://test.dagshub.com/DagsHub-Datasets/amazon-reviews-ml-dataset")

fs.listdir("s3://amazon-reviews-ml")
Click on copy button to copy content

Description

We present a collection of Amazon reviews specifically designed to aid research in multilingual text classification. The dataset contains reviews in English, Japanese, German, French, Chinese and Spanish, collected between November 1, 2015 and November 1, 2019. Each record in the dataset contains the review text, the review title, the star rating, an anonymized reviewer ID, an anonymized product ID and the coarse-grained product category (e.g. ‘books’, ‘appliances’, etc.)

Additional information

Update frequency

None specified.

Managed by

Amazon

Related datasets

Common Screens

Helpful Sentences from Reviews

Humor Detection from Product Question Answering Systems

Japanese Tokenizer Dictionaries

Launch your ML development to new heights with DagsHub

Back to top