Photo by Shubham Dhage on Unsplash

AWS Public Blockchain Data Dataset for Machine Learning

Install DagsHub:

pip install dagshub
Click on copy button to copy content

To stream this data directly on DagsHub

from dagshub.streaming import DagsHubFilesystem

fs = DagsHubFilesystem(".", repo_url="https://test.dagshub.com/DagsHub-Datasets/aws-public-blockchain-dataset")

fs.listdir("s3://aws-public-blockchain")
Click on copy button to copy content

Description

The AWS Public Blockchain Data provide datasets from the Bitcoin and Ethereum blockchains. The blockchain data is transformed into multiple tables as compressed Parquet files partitioned by date to allow efficient access for most common analytics queries.

Additional information

Update frequency

New data is delivered constantly to the current date folders as one Parquet file per block. Intra-day data is aggregated everyday at 00:30 UTC.

Related datasets

Mars Spectrometry: Detect Evidence for Past Habitability

AI2 Meaningful Citations Data Set

AI2 Reasoning Challenge (ARC) 2018

Aristo Mini Corpus

Launch your ML development to new heights with DagsHub

Back to top