As of the current date, there are numerous public datasets available on BigQuery, which is a fully-managed, serverless data warehouse offered by Google Cloud Platform (GCP). BigQuery provides a platform for storing, querying, and analyzing large datasets using SQL queries. It offers a wide range of public datasets that users can access and utilize for various purposes, including research, analysis, and machine learning.
The availability of public datasets on BigQuery is constantly evolving as new datasets are added and existing ones are updated or removed. Google actively collaborates with various organizations, academic institutions, and government agencies to make a diverse range of datasets available to the public. These datasets cover a wide spectrum of domains, including social sciences, biology, climate, finance, transportation, and more.
To access the public datasets on BigQuery, users can navigate to the BigQuery console, select the "Public Datasets" option, and browse through the available datasets. Alternatively, users can use the BigQuery command-line tool or the client libraries provided by Google Cloud to programmatically access and interact with the datasets.
The number of public datasets available on BigQuery is not fixed, as new datasets are added periodically. However, as of the time of writing this answer, there are thousands of public datasets available on BigQuery. Some notable examples include:
1. The Open Images Dataset: This dataset contains millions of images annotated with labels from a diverse range of categories. It is a valuable resource for training and evaluating computer vision models.
2. The Global Surface Water Explorer: This dataset provides information on the extent and dynamics of global surface water from 1984 to the present. It is useful for studying water availability, changes in water bodies, and monitoring environmental conditions.
3. The New York City Taxi and Limousine Commission (TLC) Trip Record Data: This dataset contains detailed information about taxi and for-hire vehicle trips in New York City. It can be used for analyzing transportation patterns, optimizing routes, and studying urban mobility.
4. The Bureau of Economic Analysis (BEA) Economic Data: This dataset provides access to a wide range of economic data, including GDP, personal income, trade statistics, and more. It is valuable for conducting economic research and analysis.
5. The NOAA Global Historical Climatology Network (GHCN): This dataset contains historical weather observations from thousands of weather stations worldwide. It is useful for climate analysis, weather modeling, and studying long-term climate trends.
These examples represent just a fraction of the public datasets available on BigQuery. The platform offers a vast collection of datasets that cater to diverse research and analysis needs. Users can leverage these datasets to gain insights, build models, and advance their machine learning projects.
BigQuery provides access to a wide range of public datasets that can be used for various purposes, including research, analysis, and machine learning. As of the current date, there are thousands of public datasets available on BigQuery, covering diverse domains and offering valuable insights into various fields of study.
Other recent questions and answers regarding Advancing in Machine Learning:
- What are the limitations in working with large datasets in machine learning?
- Can machine learning do some dialogic assitance?
- What is the TensorFlow playground?
- Does eager mode prevent the distributed computing functionality of TensorFlow?
- Can Google cloud solutions be used to decouple computing from storage for a more efficient training of the ML model with big data?
- Does the Google Cloud Machine Learning Engine (CMLE) offer automatic resource acquisition and configuration and handle resource shutdown after the training of the model is finished?
- Is it possible to train machine learning models on arbitrarily large data sets with no hiccups?
- When using CMLE, does creating a version require specifying a source of an exported model?
- Can CMLE read from Google Cloud storage data and use a specified trained model for inference?
- Can Tensorflow be used for training and inference of deep neural networks (DNNs)?
View more questions and answers in Advancing in Machine Learning