Buradasın
Creating Custom TensorFlow Datasets
tensorflow.org/datasets/add_datasetYapay zekadan makale özeti
- Overview
- TFDS processes external datasets into standard format for machine learning pipelines
- Datasets are implemented as subclasses of tfds.core.DatasetBuilder
- Most preprocessing is done automatically
- Dataset Creation Process
- Use TFDS CLI to generate dataset template files
- Specify data sources, features, and split configurations
- Download and extract source data using tfds.download.DownloadManager
- Generate examples using _generate_examples method
- Configuration and Versioning
- Datasets can have multiple variants through BuilderConfigs
- Versioning supports both external data and internal TFDS code updates
- Use lazy_imports for datasets requiring only necessary dependencies
- Testing and Deployment
- Use tfds build command for dataset generation
- Register checksums for determinism and documentation
- Test datasets using tfds.testing.DatasetBuilderTestCase
- Add dataset module import to project __init__ for registration