This guide provides information about using TensorFlow and PyTorch with the No-Code Classification Toolkit.
The toolkit now supports both:
- TensorFlow (v2.18.0)
- PyTorch (v2.5.1)
Pros:
- More models available in
tf.keras.applications - Better TPU support
- Mature ecosystem
Cons:
- Can be slower for some operations
- Larger memory footprint
Pros:
- More flexible and Pythonic
- Better for research and experimentation
- Efficient mixed precision training with AMP
- Better debugging experience
Cons:
- Fewer pre-trained models in torchvision
- Model naming conventions differ
- MobileNetV2
- ResNet50V2, ResNet101V2, ResNet152V2
- ResNet50, ResNet101, ResNet152
- Xception
- InceptionV3, InceptionResNetV2
- VGG16, VGG19
- DenseNet121, DenseNet169, DenseNet201
- NASNetMobile, NASNetLarge
- MobileNet
- resnet50, resnet101, resnet152
- vgg16, vgg19
- densenet121, densenet169, densenet201
- mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small
- efficientnet_b0, efficientnet_b1, efficientnet_b2, efficientnet_b3, efficientnet_b4
- SGD
- RMSprop
- Adam
- Adadelta
- Adagrad
- Adamax
- Nadam
- FTRL
- SGD
- Adam
- AdamW
- RMSprop
- Adadelta
- Adagrad
Select from:
- Full Precision (FP32) - Standard training
- Mixed Precision (GPU - FP16) - Faster training on GPUs with Tensor Cores
- Mixed Precision (TPU - BF16) - For Google TPU workloads
Enable/disable using the checkbox:
- Unchecked: Full Precision (FP32)
- Checked: Automatic Mixed Precision (AMP) using torch.amp
Three Docker images are available:
docker build -f Dockerfile.tensorflow -t classifier:tensorflow .
docker run -it --gpus all --net host -v /path/to/data:/data classifier:tensorflowSize: ~5-6 GB
Use when: You only need TensorFlow
docker build -f Dockerfile.pytorch -t classifier:pytorch .
docker run -it --gpus all --net host -v /path/to/data:/data classifier:pytorchSize: ~6-7 GB
Use when: You only need PyTorch
docker build -f Dockerfile.both -t classifier:both .
docker run -it --gpus all --net host -v /path/to/data:/data classifier:bothSize: ~10-12 GB
Use when: You want flexibility to switch between frameworks
- Dataset Organization: Keep training and validation sets separate
- Minimum Samples: Ensure at least 100 images per class (configurable)
- Image Formats: Use JPG, JPEG, PNG, or BMP
- Naming: Use descriptive folder names as they become class labels
- Set
TF_FORCE_GPU_ALLOW_GROWTH=truefor dynamic GPU memory allocation - Use
mixed_float16for modern NVIDIA GPUs (compute capability >= 7.0) - Monitor TensorBoard at
http://localhost:6006
- Use
num_workers=4in DataLoader for optimal performance - Enable mixed precision (AMP) for faster training on modern GPUs
- PyTorch models use lowercase naming (e.g.,
resnet50notResNet50) - Pin memory is enabled by default for faster GPU transfers
- Start with lower learning rates (0.001 or 0.0001)
- Use early stopping to prevent overfitting (built-in)
- Monitor validation accuracy during training
- Save best models automatically enabled
- Use data augmentation for better generalization
Both frameworks are competitive in performance:
| Feature | TensorFlow | PyTorch |
|---|---|---|
| Training Speed | Fast | Fast |
| Memory Usage | Higher | Lower |
| Ease of Use | Good | Excellent |
| Debugging | Good | Excellent |
| Production | Excellent | Good |
After training completes:
- Keras Weights:
/app/model/weights/keras/{backbone}_{timestamp}.h5 - SavedModel:
/app/model/weights/savedmodel/{backbone}_{timestamp}/ - TensorBoard Logs:
/app/logs/tensorboard/{backbone}_{timestamp}/
- Model Checkpoint:
/app/model/weights/pytorch/{backbone}_{timestamp}.pth - Best Model:
/app/model/weights/pytorch/{backbone}_{timestamp}_best.pth - TensorBoard Logs:
/app/logs/tensorboard/{backbone}_{timestamp}/
- Reduce batch size
- Reduce input image size
- Use mixed precision training
- Increase batch size if you have memory
- Enable mixed precision
- Reduce number of workers if CPU-bound
- Increase dataset size
- Use data augmentation
- Try different learning rates
- Use a different backbone
- Train for more epochs
Models trained in one framework cannot be directly loaded in another. However, you can:
- Export predictions and compare
- Train similar architectures in both frameworks
- Use ONNX for model conversion (advanced)