Tutorials#
Step-by-step guides to learn torchTextClassifiers through practical examples.
- Binary Classification Tutorial
- Multiclass Classification Tutorial
- Mixed Features Classification
- Learning Objectives
- Prerequisites
- What Are Categorical Features?
- When to Use Categorical Features
- Complete Example
- Step-by-Step Walkthrough
- Comparison: Text-Only vs. Mixed Features
- Combination Strategies
- Real-World Example: AG News with Source
- Common Issues
- Customization
- Best Practices
- Next Steps
- Summary
- Model Explainability
- Learning Objectives
- Prerequisites
- What Is Explainability?
- Complete Example
- Step-by-Step Walkthrough
- Complete Visualization Example
- Interactive Explainability
- Understanding Attribution Scores
- Debugging with Explainability
- Advanced: Custom Attribution Methods
- Common Issues
- Best Practices
- Real-World Use Cases
- Customization
- Summary
- Multilabel Classification
- Learning Objectives
- Prerequisites
- Multilabel vs. Multiclass
- Two Approaches to Multilabel
- Complete Example: Ragged Lists
- Complete Example: One-Hot Encoding
- Step-by-Step Walkthrough
- Evaluation Metrics
- Real-World Example: Document Tagging
- Common Issues
- Customization
- Advanced: Probabilistic Labels
- Best Practices
- Summary
- Next Steps
Overview#
These tutorials guide you through common text classification tasks, from basic binary classification to advanced multiclass scenarios.
Available Tutorials#
Getting Started#
Recommended first tutorial
Build a sentiment classifier for product reviews. Learn the complete workflow from data preparation to evaluation.
What you’ll learn:
Creating and training tokenizers
Configuring models
Training with validation data
Making predictions
Evaluating performance
Difficulty: Beginner | Time: 15 minutes
Intermediate Tutorials#
Classify text into 3+ categories with proper handling of class imbalance and evaluation metrics.
What you’ll learn:
Multiclass model configuration
Class distribution analysis
Reproducibility with seeds
Confusion matrices
Advanced evaluation metrics
Difficulty: Intermediate | Time: 20 minutes
Combine text with categorical variables for improved classification performance.
What you’ll learn:
Adding categorical features alongside text
Configuring categorical embeddings
Comparing performance improvements
Feature combination strategies
Difficulty: Intermediate | Time: 25 minutes
Advanced Tutorials#
Understand which words and characters drive your model’s predictions.
What you’ll learn:
Generating attribution scores with Captum
Word-level and character-level visualizations
Identifying influential tokens
Interactive explainability mode
Difficulty: Advanced | Time: 30 minutes
Assign multiple labels to each text sample for complex classification scenarios.
What you’ll learn:
Ragged lists vs. one-hot encoding
Configuring BCEWithLogitsLoss
Multilabel evaluation metrics
Handling variable labels per sample
Difficulty: Advanced | Time: 30 minutes
Learning Path#
We recommend following this learning path:
Start with: Quick Start - Get familiar with the basics
Then: Binary Classification Tutorial - Understand the complete workflow
Next: Multiclass Classification Tutorial - Handle multiple classes
Branch out: Mixed Features Classification for categorical features OR Multilabel Classification for multiple labels
Master: Model Explainability - Understand your model’s predictions
Tutorial Format#
Each tutorial follows a consistent structure:
- Learning Objectives
What you’ll be able to do after completing the tutorial
- Prerequisites
What you need to know before starting
- Complete Code
Full working example you can copy and run
- Step-by-Step Walkthrough
Detailed explanation of each step
- Customization
How to adapt the code to your needs
- Common Issues
Troubleshooting tips and solutions
- Next Steps
Where to go after finishing
Tips for Learning#
Run the Code#
Don’t just read - run the examples! Modify them to see what happens:
# Try different values
model_config = ModelConfig(
embedding_dim=128, # Was 64 - what changes?
num_classes=2
)
Start Simple#
Begin with the Quick Start, then move to Binary Classification. Don’t skip ahead!
Use Your Own Data#
Once you understand the examples, try them with your own text data:
# Your data
my_texts = ["your", "text", "samples"]
my_labels = [0, 1, 0]
# Same workflow
classifier.train(my_texts, my_labels, training_config)
Experiment#
Try different tokenizers (WordPiece vs NGram)
Adjust hyperparameters (learning rate, embedding dim)
Compare model sizes
Test different batch sizes
Read the Errors#
Error messages are helpful! They often tell you exactly what’s wrong:
# Error: num_classes=2 but got label 3
# Solution: Check your labels - should be 0, 1 (not 1, 2, 3)
Getting Help#
Stuck on a tutorial? Here’s how to get help:
Check Common Issues: Each tutorial has a troubleshooting section
Read the API docs: API Reference for detailed parameter descriptions
Review architecture: Architecture Overview for how components work
Ask questions: GitHub Discussions
Report bugs: GitHub Issues
Additional Resources#
Example Scripts#
All tutorials are based on runnable examples in the repository:
Jupyter Notebooks#
Interactive notebooks for hands-on learning:
Contributing#
Want to contribute a tutorial? We welcome:
New use cases
Alternative approaches
Real-world examples
Performance tips
See our contributing guidelines to get started!
What’s Next?#
Ready to start? Choose your path:
New to text classification? Start with Quick Start
Want to dive deeper? Begin with Binary Classification Tutorial
Ready for multiclass? Jump to Multiclass Classification Tutorial
Need API details? Check API Reference