torchTextClassifiers: A unified framework for text classification in PyTorch

9 December 2025

👨‍🔬 Who are we ?

Two data scientists from the innovation team at Insee:

  • Cédric Couralet
    • Previously software developer, architect, devops/security engineer… at Insee
    • Currently helping real data scientists bridge the gap between experimentation and production (How to talk to IT)
    • Open Source advocate

👨‍🔬 Who are we ?

Two data scientists from the innovation team at Insee:

  • Meilame Tayebjee
    • Graduated 2024
    • Currently pursuing a part-time PhD in AI applied to health econometrics
    • At Insee:
      • working on deep learning for text classification, computer vision with remote sensing data, MLOps practices
      • exploring explainability, calibration, LLMs, RAGs & agentic workflows

💡 What will we talk about ?

  • MLOps pipeline for text classification at Insee
  • The constraint that pushed us to open-source a PyTorch package: why and how

1️⃣ Context

Use case (1/2)

  • NACE automatic coding (code APE) for the national business registry (Sirene)
    • Given an activity description and some additional info, assign one of the ~750 NACE labels
    • Sparse-label extreme multi-class text classification task with categorical variables
  • If model is not confident, human annotators enter the loop (reprise manuelle)
    • The automatic coding model reliability is highly critical

Use case (2/2)

  • Lot of training data, but not necessarily reliable
    • Coming from previous classification by a deterministic algorithm (Sicore) (1996-2021) and fastText (2021-2025)
  • Training on GPU but non-batched (online) inference on CPU (secured offline environment)
    • Special care about responsive inference time (<200 ms)

2️⃣ MLOps pipeline

The ideal MLOps pipeline…

… vs ours

Extensive use of MLFLow

  • MLflow used as a:
    1. Training monitor
    2. Model store and “versioner”
    3. Model wrapper (using pyfunc object from MLFLow)
    • Models are packaged with all the metadata necessary to run inference

The wrapper

class MLFlowPyTorchWrapper(mlflow.pyfunc.PythonModel):
  def __init__()
    ...
  
  def predict(self, model_input: list[SingleForm], params=None) -> list[PredictionResponse]:
    query = self.preprocess_inputs(
            inputs=model_input,
        )

    # Preprocess inputs
    text = query[self.text_feature].values
    categorical_variables = query[self.categorical_features].values

    ...

    all_scores = []
    for batch_idx, batch in enumerate(dataloader):
        with torch.no_grad():
            scores = self.module(batch).detach()
            all_scores.append(scores)
    all_scores = torch.cat(all_scores)
    probs = torch.nn.functional.softmax(all_scores, dim=1)

    ...

    responses = []
    for i in range(len(predictions[0])):
        response = process_response(predictions, i, nb_echos_max, prob_min, self.libs)
        responses.append(response)

    return responses

API serving

  • Text classification model served through a containerized REST API:
    • Simplicity for end users
    • Standard query format
    • Scalable
    • Modular and portable
  • Simple design thanks to the MLFLow wrapper
  • Continuous deployment with Argo CD

@router.post("/", response_model=List[PredictionResponse])
async def predict(
    credentials: Annotated[HTTPBasicCredentials, Depends(get_credentials)],
    request: Request,
    forms: BatchForms,
    ...
    num_workers: int = 0,
    batch_size: int = 1,
):
    """
    Endpoint for predicting batches of data.

    Args:
        credentials (HTTPBasicCredentials): The credentials for authentication.
        forms (Forms): The input data in the form of Forms object.
        num_workers (int, optional): Number of CPU for multiprocessing in Dataloader. Defaults to 1.
        batch_size (int, optional): Size of a batch for batch prediction.

    For single predictions, we recommend keeping num_workers and batch_size to 1 for better performance.
    For batched predictions, consider increasing these two parameters (num_workers can range from 4 to 12, batch size can be increased up to 256) to optimize performance.

    Returns:
        list: The list of predicted responses.
    """
    input_data = forms.forms

    ...

    output = request.app.state.model.predict(input_data, params=params_dict)
    return [out.model_dump() for out in output]

API serving

API serving

3️⃣ torchTextClassifiers, an open-source package to distribute PyTorch models

Beyond fastText ?

  • fastText: a powerful and efficient model that has been used in production since 2021…
  • … but the library repo has been archived on March 19th, 2024

This non-maintenance is highly problematic in the medium-term:

  • Potential appearance of (non-fixable) bugs
  • Conflicting versions of dependencies
  • Modernization hindrance

PyTorch: why ? Some strategic reflections…

💡 Idea: Develop our custom PyTorch-based model to:

  • adapt and customize the architecture for our specific needs (text classification with categorical variables)
  • limit dependencies to external libraries and internalize maintenance for more robustness in the long-term
  • access to the vibrant deep learning / NLP community to develop additional features (explainability with Captum, calibration with torch-uncertainty…)…

Packaging the architecture: why ?

  • Conceptually, the model architecture has its own life
    • Can be used for many other use cases
    • Has its own development, versioning etc.
    • This justifies to have its own repo, distinct from the train or API one
  • From an MLOps point of view:
    • The model is travelling between different teams and repos (train, inference, prod): we need to have a Single Source of Truth for easy deployment and versioning!
      • PyPI plays the remote role

Our solution: torchTextClassifiers

  • The package:
    • Conceptualizes the different components of a text classification model to flexibly manipulate them
    • Distributes SOTA architectures incl. self-attention layers (you can make your own small BERT!)
    • Enables to easily instantiate and train those components, while proposing additional features such as explainability
      • You can use any tokenizer from HuggingFace or train your own one
  • Targets:
    • All those who want to train their home-made - possibly small - models, customize their architectures and can’t deploy big models from HuggingFace

Positioning

Production POV

The different components

Demo

Link to the demo

4️⃣ Perspectives

MLOps

  • Bridging the gap between innovation and production:
    • full automation between data extraction, model training and qualification, model deployment
    • observability of the model in production: logs, continuous annotation…

Roadmap for torchTextClassifiers

  • Additional features:
    • include more architectures: gated attention, text-label cross-attention
    • controlling uncertainty: calibration & conformal prediction
    • quantization
    • push / pull from HF
  • Always fully open-sourced!

Thank you for your attention.

Find here all of our repos on the NACE coding project and our packages: