torchTextClassifiers: A unified framework for text classification in PyTorch

Cédric Couralet and Meilame Tayebjee

9 December 2025

Link to the presentation:

In the chat!

👨‍🔬 Who are we ?

Two data scientists from the innovation team at Insee:

Cédric Couralet
- Previously software developer, architect, devops/security engineer… at Insee
- Currently helping real data scientists bridge the gap between experimentation and production (How to talk to IT)
- Open Source advocate

👨‍🔬 Who are we ?

Two data scientists from the innovation team at Insee:

Meilame Tayebjee
- Graduated 2024
- Currently pursuing a part-time PhD in AI applied to health econometrics
- At Insee:
  - working on deep learning for text classification, computer vision with remote sensing data, MLOps practices
  - exploring explainability, calibration, LLMs, RAGs & agentic workflows

💡 What will we talk about ?

MLOps pipeline for text classification at Insee
The constraint that pushed us to open-source a PyTorch package: why and how

1️⃣ Context

Use case (1/2)

NACE automatic coding (code APE) for the national business registry (Sirene)
- Given an activity description and some additional info, assign one of the ~750 NACE labels
- Sparse-label extreme multi-class text classification task with categorical variables
If model is not confident, human annotators enter the loop (reprise manuelle)
- The automatic coding model reliability is highly critical

Use case (2/2)

Lot of training data, but not necessarily reliable
- Coming from previous classification by a deterministic algorithm (Sicore) (1996-2021) and fastText (2021-2025)
Training on GPU but non-batched (online) inference on CPU (secured offline environment)
- Special care about responsive inference time (<200 ms)

2️⃣ MLOps pipeline

The ideal MLOps pipeline…

… vs ours

Extensive use of MLFLow

MLflow used as a:
1. Training monitor
2. Model store and “versioner”
3. Model wrapper (using pyfunc object from MLFLow)
- Models are packaged with all the metadata necessary to run inference

The wrapper

class MLFlowPyTorchWrapper(mlflow.pyfunc.PythonModel):
  def __init__()
    ...
  
  def predict(self, model_input: list[SingleForm], params=None) -> list[PredictionResponse]:
    query = self.preprocess_inputs(
            inputs=model_input,
        )

    # Preprocess inputs
    text = query[self.text_feature].values
    categorical_variables = query[self.categorical_features].values

    ...

    all_scores = []
    for batch_idx, batch in enumerate(dataloader):
        with torch.no_grad():
            scores = self.module(batch).detach()
            all_scores.append(scores)
    all_scores = torch.cat(all_scores)
    probs = torch.nn.functional.softmax(all_scores, dim=1)

    ...

    responses = []
    for i in range(len(predictions[0])):
        response = process_response(predictions, i, nb_echos_max, prob_min, self.libs)
        responses.append(response)

    return responses

API serving

Text classification model served through a containerized REST API:
- Simplicity for end users
- Standard query format
- Scalable
- Modular and portable
Simple design thanks to the MLFLow wrapper
Continuous deployment with Argo CD

@router.post("/", response_model=List[PredictionResponse])
async def predict(
    credentials: Annotated[HTTPBasicCredentials, Depends(get_credentials)],
    request: Request,
    forms: BatchForms,
    ...
    num_workers: int = 0,
    batch_size: int = 1,
):
    """
    Endpoint for predicting batches of data.

    Args:
        credentials (HTTPBasicCredentials): The credentials for authentication.
        forms (Forms): The input data in the form of Forms object.
        num_workers (int, optional): Number of CPU for multiprocessing in Dataloader. Defaults to 1.
        batch_size (int, optional): Size of a batch for batch prediction.

    For single predictions, we recommend keeping num_workers and batch_size to 1 for better performance.
    For batched predictions, consider increasing these two parameters (num_workers can range from 4 to 12, batch size can be increased up to 256) to optimize performance.

    Returns:
        list: The list of predicted responses.
    """
    input_data = forms.forms

    ...

    output = request.app.state.model.predict(input_data, params=params_dict)
    return [out.model_dump() for out in output]

API serving

async function transformToPost(description, top_k) {
  // Base URL with query parameters
  const baseUrl = `https://codification-ape2025-pytorch.lab.sspcloud.fr/predict/?nb_echos_max=${top_k}&prob_min=0.01&num_workers=0&batch_size=1`;

  // Build the request body according to the expected schema
  const body = {
    forms: [
      {
        description_activity: description
      }
    ]
  };

  // Send the POST request
  const response = await fetch(baseUrl, {
    method: "POST",
    headers: {
      "Content-Type": "application/json"
    },
    body: JSON.stringify(body)
  });

  // Parse and return the JSON response
  return response.json();
}

viewof activite = Inputs.text({
  label: '',
  value: 'coiffure',
  width: 800
})

viewof prediction = Inputs.button("Run Prediction", {
  reduce: async () => {
    return await transformToPost(activite, 5);
  }
})

// afficher les résultats joliment
prediction_table = {
  if (!prediction || !prediction.length) {
    return html``
  }

  // la réponse est un tableau avec un seul objet
  const result = prediction[0]
  const { IC, MLversion, ...codes } = result

  const rows = Object.values(codes).map(({ code, libelle, probabilite }) => {
    return html`
      <tr>
        <td>${code} – ${libelle}</td>
        <td style="text-align:right;">${probabilite.toFixed(3)}</td>
      </tr>
    `
  })

  return html`
    <table style="border-collapse: collapse; width: 100%;">
      <caption style="margin-bottom: 0.5em;">
        Confidence score : ${(+IC).toFixed(3)}
      </caption>
      <thead>
        <tr>
          <th style="text-align:left;">Description (NA2008)</th>
          <th style="text-align:right;">Probability</th>
        </tr>
      </thead>
      <tbody>
        ${rows}
      </tbody>
    </table>
  `
}

3️⃣ torchTextClassifiers, an open-source package to distribute PyTorch models

Beyond fastText ?

fastText: a powerful and efficient model that has been used in production since 2021…
… but the library repo has been archived on March 19th, 2024

This non-maintenance is highly problematic in the medium-term:

Potential appearance of (non-fixable) bugs
Conflicting versions of dependencies
Modernization hindrance

PyTorch: why ? Some strategic reflections…

💡 Idea: Develop our custom PyTorch-based model to:

adapt and customize the architecture for our specific needs (text classification with categorical variables)
limit dependencies to external libraries and internalize maintenance for more robustness in the long-term
access to the vibrant deep learning / NLP community to develop additional features (explainability with Captum, calibration with torch-uncertainty…)…

Packaging the architecture: why ?

Conceptually, the model architecture has its own life
- Can be used for many other use cases
- Has its own development, versioning etc.
- This justifies to have its own repo, distinct from the train or API one
From an MLOps point of view:
- The model is travelling between different teams and repos (train, inference, prod): we need to have a Single Source of Truth for easy deployment and versioning!
  - PyPI plays the remote role

Our solution: torchTextClassifiers

The package:
- Conceptualizes the different components of a text classification model to flexibly manipulate them
- Distributes SOTA architectures incl. self-attention layers (you can make your own small BERT!)
- Enables to easily instantiate and train those components, while proposing additional features such as explainability
  - You can use any tokenizer from HuggingFace or train your own one

Targets:
- All those who want to train their home-made - possibly small - models, customize their architectures and can’t deploy big models from HuggingFace

Positioning

Production POV

The different components

Demo

Link to the demo

Link to the doc

https://inseefrlab.github.io/torchTextClassifiers/

4️⃣ Perspectives

MLOps

Bridging the gap between innovation and production:
- full automation between data extraction, model training and qualification, model deployment
- observability of the model in production: logs, continuous annotation…

Roadmap for torchTextClassifiers

Additional features:
- include more architectures: gated attention, text-label cross-attention
- controlling uncertainty: calibration & conformal prediction
- quantization
- push / pull from HF
Always fully open-sourced!

Thank you for your attention.

Find here all of our repos on the NACE coding project and our packages: