Skip to contents

Overview

The Cell Key Method (CKM) is a statistical technique used to protect the confidentiality of tabular data by perturbating all cells in it. This package provides tools to apply the CKM in R, enabling users to select the best set of parameters and to generate perturbed counting tables from microdata.

For more information on the Cell Key Method, you can refer to the chapter 5.4 of the Handbook on Statistical Disclsoure Control.

The package is designed to perturb only frequency tables only for the moment.

Documentation

For detailed documentation, please refer to the package vignette.

The transition matrices are built using the ptable package.

For French readers, you can also refer to a methdological document for more information on the Cell Key Method.

Package Installation

# install.packages("remotes")
remotes::install_github("inseefrlab/ckm", dependencies = TRUE)

Applying the Cell Key Method Step by Step

Assigning a Random Key to the Microdata

library(ckm)

data("dtest", package = "ckm")

set.seed(4081789) # Ensure reproducibility
dtest_with_keys <- build_individual_keys(dtest)
hist(dtest_with_keys$rkey)

Generating the Counting Table with Cell Keys

tab_before <- tabulate_cnt_micro_data(
  df = dtest_with_keys,
  cat_vars = c("DIPLOME", "SEXE", "AGE"),
  hrc_vars = list(GEO = c("REG", "DEP")),
  marge_label = "Total"
)

Applying the Perturbation

res_ckm <- apply_ckm(tab_before, D = 5, V = 2)

Applying the Cell Key Method in One Step

After generating the individual key on your dataset, you can directly build the perturbed table:

res_ckm <- tabulate_and_apply_ckm(
  df = dtest_with_keys,
  cat_vars = c("DIPLOME", "SEXE", "AGE"),
  hrc_vars = list(GEO = c("REG", "DEP")),
  marge_label = "Total",
  D = 5, V = 2
)