Manages the secondary secret of a list of tables
Usage
tab_multi_manager(
list_tables,
list_explanatory_vars,
dir_name = NULL,
hrc = NULL,
alt_hrc = NULL,
totcode = getOption("rtauargus.totcode"),
alt_totcode = NULL,
value = "value",
freq = "freq",
secret_var = "is_secret_prim",
cost_var = NULL,
suppress = "MOD(1,5,1,0,0)",
ip_start = 10,
ip_end = 0,
num_iter_max = 10,
split_tab = FALSE,
nb_tab_option = "smart",
limit = 14700,
...
)
Arguments
- list_tables
named list of
data.frame
ordata.table
representing the tables to protect- list_explanatory_vars
named list of character vectors of explanatory variables of each table mentionned in list_tables. Names of the list are the same as of the list of tables.
- dir_name
string indicated the path of the directory in which to save all the files (.rda, .hst, .txt, .arb, .csv) generated by the function.
- hrc
Informations of hierarchical variables (see section 'Hierarchical variables').
(Informations sur les variables hiérarchiques (voir section 'Hierarchical variables').) (Caractère qui, répété n fois, indique que la valeur est à n niveaux de profondeur dans la hiérarchie.)- alt_hrc
named list for alternative hierarchies (useful for non nested-hierarchies)
- totcode
Code(s) which represent the total of a categorical variable (see section 'Specific parameters' for this parameter's syntax). If unspecified for a variable(neither by default nor explicitly) it will be set to
rtauargus.totcode
.
(Code(s) pour le total d'une variable catégorielle (voir section 'Specific parameters' pour la syntaxe de ce paramètre). Les variables non spécifiées (ni par défaut, ni explicitement) se verront attribuer la valeur dertauargus.totcode
.)- alt_totcode
named list for alternative codes
- value
Name of the column containing the value of the cells.
(Nom de la colonne contenant la valeur des cellules)- freq
Name of the column containing the cell frequency.
(Nom de la colonne contenant les effectifs pour une cellule)- secret_var
Nae of the boolean variable which specifies the secret, primary or not : equal to "TRUE" if a cell is concerned by the secret,"FALSE" otherwise. will be exported in the apriori file.
(Variable indiquant le secret de type booléen: prend la valeur "TRUE" quand les cellules du tableau doivent être masquées "FALSE" sinon. Permet de créer un fichier d'apriori)- cost_var
Numeric variable allow to change the cost suppression of a cell for secondary suppression, it's the value of the cell by default, can be specified for each cell, fill with NA if the cost doesn't need to be changed for all cells
(Variable numeric qui permet de changer la coût de suppression d'une cellule, pris en compte dans les algorithmes de secret secondaire.Par défaut le coût correspond à la valeur de la cellule. peut être spécifié pour chacune des cellules, peut contenir des NA pour les coûts que l'on ne souhaite pas modifier.) (nombre minimal de décimales à afficher (voir section 'Number of decimals').)- suppress
Algortihm for secondary suppression (Tau-Argus batch syntax), and the parameters for it.
( Algorithme de gestion du secret secondaire (syntaxe batch de Tau-Argus), ainsi que les potentiels paramètres associés)- ip_start
integer: Interval protection level to apply at first treatment of each table
- ip_end
integer: Interval protection level to apply at other treatments
- num_iter_max
integer: Maximum of treatments to do on each table (default to 10)
- split_tab
boolean, whether to reduce dimension to 3 while treating a table of dimension 4 or 5 (default to
FALSE
)- nb_tab_option
strategy to follow to choose variables automatically while splitting:
"min"
: minimize the number of tables;"max"
: maximize the number of tables;"smart"
: minimize the number of tables under the constraint of their row count.
- limit
numeric, used to choose which variable to merge (if nb_tab_option = 'smart') and split table with a number of row above this limit in order to avoid tauargus failures
- ...
other arguments of
tab_rtauargus2()
Value
original list of tables. Secret Results of each iteration is added to each table. For example, the result of first iteration is called 'is_secret_1' in each table. It's a boolean variable, whether the cell has to be masked or not.
Examples
library(rtauargus)
library(dplyr)
data(turnover_act_size)
data(turnover_act_cj)
data(activity_corr_table)
#0-Making hrc file of business sectors ----
hrc_file_activity <- activity_corr_table %>%
write_hrc2(file_name = "hrc/activity")
#1-Prepare data ----
#Indicate whether each cell complies with the primary rules
#Boolean variable created is TRUE if the cell doesn't comply.
#Here the frequency rule is freq in (0;3)
#and the dominance rule is NK(1,85)
list_data_2_tabs <- list(
act_size = turnover_act_size,
act_cj = turnover_act_cj
) %>%
purrr::map(
function(df){
df %>%
mutate(
is_secret_freq = N_OBS > 0 & N_OBS < 3,
is_secret_dom = ifelse(MAX == 0, FALSE, MAX/TOT>0.85),
is_secret_prim = is_secret_freq | is_secret_dom
)
}
)
if (FALSE) {
options(
rtauargus.tauargus_exe =
"Y:/Logiciels/TauArgus/TauArgus4.2.3/TauArgus.exe"
)
res_1 <- tab_multi_manager(
list_tables = list_data_2_tabs,
list_explanatory_vars = list(
act_size = c("ACTIVITY", "SIZE"),
act_cj = c("ACTIVITY", "CJ")
),
hrc = c(ACTIVITY = hrc_file_activity),
dir_name = "tauargus_files",
value = "TOT",
freq = "N_OBS",
secret_var = "is_secret_prim",
totcode = "Total"
)
# With the reduction dimensions feature
data("datatest1")
data("datatest2")
datatest2b <- datatest2 %>%
filter(cj == "Total", treff == "Total", type_distrib == "Total") %>%
select(-cj, -treff, -type_distrib)
str(datatest2b)
res <- tab_multi_manager(
list_tables = list(d1 = datatest1, d2 = datatest2b),
list_explanatory_vars = list(
d1 = names(datatest1)[1:4],
d2 = names(datatest2b)[1:2]
),
dir_name = "tauargus_files",
value = "pizzas_tot_abs",
freq = "nb_obs_rnd",
secret_var = "is_secret_prim",
totcode = "Total",
split_tab = TRUE
)
}