Parameters#
A CMS analysis in PocketCoffea is fully determined by a set of parameters and by an analysis configuration. These two terms, althought they are both referring to metadata, have a specific meaning inside the framework.
In PocketCoffea parameters are all the metadata defining a CMS analysis phasespace in a broader sense:
Triggers
Luminosity and event flags
Object identification, calibration configuration and working points
Scale factors
Jet energy calibration configuration
Note
Analyzers use parameters to define a calibrated set of objects, with clearly defined object preselection and a set of CMS-specific working points.
On top of this, different analysis configurations can be defined to export a set of observables, in specific categories and produce plots, ntuples, measurements.
Goal
The goal of PocketCoffea is to track both analyses parameters and configurations to streamline the sharing of common metadata between groups and make analysis preservation easier.
The configuration format is described in details here. In this page we discuss the format of analyses parameters.
Parameters format#
The choosen format for analysis parameters in PocketCoffea is yaml, given is high readability and flexibility. Most of the CMS metadata can be expressed as list and dictionaries of strings and numbers, therefore the yaml format does not pose any limitation.
The OmegaConf (docs) package has been chosen to handle the yaml parameters file: this allow us to compose different parameter sets and/or being able to dynamically overwrite part of a configuration.
Important
The parameters
object is passed to the Configurator
class (see Configuration), which passes it
inside the Coffea processor and to all the components of the framework. Therefore, the parameters object is the ideal
container for all the the necessary metadata that are not part of the analysis configuration.
Let’s have a look at one of the default parameters set defined in the PocketCoffea defaults in
pocket_coffea/parameters/jet_scale_factors.yaml
jet_scale_factors:
btagSF:
# DeepJet AK4 tagger shape SF
'2016_PreVFP':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2016preVFP_UL/btagging.json.gz
name: "deepJet_shape"
'2016_PostVFP':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2016postVFP_UL/btagging.json.gz
name: "deepJet_shape"
'2017':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2017_UL/btagging.json.gz
name: "deepJet_shape"
'2018':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2018_UL/btagging.json.gz
name: "deepJet_shape"
jet_puId:
# Jet PU ID SF to be applied only on selected jets (pt<50) that are matched to GenJets
'2016_PreVFP':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/JME/2016preVFP_UL/jmar.json.gz
name: PUJetID_eff
'2016_PostVFP':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/JME/2016postVFP_UL/jmar.json.gz
name: PUJetID_eff
'2017':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/JME/2017_UL/jmar.json.gz
name: PUJetID_eff
'2018':
file: /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/JME/2018_UL/jmar.json.gz
name: PUJetID_eff
Tip
PocketCoffea defines a set of default parameters sets for the most common CMS parameters: lumi, jets calibrations, event flags, btagging working points.
The file contains a nested structure splitting the parameters by datataking period. Internally the jet scale factor application
code in PocketCoffea will look for the jet_scale_factors.btagSF
metadata when applying the btaggging scale factors.
Important
The parameters format is free and not validated inside the framework. Internally some part of the code expect the
parameter dictionaries to have certain key as jet_scale_factors
. In case one key is not found the process terminated
with a nice exception telling the user what’s missing in the yaml files.
Default parameters#
PocketCoffea defines a set of default parameters sets for the most common CMS parameters: lumi, jets calibrations, event flags, btagging working points. The user can get a copy of the default set of parameters programmatically:
1>>> from pocket_coffea.parameters import defaults
2>>> default_parameters = defaults.get_default_parameters()
3
4>>> default_parameters.keys()
5dict_keys(['pileupJSONfiles', 'event_flags', 'event_flags_data',
6 'lumi', 'default_jets_calibration', 'jets_calibration',
7 'jet_scale_factors', 'btagging', 'lepton_scale_factors',
8 'systematic_variations'])
9
10>>> default_parameters.jet_scale_factors.btagSF
11{'btagSF': {
12 '2016_PreVFP': {'file': '/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2016preVFP_UL/btagging.json.gz', 'name': 'deepJet_shape'},
13 '2016_PostVFP': {'file': '/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2016postVFP_UL/btagging.json.gz', 'name': 'deepJet_shape'},
14 '2017': {'file': '/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2017_UL/btagging.json.gz', 'name': 'deepJet_shape'},
15 '2018': {'file': '/cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration/POG/BTV/2018_UL/btagging.json.gz',
16 'name': 'deepJet_shape'}},
17}
18
The OmegaConf
parameters object behaves like a python dictionary, where keys can be accessed directly as attributes.
The user can explore programmatically the full parameters set and dinamycally add more keys.
Parameters set can also be loaded directly from yaml files:
1### Using the PocketCoffea interface
2from pocket_coffea.parameters import defaults
3default.compose_parameters_from_files(["params/triggers.yaml", "params/leptons.yaml"])
4
5## Using directly the OmegaConf interface
6pileup = OmegaConf.load('pileup.yaml')
7event_flags = OmegaConf.load('event_flags.yaml')
8lumi = OmegaConf.load('lumi.yaml')
9params = OmegaConf.merge(pileup, event_flags, lumi)
User parameters customization#
User must be able to easily and cleanly modify analysis parameters building up from shared configuration sets. The most direct way to modify parameters is just to load the defaults and manually set attributes in the analysis configuration file or in a script.
1>>> from pocket_coffea.parameters import defaults
2>>> default_parameters = defaults.get_default_parameters()
3# Now the user can customize the params as a dictionary
4>>> a["custom_param"] = {"2018": 3.45, "2017": [2,3,4,5]}
5>>> a.custom_param
6{'2018': 3.45, '2017': [2, 3, 4, 5]}
A best practice is to save parameters customization in yaml files along the analyses configuration.
Some methods have been implemented in the pocket_coffea.parameters.defaults
module to help the user compose the
configuration.
1from pocket_coffea.parameters import defaults
2default_parameters = defaults.get_default_parameters()
3
4parameters = defaults.merge_parameters_from_files(default_parameters,
5 f"{localdir}/params/object_preselection.yaml",
6 f"{localdir}/params/triggers.yaml",
7 update=True)
The method defaults.merge_parameters_from_files
loads the additional parameters from the yaml files passed by the user
and merge them with the default_parameters
object. The update=True
options means that if a key is already present,
the dictionary is updated and not just replaced (default from OmegaConf).
Tip
The parameters used in each analysis run are dumped together with the analysis configuration in order to always track all the metadata used to produce plots and ntuples.
OmegaConf tip and tricks#
The OmegaConf library allows some additional dynamic behaviour in the definition of the yaml file which can be quite useful.
Custom resolvers#
OmegaConf permits the user to define resolvers
which get their value resolved during execution.
For example the location of the default parameters directory depends on the user setup. It can be defined as a
${default_params_dir:}
macro.
lumi:
goldenJSON:
2016_PreVFP: "${default_params_dir:}/datacert/Cert_271036-284044_13TeV_Legacy2016_Collisions16_JSON.txt"
2016_PostVFP: "${default_params_dir:}/datacert/Cert_271036-284044_13TeV_Legacy2016_Collisions16_JSON.txt"
'2017': "${default_params_dir:}/datacert/Cert_294927-306462_13TeV_UL2017_Collisions17_GoldenJSON.txt"
'2018': "${default_params_dir:}/datacert/Cert_314472-325175_13TeV_Legacy2018_Collisions18_JSON.txt"
The macro is defined by default thanks to an helper function in pocket_coffea.parameters.defaults
.
The user can define additional macros using the helper before loading the parameters files. For example:
1from pocket_coffea.parameters import defaults
2default_parameters = defaults.get_default_parameters()
3localdir = os.path.dirname(os.path.abspath(__file__))
4
5# Register a new macro
6defaults.register_configuration_dir("config_dir", localdir+"/params")
Important
Now the loaded parameters can use the "${config_dir:}"
macro without contaminating the parameters files with
user-specific hard coded paths. This is very important to be able to share configurations between users without painful
hardcoded changes.
Cross references#
OmegaConf can also build cross references inside the configuration dictionary.
Existing keys can be referred to just by using the syntax ${other.key.in.the.dictionary}
. N.B.: note the missing
semicolumn at the end of the macro syntax, which is reserved for resolvers
like ${default_dir:}
.
For example, the jet calibration configuration can be built using pieces of the default_jets_calibration
dictionary
defined in
pocket_coffea/parameters/jets_calibration.yaml
. This
helps removing a lot of repetition and boilerplate metadata.
# Default jets calibration for the user used by the processor
jets_calibration:
factory_file: "./jets_calibrator_JES_JER_Syst.pkl.gz"
jet_types:
AK4PFchs: "${default_jets_calibration.factory_configuration.AK4PFchs.JES_JER_Syst}"
AK8PFPuppi: "${default_jets_calibration.factory_configuration.AK8PFPuppi.JES_JER_Syst}"
collection: # this is needed to know which collection is corrected with which jet factory
AK4PFchs: "Jet"
AK8PFPuppi: "FatJet"
jec_name_map: "${default_jets_calibration.jec_name_map}"
Missing values#
Missing values which need to be defined can be included with a ???
string: see
docs. If a user runs trying to use these value, an
exception will be raised.