pocket_coffea.workflows package

pocket_coffea.workflows package#

Submodules#

pocket_coffea.workflows.base module#

class pocket_coffea.workflows.base.BaseProcessorABC(cfg: Configurator)#

Bases: ProcessorABC, ABC

Abstract Class which defined the common operations of a PocketCoffea processor.

abstract apply_object_preselection(variation)#

Function which must be defined by the actual user processor to preselect and clean objects and define the collections as attributes of events. E.g.:

self.events["ElectronGood"] = lepton_selection(self.events, "Electron", self.params)

apply_preselections(variation)#: The function computes all the masks from the preselection cuts and filter out the events to speed up the later computations. N.B.: Preselection happens after the objects correction and cleaning.

classmethod available_variations()#: Identifiers of the weights variabtions available thorugh this processor. By default they are all the weights defined in the WeightsManager

classmethod available_weights()#: Identifiers of the weights available thorugh this processor. By default they are all the weights defined in the WeightsManager

compute_weights(variation)#: Function which define weights (called after preselection). The WeightsManager is build only for MC, not for data. The user processor can redefine this function to customize the weights computation object.

compute_weights_extra(variation)#: Function that can be defined by user processors to define additional weights to be added to the WeightsManager. To completely redefine the WeightsManager, use the function compute_weights

count_events(variation)#: Count the number of events in each category and also sum their nominal weights (for each sample, by chunk). Store the results in the cutflow and sumw outputs

abstract count_objects(variation)#: Function that counts the preselected objects and save the counts as attributes of events. The function must be defined by the user processor.

define_categories(variation)#

The function saves all the cut masks internally, in order to use them later to define categories (groups of cuts.).

The categorization objects takes care of the details of the caching of the mask and expose a common interface.

Moreover it computes the cut masks defining the subsamples for the current chunks and store them in the self.subsamples attribute for later use.

define_column_accumulators()#: Define the ColumnsManagers to handle the requested columns from the configuration. If Subsamples are defined a columnsmanager is created for each of them.

define_column_accumulators_extra()#: This function should be redefined to add column accumulators in the custom processor, if they cannot be defined from the configuration

define_common_variables_after_presel(variation)#: Function that defines common variables employed in analyses and save them as attributes of events, after preselection. If the user processor does not redefine it, no common variables are built.

define_common_variables_before_presel(variation)#: Function that defines common variables employed in analyses and save them as attributes of events, before preselection. If the user processor does not redefine it, no common variables are built.

define_custom_axes_extra()#

Function which get called before the definition of the Histogram manager. It is used to defined extra custom axes for the histograms depending on the current chunk metadata. E.g.: it can be used to add a era axes only for data.

Custom axes needed for all the samples can be added in the user processor constructor, by appending to self.custom_axes.

define_histograms()#: Initialize the HistManager. Redefine to customize completely the creation of the histManager. Only one HistManager is created for all the subsamples. The subsamples masks are passed to fill_histogram and used internally.

define_histograms_extra()#

Function that get called after the creation of the HistManager. The user can redefine this function to manipulate the HistManager histogram configuration to add customizations directly to the histogram objects before the filling.

This function should also be redefined to fill the self.custom_histogram_fields that are passed to the histogram filling.

export_skimmed_chunk()#

fill_column_accumulators(variation)#

fill_column_accumulators_extra(variation)#

fill_histograms(variation)#: Function which fill the histograms for each category and variation, throught the HistManager.

fill_histograms_extra(variation)#: The function get called after the filling of the default histograms. Redefine it to fill custom histograms

get_extra_shape_variations()#

get_shape_variations()#: Generator for shape variations.

load_metadata()#

The function is called at the beginning of the processing for each chunk to load some metadata depending on the chunk sample, year and dataset.

_dataset: name assigned by the user to the fileset configuration: it identifies
in a unique way the output and the source of the events.
_sample: category of the events, used in the processing to parametrize things
_samplePart: subcategory of the events, used in the processing to parametrize things

load_metadata_extra()#: Function that can be called by a derived processor to define additional metadata. For example load additional information for a specific sample.

property nevents#: Compute the current number of events in the current chunk. If the function is called after skimming or preselection the number of events is reduced accordingly

postprocess(accumulator)#

The function is called by coffea at the end of the processing. The default function calls the rescale_sumgenweights function to rescale the histograms and sumw metadata using the sum of the genweights computed without preselections for each dataset.

Moreover the function saves in the output a dictionary of metadata with the full description of the datasets taken from the configuration.

To add additional customatizaion redefine the postprocessing function, but remember to include a super().postprocess() call.

process(events: Array)#

This function get called by Coffea on each chunk of NanoAOD file. The processing steps of PocketCoffea are defined in this function.

Customization points for user-defined processor are provided as _extra functions. By redefining those functions the user can change the behaviour of the processor in some predefined points.

The processing goes through the following steps:

load metadata

Skim events (first masking of events):
HLT triggers should be applied here, but their use is left to the configuration, and not hardcoded in the processor.

apply object preselections

count objects

apply event preselection (secondo masking of events)

define categories

define weights

define histograms

count events in each category

process_extra_after_presel(variation)#

process_extra_after_skim()#

process_extra_before_presel(variation)#

process_extra_before_skim()#

rescale_sumgenweights(output)#

skim_events()#

Function which applied the initial event skimming. By default the skimming comprehend:

METfilters,

PV requirement *at least 1 good primary vertex

lumi-mask (for DATA): applied the goldenJson selection

requested HLT triggers (from configuration, not hardcoded in the processor)

user-defined skimming cuts

BE CAREFUL: the skimming is done before any object preselection and cleaning. Only collections and branches already present in the NanoAOD before any corrections can be used. Alternatively, if you need to apply the cut on preselected objects - define the cut at the preselection level, not at skim level.

pocket_coffea.workflows.genweights module#

class pocket_coffea.workflows.genweights.genWeightsProcessor(cfg: Configurator)#

Bases: BaseProcessorABC

apply_object_preselection(variation)#

Function which must be defined by the actual user processor to preselect and clean objects and define the collections as attributes of events. E.g.:

self.events["ElectronGood"] = lepton_selection(self.events, "Electron", self.params)

count_objects(variation)#: Function that counts the preselected objects and save the counts as attributes of events. The function must be defined by the user processor.

load_metadata()#

The function is called at the beginning of the processing for each chunk to load some metadata depending on the chunk sample, year and dataset.

_dataset: name assigned by the user to the fileset configuration: it identifies
in a unique way the output and the source of the events.
_sample: category of the events, used in the processing to parametrize things
_samplePart: subcategory of the events, used in the processing to parametrize things

process(events: Array)#

This function get called by Coffea on each chunk of NanoAOD file. The processing steps of PocketCoffea are defined in this function.

Customization points for user-defined processor are provided as _extra functions. By redefining those functions the user can change the behaviour of the processor in some predefined points.

The processing goes through the following steps:

load metadata

Skim events (first masking of events):
HLT triggers should be applied here, but their use is left to the configuration, and not hardcoded in the processor.

apply object preselections

count objects

apply event preselection (secondo masking of events)

define categories

define weights

define histograms

count events in each category

pocket_coffea.workflows.semileptonic_triggerSF module#

class pocket_coffea.workflows.semileptonic_triggerSF.semileptonicTriggerProcessor(cfg)#

Bases: ttHbbBaseProcessor

define_custom_axes_extra()#

Custom axes needed for all the samples can be added in the user processor constructor, by appending to self.custom_axes.

pocket_coffea.workflows.sf_lepton_variations module#

pocket_coffea.workflows.tthbb_base_processor module#

class pocket_coffea.workflows.tthbb_base_processor.ttHbbBaseProcessor(cfg: Configurator)#