pocket_coffea.scripts.dataset package#
Submodules#
pocket_coffea.scripts.dataset.append_genweights module#
pocket_coffea.scripts.dataset.append_parents module#
pocket_coffea.scripts.dataset.build_datasets module#
pocket_coffea.scripts.dataset.dataset_query module#
- class pocket_coffea.scripts.dataset.dataset_query.DataDiscoveryCLI#
Bases:
object
- property as_dict#
- do_allowlist_sites(sites=None)#
- do_blocklist_sites(sites=None)#
- do_clear()#
- do_list_replicas()#
- do_list_selected()#
- do_login(proxy=None)#
Login to the rucio client. Optionally a specific proxy file can be passed to the command. If the proxy file is not specified, voms-proxy-info is used
- do_query(query=None)#
- do_query_results()#
- do_regex_sites(regex=None)#
- do_replicas(mode=None, selection=None)#
Query Rucio for replicas. mode: - None: ask the user about the mode
round-robin (take files randomly from available sites),
choose: ask the user to choose from a list of sites
first: take the first site from the rucio query
selection: list of indices or ‘all’ to select all the selected datasets for replicas query
- do_save(filename=None)#
Save the replica information in yaml format
- do_select(selection=None, metadata=None)#
Selected the datasets from the list of query results. Input a list of indices also with range 4-6 or “all”.
- do_sites_filters(ask_clear=True)#
- do_whoami()#
- extract_era_from_dataset_name(dataset_name)#
- extract_xsec_from_dataset_name(dataset_name)#
- extract_year_from_dataset_name(dataset_name)#
- generate_default_metadata(dataset)#
- is_mc_dataset(dataset_name)#
- load_dataset_definition(dataset_definition, query_results_strategy='all', replicas_strategy='round-robin')#
Initialize the DataDiscoverCLI by querying a set of datasets defined in dataset_definitions and selected results and replicas following the options.
query_results_strategy: “all” or “manual” to be prompt for selection
- replicas_strategy:
“round-robin”: select randomly from the available sites for each file
“choose”: filter the sites with a list of indices for all the files
“first”: take the first result returned by rucio
“manual”: to be prompt for manual decision dataset by dataset
- start_cli()#
- pocket_coffea.scripts.dataset.dataset_query.get_indices_query(input_str: str, maxN: int) List[int] #
- pocket_coffea.scripts.dataset.dataset_query.print_dataset_query(query, dataset_list, console, selected=[])#