Ploughshare download#127
Conversation
felixhekhorn
left a comment
There was a problem hiding this comment.
Please also run pre-commit.
Co-authored-by: Felix Hekhorn <felixhekhorn@users.noreply.github.com>
Co-authored-by: Felix Hekhorn <felixhekhorn@users.noreply.github.com>
…inefarm into add_download_ploughshare
Co-authored-by: Felix Hekhorn <felixhekhorn@users.noreply.github.com>
scarlehoff
left a comment
There was a problem hiding this comment.
I agree with @felixhekhorn the run should be its own run
pinefarm/src/pinefarm/cli/run.py
Line 33 in 0490d3c
pinefarm will then run run and generate_pineappl (you can make generate_pineappl do nothing and put everything in a postrun.sh script that will be run as part of the postprocess method:
pinefarm/src/pinefarm/external/interface.py
Line 165 in 0490d3c
which just runs postrun.sh and, importantly, burns in the metadata.txt into all grids which we want in general.
|
Now |
scarlehoff
left a comment
There was a problem hiding this comment.
Thanks!
Left a lot of nitpicks but seems to work (I tested CMS_2JET_8TEV_3D although I could not get grids because of Error: you need to install pineappl with feature fastnlo, which I won't; but I trust it would)
scarlehoff
left a comment
There was a problem hiding this comment.
Thanks for the changes! Once @felixhekhorn has time for a second look we can merge I think
|
As we discussed (when @scarlehoff was not around), I'd first like to have all relevant pinecards in NNPDF/pinecards#197 and NNPDF/pinecards#192 available, such that we are sure we have all tools and options we need in practice to discuss the different cases there (as the different datasets may need more or less structure) |
|
Is this already the case? I see many runcards there. Is that all? |
|
For single jets, I think there's one more runcard I should add. For dijets, I need to make the runcards compatible with the current shape of the Ploughshare module. |
|
The pinecards have been updated so that they are compatible with this module now: |
|
Are all pinecards included now? (also for the one that @achiefa sent the email last week that was missing?) and for the right scale choice? (Looking for a confirmation, a self-certification of sorts, before merging, I won't check them one by one) |
|
The scale choice is right. For dijets, we've got everything. For single jets, we don't have the following pinecards:
|
|
Did you add them (the grids)? (I mean, I understand that the pinecards we want are the ones for the grids you have added, other grids are a different story) |
|
Yes, all the grids that I was in charge of generating have been:
|
|
The pinecards currently contain a double information, which should only have a single source of truth:
EDIT: |
This is now done and tested. The ploughshare link from |
Addresses #102 and NNPDF/pinecards#192.
This is the initial implementation of the class
Ploughthat downloads the grids from ploughshare and converts them into pineappl format (or does whatever is necessary: cutting bins, renaming grids etc.)Pinecard structure:
The folder would contain two files:
ploughshare_link.txtandprocess_grids.sh(optionallypostrun.shandmetadata.txt). Please have a look at the CMS_2JET_8TEV_3D example.ploughshare_link.txt: contains the link to the file that has to be downloadedprocess_grids.sh: responsible for conversion etc.Class structure
Plough.run(): downloads the (tarball) file and extracts it in the output folderPlough.generate_pineappl(): runs theprocess_grids.shscript inside the output folder (also tells it the grid names)These methods are run when the class is initialised. If they were run conventionally (i.e. by
run.py), then an external-pineappl comparison would have to be made at the pinefarm level - this is already done bypineappl import. Also, lines 194-200 ofrun.pyassume that only one grid was generated.The conversion happens inside
process_grids.shinstead ofpostrun.sh, aspostrun.shonly assumes that one grid was generated.