Skip to content

Ploughshare download#127

Open
andrpie wants to merge 16 commits into
mainfrom
add_download_ploughshare
Open

Ploughshare download#127
andrpie wants to merge 16 commits into
mainfrom
add_download_ploughshare

Conversation

@andrpie
Copy link
Copy Markdown

@andrpie andrpie commented Apr 14, 2026

Addresses #102 and NNPDF/pinecards#192.

This is the initial implementation of the class Plough that downloads the grids from ploughshare and converts them into pineappl format (or does whatever is necessary: cutting bins, renaming grids etc.)

Pinecard structure:

The folder would contain two files: ploughshare_link.txt and process_grids.sh (optionally postrun.sh and metadata.txt). Please have a look at the CMS_2JET_8TEV_3D example.
ploughshare_link.txt: contains the link to the file that has to be downloaded
process_grids.sh: responsible for conversion etc.

Class structure

Plough.run(): downloads the (tarball) file and extracts it in the output folder
Plough.generate_pineappl(): runs the process_grids.sh script inside the output folder (also tells it the grid names)

These methods are run when the class is initialised. If they were run conventionally (i.e. by run.py), then an external-pineappl comparison would have to be made at the pinefarm level - this is already done by pineappl import. Also, lines 194-200 of run.py assume that only one grid was generated.

The conversion happens inside process_grids.sh instead of postrun.sh, as postrun.sh only assumes that one grid was generated.

@andrpie andrpie requested review from achiefa and scarlehoff April 14, 2026 16:46
Copy link
Copy Markdown
Contributor

@felixhekhorn felixhekhorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also run pre-commit.

Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
andrpie and others added 9 commits April 17, 2026 10:13
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @felixhekhorn the run should be its own run

def subcommand(pinecard, theory_path, pdf, dry, finalize=None):

pinefarm will then run run and generate_pineappl (you can make generate_pineappl do nothing and put everything in a postrun.sh script that will be run as part of the postprocess method:

def postprocess(self):

which just runs postrun.sh and, importantly, burns in the metadata.txt into all grids which we want in general.

Comment thread src/pinefarm/external/plough.py
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
@andrpie
Copy link
Copy Markdown
Author

andrpie commented Apr 28, 2026

Now Plough is compatible with postrun.sh and its run is not called in __init__. I've added a condition in run.py to make it work. The pinecard structure got updated accordingly. Thanks for the suggestions!

Comment thread src/pinefarm/cli/run.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Left a lot of nitpicks but seems to work (I tested CMS_2JET_8TEV_3D although I could not get grids because of Error: you need to install pineappl with feature fastnlo, which I won't; but I trust it would)

Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/plough.py
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Once @felixhekhorn has time for a second look we can merge I think

@felixhekhorn
Copy link
Copy Markdown
Contributor

felixhekhorn commented May 11, 2026

As we discussed (when @scarlehoff was not around), I'd first like to have all relevant pinecards in NNPDF/pinecards#197 and NNPDF/pinecards#192 available, such that we are sure we have all tools and options we need in practice to discuss the different cases there (as the different datasets may need more or less structure)

@scarlehoff
Copy link
Copy Markdown
Member

Is this already the case? I see many runcards there. Is that all?

@andrpie
Copy link
Copy Markdown
Author

andrpie commented May 12, 2026

For single jets, I think there's one more runcard I should add. For dijets, I need to make the runcards compatible with the current shape of the Ploughshare module.

@andrpie
Copy link
Copy Markdown
Author

andrpie commented May 12, 2026

The pinecards have been updated so that they are compatible with this module now:
dijets (fastNLO grids): NNPDF/pinecards#192
single jets (APPLgrid): NNPDF/pinecards#197

@scarlehoff
Copy link
Copy Markdown
Member

Are all pinecards included now? (also for the one that @achiefa sent the email last week that was missing?) and for the right scale choice?

(Looking for a confirmation, a self-certification of sorts, before merging, I won't check them one by one)

@andrpie
Copy link
Copy Markdown
Author

andrpie commented May 19, 2026

The scale choice is right. For dijets, we've got everything. For single jets, we don't have the following pinecards:

  • CMS 13TeV
  • ATLAS 7TeV
  • ATLAS 13TeV
    Of course, I can add these. However, the need for them has not been flagged with me before.

@scarlehoff
Copy link
Copy Markdown
Member

scarlehoff commented May 19, 2026

Did you add them (the grids)? (I mean, I understand that the pinecards we want are the ones for the grids you have added, other grids are a different story)

@andrpie
Copy link
Copy Markdown
Author

andrpie commented May 19, 2026

Yes, all the grids that I was in charge of generating have been:

  • added to theories_slim
  • their pinecards are in the two PRs above

@felixhekhorn
Copy link
Copy Markdown
Contributor

felixhekhorn commented May 19, 2026

The pinecards currently contain a double information, which should only have a single source of truth:

  1. in metadata.txt there is a field ploughshare
  2. ploughshare_link.txt

EDIT:
Please keep only the ploughshare_link.txt and reconstruct the metadata link from there

@andrpie
Copy link
Copy Markdown
Author

andrpie commented May 19, 2026

Please keep only the ploughshare_link.txt and reconstruct the metadata link from there

This is now done and tested. The ploughshare link from metadata.txt is removed and it is now constructed at the interface level. The result is a key-value pair in a grid's metadata of the form:
ploughshare_link: https://ploughshare.web.cern.ch/ploughshare/record.php?group=applfast&dataset=applfast-atlas-incjets-appl-arxiv-1706.03192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants