Mark datasets from Zenodo as static

When retrieving a remote file over HTTP, Snakemake uses the
"last-modified" property in HTTP header as a proxy for `mtime` of the
remote file. If this time is more recent than the `mtime` of the
output of the retrieve rule, the rule is triggered and the remote file
is retrieved again (since it was apparently updated).

However, Zenodo periodically updates the "last-modified" property of
records retrieved over HTTP even if those records have not been
updated. This causes Snakemake to false assume that the records have
to downloaded again.

By setting `static=True` for datasets we know don't actually change,
we avoid this problem.
This commit is contained in:
Koen van Greevenbroek 2021-09-15 11:44:49 +02:00
parent b88322587f
commit 6485b98973

View File

@ -153,7 +153,7 @@ if config['enable'].get('build_cutout', False):
if config['enable'].get('retrieve_cutout', True):
rule retrieve_cutout:
input: HTTP.remote("zenodo.org/record/4709858/files/{cutout}.nc", keep_local=True)
input: HTTP.remote("zenodo.org/record/4709858/files/{cutout}.nc", keep_local=True, static=True)
output: "cutouts/{cutout}.nc"
shell: "mv {input} {output}"
@ -170,7 +170,7 @@ if config['enable'].get('build_natura_raster', False):
if config['enable'].get('retrieve_natura_raster', True):
rule retrieve_natura_raster:
input: HTTP.remote("zenodo.org/record/4706686/files/natura.tiff", keep_local=True)
input: HTTP.remote("zenodo.org/record/4706686/files/natura.tiff", keep_local=True, static=True)
output: "resources/natura.tiff"
shell: "mv {input} {output}"