Skip to content

Commit 68998af

Browse files
committed
fixing NDArray save method to use its own compression parameters
The 'NDArray.save' method does not path 'cparams' to the parent class 'copy' consequently array is reprocessed with the default 'cparams'. To observe the bug run the script below. ================================================================= import os import blosc2 import numpy as np a = np.arange(10_000_000) a = a * a cparams = blosc2.CParams( codec=blosc2.Codec.ZSTD, clevel=9, filters=[blosc2.Filter.BITSHUFFLE], ) ba = blosc2.asarray(a, cparams=cparams) print(f"Blosc2 memory: size {ba.cbytes}\t cratio: {ba.cratio}") outdir = "cache/" prefix = "save" fname = outdir + prefix + ".b2nd" ba.save(fname, mode="w") fsize = os.path.getsize(fname) print(f"Blosc2 array save:\t saved file size = {fsize}\t cratio: {a.nbytes/fsize}") ================================================================= You should see ~~~~~ Blosc2 memory: size 4370284 cratio: 18.74477722729232 Blosc2 array save: saved file size = 12370369 cratio: 6.467066584675041 ~~~~~ ba.cbytes ---> 4370284 ba.cratio ---> 18.3 I.e. the array in memory has 'cbytes=4370284', however the saved file has 12370369 bytes, which gives 'cratio' of about 6.5. Also if the array is loaded back from the file, it is easy to see that 'cparams' are different from the original array and changed to the default one. After the patch, the memory 'cbytes' are closely matching the saved file size as well as 'cparams' ~~~~~ Blosc2 memory: size 4370284 cratio: 18.74477722729232 Blosc2 array save: saved file size = 4370051 cratio: 18.30642251085856 ~~~~~
1 parent eee6997 commit 68998af

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

src/blosc2/ndarray.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4703,7 +4703,7 @@ def save(self, urlpath: str, contiguous=True, **kwargs: Any) -> None:
47034703
# Add the contiguous parameter
47044704
kwargs["contiguous"] = contiguous
47054705

4706-
super().copy(self.dtype, **kwargs)
4706+
super().copy(self.dtype, cparams=asdict(self.cparams), **kwargs)
47074707

47084708
def resize(self, newshape: tuple | list) -> None:
47094709
"""Change the shape of the array by growing or shrinking one or more dimensions.

0 commit comments

Comments
 (0)