我有同样的问题,当我阅读Sven的答复时感到失望。如果您无法在文件上拥有大量数组并且一次只处理其中的小块,似乎numpy会缺少某些关键功能。您的案例似乎与采用.npy格式的原始用例之一接近(请参阅:http ://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt )。
然后,我遇到了numpy.lib.format,它似乎是完全有用的东西。我不知道为什么从numpy根软件包中无法使用此功能。与HDF5相比,关键优势在于它附带了numpy。
>>> print numpy.lib.format.open_memmap.__doc__
"""
Open a .npy file as a memory-mapped array.
This may be used to read an existing file or create a new one.
Parameters
----------
filename : str
The name of the file on disk. This may not be a filelike object.
mode : str, optional
The mode to open the file with. In addition to the standard file modes,
'c' is also accepted to mean "copy on write". See `numpy.memmap` for
the available mode strings.
dtype : dtype, optional
The data type of the array if we are creating a new file in "write"
mode.
shape : tuple of int, optional
The shape of the array if we are creating a new file in "write"
mode.
fortran_order : bool, optional
Whether the array should be Fortran-contiguous (True) or
C-contiguous (False) if we are creating a new file in "write" mode.
version : tuple of int (major, minor)
If the mode is a "write" mode, then this is the version of the file
format used to create the file.
Returns
-------
marray : numpy.memmap
The memory-mapped array.
Raises
------
ValueError
If the data or the mode is invalid.
IOError
If the file is not found or cannot be opened correctly.
See Also
--------
numpy.memmap
"""