elektronn3.data.sources module¶
Code related to data sources (HDF5 etc.)
- class elektronn3.data.sources.HDF5DataSource(fname, key, in_memory=False)[source]¶
Bases:
DataSource
An h5py.Dataset wrapper for safe multiprocessing. Opens the file and the dataset on each read/property access and then immediately closes it.
This is a workaround for this issue and related data corruptions: https://github.com/pytorch/pytorch/issues/11929.
By avoiding open file handles before worker processes are forked, concurrency issues with HDF5’s global state do not apply.
- elektronn3.data.sources.slice_3d(src, coords_lo, coords_hi, dtype=<class 'numpy.float32'>, prepend_empty_axis=False, check_bounds=True)[source]¶
Slice a patch of 3D image data out of a data source.
- Parameters:
src (
DataSource
) – Source data set from which to read data. The expected data shapes are (C, D, H, W) or (D, H, W).coords_lo (
Sequence
[int
]) – Lower bound of the coordinates where data should be read from insrc
.coords_hi (
Sequence
[int
]) – Upper bound of the coordinates where data should be read from insrc
.dtype (
type
) – NumPydtype
that the sliced array will be cast to if it doesn’t already have this dtype.prepend_empty_axis (
bool
) – Prepends a new empty (1-sized) axis to the sliced array before returning it.check_bounds – If
True
(default), only indices that are within the bounds ofsrc
will be allowed (no negative indices or slices to indices that exceed the shape ofsrc
, which would normally just be ignored).
- Return type:
ndarray
- Returns:
Sliced image array.