Compute
With zero-copy read and write access to data structures, this section presents how to compute in pyAMReX.
Since the pyAMReX data containers are wrappers to the C++ AMReX objects, it is worth reading:
As a very short, simplified overview, this narrows down to:
AMReX decomposes index space into rectangular, block-structured, regular grids,
block are often intentionally slightly over-decomposed, there is >= one block per compute unit (CPU core or GPU),
particles are chunked/tiled and usually decomposed like the field blocks,
refinement levels are represented as (potentially sparse) levels of blocks.
Computations are thus performed (mostly) on whole blocks, which enables to use compute advanced acceleration techniques on CPUs or GPUs.
Fields
The most common data structure to interact with is a MultiFab, which is a collection of boxes with associated field data. The field data can have more than one component (in the slowest varying index), but all components have the same staggering/centering.
This is how to iterate and potentially compute for all blocks assigned to a local process in pyAMReX:
# finest active MR level, get from a
# simulation's AmrMesh object, e.g.:
# finest_level = sim.finest_level
finest_level = 0 # no MR
# iterate over mesh-refinement levels
for lev in range(finest_level + 1):
# get an existing MultiFab, e.g.,
# from a simulation:
# mfab = sim.get_field(lev=lev)
# Config = sim.extension.Config
field_list = mfab.to_xp()
for field in field_list:
field[()] = 42.0
# finest active MR level, get from a
# simulation's AmrMesh object, e.g.:
# finest_level = sim.finest_level
finest_level = 0 # no MR
# iterate over mesh-refinement levels
for lev in range(finest_level + 1):
# get an existing MultiFab, e.g.,
# from a simulation:
# mfab = sim.get_field(lev=lev)
# Config = sim.extension.Config
# grow (aka guard/ghost/halo) regions
ngv = mfab.n_grow_vect
# get every local block of the field
for mfi in mfab:
# global index box w/ guards
bx = mfi.tilebox().grow(ngv)
print(bx)
# numpy/cupy representation: non-
# copying view, w/ guard/ghost
field = mfab.array(mfi).to_xp()
# notes on indexing in field:
# - numpy uses locally zero-based indexing
# - layout is F_CONTIGUOUS by default, just like AMReX
field[()] = 42.0
# finest active MR level, get from a
# simulation's AmrMesh object, e.g.:
# finest_level = sim.finest_level
finest_level = 0 # no MR
# iterate over mesh-refinement levels
for lev in range(finest_level + 1):
# get an existing MultiFab, e.g.,
# from a simulation:
# mfab = sim.get_field(lev=lev)
# Config = sim.extension.Config
# Using global indexing
# Set all valid cells (and internal ghost cells)
mfab[...] = 42.0
# Set a range of cells. Indices are in Fortran order.
# First dimension, sets from first lower guard cell to first upper guard cell.
# - Imaginary indices refer to the guard cells, negative lower, positive upper.
# Second dimension, sets all valid cells.
# Third dimension, sets all valid and ghost cells
# - The empty tuple is used to specify the range to include all valid and ghost cells.
# Components dimension, sets first component.
mfab[-1j:2j, :, (), 0] = 42.0
# Get a range of cells
# Get the data along the valid cells in the first dimension (gathering data across blocks
# and processors), at the first upper guard cell in the second dimensionn, and cell 2 of
# the third (with 2 being relative to 0 which is the lower end of the valid cells of the full domain).
# Note that in an MPI context, this is a global operation, so caution is required when
# scaling to large numbers of processors.
if mfab.n_grow_vect.max > 0:
mfslice = mfab[:, 1j, 2]
# The assignment is to the last valid cell of the second dimension.
mfab[:, -1, 2] = 2 * mfslice
For a complete physics example that uses CPU/GPU agnostic Python code for computation on fields, see:
For many small CPU and GPU examples on how to compute on fields, see the following test cases:
Examples in
test_multifab.py
# -*- coding: utf-8 -*- import math import numpy as np import pytest import amrex.space3d as amr def test_mfab_numpy(mfab): """Used in docs/source/usage/compute.rst""" # Manual: Compute Mfab Detailed START # finest active MR level, get from a # simulation's AmrMesh object, e.g.: # finest_level = sim.finest_level finest_level = 0 # no MR # iterate over mesh-refinement levels for lev in range(finest_level + 1): # get an existing MultiFab, e.g., # from a simulation: # mfab = sim.get_field(lev=lev) # Config = sim.extension.Config # grow (aka guard/ghost/halo) regions ngv = mfab.n_grow_vect # get every local block of the field for mfi in mfab: # global index box w/ guards bx = mfi.tilebox().grow(ngv) print(bx) # numpy/cupy representation: non- # copying view, w/ guard/ghost field = mfab.array(mfi).to_xp() # notes on indexing in field: # - numpy uses locally zero-based indexing # - layout is F_CONTIGUOUS by default, just like AMReX field[()] = 42.0 # Manual: Compute Mfab Detailed END # Manual: Compute Mfab Simple START # finest active MR level, get from a # simulation's AmrMesh object, e.g.: # finest_level = sim.finest_level finest_level = 0 # no MR # iterate over mesh-refinement levels for lev in range(finest_level + 1): # get an existing MultiFab, e.g., # from a simulation: # mfab = sim.get_field(lev=lev) # Config = sim.extension.Config field_list = mfab.to_xp() for field in field_list: field[()] = 42.0 # Manual: Compute Mfab Simple END # Manual: Compute Mfab Global START # finest active MR level, get from a # simulation's AmrMesh object, e.g.: # finest_level = sim.finest_level finest_level = 0 # no MR # iterate over mesh-refinement levels for lev in range(finest_level + 1): # get an existing MultiFab, e.g., # from a simulation: # mfab = sim.get_field(lev=lev) # Config = sim.extension.Config # Using global indexing # Set all valid cells (and internal ghost cells) mfab[...] = 42.0 # Set a range of cells. Indices are in Fortran order. # First dimension, sets from first lower guard cell to first upper guard cell. # - Imaginary indices refer to the guard cells, negative lower, positive upper. # Second dimension, sets all valid cells. # Third dimension, sets all valid and ghost cells # - The empty tuple is used to specify the range to include all valid and ghost cells. # Components dimension, sets first component. mfab[-1j:2j, :, (), 0] = 42.0 # Get a range of cells # Get the data along the valid cells in the first dimension (gathering data across blocks # and processors), at the first upper guard cell in the second dimensionn, and cell 2 of # the third (with 2 being relative to 0 which is the lower end of the valid cells of the full domain). # Note that in an MPI context, this is a global operation, so caution is required when # scaling to large numbers of processors. if mfab.n_grow_vect.max > 0: mfslice = mfab[:, 1j, 2] # The assignment is to the last valid cell of the second dimension. mfab[:, -1, 2] = 2 * mfslice # Manual: Compute Mfab Global END @pytest.mark.skipif(amr.Config.have_gpu, reason="This test only runs on CPU") def test_mfab_loop_slow(mfab): ngv = mfab.n_grow_vect print(f"\n mfab={mfab}, mfab.n_grow_vect={ngv}") for mfi in mfab: bx = mfi.tilebox().grow(ngv) marr = mfab.array(mfi) # print(mfab) # print(mfab.num_comp) # print(mfab.size) # print(marr.size) # print(marr.nComp) # index by index assignment # notes: # - this is AMReX Array4, F-order indices # - even though we iterate by fastest varying index, # such loops are naturally very slow in Python three_comps = mfab.num_comp == 3 if three_comps: for i, j, k in bx: # print(i,j,k) marr[i, j, k, 0] = 10.0 * i marr[i, j, k, 1] = 10.0 * j marr[i, j, k, 2] = 10.0 * k else: for i, j, k in bx: # print(i,j,k) marr[i, j, k] = 10.0 * i # numpy representation: non-copying view, zero-indexed, # includes the guard/ghost region marr_np = marr.to_numpy() # check the values at start/end are the same: first component assert marr_np[0, 0, 0, 0] == marr[bx.small_end] assert marr_np[-1, -1, -1, 0] == marr[bx.big_end] # same check, but for all components for n in range(mfab.num_comp): small_end_comp = list(bx.small_end) + [n] big_end_comp = list(bx.big_end) + [n] assert marr_np[0, 0, 0, n] == marr[small_end_comp] assert marr_np[-1, -1, -1, n] == marr[big_end_comp] # all components and all indices set at once to 42 marr_np[()] = 42.0 # values in start & end still match? assert marr_np[0, 0, 0, 0] == marr[bx.small_end] assert marr_np[-1, -1, -1, -1] == marr[bx.big_end] # all values for all indices match between multifab & numpy view? for n in range(mfab.num_comp): for i, j, k in bx: assert marr[i, j, k, n] == 42.0 def test_mfab_loop(mfab): ngv = mfab.n_grow_vect print(f"\n mfab={mfab}, mfab.n_grow_vect={ngv}") for mfi in mfab: bx = mfi.tilebox().grow(ngv) marr = mfab.array(mfi) # note: offset from index space in numpy # in numpy, we start indices from zero, not small_end # numpy/cupy representation: non-copying view, including the # guard/ghost region marr_xp = marr.to_xp() marr_xp[()] = ( 10.0 # TODO: fill with index value or so as in test_mfab_loop_slow ) def iv2s(iv, comp): return tuple(iv) + (comp,) # check the values at start/end are the same: first component for n in range(mfab.num_comp): assert marr_xp[0, 0, 0, n] == 10.0 assert marr_xp[-1, -1, -1, n] == marr_xp[iv2s(bx.big_end - bx.small_end, n)] # now we do some faster assignments, using range based access # This should fail as out-of-bounds, but does not. # Does NumPy/CuPy not check array access for non-owned views? # marr_xp[24:200, :, :, :] = 42. # all components and all indices set at once to 42 marr_xp[()] = 42.0 # values in start & end still match? for n in range(mfab.num_comp): assert marr_xp[0, 0, 0, n] == 42.0 assert marr_xp[-1, -1, -1, n] == marr_xp[iv2s(bx.big_end - bx.small_end, n)] def test_mfab_simple(mfab): assert mfab.is_all_cell_centered # assert(all(not mfab.is_nodal(i) for i in [-1, 0, 1, 2])) # -1?? assert all(not mfab.is_nodal(i) for i in [0, 1, 2]) for i in range(mfab.num_comp): mfab.set_val(-10 * (i + 1), i, 1) mfab.abs(0, mfab.num_comp) for i in range(mfab.num_comp): assert mfab.max(i) == (10 * (i + 1)) # Assert: None == 10 for i=0 assert mfab.min(i) == (10 * (i + 1)) mfab.plus(20.0, 0, mfab.num_comp) for i in range(mfab.num_comp): np.testing.assert_allclose(mfab.max(i), 20.0 + (10 * (i + 1))) np.testing.assert_allclose(mfab.min(i), 20.0 + (10 * (i + 1))) mfab.mult(10.0, 0, mfab.num_comp) for i in range(mfab.num_comp): np.testing.assert_allclose(mfab.max(i), 10.0 * (20.0 + (10 * (i + 1)))) np.testing.assert_allclose(mfab.min(i), 10.0 * (20.0 + (10 * (i + 1)))) mfab.invert(10.0, 0, mfab.num_comp) for i in range(mfab.num_comp): np.testing.assert_allclose(mfab.max(i), 1.0 / (20.0 + (10 * (i + 1)))) np.testing.assert_allclose(mfab.min(i), 1.0 / (20.0 + (10 * (i + 1)))) @pytest.mark.parametrize("nghost", [0, 1]) def test_mfab_ops(boxarr, distmap, nghost): src = amr.MultiFab(boxarr, distmap, 3, nghost) dst = amr.MultiFab(boxarr, distmap, 1, nghost) src.set_val(10.0, 0, 1) src.set_val(20.0, 1, 1) src.set_val(30.0, 2, 1) dst.set_val(0.0, 0, 1) # dst.add(src, 2, 0, 1, nghost) # dst.subtract(src, 1, 0, 1, nghost) # dst.multiply(src, 0, 0, 1, nghost) # dst.divide(src, 1, 0, 1, nghost) dst.add(dst, src, 2, 0, 1, nghost) dst.subtract(dst, src, 1, 0, 1, nghost) dst.multiply(dst, src, 0, 0, 1, nghost) dst.divide(dst, src, 1, 0, 1, nghost) print(dst.min(0)) np.testing.assert_allclose(dst.min(0), 5.0) np.testing.assert_allclose(dst.max(0), 5.0) # dst.xpay(2.0, src, 0, 0, 1, nghost) # dst.saxpy(2.0, src, 1, 0, 1, nghost) dst.xpay(dst, 2.0, src, 0, 0, 1, nghost) dst.saxpy(dst, 2.0, src, 1, 0, 1, nghost) np.testing.assert_allclose(dst.min(0), 60.0) np.testing.assert_allclose(dst.max(0), 60.0) # dst.lin_comb(6.0, src, 1, # 1.0, src, 2, 0, 1, nghost) dst.lin_comb(dst, 6.0, src, 1, 1.0, src, 2, 0, 1, nghost) np.testing.assert_allclose(dst.min(0), 150.0) np.testing.assert_allclose(dst.max(0), 150.0) def test_mfab_mfiter(mfab): assert iter(mfab).is_valid assert iter(mfab).length == 8 cnt = 0 for _mfi in mfab: cnt += 1 assert iter(mfab).length == cnt @pytest.mark.skipif( amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA" ) def test_mfab_ops_cuda_numba(mfab_device): # https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html from numba import cuda ngv = mfab_device.n_grow_vect # assign 3: define kernel @cuda.jit def set_to_three(array): i, j, k = cuda.grid(3) if i < array.shape[0] and j < array.shape[1] and k < array.shape[2]: array[i, j, k] = 3.0 # assign 3: loop through boxes and launch kernels for mfi in mfab_device: bx = mfi.tilebox().grow(ngv) # noqa marr = mfab_device.array(mfi) marr_numba = cuda.as_cuda_array(marr) # kernel launch threadsperblock = (4, 4, 4) blockspergrid = tuple( [math.ceil(s / b) for s, b in zip(marr_numba.shape, threadsperblock)] ) set_to_three[blockspergrid, threadsperblock](marr_numba) # Check results shape = 32**3 * 8 sum_threes = mfab_device.sum_unique(comp=0, local=False) assert sum_threes == shape * 3 @pytest.mark.skipif( amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA" ) def test_mfab_ops_cuda_cupy(mfab_device): # https://docs.cupy.dev/en/stable/user_guide/interoperability.html import cupy as cp import cupyx.profiler # AMReX -> cupy ngv = mfab_device.n_grow_vect print(f"\n mfab_device={mfab_device}, mfab_device.n_grow_vect={ngv}") # assign 3 with cupyx.profiler.time_range("assign 3 [()]", color_id=0): for mfi in mfab_device: bx = mfi.tilebox().grow(ngv) # noqa marr_cupy = mfab_device.array(mfi).to_cupy(order="C") # print(marr_cupy.shape) # 1, 32, 32, 32 # print(marr_cupy.dtype) # float64 # performance: # https://github.com/AMReX-Codes/pyamrex/issues/55#issuecomment-1579610074 # write and read into the marr_cupy marr_cupy[()] = 3.0 # verify result with a .sum_unique with cupyx.profiler.time_range("verify 3", color_id=0): shape = 32**3 * 8 # print(mfab_device.shape) sum_threes = mfab_device.sum_unique(comp=0, local=False) assert sum_threes == shape * 3 # assign 2 with cupyx.profiler.time_range("assign 2 (set_val)", color_id=1): mfab_device.set_val(2.0) with cupyx.profiler.time_range("verify 2", color_id=1): sum_twos = mfab_device.sum_unique(comp=0, local=False) assert sum_twos == shape * 2 # assign 5 with cupyx.profiler.time_range("assign 5 (ones-like)", color_id=2): def set_to_five(mm): xp = cp.get_array_module(mm) assert xp.__name__ == "cupy" mm = xp.ones_like(mm) * 10.0 mm /= 2.0 return mm for mfi in mfab_device: bx = mfi.tilebox().grow(ngv) # noqa marr_cupy = mfab_device.array(mfi).to_cupy(order="F") # print(marr_cupy.shape) # 32, 32, 32, 1 # print(marr_cupy.dtype) # float64 # performance: # https://github.com/AMReX-Codes/pyamrex/issues/55#issuecomment-1579610074 # write and read into the marr_cupy fives_cp = set_to_five(marr_cupy) marr_cupy[()] = 0.0 marr_cupy += fives_cp # verify with cupyx.profiler.time_range("verify 5", color_id=2): sum = mfab_device.sum_unique(comp=0, local=False) assert sum == shape * 5 # assign 7 with cupyx.profiler.time_range("assign 7 (fuse)", color_id=3): @cp.fuse(kernel_name="set_to_seven") def set_to_seven(x): x[...] = 7.0 for mfi in mfab_device: bx = mfi.tilebox().grow(ngv) # noqa marr_cupy = mfab_device.array(mfi).to_cupy(order="C") # write and read into the marr_cupy set_to_seven(marr_cupy) # verify with cupyx.profiler.time_range("verify 7", color_id=3): sum = mfab_device.sum_unique(comp=0, local=False) assert sum == shape * 7 # TODO: @jit.rawkernel() @pytest.mark.skipif( amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA" ) def test_mfab_ops_cuda_pytorch(mfab_device): # https://docs.cupy.dev/en/stable/user_guide/interoperability.html#pytorch import torch # assign 3: loop through boxes and launch kernel for mfi in mfab_device: marr = mfab_device.array(mfi) marr_torch = torch.as_tensor(marr, device="cuda") marr_torch[:, :, :] = 3 # Check results shape = 32**3 * 8 sum_threes = mfab_device.sum_unique(comp=0, local=False) assert sum_threes == shape * 3 @pytest.mark.skipif( amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA" ) def test_mfab_ops_cuda_cuml(mfab_device): pass # https://github.com/rapidsai/cuml # https://github.com/rapidsai/cudf # maybe better for particles as a dataframe test # import cudf # import cuml # AMReX -> RAPIDSAI cuML # arr_cuml = ... # assert(arr_cuml.__cuda_array_interface__['data'][0] == arr.__cuda_array_interface__['data'][0]) # TODO @pytest.mark.skipif( amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA" ) def test_mfab_dtoh_copy(mfab_device): class MfabPinnedContextManager: def __enter__(self): self.mfab = amr.MultiFab( mfab_device.box_array(), mfab_device.dm(), mfab_device.n_comp, mfab_device.n_grow_vect, amr.MFInfo().set_arena(amr.The_Pinned_Arena()), ) return self.mfab def __exit__(self, exc_type, exc_value, traceback): self.mfab.clear() del self.mfab with MfabPinnedContextManager() as mfab_host: mfab_host.set_val(42.0) amr.dtoh_memcpy(mfab_host, mfab_device) # assert all are 0.0 on host host_min = mfab_host.min(0) host_max = mfab_host.max(0) assert host_min == host_max assert host_max == 0.0 dev_val = 11.0 mfab_host.set_val(dev_val) amr.htod_memcpy(mfab_device, mfab_host) # assert all are 11.0 on device for n in range(mfab_device.n_comp): assert mfab_device.min(comp=n) == dev_val assert mfab_device.max(comp=n) == dev_val # numpy bindings (w/ copy) local_boxes_host = mfab_device.to_numpy(copy=True) assert max([np.max(box) for box in local_boxes_host]) == dev_val del local_boxes_host # numpy bindings (w/ copy) for mfi in mfab_device: marr = mfab_device.array(mfi).to_numpy(copy=True) assert np.min(marr) >= dev_val assert np.max(marr) <= dev_val # cupy bindings (w/o copy) import cupy as cp local_boxes_device = mfab_device.to_cupy() assert max([cp.max(box) for box in local_boxes_device]) == dev_val def test_mfab_copy(mfab): # write to mfab mfab.set_val(42.0) for i in range(mfab.num_comp): np.testing.assert_allclose(mfab.max(i), 42.0) # copy new_mfab = mfab.copy() # write to old mfab mfab.set_val(1.0) for i in range(mfab.num_comp): np.testing.assert_allclose(mfab.max(i), 1.0) # check new mfab is the original data for i in range(new_mfab.num_comp): np.testing.assert_allclose(new_mfab.max(i), 42.0)
Particles
AMReX Particles are stored in the ParticleContainer class.
There are a few small differences to the iteration over a ParticleContainer compared to a MultiFab
:
ParticleContainer
is aware of mesh-refinement levels,AMReX supports a variety of data layouts for particles, the modern pure SoA + runtime attribute layout and the legacy AoS + SoA + runtime SoA attributes layout.
Here is the general structure for computing on particles:
# code-specific getter function, e.g.:
# pc = sim.get_particles()
# Config = sim.extension.Config
# local particles on all levels
df = pc.to_df() # this is a copy!
print(df)
# read
print(df["x"])
# write (into copy!)
df["x"] = 0.30
df["y"] = 0.35
df["z"] = 0.40
df["a"] = df["x"] ** 2
df["b"] = df["x"] + df["y"]
df["c"] = 0.50
# int attributes
# df["i1"] = 12
# df["i2"] = 12
# ...
print(df)
# code-specific getter function, e.g.:
# pc = sim.get_particles()
# Config = sim.extension.Config
# iterate over mesh-refinement levels
for lvl in range(pc.finest_level + 1):
# loop local tiles of particles
for pti in pc.iterator(pc, level=lvl):
# compile-time and runtime attributes
soa = pti.soa().to_xp()
# print all particle ids in the chunk
print("idcpu =", soa.idcpu)
x = soa.real["x"]
y = soa.real["y"]
# write to all particles in the chunk
# note: careful, if you change particle positions, you might need to
# redistribute particles before continuing the simulation step
soa.real["x"][:] = 0.30
soa.real["y"][:] = 0.35
soa.real["z"][:] = 0.40
soa.real["a"][:] = x[:] ** 2
soa.real["b"][:] = x[:] + y[:]
soa.real["c"][:] = 0.50
# ...
# all int attributes
for soa_int in soa.int.values():
soa_int[:] = 12
# code-specific getter function, e.g.:
# pc = sim.get_particles()
# Config = sim.extension.Config
# iterate over mesh-refinement levels
for lvl in range(pc.finest_level + 1):
# loop local tiles of particles
for pti in pc.iterator(pc, level=lvl):
# default layout: AoS with positions and idcpu
# note: not part of the new PureSoA particle container layout
aos = (
pti.aos().to_numpy(copy=True)
if Config.have_gpu
else pti.aos().to_numpy()
)
# additional compile-time and runtime attributes in SoA format
soa = pti.soa().to_xp()
# notes:
# Only the next lines are the "HOT LOOP" of the computation.
# For efficiency, use numpy array operation for speed on CPUs.
# For GPUs use .to_cupy() above and compute with cupy or numba.
# print all particle ids in the chunk
print("idcpu =", aos[:]["idcpu"])
# write to all particles in the chunk
aos[:]["x"] = 0.30
aos[:]["y"] = 0.35
aos[:]["z"] = 0.40
print(soa.real)
for soa_real in soa.real.values():
soa_real[:] = 42.0
for soa_int in soa.int.values():
soa_int[:] = 12
For many small CPU and GPU examples on how to compute on particles, see the following test cases:
Examples in
test_particleContainer.py
# -*- coding: utf-8 -*- import importlib import numpy as np import pytest import amrex.space3d as amr @pytest.fixture() def Npart(): return 21 @pytest.fixture(scope="function") def empty_particle_container(std_geometry, distmap, boxarr): # This fixture includes the legacy AoS layout components, which for CuPy only run on CPU # or require managed memory, see https://github.com/cupy/cupy/issues/2031 if amr.Config.have_gpu: return amr.ParticleContainer_2_1_3_1_managed(std_geometry, distmap, boxarr) else: return amr.ParticleContainer_2_1_3_1_default(std_geometry, distmap, boxarr) @pytest.fixture(scope="function") def empty_soa_particle_container(std_geometry, distmap, boxarr): pc = amr.ParticleContainer_pureSoA_8_0_default(std_geometry, distmap, boxarr) return pc @pytest.fixture(scope="function") def std_particle(): myt = amr.ParticleInitType_2_1_3_1() myt.real_struct_data = [0.5, 0.6] myt.int_struct_data = [5] myt.real_array_data = [0.5, 0.2, 0.3] myt.int_array_data = [1] return myt @pytest.fixture(scope="function") def particle_container(Npart, std_geometry, distmap, boxarr, std_real_box): # This fixture includes the legacy AoS layout components, which for CuPy only run on CPU # or require managed memory, see https://github.com/cupy/cupy/issues/2031 if amr.Config.have_gpu: pc = amr.ParticleContainer_2_1_3_1_managed(std_geometry, distmap, boxarr) else: pc = amr.ParticleContainer_2_1_3_1_default(std_geometry, distmap, boxarr) myt = amr.ParticleInitType_2_1_3_1() myt.real_struct_data = [0.5, 0.6] myt.int_struct_data = [5] myt.real_array_data = [0.5, 0.2, 0.3] myt.int_array_data = [1] iseed = 1 pc.init_random(Npart, iseed, myt, False, std_real_box) # add runtime components: 1 real 2 int pc.add_real_comp(True) pc.add_int_comp(True) pc.add_int_comp(True) # assign some values to runtime components for lvl in range(pc.finest_level + 1): for pti in pc.iterator(pc, level=lvl): soa = pti.soa() soa.get_real_data(2).assign(1.2345) soa.get_int_data(1).assign(42) soa.get_int_data(2).assign(33) return pc @pytest.fixture(scope="function") def soa_particle_container(Npart, std_geometry, distmap, boxarr, std_real_box): pc = amr.ParticleContainer_pureSoA_8_0_default(std_geometry, distmap, boxarr) myt = amr.ParticleInitType_pureSoA_8_0() myt.real_array_data = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] myt.int_array_data = [] iseed = 1 pc.init_random(Npart, iseed, myt, False, std_real_box) # add runtime components: 1 real 2 int pc.add_real_comp(True) pc.add_int_comp(True) pc.add_int_comp(True) # assign some values to runtime components for lvl in range(pc.finest_level + 1): for pti in pc.iterator(pc, level=lvl): soa = pti.soa() soa.get_real_data(8).assign(1.2345) soa.get_int_data(0).assign(42) soa.get_int_data(1).assign(33) return pc def test_particleInitType(): myt = amr.ParticleInitType_2_1_3_1() print(myt.real_struct_data) print(myt.int_struct_data) print(myt.real_array_data) print(myt.int_array_data) myt.real_struct_data = [0.5, 0.7] myt.int_struct_data = [5] myt.real_array_data = [0.5, 0.2, 0.4] myt.int_array_data = [1] assert np.allclose(myt.real_struct_data, [0.5, 0.7]) assert np.allclose(myt.int_struct_data, [5]) assert np.allclose(myt.real_array_data, [0.5, 0.2, 0.4]) assert np.allclose(myt.int_array_data, [1]) def test_n_particles(particle_container, Npart): pc = particle_container assert pc.OK() assert ( pc.num_struct_real == amr.ParticleContainer_2_1_3_1_default.num_struct_real == 2 ) assert ( pc.num_struct_int == amr.ParticleContainer_2_1_3_1_default.num_struct_int == 1 ) assert ( pc.num_array_real == amr.ParticleContainer_2_1_3_1_default.num_array_real == 3 ) assert pc.num_array_int == amr.ParticleContainer_2_1_3_1_default.num_array_int == 1 assert ( pc.number_of_particles_at_level(0) == np.sum(pc.number_of_particles_in_grid(0)) == Npart ) def test_pc_init(): # This test only runs on CPU or requires managed memory, # see https://github.com/cupy/cupy/issues/2031 pc = ( amr.ParticleContainer_2_1_3_1_managed() if amr.Config.have_gpu else amr.ParticleContainer_2_1_3_1_default() ) print("bytespread", pc.byte_spread) print("capacity", pc.print_capacity()) print("number_of_particles_at_level(0)", pc.number_of_particles_at_level(0)) assert pc.number_of_particles_at_level(0) == 0 bx = amr.Box(amr.IntVect(0, 0, 0), amr.IntVect(63, 63, 63)) rb = amr.RealBox(0, 0, 0, 1, 1, 1) coord_int = 1 # RZ periodicity = [0, 0, 1] gm = amr.Geometry(bx, rb, coord_int, periodicity) ba = amr.BoxArray(bx) ba.max_size(32) dm = amr.DistributionMapping(ba) print("-------------------------") print("define particle container") pc.Define(gm, dm, ba) assert pc.OK() assert ( pc.num_struct_real == amr.ParticleContainer_2_1_3_1_default.num_struct_real == 2 ) assert ( pc.num_struct_int == amr.ParticleContainer_2_1_3_1_default.num_struct_int == 1 ) assert ( pc.num_array_real == amr.ParticleContainer_2_1_3_1_default.num_array_real == 3 ) assert pc.num_array_int == amr.ParticleContainer_2_1_3_1_default.num_array_int == 1 print("bytespread", pc.byte_spread) print("capacity", pc.print_capacity()) print("number_of_particles_at_level(0)", pc.number_of_particles_at_level(0)) assert pc.total_number_of_particles() == pc.number_of_particles_at_level(0) == 0 assert pc.OK() print("---------------------------") print("add a particle to each grid") Npart_grid = 1 iseed = 1 myt = amr.ParticleInitType_2_1_3_1() myt.real_struct_data = [0.5, 0.4] myt.int_struct_data = [5] myt.real_array_data = [0.5, 0.2, 0.4] myt.int_array_data = [1] pc.init_random_per_box(Npart_grid, iseed, myt) ngrid = ba.size npart = Npart_grid * ngrid print("NumberOfParticles", pc.number_of_particles_at_level(0)) assert pc.total_number_of_particles() == pc.number_of_particles_at_level(0) == npart assert pc.OK() print(f"Finest level = {pc.finest_level}") print("Iterate particle boxes & set values") # lvl = 0 for lvl in range(pc.finest_level + 1): print(f"at level {lvl}:") for pti in pc.iterator(pc, level=lvl): print("...") assert pti.num_particles == 1 assert pti.num_real_particles == 1 assert pti.num_neighbor_particles == 0 assert pti.level == lvl print(pti.pair_index) print(pti.geom(level=lvl)) # note: cupy does not yet support this # https://github.com/cupy/cupy/issues/2031 aos = pti.aos() aos_arr = aos.to_numpy() aos_arr[0]["x"] = 0.30 aos_arr[0]["y"] = 0.35 aos_arr[0]["z"] = 0.40 # TODO: this seems to write into a copy of the data soa = pti.soa() real_arrays = soa.get_real_data() int_arrays = soa.get_int_data() real_arrays[0] = [0.55] real_arrays[1] = [0.22] int_arrays[0] = [2] assert np.allclose(real_arrays[0], np.array([0.55])) assert np.allclose(real_arrays[1], np.array([0.22])) assert np.allclose(int_arrays[0], np.array([2])) # read-only for lvl in range(pc.finest_level + 1): for pti in pc.const_iterator(pc, level=lvl): assert pti.num_particles == 1 assert pti.num_real_particles == 1 assert pti.num_neighbor_particles == 0 assert pti.level == lvl aos = pti.aos() aos_arr = aos.to_numpy() assert aos[0].x == 0.30 assert aos[0].y == 0.35 assert aos[0].z == 0.40 assert aos_arr[0]["z"] == 0.40 soa = pti.soa() real_arrays = soa.get_real_data() int_arrays = soa.get_int_data() print(real_arrays[0]) print(int_arrays[0]) # TODO: this does not work yet and is still the original data # assert np.allclose(real_arrays[0], np.array([0.55])) # assert np.allclose(real_arrays[1], np.array([0.22])) # assert np.allclose(int_arrays[0], np.array([2])) def test_particle_init(Npart, particle_container): pc = particle_container assert ( pc.number_of_particles_at_level(0) == np.sum(pc.number_of_particles_in_grid(0)) == Npart ) # pc.resizeData() print(pc.num_local_tiles_at_level(0)) lev = pc.get_particles(0) print(len(lev.items())) assert pc.num_local_tiles_at_level(0) == len(lev.items()) for tile_ind, pt in lev.items(): print("tile", tile_ind) real_arrays = pt.get_struct_of_arrays().get_real_data() int_arrays = pt.get_struct_of_arrays().get_int_data() aos = pt.get_array_of_structs() aos_arr = aos.to_numpy() if len(real_arrays) > 0: assert np.isclose(real_arrays[0][0], 0.5) and np.isclose( real_arrays[1][0], 0.2 ) assert isinstance(int_arrays[0][0], int) assert int_arrays[0][0] == 1 assert isinstance(aos_arr[0]["rdata_0"], np.floating) assert isinstance(aos_arr[0]["idata_0"], np.integer) assert np.isclose(aos_arr[0]["rdata_0"], 0.5) and aos_arr[0]["idata_0"] == 5 aos_arr[0]["idata_0"] = 2 aos1 = pt.get_array_of_structs() print(aos1[0]) print(aos[0]) print(aos_arr[0]) assert ( aos_arr[0]["idata_0"] == aos[0].get_idata(0) == aos1[0].get_idata(0) == 2 ) print("soa test") real_arrays[1][0] = -1.2 ra1 = pt.get_struct_of_arrays().get_real_data() print(real_arrays) print(ra1) for ii, arr in enumerate(real_arrays): assert np.allclose(arr.to_numpy(), ra1[ii].to_numpy()) print("soa int test") iarr_np = int_arrays[0].to_numpy() iarr_np[0] = -3 ia1 = pt.get_struct_of_arrays().get_int_data() ia1_np = ia1[0].to_numpy() print(iarr_np) print(ia1_np) assert np.allclose(iarr_np, ia1_np) print( "---- is the particle tile recording changes or passed by reference? --------" ) lev1 = pc.get_particles(0) for tile_ind, pt in lev1.items(): print("tile", tile_ind) real_arrays = pt.get_struct_of_arrays().get_real_data() int_arrays = pt.get_struct_of_arrays().get_int_data() aos = pt.get_array_of_structs() print(aos[0]) assert aos[0].get_idata(0) == 2 assert real_arrays[1][0] == -1.2 assert int_arrays[0][0] == -3 def test_per_cell(empty_particle_container, std_geometry, std_particle): pc = empty_particle_container pc.init_one_per_cell(0.5, 0.5, 0.5, std_particle) assert pc.OK() lev = pc.get_particles(0) assert pc.num_local_tiles_at_level(0) == len(lev.items()) sum_1 = 0 for tile_ind, pt in lev.items(): print("tile", tile_ind) real_arrays = pt.get_struct_of_arrays().get_real_data() sum_1 += np.sum(real_arrays[1]) print(sum_1) ncells = std_geometry.domain.numPts() print("ncells from box", ncells) print("NumberOfParticles", pc.number_of_particles_at_level(0)) assert ( pc.total_number_of_particles() == pc.number_of_particles_at_level(0) == ncells ) print("npts * real_1", ncells * std_particle.real_array_data[1]) assert ncells * std_particle.real_array_data[1] == sum_1 def test_soa_pc_numpy(soa_particle_container, Npart): """Used in docs/source/usage/compute.rst""" pc = soa_particle_container assert pc.number_of_particles_at_level(0) == Npart class Config: have_gpu = False # Manual: Pure SoA Compute PC Detailed START # code-specific getter function, e.g.: # pc = sim.get_particles() # Config = sim.extension.Config # iterate over mesh-refinement levels for lvl in range(pc.finest_level + 1): # loop local tiles of particles for pti in pc.iterator(pc, level=lvl): # compile-time and runtime attributes soa = pti.soa().to_xp() # print all particle ids in the chunk print("idcpu =", soa.idcpu) x = soa.real["x"] y = soa.real["y"] # write to all particles in the chunk # note: careful, if you change particle positions, you might need to # redistribute particles before continuing the simulation step soa.real["x"][:] = 0.30 soa.real["y"][:] = 0.35 soa.real["z"][:] = 0.40 soa.real["a"][:] = x[:] ** 2 soa.real["b"][:] = x[:] + y[:] soa.real["c"][:] = 0.50 # ... # all int attributes for soa_int in soa.int.values(): soa_int[:] = 12 # Manual: Pure SoA Compute PC Detailed END def test_pc_numpy(particle_container, Npart): """Used in docs/source/usage/compute.rst""" pc = particle_container assert pc.number_of_particles_at_level(0) == Npart class Config: have_gpu = False # Manual: Legacy Compute PC Detailed START # code-specific getter function, e.g.: # pc = sim.get_particles() # Config = sim.extension.Config # iterate over mesh-refinement levels for lvl in range(pc.finest_level + 1): # loop local tiles of particles for pti in pc.iterator(pc, level=lvl): # default layout: AoS with positions and idcpu # note: not part of the new PureSoA particle container layout aos = ( pti.aos().to_numpy(copy=True) if Config.have_gpu else pti.aos().to_numpy() ) # additional compile-time and runtime attributes in SoA format soa = pti.soa().to_xp() # notes: # Only the next lines are the "HOT LOOP" of the computation. # For efficiency, use numpy array operation for speed on CPUs. # For GPUs use .to_cupy() above and compute with cupy or numba. # print all particle ids in the chunk print("idcpu =", aos[:]["idcpu"]) # write to all particles in the chunk aos[:]["x"] = 0.30 aos[:]["y"] = 0.35 aos[:]["z"] = 0.40 print(soa.real) for soa_real in soa.real.values(): soa_real[:] = 42.0 for soa_int in soa.int.values(): soa_int[:] = 12 # Manual: Legacy Compute PC Detailed END @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) def test_pc_df(particle_container, Npart): pc = particle_container print(f"pc={pc}") df = pc.to_df() print(df.columns) print(df) assert len(df.columns) == 14 @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) def test_soa_pc_empty_df(empty_soa_particle_container, Npart): pc = empty_soa_particle_container print(f"pc={pc}") df = pc.to_df() assert df is None @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) @pytest.mark.skipif(not amr.Config.have_mpi, reason="Requires AMReX_MPI=ON") def test_soa_pc_df_mpi(soa_particle_container, Npart): pc = soa_particle_container print(f"pc={pc}") df = pc.to_df(local=False) if df is not None: # only rank 0 print(df.columns) print(df) @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) def test_soa_pc_df(soa_particle_container, Npart): """Used in docs/source/usage/compute.rst""" pc = soa_particle_container class Config: have_gpu = False # Manual: Pure SoA Compute PC Pandas START # code-specific getter function, e.g.: # pc = sim.get_particles() # Config = sim.extension.Config # local particles on all levels df = pc.to_df() # this is a copy! print(df) # read print(df["x"]) # write (into copy!) df["x"] = 0.30 df["y"] = 0.35 df["z"] = 0.40 df["a"] = df["x"] ** 2 df["b"] = df["x"] + df["y"] df["c"] = 0.50 # int attributes # df["i1"] = 12 # df["i2"] = 12 # ... print(df) # Manual: Pure SoA Compute PC Pandas END @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) def test_pc_empty_df(empty_particle_container, Npart): pc = empty_particle_container print(f"pc={pc}") df = pc.to_df() assert df is None @pytest.mark.skipif( importlib.util.find_spec("pandas") is None, reason="pandas is not available" ) @pytest.mark.skipif(not amr.Config.have_mpi, reason="Requires AMReX_MPI=ON") def test_pc_df_mpi(particle_container, Npart): pc = particle_container print(f"pc={pc}") df = pc.to_df(local=False) if df is not None: # only rank 0 print(df.columns) print(df) assert len(df.columns) == 14
Examples in
test_aos.py
# -*- coding: utf-8 -*- import numpy as np import amrex.space3d as amr def test_aos_init(): aos = amr.ArrayOfStructs_2_1_default() assert aos.numParticles() == 0 assert aos.numTotalParticles() == aos.numRealParticles() == 0 assert aos.empty() def test_aos_push_pop(): aos = ( amr.ArrayOfStructs_2_1_managed() if amr.Config.have_gpu else amr.ArrayOfStructs_2_1_default() ) p1 = amr.Particle_2_1() p1.set_rdata([1.5, 2.2]) p1.set_idata([3]) aos.push_back(p1) p2 = amr.Particle_2_1() p2.set_rdata([2.1, 25.2]) p2.set_idata([5]) aos.push_back(p2) ## test aos.back() pback = aos.back() assert np.allclose(pback.get_rdata(), p2.get_rdata()) assert np.allclose(pback.get_idata(), p2.get_idata()) assert pback.x == p2.x and pback.y == p2.y and pback.z == p2.z assert aos.numParticles() == aos.numTotalParticles() == aos.numRealParticles() == 2 assert aos.getNumNeighbors() == 0 aos.setNumNeighbors(5) assert aos.getNumNeighbors() == 5 assert not aos.empty() assert aos.size() == 7 assert aos[0].get_rdata() == p1.get_rdata() p3 = amr.Particle_2_1() p3.set_rdata([3.14, -3.14]) p3.set_idata([10]) aos[0] = p3 assert aos[0].get_idata() == p3.get_idata() aos.pop_back() assert aos.numParticles() == aos.numRealParticles() == 1 assert aos.numTotalParticles() == 6 def test_array_interface(): aos = ( amr.ArrayOfStructs_2_1_managed() if amr.Config.have_gpu else amr.ArrayOfStructs_2_1_default() ) p1 = amr.Particle_2_1() p1.setPos([1, 2, 3]) p1.set_rdata([4.5, 5.2]) p1.set_idata([6]) aos.push_back(p1) p2 = amr.Particle_2_1() p2.setPos([8, 9, 10]) p2.set_rdata([11.1, 12.2]) p2.set_idata([13]) aos.push_back(p2) print(aos[0]) print(dir(aos[0])) # print('particle 1 from aos:\n',aos[0]) # print('particle 2 from aos:\n',aos[1]) # print('array interface\n', aos.__array_interface__) arr = aos.to_numpy() assert ( np.isclose(arr[0][0], 1.0) and np.isclose(arr[0][4], 5.2) and np.isclose(arr[0][6], 6) ) assert ( np.isclose(arr[1][2], 10) and np.isclose(arr[1][3], 11.1) and np.isclose(arr[1][6], 13) ) p3 = amr.Particle_2_1(x=-3) p4 = amr.Particle_2_1(y=-5) print(arr) print(aos[0], aos[1]) print("-------") aos[0] = p4 assert aos[0].x == 0 assert aos[0].y == -5 aos[0] = p3 print("array:", arr) print("aos[0]:", aos[0], "aos[1]:", aos[1]) assert aos[0].x == arr[0][0] == -3 assert aos[0].y == arr[0][1] == 0 assert aos[0].z == arr[0][2] == 0 shape = amr.Config.spacedim + amr.Particle_2_1.NReal + amr.Particle_2_1.NInt + 1 for ii in range(shape): arr[1][ii] = 0 arr[1][1] = -5 # np.array([0, -5, 0,0,0,0,0]) print("array:", arr) print("aos[0]:", aos[0], "aos[1]:", aos[1]) assert aos[1].y == arr[1][1] == -5 assert aos[1].x == arr[1][0] == 0 assert aos[1].z == arr[1][2] == 0
Examples in
test_soa.py
# -*- coding: utf-8 -*- import numpy as np import amrex.space3d as amr def test_soa_init(): soa = amr.StructOfArrays_3_1_default() print("--test init --") print("num real components", soa.num_real_comps) print("num int components", soa.num_int_comps) assert soa.num_real_comps == 3 and soa.num_int_comps == 1 soa.define(1, 3, ["x", "y", "z", "w"], ["i1", "i2", "i3", "i4"]) print("--test define --") print("num real components", soa.num_real_comps) print("num int components", soa.num_int_comps) assert soa.num_real_comps == 4 and soa.num_int_comps == 4 print("num particles", soa.num_particles) print("num real particles", soa.num_real_particles) print("num totalparticles", soa.num_total_particles) print("num Neighbors", soa.get_num_neighbors()) print("soa size", soa.size) assert soa.num_particles == soa.num_real_particles == 0 assert soa.size == soa.num_total_particles == 0 assert soa.get_num_neighbors() == 0 soa.resize(5) print("--test resize --") print("num particles", soa.num_particles) print("num real particles", soa.num_real_particles) print("num totalparticles", soa.num_total_particles) print("num Neighbors", soa.get_num_neighbors()) print("soa size", soa.size) assert soa.num_particles == soa.num_real_particles == 5 assert soa.size == soa.num_total_particles == 5 assert soa.get_num_neighbors() == 0 soa.set_num_neighbors(3) print("--test set neighbor num--") print("num particles", soa.num_particles) print("num real particles", soa.num_real_particles) print("num totalparticles", soa.num_total_particles) print("num Neighbors", soa.get_num_neighbors()) print("soa size", soa.size) assert soa.num_particles == soa.num_real_particles == 5 assert soa.size == soa.num_total_particles == 8 assert soa.get_num_neighbors() == 3 def test_soa_from_tile(): pt = ( amr.ParticleTile_2_1_3_1_managed() if amr.Config.have_gpu else amr.ParticleTile_2_1_3_1_default() ) p = amr.Particle_5_2( 1.0, 2.0, 3.0, rdata_0=4.0, rdata_1=5.0, rdata_2=6.0, rdata_3=7.0, rdata_4=8.0, idata_0=9, idata_1=10, ) sp = amr.Particle_5_2(1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9, 10) pt.push_back(p) pt.push_back(sp) soa = pt.get_struct_of_arrays() print("num particles", soa.num_particles) print("num real particles", soa.num_real_particles) print("num totalparticles", soa.num_total_particles) print("num Neighbors", soa.get_num_neighbors()) print("soa size", soa.size) assert soa.num_particles == soa.size == 2 assert soa.num_total_particles == soa.num_real_particles == 2 assert soa.get_num_neighbors() == 0 real_arrays = soa.get_real_data() int_arrays = soa.get_int_data() print(real_arrays) assert np.isclose(real_arrays[0][1], 6.1) and np.isclose(real_arrays[1][1], 7.1) assert isinstance(int_arrays[0][0], int) assert int_arrays[0][1] == 10 real_arrays[1][0] = -1.2 ra1 = soa.get_real_data() print(real_arrays) print(ra1) for ii, arr in enumerate(real_arrays): assert np.allclose(arr.to_numpy(), ra1[ii].to_numpy()) print("soa int test") iarr_np = int_arrays[0].to_numpy() iarr_np[0] = -3 ia1 = soa.get_int_data() ia1_np = ia1[0].to_numpy() print(iarr_np) print(ia1_np) assert np.allclose(iarr_np, ia1_np)
Other C++ Calls
pyAMReX exposes many more C++-written and GPU-accelerated AMReX functions for MultiFab
and particles to Python, which can be used to set, reduce, rescale, redistribute, etc. contained data.
Check out the detailed API docs for more details and use further third party libraries at will on the exposed data structures, replacing the hot loops described above.
You can also leave the Python world in pyAMReX and go back to C++ whenever needed. For some applications, pyAMReX might act as scriptable glue that transports fields and particles from one (C++) function to another without recompilation, by exposing the functions and methods to Python.