Compute

With zero-copy read and write access to data structures, this section presents how to compute in pyAMReX.

Since the pyAMReX data containers are wrappers to the C++ AMReX objects, it is worth reading:

As a very short, simplified overview, this narrows down to:

  • AMReX decomposes index space into rectangular, block-structured, regular grids,

  • block are often intentionally slightly over-decomposed, there is >= one block per compute unit (CPU core or GPU),

  • particles are chunked/tiled and usually decomposed like the field blocks,

  • refinement levels are represented as (potentially sparse) levels of blocks.

Computations are thus performed (mostly) on whole blocks, which enables to use compute advanced acceleration techniques on CPUs or GPUs.

Fields

The most common data structure to interact with is a MultiFab, which is a collection of boxes with associated field data. The field data can have more than one component (in the slowest varying index), but all components have the same staggering/centering.

This is how to iterate and potentially compute for all blocks assigned to a local process in pyAMReX:

# finest active MR level, get from a simulation's AmrMesh object, e.g.:
# finest_level = sim.finest_level
finest_level = 0  # no MR

# iterate over every mesh-refinement levels
for lev in range(finest_level + 1):
    # get an existing MultiFab, e.g., from a simulation:
    # mfab = sim.get_field(lev=lev)  # code-specific getter function

    # grow (aka guard/ghost/halo) regions
    ngv = mfab.n_grow_vect

    # get every local block of the field
    for mfi in mfab:
        # global index space box, including guards
        bx = mfi.tilebox().grow(ngv)
        print(bx)  # note: global index space of this block

        # numpy representation: non-copying view, including the
        # guard/ghost region
        field_np = mfab.array(mfi).to_numpy()

        # notes on indexing in field_np:
        # - numpy uses locally zero-based indexing
        # - layout is F_CONTIGUOUS by default, just like AMReX

        # notes:
        # Only the next lines are the "HOT LOOP" of the computation.
        # For efficiency, use numpy array operation for speed on CPUs.
        # For GPUs use .to_cupy() above and compute with cupy or numba.
        field_np[()] = 42.0

For a complete physics example that uses CPU/GPU agnostic Python code for computation on fields, see:

For many small CPU and GPU examples on how to compute on fields, see the following test cases:

  • MultiFab example

  • Examples in test_multifab.py
    Listing 1 This files is in tests/test_multifab.py.
    # -*- coding: utf-8 -*-
    
    import math
    
    import numpy as np
    import pytest
    
    import amrex.space3d as amr
    
    
    def test_mfab_numpy(mfab):
        """Used in docs/source/usage/compute.rst"""
    
        # Manual: Compute Mfab START
        # finest active MR level, get from a simulation's AmrMesh object, e.g.:
        # finest_level = sim.finest_level
        finest_level = 0  # no MR
    
        # iterate over every mesh-refinement levels
        for lev in range(finest_level + 1):
            # get an existing MultiFab, e.g., from a simulation:
            # mfab = sim.get_field(lev=lev)  # code-specific getter function
    
            # grow (aka guard/ghost/halo) regions
            ngv = mfab.n_grow_vect
    
            # get every local block of the field
            for mfi in mfab:
                # global index space box, including guards
                bx = mfi.tilebox().grow(ngv)
                print(bx)  # note: global index space of this block
    
                # numpy representation: non-copying view, including the
                # guard/ghost region
                field_np = mfab.array(mfi).to_numpy()
    
                # notes on indexing in field_np:
                # - numpy uses locally zero-based indexing
                # - layout is F_CONTIGUOUS by default, just like AMReX
    
                # notes:
                # Only the next lines are the "HOT LOOP" of the computation.
                # For efficiency, use numpy array operation for speed on CPUs.
                # For GPUs use .to_cupy() above and compute with cupy or numba.
                field_np[()] = 42.0
        # Manual: Compute Mfab END
    
    
    def test_mfab_loop(mfab):
        ngv = mfab.n_grow_vect
        print(f"\n  mfab={mfab}, mfab.n_grow_vect={ngv}")
    
        for mfi in mfab:
            bx = mfi.tilebox().grow(ngv)
            marr = mfab.array(mfi)
    
            # print(mfab)
            # print(mfab.num_comp)
            # print(mfab.size)
            # print(marr.size)
            # print(marr.nComp)
    
            # index by index assignment
            # notes:
            # - this is AMReX Array4, F-order indices
            # - even though we iterate by fastest varying index,
            #   such loops are naturally very slow in Python
            three_comps = mfab.num_comp == 3
            if three_comps:
                for i, j, k in bx:
                    # print(i,j,k)
                    marr[i, j, k, 0] = 10.0 * i
                    marr[i, j, k, 1] = 10.0 * j
                    marr[i, j, k, 2] = 10.0 * k
            else:
                for i, j, k in bx:
                    # print(i,j,k)
                    marr[i, j, k] = 10.0 * i
    
            # note: offset from index space in numpy
            #   in numpy, we start indices from zero, not small_end
    
            # numpy representation: non-copying view, including the
            # guard/ghost region
            marr_np = marr.to_numpy()
    
            # check the values at start/end are the same: first component
            assert marr_np[0, 0, 0, 0] == marr[bx.small_end]
            assert marr_np[-1, -1, -1, 0] == marr[bx.big_end]
            # same check, but for all components
            for n in range(mfab.num_comp):
                small_end_comp = list(bx.small_end) + [n]
                big_end_comp = list(bx.big_end) + [n]
                assert marr_np[0, 0, 0, n] == marr[small_end_comp]
                assert marr_np[-1, -1, -1, n] == marr[big_end_comp]
    
            # now we do some faster assignments, using range based access
            #   this should fail as out-of-bounds, but does not
            #     does Numpy not check array access for non-owned views?
            # marr_np[24:200, :, :, :] = 42.
    
            #   all components and all indices set at once to 42
            marr_np[()] = 42.0
    
            # values in start & end still match?
            assert marr_np[0, 0, 0, 0] == marr[bx.small_end]
            assert marr_np[-1, -1, -1, -1] == marr[bx.big_end]
    
            # all values for all indices match between multifab & numpy view?
            for n in range(mfab.num_comp):
                for i, j, k in bx:
                    assert marr[i, j, k, n] == 42.0
    
            # separate test: cupy assignment & reading
            #   TODO
    
    
    def test_mfab_simple(mfab):
        assert mfab.is_all_cell_centered
        # assert(all(not mfab.is_nodal(i) for i in [-1, 0, 1, 2]))  # -1??
        assert all(not mfab.is_nodal(i) for i in [0, 1, 2])
    
        for i in range(mfab.num_comp):
            mfab.set_val(-10 * (i + 1), i, 1)
        mfab.abs(0, mfab.num_comp)
        for i in range(mfab.num_comp):
            assert mfab.max(i) == (10 * (i + 1))  # Assert: None == 10 for i=0
            assert mfab.min(i) == (10 * (i + 1))
    
        mfab.plus(20.0, 0, mfab.num_comp)
        for i in range(mfab.num_comp):
            np.testing.assert_allclose(mfab.max(i), 20.0 + (10 * (i + 1)))
            np.testing.assert_allclose(mfab.min(i), 20.0 + (10 * (i + 1)))
    
        mfab.mult(10.0, 0, mfab.num_comp)
        for i in range(mfab.num_comp):
            np.testing.assert_allclose(mfab.max(i), 10.0 * (20.0 + (10 * (i + 1))))
            np.testing.assert_allclose(mfab.min(i), 10.0 * (20.0 + (10 * (i + 1))))
        mfab.invert(10.0, 0, mfab.num_comp)
        for i in range(mfab.num_comp):
            np.testing.assert_allclose(mfab.max(i), 1.0 / (20.0 + (10 * (i + 1))))
            np.testing.assert_allclose(mfab.min(i), 1.0 / (20.0 + (10 * (i + 1))))
    
    
    @pytest.mark.parametrize("nghost", [0, 1])
    def test_mfab_ops(boxarr, distmap, nghost):
        src = amr.MultiFab(boxarr, distmap, 3, nghost)
        dst = amr.MultiFab(boxarr, distmap, 1, nghost)
    
        src.set_val(10.0, 0, 1)
        src.set_val(20.0, 1, 1)
        src.set_val(30.0, 2, 1)
        dst.set_val(0.0, 0, 1)
    
        # dst.add(src, 2, 0, 1, nghost)
        # dst.subtract(src, 1, 0, 1, nghost)
        # dst.multiply(src, 0, 0, 1, nghost)
        # dst.divide(src, 1, 0, 1, nghost)
    
        dst.add(dst, src, 2, 0, 1, nghost)
        dst.subtract(dst, src, 1, 0, 1, nghost)
        dst.multiply(dst, src, 0, 0, 1, nghost)
        dst.divide(dst, src, 1, 0, 1, nghost)
    
        print(dst.min(0))
        np.testing.assert_allclose(dst.min(0), 5.0)
        np.testing.assert_allclose(dst.max(0), 5.0)
    
        # dst.xpay(2.0, src, 0, 0, 1, nghost)
        # dst.saxpy(2.0, src, 1, 0, 1, nghost)
        dst.xpay(dst, 2.0, src, 0, 0, 1, nghost)
        dst.saxpy(dst, 2.0, src, 1, 0, 1, nghost)
        np.testing.assert_allclose(dst.min(0), 60.0)
        np.testing.assert_allclose(dst.max(0), 60.0)
    
        # dst.lin_comb(6.0, src, 1,
        #             1.0, src, 2, 0, 1, nghost)
        dst.lin_comb(dst, 6.0, src, 1, 1.0, src, 2, 0, 1, nghost)
        np.testing.assert_allclose(dst.min(0), 150.0)
        np.testing.assert_allclose(dst.max(0), 150.0)
    
    
    def test_mfab_mfiter(mfab):
        assert iter(mfab).is_valid
        assert iter(mfab).length == 8
    
        cnt = 0
        for _mfi in mfab:
            cnt += 1
    
        assert iter(mfab).length == cnt
    
    
    @pytest.mark.skipif(
        amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA"
    )
    def test_mfab_ops_cuda_numba(mfab_device):
        # https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html
        from numba import cuda
    
        ngv = mfab_device.n_grow_vect
    
        # assign 3: define kernel
        @cuda.jit
        def set_to_three(array):
            i, j, k = cuda.grid(3)
            if i < array.shape[0] and j < array.shape[1] and k < array.shape[2]:
                array[i, j, k] = 3.0
    
        # assign 3: loop through boxes and launch kernels
        for mfi in mfab_device:
            bx = mfi.tilebox().grow(ngv)  # noqa
            marr = mfab_device.array(mfi)
            marr_numba = cuda.as_cuda_array(marr)
    
            # kernel launch
            threadsperblock = (4, 4, 4)
            blockspergrid = tuple(
                [math.ceil(s / b) for s, b in zip(marr_numba.shape, threadsperblock)]
            )
            set_to_three[blockspergrid, threadsperblock](marr_numba)
    
        # Check results
        shape = 32**3 * 8
        sum_threes = mfab_device.sum_unique(comp=0, local=False)
        assert sum_threes == shape * 3
    
    
    @pytest.mark.skipif(
        amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA"
    )
    def test_mfab_ops_cuda_cupy(mfab_device):
        # https://docs.cupy.dev/en/stable/user_guide/interoperability.html
        import cupy as cp
        import cupyx.profiler
    
        # AMReX -> cupy
        ngv = mfab_device.n_grow_vect
        print(f"\n  mfab_device={mfab_device}, mfab_device.n_grow_vect={ngv}")
    
        # assign 3
        with cupyx.profiler.time_range("assign 3 [()]", color_id=0):
            for mfi in mfab_device:
                bx = mfi.tilebox().grow(ngv)  # noqa
                marr_cupy = mfab_device.array(mfi).to_cupy(order="C")
                # print(marr_cupy.shape)  # 1, 32, 32, 32
                # print(marr_cupy.dtype)  # float64
                # performance:
                #   https://github.com/AMReX-Codes/pyamrex/issues/55#issuecomment-1579610074
    
                # write and read into the marr_cupy
                marr_cupy[()] = 3.0
    
        # verify result with a .sum_unique
        with cupyx.profiler.time_range("verify 3", color_id=0):
            shape = 32**3 * 8
            # print(mfab_device.shape)
            sum_threes = mfab_device.sum_unique(comp=0, local=False)
            assert sum_threes == shape * 3
    
        # assign 2
        with cupyx.profiler.time_range("assign 2 (set_val)", color_id=1):
            mfab_device.set_val(2.0)
        with cupyx.profiler.time_range("verify 2", color_id=1):
            sum_twos = mfab_device.sum_unique(comp=0, local=False)
            assert sum_twos == shape * 2
    
        # assign 5
        with cupyx.profiler.time_range("assign 5 (ones-like)", color_id=2):
    
            def set_to_five(mm):
                xp = cp.get_array_module(mm)
                assert xp.__name__ == "cupy"
                mm = xp.ones_like(mm) * 10.0
                mm /= 2.0
                return mm
    
            for mfi in mfab_device:
                bx = mfi.tilebox().grow(ngv)  # noqa
                marr_cupy = mfab_device.array(mfi).to_cupy(order="F")
                # print(marr_cupy.shape)  # 32, 32, 32, 1
                # print(marr_cupy.dtype)  # float64
                # performance:
                #   https://github.com/AMReX-Codes/pyamrex/issues/55#issuecomment-1579610074
    
                # write and read into the marr_cupy
                fives_cp = set_to_five(marr_cupy)
                marr_cupy[()] = 0.0
                marr_cupy += fives_cp
    
        # verify
        with cupyx.profiler.time_range("verify 5", color_id=2):
            sum = mfab_device.sum_unique(comp=0, local=False)
            assert sum == shape * 5
    
        # assign 7
        with cupyx.profiler.time_range("assign 7 (fuse)", color_id=3):
    
            @cp.fuse(kernel_name="set_to_seven")
            def set_to_seven(x):
                x[...] = 7.0
    
            for mfi in mfab_device:
                bx = mfi.tilebox().grow(ngv)  # noqa
                marr_cupy = mfab_device.array(mfi).to_cupy(order="C")
    
                # write and read into the marr_cupy
                set_to_seven(marr_cupy)
    
        # verify
        with cupyx.profiler.time_range("verify 7", color_id=3):
            sum = mfab_device.sum_unique(comp=0, local=False)
            assert sum == shape * 7
    
        # TODO: @jit.rawkernel()
    
    
    @pytest.mark.skipif(
        amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA"
    )
    def test_mfab_ops_cuda_pytorch(mfab_device):
        # https://docs.cupy.dev/en/stable/user_guide/interoperability.html#pytorch
        import torch
    
        # assign 3: loop through boxes and launch kernel
        for mfi in mfab_device:
            marr = mfab_device.array(mfi)
            marr_torch = torch.as_tensor(marr, device="cuda")
            marr_torch[:, :, :] = 3
    
        # Check results
        shape = 32**3 * 8
        sum_threes = mfab_device.sum_unique(comp=0, local=False)
        assert sum_threes == shape * 3
    
    
    @pytest.mark.skipif(
        amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA"
    )
    def test_mfab_ops_cuda_cuml(mfab_device):
        pass
        # https://github.com/rapidsai/cuml
        # https://github.com/rapidsai/cudf
        #   maybe better for particles as a dataframe test
        # import cudf
        # import cuml
    
        # AMReX -> RAPIDSAI cuML
        # arr_cuml = ...
        # assert(arr_cuml.__cuda_array_interface__['data'][0] == arr.__cuda_array_interface__['data'][0])
        # TODO
    
    
    @pytest.mark.skipif(
        amr.Config.gpu_backend != "CUDA", reason="Requires AMReX_GPU_BACKEND=CUDA"
    )
    def test_mfab_dtoh_copy(mfab_device):
        class MfabPinnedContextManager:
            def __enter__(self):
                self.mfab = amr.MultiFab(
                    mfab_device.box_array(),
                    mfab_device.dm(),
                    mfab_device.n_comp(),
                    mfab_device.n_grow_vect(),
                    amr.MFInfo().set_arena(amr.The_Pinned_Arena()),
                )
                return self.mfab
    
            def __exit__(self, exc_type, exc_value, traceback):
                self.mfab.clear()
                del self.mfab
    
        with MfabPinnedContextManager() as mfab_host:
            mfab_host.set_val(42.0)
    
            amr.dtoh_memcpy(mfab_host, mfab_device)
    
            # assert all are 0.0 on host
            host_min = mfab_host.min(0)
            host_max = mfab_host.max(0)
            assert host_min == host_max
            assert host_max == 0.0
    
            dev_val = 11.0
            mfab_host.set_val(dev_val)
            amr.htod_memcpy(mfab_device, mfab_host)
    
            # assert all are 11.0 on device
            for n in range(mfab_device.n_comp()):
                assert mfab_device.min(comp=n) == dev_val
                assert mfab_device.max(comp=n) == dev_val
    
            # numpy bindings (w/ copy)
            local_boxes_host = mfab_device.to_numpy(copy=True)
            assert max([np.max(box) for box in local_boxes_host]) == dev_val
            del local_boxes_host
    
            # numpy bindings (w/ copy)
            for mfi in mfab_device:
                marr = mfab_device.array(mfi).to_numpy(copy=True)
                assert np.min(marr) >= dev_val
                assert np.max(marr) <= dev_val
    
            # cupy bindings (w/o copy)
            import cupy as cp
    
            local_boxes_device = mfab_device.to_cupy()
            assert max([cp.max(box) for box in local_boxes_device]) == dev_val
    

Particles

AMReX Particles are stored in the ParticleContainer class.

There are a few small differences to the iteration over a ParticleContainer compared to a MultiFab:

  • ParticleContainer is aware of mesh-refinement levels,

  • AMReX supports a variety of data layouts for particles (AoS and SoA + runtime SoA attributes), which requires a few more calls to access.

Here is the general structure for computing on particles:

# code-specific getter function, e.g.:
# pc = sim.get_particles()

# iterate over every mesh-refinement levels (no MR: lev=0)
for lvl in range(pc.finest_level + 1):
    # get every local chunk of particles
    for pti in pc.iterator(pc, level=lvl):
        # default layout: AoS with positions and cpuid
        # note: not part of the new PureSoA particle container layout
        aos = pti.aos().to_numpy()

        # additional compile-time and runtime attributes in SoA format
        soa = pti.soa().to_numpy()

        # notes:
        # Only the next lines are the "HOT LOOP" of the computation.
        # For efficiency, use numpy array operation for speed on CPUs.
        # For GPUs use .to_cupy() above and compute with cupy or numba.

        # print all particle ids in the chunk
        print(aos[()]["cpuid"])

        # write to all particles in the chunk
        aos[()]["x"] = 0.30
        aos[()]["y"] = 0.35
        aos[()]["z"] = 0.40

        for soa_real in soa.real:
            soa_real[()] = 42.0

        for soa_int in soa.int:
            soa_int[()] = 12

For many small CPU and GPU examples on how to compute on particles, see the following test cases:

  • Examples in test_particleContainer.py
    Listing 2 This files is in tests/test_particleContainer.py.
    # -*- coding: utf-8 -*-
    
    import importlib
    
    import numpy as np
    import pytest
    
    import amrex.space3d as amr
    
    
    @pytest.fixture()
    def Npart():
        return 21
    
    
    @pytest.fixture(scope="function")
    def empty_particle_container(std_geometry, distmap, boxarr):
        pc = amr.ParticleContainer_1_1_2_1_default(std_geometry, distmap, boxarr)
        return pc
    
    
    @pytest.fixture(scope="function")
    def std_particle():
        myt = amr.ParticleInitType_1_1_2_1()
        myt.real_struct_data = [0.5]
        myt.int_struct_data = [5]
        myt.real_array_data = [0.5, 0.2]
        myt.int_array_data = [1]
        return myt
    
    
    @pytest.fixture(scope="function")
    def particle_container(Npart, std_geometry, distmap, boxarr, std_real_box):
        pc = amr.ParticleContainer_1_1_2_1_default(std_geometry, distmap, boxarr)
        myt = amr.ParticleInitType_1_1_2_1()
        myt.real_struct_data = [0.5]
        myt.int_struct_data = [5]
        myt.real_array_data = [0.5, 0.2]
        myt.int_array_data = [1]
    
        iseed = 1
        pc.InitRandom(Npart, iseed, myt, False, std_real_box)
        return pc
    
    
    def test_particleInitType():
        myt = amr.ParticleInitType_1_1_2_1()
        print(myt.real_struct_data)
        print(myt.int_struct_data)
        print(myt.real_array_data)
        print(myt.int_array_data)
    
        myt.real_struct_data = [0.5]
        myt.int_struct_data = [5]
        myt.real_array_data = [0.5, 0.2]
        myt.int_array_data = [1]
    
        assert np.allclose(myt.real_struct_data, [0.5])
        assert np.allclose(myt.int_struct_data, [5])
        assert np.allclose(myt.real_array_data, [0.5, 0.2])
        assert np.allclose(myt.int_array_data, [1])
    
    
    def test_n_particles(particle_container, Npart):
        pc = particle_container
        assert pc.OK()
        assert pc.NStructReal == amr.ParticleContainer_1_1_2_1_default.NStructReal == 1
        assert pc.NStructInt == amr.ParticleContainer_1_1_2_1_default.NStructInt == 1
        assert pc.NArrayReal == amr.ParticleContainer_1_1_2_1_default.NArrayReal == 2
        assert pc.NArrayInt == amr.ParticleContainer_1_1_2_1_default.NArrayInt == 1
        assert (
            pc.NumberOfParticlesAtLevel(0) == np.sum(pc.NumberOfParticlesInGrid(0)) == Npart
        )
    
    
    def test_pc_init():
        pc = amr.ParticleContainer_1_1_2_1_default()
    
        print("bytespread", pc.ByteSpread())
        print("capacity", pc.PrintCapacity())
        print("NumberOfParticles", pc.NumberOfParticlesAtLevel(0))
        assert pc.NumberOfParticlesAtLevel(0) == 0
    
        bx = amr.Box(amr.IntVect(0, 0, 0), amr.IntVect(63, 63, 63))
        rb = amr.RealBox(0, 0, 0, 1, 1, 1)
        coord_int = 1  # RZ
        periodicity = [0, 0, 1]
        gm = amr.Geometry(bx, rb, coord_int, periodicity)
    
        ba = amr.BoxArray(bx)
        ba.max_size(32)
        dm = amr.DistributionMapping(ba)
    
        print("-------------------------")
        print("define particle container")
        pc.Define(gm, dm, ba)
        assert pc.OK()
        assert pc.NStructReal == amr.ParticleContainer_1_1_2_1_default.NStructReal == 1
        assert pc.NStructInt == amr.ParticleContainer_1_1_2_1_default.NStructInt == 1
        assert pc.NArrayReal == amr.ParticleContainer_1_1_2_1_default.NArrayReal == 2
        assert pc.NArrayInt == amr.ParticleContainer_1_1_2_1_default.NArrayInt == 1
    
        print("bytespread", pc.ByteSpread())
        print("capacity", pc.PrintCapacity())
        print("NumberOfParticles", pc.NumberOfParticlesAtLevel(0))
        assert pc.TotalNumberOfParticles() == pc.NumberOfParticlesAtLevel(0) == 0
        assert pc.OK()
    
        print("---------------------------")
        print("add a particle to each grid")
        Npart_grid = 1
        iseed = 1
        myt = amr.ParticleInitType_1_1_2_1()
        myt.real_struct_data = [0.5]
        myt.int_struct_data = [5]
        myt.real_array_data = [0.5, 0.2]
        myt.int_array_data = [1]
        pc.InitRandomPerBox(Npart_grid, iseed, myt)
        ngrid = ba.size
        npart = Npart_grid * ngrid
    
        print("NumberOfParticles", pc.NumberOfParticlesAtLevel(0))
        assert pc.TotalNumberOfParticles() == pc.NumberOfParticlesAtLevel(0) == npart
        assert pc.OK()
    
        print(f"Finest level = {pc.finest_level}")
    
        print("Iterate particle boxes & set values")
        # lvl = 0
        for lvl in range(pc.finest_level + 1):
            print(f"at level {lvl}:")
            for pti in pc.iterator(pc, level=lvl):
                print("...")
                assert pti.num_particles == 1
                assert pti.num_real_particles == 1
                assert pti.num_neighbor_particles == 0
                assert pti.level == lvl
                print(pti.pair_index)
                print(pti.geom(level=lvl))
    
                aos = pti.aos()
                aos_arr = aos.to_numpy()
                aos_arr[0]["x"] = 0.30
                aos_arr[0]["y"] = 0.35
                aos_arr[0]["z"] = 0.40
    
                # TODO: this seems to write into a copy of the data
                soa = pti.soa()
                real_arrays = soa.GetRealData()
                int_arrays = soa.GetIntData()
                real_arrays[0] = [0.55]
                real_arrays[1] = [0.22]
                int_arrays[0] = [2]
    
                assert np.allclose(real_arrays[0], np.array([0.55]))
                assert np.allclose(real_arrays[1], np.array([0.22]))
                assert np.allclose(int_arrays[0], np.array([2]))
    
        # read-only
        for lvl in range(pc.finest_level + 1):
            for pti in pc.const_iterator(pc, level=lvl):
                assert pti.num_particles == 1
                assert pti.num_real_particles == 1
                assert pti.num_neighbor_particles == 0
                assert pti.level == lvl
    
                aos = pti.aos()
                aos_arr = aos.to_numpy()
                assert aos[0].x == 0.30
                assert aos[0].y == 0.35
                assert aos[0].z == 0.40
                assert aos_arr[0]["z"] == 0.40
    
                soa = pti.soa()
                real_arrays = soa.GetRealData()
                int_arrays = soa.GetIntData()
                print(real_arrays[0])
                print(int_arrays[0])
                # TODO: this does not work yet and is still the original data
                # assert np.allclose(real_arrays[0], np.array([0.55]))
                # assert np.allclose(real_arrays[1], np.array([0.22]))
                # assert np.allclose(int_arrays[0], np.array([2]))
    
    
    def test_particle_init(Npart, particle_container):
        pc = particle_container
        assert (
            pc.NumberOfParticlesAtLevel(0) == np.sum(pc.NumberOfParticlesInGrid(0)) == Npart
        )
    
        # pc.resizeData()
        print(pc.numLocalTilesAtLevel(0))
        lev = pc.GetParticles(0)
        print(len(lev.items()))
        assert pc.numLocalTilesAtLevel(0) == len(lev.items())
        for tile_ind, pt in lev.items():
            print("tile", tile_ind)
            real_arrays = pt.GetStructOfArrays().GetRealData()
            int_arrays = pt.GetStructOfArrays().GetIntData()
            aos = pt.GetArrayOfStructs()
            aos_arr = aos.to_numpy()
            if len(real_arrays) > 0:
                assert np.isclose(real_arrays[0][0], 0.5) and np.isclose(
                    real_arrays[1][0], 0.2
                )
                assert isinstance(int_arrays[0][0], int)
                assert int_arrays[0][0] == 1
                assert isinstance(aos_arr[0]["rdata_0"], np.floating)
                assert isinstance(aos_arr[0]["idata_0"], np.integer)
                assert np.isclose(aos_arr[0]["rdata_0"], 0.5) and aos_arr[0]["idata_0"] == 5
    
                aos_arr[0]["idata_0"] = 2
                aos1 = pt.GetArrayOfStructs()
                print(aos1[0])
                print(aos[0])
                print(aos_arr[0])
                assert (
                    aos_arr[0]["idata_0"]
                    == aos[0].get_idata(0)
                    == aos1[0].get_idata(0)
                    == 2
                )
    
                print("soa test")
                real_arrays[1][0] = -1.2
    
                ra1 = pt.GetStructOfArrays().GetRealData()
                print(real_arrays)
                print(ra1)
                for ii, arr in enumerate(real_arrays):
                    assert np.allclose(arr.to_numpy(), ra1[ii].to_numpy())
    
                print("soa int test")
                iarr_np = int_arrays[0].to_numpy()
                iarr_np[0] = -3
                ia1 = pt.GetStructOfArrays().GetIntData()
                ia1_np = ia1[0].to_numpy()
                print(iarr_np)
                print(ia1_np)
                assert np.allclose(iarr_np, ia1_np)
    
        print(
            "---- is the particle tile recording changes or passed by reference? --------"
        )
        lev1 = pc.GetParticles(0)
        for tile_ind, pt in lev1.items():
            print("tile", tile_ind)
            real_arrays = pt.GetStructOfArrays().GetRealData()
            int_arrays = pt.GetStructOfArrays().GetIntData()
            aos = pt.GetArrayOfStructs()
            print(aos[0])
            assert aos[0].get_idata(0) == 2
            assert real_arrays[1][0] == -1.2
            assert int_arrays[0][0] == -3
    
    
    def test_per_cell(empty_particle_container, std_geometry, std_particle):
        pc = empty_particle_container
        pc.InitOnePerCell(0.5, 0.5, 0.5, std_particle)
        assert pc.OK()
    
        lev = pc.GetParticles(0)
        assert pc.numLocalTilesAtLevel(0) == len(lev.items())
    
        sum_1 = 0
        for tile_ind, pt in lev.items():
            print("tile", tile_ind)
            real_arrays = pt.GetStructOfArrays().GetRealData()
            sum_1 += np.sum(real_arrays[1])
        print(sum_1)
        ncells = std_geometry.Domain().numPts()
        print("ncells from box", ncells)
        print("NumberOfParticles", pc.NumberOfParticlesAtLevel(0))
        assert pc.TotalNumberOfParticles() == pc.NumberOfParticlesAtLevel(0) == ncells
        print("npts * real_1", ncells * std_particle.real_array_data[1])
        assert ncells * std_particle.real_array_data[1] == sum_1
    
    
    def test_pc_numpy(particle_container, Npart):
        """Used in docs/source/usage/compute.rst"""
        pc = particle_container
    
        # Manual: Compute PC START
        # code-specific getter function, e.g.:
        # pc = sim.get_particles()
    
        # iterate over every mesh-refinement levels (no MR: lev=0)
        for lvl in range(pc.finest_level + 1):
            # get every local chunk of particles
            for pti in pc.iterator(pc, level=lvl):
                # default layout: AoS with positions and cpuid
                # note: not part of the new PureSoA particle container layout
                aos = pti.aos().to_numpy()
    
                # additional compile-time and runtime attributes in SoA format
                soa = pti.soa().to_numpy()
    
                # notes:
                # Only the next lines are the "HOT LOOP" of the computation.
                # For efficiency, use numpy array operation for speed on CPUs.
                # For GPUs use .to_cupy() above and compute with cupy or numba.
    
                # print all particle ids in the chunk
                print(aos[()]["cpuid"])
    
                # write to all particles in the chunk
                aos[()]["x"] = 0.30
                aos[()]["y"] = 0.35
                aos[()]["z"] = 0.40
    
                for soa_real in soa.real:
                    soa_real[()] = 42.0
    
                for soa_int in soa.int:
                    soa_int[()] = 12
        # Manual: Compute PC END
    
    
    @pytest.mark.skipif(
        importlib.util.find_spec("pandas") is None, reason="pandas is not available"
    )
    def test_pc_df(particle_container, Npart):
        pc = particle_container
        print(f"pc={pc}")
        df = pc.to_df()
        print(df.columns)
        print(df)
    
    
    @pytest.mark.skipif(
        importlib.util.find_spec("pandas") is None, reason="pandas is not available"
    )
    def test_pc_empty_df(empty_particle_container, Npart):
        pc = empty_particle_container
        print(f"pc={pc}")
        df = pc.to_df()
        assert df is None
    
    
    @pytest.mark.skipif(
        importlib.util.find_spec("pandas") is None, reason="pandas is not available"
    )
    @pytest.mark.skipif(not amr.Config.have_mpi, reason="Requires AMReX_MPI=ON")
    def test_pc_df_mpi(particle_container, Npart):
        pc = particle_container
        print(f"pc={pc}")
        df = pc.to_df(local=False)
        if df is not None:
            # only rank 0
            print(df.columns)
            print(df)
    
  • Examples in test_aos.py
    Listing 3 This files is in tests/test_aos.py.
    # -*- coding: utf-8 -*-
    
    import numpy as np
    
    import amrex.space3d as amr
    
    
    def test_aos_init():
        aos = amr.ArrayOfStructs_2_1_default()
    
        assert aos.numParticles() == 0
        assert aos.numTotalParticles() == aos.numRealParticles() == 0
        assert aos.empty()
    
    
    def test_aos_push_pop():
        aos = amr.ArrayOfStructs_2_1_default()
        p1 = amr.Particle_2_1()
        p1.set_rdata([1.5, 2.2])
        p1.set_idata([3])
        aos.push_back(p1)
        p2 = amr.Particle_2_1()
        p2.set_rdata([2.1, 25.2])
        p2.set_idata([5])
        aos.push_back(p2)
    
        ## test aos.back()
        pback = aos.back()
        assert np.allclose(pback.get_rdata(), p2.get_rdata())
        assert np.allclose(pback.get_idata(), p2.get_idata())
        assert pback.x == p2.x and pback.y == p2.y and pback.z == p2.z
    
        assert aos.numParticles() == aos.numTotalParticles() == aos.numRealParticles() == 2
        assert aos.getNumNeighbors() == 0
        aos.setNumNeighbors(5)
        assert aos.getNumNeighbors() == 5
        assert not aos.empty()
        assert aos.size() == 7
        assert aos[0].get_rdata() == p1.get_rdata()
        p3 = amr.Particle_2_1()
        p3.set_rdata([3.14, -3.14])
        p3.set_idata([10])
        aos[0] = p3
        assert aos[0].get_idata() == p3.get_idata()
    
        aos.pop_back()
        assert aos.numParticles() == aos.numRealParticles() == 1
        assert aos.numTotalParticles() == 6
    
    
    def test_array_interface():
        aos = amr.ArrayOfStructs_2_1_default()
        p1 = amr.Particle_2_1()
        p1.setPos([1, 2, 3])
        p1.set_rdata([4.5, 5.2])
        p1.set_idata([6])
        aos.push_back(p1)
        p2 = amr.Particle_2_1()
        p2.setPos([8, 9, 10])
        p2.set_rdata([11.1, 12.2])
        p2.set_idata([13])
        aos.push_back(p2)
    
        print(aos[0])
        print(dir(aos[0]))
    
        # print('particle 1 from aos:\n',aos[0])
        # print('particle 2 from aos:\n',aos[1])
        # print('array interface\n', aos.__array_interface__)
        arr = aos.to_numpy()
        assert (
            np.isclose(arr[0][0], 1.0)
            and np.isclose(arr[0][4], 5.2)
            and np.isclose(arr[0][6], 6)
        )
        assert (
            np.isclose(arr[1][2], 10)
            and np.isclose(arr[1][3], 11.1)
            and np.isclose(arr[1][6], 13)
        )
    
        p3 = amr.Particle_2_1(x=-3)
        p4 = amr.Particle_2_1(y=-5)
        print(arr)
        print(aos[0], aos[1])
        print("-------")
        aos[0] = p4
        assert aos[0].x == 0
        assert aos[0].y == -5
        aos[0] = p3
        print("array:", arr)
        print("aos[0]:", aos[0], "aos[1]:", aos[1])
        assert aos[0].x == arr[0][0] == -3
        assert aos[0].y == arr[0][1] == 0
        assert aos[0].z == arr[0][2] == 0
    
        shape = amr.Config.spacedim + amr.Particle_2_1.NReal + amr.Particle_2_1.NInt + 1
        for ii in range(shape):
            arr[1][ii] = 0
        arr[1][1] = -5  # np.array([0, -5, 0,0,0,0,0])
        print("array:", arr)
        print("aos[0]:", aos[0], "aos[1]:", aos[1])
        assert aos[1].y == arr[1][1] == -5
        assert aos[1].x == arr[1][0] == 0
        assert aos[1].z == arr[1][2] == 0
    
  • Examples in test_soa.py
    Listing 4 This files is in tests/test_soa.py.
    # -*- coding: utf-8 -*-
    
    import numpy as np
    
    import amrex.space3d as amr
    
    
    def test_soa_init():
        soa = amr.StructOfArrays_2_1_default()
        print("--test init --")
        print("num real components", soa.num_real_comps)
        print("num int components", soa.num_int_comps)
        assert soa.num_real_comps == 2 and soa.num_int_comps == 1
    
        soa.define(1, 3)
        print("--test define --")
        print("num real components", soa.num_real_comps)
        print("num int components", soa.num_int_comps)
        assert soa.num_real_comps == 3 and soa.num_int_comps == 4
        print("num particles", soa.numParticles())
        print("num real particles", soa.numRealParticles())
        print("num totalparticles", soa.numTotalParticles())
        print("num Neighbors", soa.getNumNeighbors())
        print("soa size", soa.size())
        assert soa.numParticles() == soa.numRealParticles() == 0
        assert soa.size() == soa.numTotalParticles() == 0
        assert soa.getNumNeighbors() == 0
    
        soa.resize(5)
        print("--test resize --")
        print("num particles", soa.numParticles())
        print("num real particles", soa.numRealParticles())
        print("num totalparticles", soa.numTotalParticles())
        print("num Neighbors", soa.getNumNeighbors())
        print("soa size", soa.size())
        assert soa.numParticles() == soa.numRealParticles() == 5
        assert soa.size() == soa.numTotalParticles() == 5
        assert soa.getNumNeighbors() == 0
    
        soa.setNumNeighbors(3)
        print("--test set neighbor num--")
        print("num particles", soa.numParticles())
        print("num real particles", soa.numRealParticles())
        print("num totalparticles", soa.numTotalParticles())
        print("num Neighbors", soa.getNumNeighbors())
        print("soa size", soa.size())
        assert soa.numParticles() == soa.numRealParticles() == 5
        assert soa.size() == soa.numTotalParticles() == 8
        assert soa.getNumNeighbors() == 3
    
    
    def test_soa_from_tile():
        pt = amr.ParticleTile_1_1_2_1_default()
        p = amr.Particle_1_1(1.0, 2.0, 3, rdata_0=4.0, idata_1=5)
        sp = amr.Particle_3_2(
            5.0, 6.0, 7.0, rdata_0=8.0, rdata_1=9.0, rdata_2=10.0, idata_0=11, idata_1=12
        )
        pt.push_back(p)
        pt.push_back(sp)
    
        soa = pt.GetStructOfArrays()
        print("num particles", soa.numParticles())
        print("num real particles", soa.numRealParticles())
        print("num totalparticles", soa.numTotalParticles())
        print("num Neighbors", soa.getNumNeighbors())
        print("soa size", soa.size())
        assert soa.numParticles() == soa.size() == 2
        assert soa.numTotalParticles() == soa.numRealParticles() == 2
        assert soa.getNumNeighbors() == 0
    
        real_arrays = soa.GetRealData()
        int_arrays = soa.GetIntData()
        print(real_arrays)
        assert np.isclose(real_arrays[0][1], 9) and np.isclose(real_arrays[1][1], 10)
        assert isinstance(int_arrays[0][0], int)
        assert int_arrays[0][1] == 12
    
        real_arrays[1][0] = -1.2
    
        ra1 = soa.GetRealData()
        print(real_arrays)
        print(ra1)
        for ii, arr in enumerate(real_arrays):
            assert np.allclose(arr.to_numpy(), ra1[ii].to_numpy())
    
        print("soa int test")
        iarr_np = int_arrays[0].to_numpy()
        iarr_np[0] = -3
        ia1 = soa.GetIntData()
        ia1_np = ia1[0].to_numpy()
        print(iarr_np)
        print(ia1_np)
        assert np.allclose(iarr_np, ia1_np)
    

Other C++ Calls

pyAMReX exposes many more C++-written and GPU-accelerated AMReX functions for MultiFab and particles to Python, which can be used to set, reduce, rescale, redistribute, etc. contained data. Check out the detailed API docs for more details and use further third party libraries at will on the exposed data structures, replacing the hot loops described above.

You can also leave the Python world in pyAMReX and go back to C++ whenever needed. For some applications, pyAMReX might act as scriptable glue that transports fields and particles from one (C++) function to another without recompilation, by exposing the functions and methods to Python.