.. note::
    :class: sphx-glr-download-link-note

    Click :ref:`here <sphx_glr_download_tutorials_tutorial_03_api.py>` to download the full example code
.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials_tutorial_03_api.py:


HeteroCL Compute APIs
=====================

**Author**: Yi-Hsiang Lai (seanlatias@github)

In this tutorial, we will show more HeteroCL compute APIs. These APIs are used
to build the algorithm. Note that in HeteroCL, the compute APIs can be used
along with the imperative DSL.

.. code-block:: default


    import heterocl as hcl


``hcl.compute``
---------------
We have introduced this API before. This API returns a **new tensor** whose
values are defined in an elementwise fashion. Following we show the API's
prototype.

``compute(shape, fcompute, name, dtype)``

``shape`` defines the shape of the output tensor. ``fcompute`` is a lambda
function that describes the elementwise definition. ``name`` and ``dtype``
are optional. We show an example below.


.. code-block:: default


    hcl.init()

    A = hcl.placeholder((10,), "A")
    B = hcl.placeholder((10,), "B")

    def compute_example(A, B):
        return hcl.compute(A.shape, lambda x: A[x]+B[x], "C")

    s = hcl.create_schedule([A, B], compute_example)
    print(hcl.lower(s))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    // attr [_top] storage_scope = "global"
    allocate _top[int32 * 1]
    produce _top {
      // attr [0] extern_scope = 0
      produce C {
        // attr [0] extern_scope = 0
        for (x, 0, 10) {
          C[x] = int32((int33(A[x]) + int33(B[x])))
        }
      }
    }


``hcl.update``
--------------
This API is similar to `hcl.compute` in that it defines how you **update a
tensor** in an elementwise fashion. Note that this API does not return a
new tensor. More specifically, the return value is `None`.

``hcl.update(tensor, fupdate, name)``

``tensor`` is the tensor we want ot update. ``fupate`` is a lambda function
that describes the elelmentwise update behavior. ``name`` is optional. We
show an example below that does the similar computation as `compute_example`.
The difference is that instead of returning a new tensor `C`, we send it in
as an input and update it in place. We can see that the generated IR is
almost the same.


.. code-block:: default


    hcl.init()
    A = hcl.placeholder((10,), "A")
    B = hcl.placeholder((10,), "B")
    C = hcl.placeholder((10,), "C")

    def update_example(A, B, C):
        hcl.update(C, lambda x: A[x]+B[x], "U")

    s = hcl.create_schedule([A, B, C], update_example)
    print(hcl.lower(s))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    // attr [_top] storage_scope = "global"
    allocate _top[int32 * 1]
    produce _top {
      // attr [0] extern_scope = 0
      // attr [U] storage_scope = "global"
      allocate U[int32 * 1]
      produce U {
        // attr [0] extern_scope = 0
        for (x, 0, 10) {
          C[x] = int32((int33(A[x]) + int33(B[x])))
        }
      }
    }


``hcl.mutate``
-------------------
This API allows users to describe any loops with vector code, even if the
loop body does not have any common pattern or contains imperative DSL.
This API is useful when we want to perform optimization.

``hcl.mutate(domain, fbody, name)``

``domain`` describes the iteration domain of our original `for` loop.
``fbody`` is the body statement of the `for` loop. ``name`` is optional. We
can describe the same computation in the previous two examples using this
API.


.. code-block:: default


    hcl.init()
    A = hcl.placeholder((10,), "A")
    B = hcl.placeholder((10,), "B")
    C = hcl.placeholder((10,), "C")

    def mut_example(A, B, C):
        def loop_body(x):
            C[x] = A[x] + B[x]
        hcl.mutate((10,), lambda x: loop_body(x), "M")

    s = hcl.create_schedule([A, B, C], mut_example)
    print(hcl.lower(s))


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    // attr [_top] storage_scope = "global"
    allocate _top[int32 * 1]
    produce _top {
      // attr [0] extern_scope = 0
      // attr [M] storage_scope = "global"
      allocate M[int32 * 1]
      produce M {
        // attr [0] extern_scope = 0
        for (x, 0, 10) {
          C[x] = int32((int33(A[x]) + int33(B[x])))
        }
      }
    }


Note that in this example, we are not allowed to directly write the
assignment statement inside the lambda function. This is forbidden by Python
syntax rules.

Combine Imperative DSL with Compute APIs
----------------------------------------
HeteroCL allows users to write a mixed-paradigm programming application.
This is common when performing reduction operations. Although HeteroCL
provides APIs for simple reduction operations such as summation and finding
the maximum number, for more complexed reduction operations such as sorting,
we need to describe them manually. Following we show an example of finding
the maximum two values in a tensor.


.. code-block:: default


    hcl.init()
    A = hcl.placeholder((10,), "A")
    M = hcl.placeholder((2,), "M")

    def find_max_two(A, M):
        def loop_body(x):
            with hcl.if_(A[x] > M[0]):
                with hcl.if_(A[x] > M[1]):
                    M[0] = M[1]
                    M[1] = A[x]
                with hcl.else_():
                    M[0] = A[x]
        hcl.mutate(A.shape, lambda x: loop_body(x))

    s = hcl.create_schedule([A, M], find_max_two)
    f = hcl.build(s)

    import numpy as np

    hcl_A = hcl.asarray(np.random.randint(50, size=(10,)))
    hcl_M = hcl.asarray(np.array([-1, -1]))

    f(hcl_A, hcl_M)

    np_A = hcl_A.asnumpy()
    np_M = hcl_M.asnumpy()

    print(np_A)
    print(np_M)

    assert np.array_equal(np_M, np.sort(np_A)[-2:])


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    [33 35 42 31 44  9 32 32 30 31]
    [42 44]


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.068 seconds)


.. _sphx_glr_download_tutorials_tutorial_03_api.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download

     :download:`Download Python source code: tutorial_03_api.py <tutorial_03_api.py>`


  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: tutorial_03_api.ipynb <tutorial_03_api.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_