heterocl.compute_api

heterocl.compute(shape, fcompute[, name, …])

Construct a new tensor based on the shape and the compute function.

heterocl.update(tensor, fcompute[, name])

Update an existing tensor according to the compute function.

heterocl.mutate(domain, fcompute[, name])

Perform a computation repeatedly in the given mutation domain.

heterocl.scalar([init, name, dtype])

A syntactic sugar for a single-element tensor.

heterocl.copy(tensor[, name, dtype])

A syntactic sugar for copying an existing tensor.

heterocl.unpack(tensor[, axis, factor, …])

Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth.

heterocl.pack(tensor[, axis, factor, name, …])

Pack a tensor with smaller bitwidth to a tensor with larger bitwidth.

heterocl.reduce_axis(lower, upper[, name])

Create a reduction axis for reduction operations.

heterocl.reducer(init, freduce[, dtype, name])

Create a reducer for a reduction operation.

heterocl.sum(expr, axis[, where, name, dtype])

Compute the sum of the given expression on axis.

heterocl.max(expr, axis[, where, name, dtype])

Compute the max of the given expression on axis.

compute(shape, fcompute, name=None, dtype=None, attrs={})[source]

Construct a new tensor based on the shape and the compute function.

The API returns a new tensor. The shape must be a tuple. The number of elements in the tuple decides the dimension of the returned tensor. The second field fcompute defines the construction rule of the returned tensor, which must be callable. The number of arguments should match the dimension defined by shape, which we do not check. This, however, provides users more programming flexibility.

The compute function specifies how we calculate each element of the returned tensor. It can contain other HeteroCL APIs, even imperative DSL.

Parameters
  • shape (tuple) – The shape of the returned tensor

  • fcompute (callable) – The construction rule for the returned tensor

  • name (str, optional) – The name of the returned tensor

  • dtype (Type, optional) – The data type of the placeholder

Returns

Return type

Tensor

Examples

# example 1.1 - anonymous lambda function
A = hcl.compute((10, 10), lambda x, y: x+y)

# equivalent code
for x in range(0, 10):
    for y in range(0, 10):
        A[x][y] = x + y

# example 1.2 - explicit function
def addition(x, y):
    return x+y
A = hcl.compute((10, 10), addition)

# example 1.3 - imperative function definition
@hcl.def_([(), ()])
def addition(x, y):
    hcl.return_(x+y)
A = hcl.compute((10, 10), addition)

# example 2 - undetermined arguments
def compute_tanh(X):
    return hcl.compute(X.shape, lambda *args: hcl.tanh(X[args]))

A = hcl.placeholder((10, 10))
B = hcl.placeholder((10, 10, 10))
tA = compute_tanh(A)
tB = compute_tanh(B)

# example 3 - mixed-paradigm programming
def return_max(x, y):
    with hcl.if_(x > y):
        hcl.return_(x)
    with hcl.else_:
        hcl.return_(y)
A = hcl.compute((10, 10), return_max)
update(tensor, fcompute, name=None)[source]

Update an existing tensor according to the compute function.

This API update an existing tensor. Namely, no new tensor is returned. The shape and data type stay the same after the update. For more details on fcompute, please check compute.

Parameters
  • tensor (Tensor) – The tensor to be updated

  • fcompute (callable) – The update rule

  • name (str, optional) – The name of the update operation

Returns

Return type

None

mutate(domain, fcompute, name=None)[source]

Perform a computation repeatedly in the given mutation domain.

This API allows users to write a loop in a tensorized way, which makes it easier to exploit the parallelism when performing optimizations. The rules for the computation function are the same as that of compute.

Parameters
  • domain (tuple) – The mutation domain

  • fcompute (callable) – The computation function that will be performed repeatedly

  • name (str, optional) – The name of the operation

Returns

Return type

None

Examples

# this example finds the max two numbers in A and stores it in M

A = hcl.placeholder((10,))
M = hcl.placeholder((2,))

def loop_body(x):
    with hcl.if_(A[x] > M[0]):
        with hcl.if_(A[x] > M[1]):
            M[0] = M[1]
            M[1] = A[x]
        with hcl.else_():
            M[0] = A[x]
hcl.mutate(A.shape, lambda x: loop_body(x))
scalar(init=0, name=None, dtype=None)[source]

A syntactic sugar for a single-element tensor.

This is equivalent to hcl.compute((1,), lambda x: init, name, dtype)

Parameters
  • init (Expr, optional) – The initial value for the returned tensor. The default value is 0.

  • name (str, optional) – The name of the returned tensor

  • dtype (Type, optional) – The data type of the placeholder

Returns

Return type

Tensor

copy(tensor, name=None, dtype=None)[source]

A syntactic sugar for copying an existing tensor.

Parameters
  • tensor (Tensor or list or numpy.ndarray) – The tensor to be copied from

  • name (str, optional) – The name of the returned tensor

Returns

Return type

Tensor

Examples

# example 1 - copy from a HeteroCL tensor
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B1 = hcl.copy(A, "B1")

# example 2 - copy from a Python list
pA = [[1, 2, 3], [4, 5, 6]]
# The data type is NOT inferred from the list
B2 = hcl.copy(pA, "B2", hcl.Int())

# example 3 - copy from a Numpy array
nA = numpy.array(pA)
# The data type is determined by using nA.dtype
B3 = hcl.copy(nA, "B3")
unpack(tensor, axis=0, factor=None, name=None, dtype=None)[source]

Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth.

This API unpacks the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The number of dimensions stays the same after unpacking. Once factor is specified, dtype is not taken into consideration. If factor is not specified, users can have several ways to specify dtype. First, we use the data type specified by the quantization scheme. Second, if dtype is specified, we use the value. Finally, we use the data type specified via the heterocl.init API. Since we are performing an unpacking operation, the number of resulting elements should be larger then that of the elements in the input tensor. Namely, the factor should be greater or equal to 1.

Parameters
  • tensor (Tensor) – The tensor to be unpacked

  • axis (int, optional) – The dimension to be unpacked

  • factor (int, optional) – The unpack factor

  • name (str, optional) – The name of the unpacked tensor

  • dtype (Type, optional) – The data type of the unpacked tensor

Returns

Return type

Tensor

Examples

# example 1.1 - unpack with factor
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B = hcl.unpack(A, factor=4)
print B.shape # (40,)
print B.dtype # "uint8"

# example 1.2 - unpack with dtype
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B = hcl.unpack(A, dtype=hcl.UInt(8))
# the results are the same as example 1.1

# example 1.3 - unpack with quantization scheme
A = hcl.placeholder((10,), "A", hcl.UInt(32))
def unpack_A(A):
    return hcl.unpack(A, name="B")
s = hcl.create_scheme(A, unpack_A)
s.downsize(unpack_A.B, hcl.UInt(8))
# the results are the same as example 1.1

# example 2 - unpack multi-dimensional tensor
A = hcl.placeholder((10, 10), "A", hcl.UInt(32))
B = hcl.unpack(A, factor=4)         # B.shape = (40, 10)
C = hcl.unpack(A, axis=1, factor=4) # C.shape = (10, 40)
pack(tensor, axis=0, factor=None, name=None, dtype=None)[source]

Pack a tensor with smaller bitwidth to a tensor with larger bitwidth.

This API packs the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The usage is the same as unpack.

Parameters
  • tensor (Tensor) – The tensor to be packed

  • axis (int, optional) – The dimension to be packed

  • factor (int, optional) – The pack factor

  • name (str, optional) – The name of the packed tensor

  • dtype (Type, optional) – The data type of the packed tensor

Returns

Return type

Tensor

reduce_axis(lower, upper, name=None)[source]

Create a reduction axis for reduction operations.

The upper- and lower-bound of the range can be arbitrary integers. However, the upper-bound should be greater than the lower-bound.

Parameters
  • lower (Expr) – The lower-bound of the reduction domain

  • upper (Expr) – The upper-bound of the reduction domain

  • name (str, optional) – The name of the reduction axis

Returns

Return type

IterVar

reducer(init, freduce, dtype='int32', name=None)[source]

Create a reducer for a reduction operation.

This API creates a reducer according to the initial value init and the reduction function freduce. The initial value can be either an expression or a tensor. With the reducer, users can create a reduction operation, where the users can further specify the input to be reduced expr, its axis axis, and the condition where. The general rule of the reduction operation is shown below. Note that for the reduction function, the first argument is the input while the second argument is the accumulator. Moreover, if the accumulator is an expression, the reduction function should return an expression. On the other hand, if the accumulator is a list or a tensor, the reduction function should not return anything.

# this can be a tensor
output = init
# the specified reduction axis
for i in reduction_domain:
    if (where):
        output = freduce(input[..., i, ...], output)

Users can further specify the data type for the reduction operation. For a multi-dimensional reduction operation, users can have multiple reduce axes. In this case, we can write them together in a list.

Parameters
  • init (Expr or Tensor) – The initial value of the accumulator

  • freduce (callable) – The reduction function that takes in two arguments. The first argument is the new input value and the second argument is the accumulator

  • dtype (Type, optional) – The data type of the accumulator

  • name (str, optional) – The name of the generated reducer

Returns

Return type

callable

See also

sum(), max()

Examples

# example 1.1 - basic reduction : summation
my_sum = hcl.reducer(0, lambda x, y: x+y)
A = hcl.placeholder((10,))
r = hcl.reduce_axis(0, 10)
B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r))

# equivalent code
B[0] = 0
for r in (0, 10):
    B[0] = A[r] + B[0]

# example 1.2 - with condition
B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r, where=A[r]>5))

# equivalent code
B[0] = 0
for r in (0, 10):
    if A[r] > 5:
        B[0] = A[r] + B[0]

# example 1.3 - with data type specification
B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r, dtype=hcl.UInt(4)))
# the output will be downsize to UInt(4)

# example 2 = a more complicated reduction
# x is the input, y is the accumulator
def my_reduction(x, y):
    with hcl.if_(x > 5):
        hcl.return_(y + x)
    with hcl.else_():
        hcl.return_(y - x)
my_sum = hcl.reducer(0, my_reduction)
A = hcl.placeholder((10,))
r = hcl.reduce_axis(0, 10)
B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r))

# equivalent code
B[0] = 0
for r in range(0, 10):
    if A[r] > 5:
        B[0] = B[0] + A[r]
    else:
        B[0] = B[0] - A[r]

# example 3 - multiple reduce axes
A = hcl.placeholder((10, 10))
r1 = hcl.reduce_axis(0, 10)
r2 = hcl.reduce_axis(0, 10)
B = hcl.compute((1,), lambda x: my_sum(A[r1, r2], axis=[r1, r2]))

# equivalent code
B[0] = 0
for r1 in (0, 10):
    for r2 in (0, 10):
        B[0] = A[r1, r2] + B[0]

# example 4 - write a sorting algorithm with reduction
init = hcl.compute((10,), lambda x: 11)
def freduce(x, Y):
    with hcl.for_(0, 10) as i:
        with hcl.if_(x < Y[i]):
            with hcl.for_(9, i, -1) as j:
                Y[j] = Y[j-1]
            Y[i] = x
            hcl.break_()
my_sort = hcl.reducer(init, freduce)
A = hcl.placeholder((10, 10))
r = hcl.reduce_axis(0, 10)
# note that we need to use the underscore the mark the reduction axis
B = hcl.compute(A.shape, lambda _x, y: my_sort(A[r, y], axis=r))
sum(expr, axis, where=True, name='sum', dtype='int32')

Compute the sum of the given expression on axis.

Parameters
  • expr (Expr) – The expression to be reduced

  • axis (IterVar) – The axis to be reduced

  • where (Expr, optional) – The filtering condition for the reduction

  • name (str, optional) – The name of the accumulator

  • dtype (Type, optional) – The data type of the accumulator

Returns

Return type

Expr

See also

reducer()

max(expr, axis, where=True, name='max', dtype='int32')

Compute the max of the given expression on axis.

Parameters
  • expr (Expr) – The expression to be reduced

  • axis (IterVar) – The axis to be reduced

  • where (Expr, optional) – The filtering condition for the reduction

  • name (str, optional) – The name of the accumulator

  • dtype (Type, optional) – The data type of the accumulator

Returns

Return type

Expr

See also

reducer()