heterocl.compute_api¶

`heterocl.compute`(shape, fcompute[, name, …])	Construct a new tensor based on the shape and the compute function.
`heterocl.update`(tensor, fcompute[, name])	Update an existing tensor according to the compute function.
`heterocl.mutate`(domain, fcompute[, name])	Perform a computation repeatedly in the given mutation domain.
`heterocl.scalar`([init, name, dtype])	A syntactic sugar for a single-element tensor.
`heterocl.copy`(tensor[, name, dtype])	A syntactic sugar for copying an existing tensor.
`heterocl.unpack`(tensor[, axis, factor, …])	Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth.
`heterocl.pack`(tensor[, axis, factor, name, …])	Pack a tensor with smaller bitwidth to a tensor with larger bitwidth.
`heterocl.reduce_axis`(lower, upper[, name])	Create a reduction axis for reduction operations.
`heterocl.reducer`(init, freduce[, dtype, name])	Create a reducer for a reduction operation.
`heterocl.sum`(expr, axis[, where, name, dtype])	Compute the sum of the given expression on axis.
`heterocl.max`(expr, axis[, where, name, dtype])	Compute the max of the given expression on axis.

compute(shape, fcompute, name=None, dtype=None, attrs={})[source]¶

Construct a new tensor based on the shape and the compute function.

The API returns a new tensor. The shape must be a tuple. The number of elements in the tuple decides the dimension of the returned tensor. The second field fcompute defines the construction rule of the returned tensor, which must be callable. The number of arguments should match the dimension defined by shape, which we do not check. This, however, provides users more programming flexibility.

The compute function specifies how we calculate each element of the returned tensor. It can contain other HeteroCL APIs, even imperative DSL.

Parameters

shape (tuple) – The shape of the returned tensor
fcompute (callable) – The construction rule for the returned tensor
name (str, optional) – The name of the returned tensor
dtype (Type, optional) – The data type of the placeholder

Returns

Return type

Tensor

Examples

# example 1.1 - anonymous lambda function
A = hcl.compute((10, 10), lambda x, y: x+y)

# equivalent code
for x in range(0, 10):
    for y in range(0, 10):
        A[x][y] = x + y

# example 1.2 - explicit function
def addition(x, y):
    return x+y
A = hcl.compute((10, 10), addition)

# example 1.3 - imperative function definition
@hcl.def_([(), ()])
def addition(x, y):
    hcl.return_(x+y)
A = hcl.compute((10, 10), addition)

# example 2 - undetermined arguments
def compute_tanh(X):
    return hcl.compute(X.shape, lambda *args: hcl.tanh(X[args]))

A = hcl.placeholder((10, 10))
B = hcl.placeholder((10, 10, 10))
tA = compute_tanh(A)
tB = compute_tanh(B)

# example 3 - mixed-paradigm programming
def return_max(x, y):
    with hcl.if_(x > y):
        hcl.return_(x)
    with hcl.else_:
        hcl.return_(y)
A = hcl.compute((10, 10), return_max)

update(tensor, fcompute, name=None)[source]¶

Update an existing tensor according to the compute function.

This API update an existing tensor. Namely, no new tensor is returned. The shape and data type stay the same after the update. For more details on fcompute, please check compute.

Parameters

tensor (Tensor) – The tensor to be updated
fcompute (callable) – The update rule
name (str, optional) – The name of the update operation

Returns

Return type

None

mutate(domain, fcompute, name=None)[source]¶

Perform a computation repeatedly in the given mutation domain.

This API allows users to write a loop in a tensorized way, which makes it easier to exploit the parallelism when performing optimizations. The rules for the computation function are the same as that of compute.

Parameters

domain (tuple) – The mutation domain
fcompute (callable) – The computation function that will be performed repeatedly
name (str, optional) – The name of the operation

Returns

Return type

None

Examples

# this example finds the max two numbers in A and stores it in M

A = hcl.placeholder((10,))
M = hcl.placeholder((2,))

def loop_body(x):
    with hcl.if_(A[x] > M[0]):
        with hcl.if_(A[x] > M[1]):
            M[0] = M[1]
            M[1] = A[x]
        with hcl.else_():
            M[0] = A[x]
hcl.mutate(A.shape, lambda x: loop_body(x))

scalar(init=0, name=None, dtype=None)[source]¶

A syntactic sugar for a single-element tensor.

This is equivalent to hcl.compute((1,), lambda x: init, name, dtype)

Parameters

init (Expr, optional) – The initial value for the returned tensor. The default value is 0.
name (str, optional) – The name of the returned tensor
dtype (Type, optional) – The data type of the placeholder

Returns

Return type

Tensor

copy(tensor, name=None, dtype=None)[source]¶

A syntactic sugar for copying an existing tensor.

Parameters

tensor (Tensor or list or numpy.ndarray) – The tensor to be copied from
name (str, optional) – The name of the returned tensor

Returns

Return type

Tensor

Examples

# example 1 - copy from a HeteroCL tensor
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B1 = hcl.copy(A, "B1")

# example 2 - copy from a Python list
pA = [[1, 2, 3], [4, 5, 6]]
# The data type is NOT inferred from the list
B2 = hcl.copy(pA, "B2", hcl.Int())

# example 3 - copy from a Numpy array
nA = numpy.array(pA)
# The data type is determined by using nA.dtype
B3 = hcl.copy(nA, "B3")

unpack(tensor, axis=0, factor=None, name=None, dtype=None)[source]¶

Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth.

This API unpacks the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The number of dimensions stays the same after unpacking. Once factor is specified, dtype is not taken into consideration. If factor is not specified, users can have several ways to specify dtype. First, we use the data type specified by the quantization scheme. Second, if dtype is specified, we use the value. Finally, we use the data type specified via the heterocl.init API. Since we are performing an unpacking operation, the number of resulting elements should be larger then that of the elements in the input tensor. Namely, the factor should be greater or equal to 1.

Parameters

tensor (Tensor) – The tensor to be unpacked
axis (int, optional) – The dimension to be unpacked
factor (int, optional) – The unpack factor
name (str, optional) – The name of the unpacked tensor
dtype (Type, optional) – The data type of the unpacked tensor

Returns

Return type

Tensor

Examples

# example 1.1 - unpack with factor
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B = hcl.unpack(A, factor=4)
print B.shape # (40,)
print B.dtype # "uint8"

# example 1.2 - unpack with dtype
A = hcl.placeholder((10,), "A", hcl.UInt(32))
B = hcl.unpack(A, dtype=hcl.UInt(8))
# the results are the same as example 1.1

# example 1.3 - unpack with quantization scheme
A = hcl.placeholder((10,), "A", hcl.UInt(32))
def unpack_A(A):
    return hcl.unpack(A, name="B")
s = hcl.create_scheme(A, unpack_A)
s.downsize(unpack_A.B, hcl.UInt(8))
# the results are the same as example 1.1

# example 2 - unpack multi-dimensional tensor
A = hcl.placeholder((10, 10), "A", hcl.UInt(32))
B = hcl.unpack(A, factor=4)         # B.shape = (40, 10)
C = hcl.unpack(A, axis=1, factor=4) # C.shape = (10, 40)

pack(tensor, axis=0, factor=None, name=None, dtype=None)[source]¶

Pack a tensor with smaller bitwidth to a tensor with larger bitwidth.

This API packs the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The usage is the same as unpack.

Parameters

tensor (Tensor) – The tensor to be packed
axis (int, optional) – The dimension to be packed
factor (int, optional) – The pack factor
name (str, optional) – The name of the packed tensor
dtype (Type, optional) – The data type of the packed tensor

Returns

Return type

Tensor

reduce_axis(lower, upper, name=None)[source]¶

Create a reduction axis for reduction operations.

The upper- and lower-bound of the range can be arbitrary integers. However, the upper-bound should be greater than the lower-bound.

Parameters

lower (Expr) – The lower-bound of the reduction domain
upper (Expr) – The upper-bound of the reduction domain
name (str, optional) – The name of the reduction axis

Returns

Return type

IterVar

reducer(init, freduce, dtype='int32', name=None)[source]¶

Create a reducer for a reduction operation.

This API creates a reducer according to the initial value init and the reduction function freduce. The initial value can be either an expression or a tensor. With the reducer, users can create a reduction operation, where the users can further specify the input to be reduced expr, its axis axis, and the condition where. The general rule of the reduction operation is shown below. Note that for the reduction function, the first argument is the input while the second argument is the accumulator. Moreover, if the accumulator is an expression, the reduction function should return an expression. On the other hand, if the accumulator is a list or a tensor, the reduction function should not return anything.

# this can be a tensor
output = init
# the specified reduction axis
for i in reduction_domain:
    if (where):
        output = freduce(input[..., i, ...], output)

Users can further specify the data type for the reduction operation. For a multi-dimensional reduction operation, users can have multiple reduce axes. In this case, we can write them together in a list.

Parameters

init (Expr or Tensor) – The initial value of the accumulator
freduce (callable) – The reduction function that takes in two arguments. The first argument is the new input value and the second argument is the accumulator
dtype (Type, optional) – The data type of the accumulator
name (str, optional) – The name of the generated reducer

Returns

Return type

callable