heterocl.compute_api¶
|
Construct a new tensor based on the shape and the compute function. |
|
Update an existing tensor according to the compute function. |
|
Perform a computation repeatedly in the given mutation domain. |
|
A syntactic sugar for a single-element tensor. |
|
A syntactic sugar for copying an existing tensor. |
|
Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth. |
|
Pack a tensor with smaller bitwidth to a tensor with larger bitwidth. |
|
Create a reduction axis for reduction operations. |
|
Create a reducer for a reduction operation. |
|
Compute the sum of the given expression on axis. |
|
Compute the max of the given expression on axis. |
-
compute
(shape, fcompute, name=None, dtype=None, attrs={})[source]¶ Construct a new tensor based on the shape and the compute function.
The API returns a new tensor. The shape must be a tuple. The number of elements in the tuple decides the dimension of the returned tensor. The second field fcompute defines the construction rule of the returned tensor, which must be callable. The number of arguments should match the dimension defined by shape, which we do not check. This, however, provides users more programming flexibility.
The compute function specifies how we calculate each element of the returned tensor. It can contain other HeteroCL APIs, even imperative DSL.
- Parameters
- Returns
- Return type
Examples
# example 1.1 - anonymous lambda function A = hcl.compute((10, 10), lambda x, y: x+y) # equivalent code for x in range(0, 10): for y in range(0, 10): A[x][y] = x + y # example 1.2 - explicit function def addition(x, y): return x+y A = hcl.compute((10, 10), addition) # example 1.3 - imperative function definition @hcl.def_([(), ()]) def addition(x, y): hcl.return_(x+y) A = hcl.compute((10, 10), addition) # example 2 - undetermined arguments def compute_tanh(X): return hcl.compute(X.shape, lambda *args: hcl.tanh(X[args])) A = hcl.placeholder((10, 10)) B = hcl.placeholder((10, 10, 10)) tA = compute_tanh(A) tB = compute_tanh(B) # example 3 - mixed-paradigm programming def return_max(x, y): with hcl.if_(x > y): hcl.return_(x) with hcl.else_: hcl.return_(y) A = hcl.compute((10, 10), return_max)
-
update
(tensor, fcompute, name=None)[source]¶ Update an existing tensor according to the compute function.
This API update an existing tensor. Namely, no new tensor is returned. The shape and data type stay the same after the update. For more details on fcompute, please check
compute
.
-
mutate
(domain, fcompute, name=None)[source]¶ Perform a computation repeatedly in the given mutation domain.
This API allows users to write a loop in a tensorized way, which makes it easier to exploit the parallelism when performing optimizations. The rules for the computation function are the same as that of
compute
.- Parameters
- Returns
- Return type
Examples
# this example finds the max two numbers in A and stores it in M A = hcl.placeholder((10,)) M = hcl.placeholder((2,)) def loop_body(x): with hcl.if_(A[x] > M[0]): with hcl.if_(A[x] > M[1]): M[0] = M[1] M[1] = A[x] with hcl.else_(): M[0] = A[x] hcl.mutate(A.shape, lambda x: loop_body(x))
-
scalar
(init=0, name=None, dtype=None)[source]¶ A syntactic sugar for a single-element tensor.
This is equivalent to
hcl.compute((1,), lambda x: init, name, dtype)
-
copy
(tensor, name=None, dtype=None)[source]¶ A syntactic sugar for copying an existing tensor.
- Parameters
tensor (Tensor or list or numpy.ndarray) – The tensor to be copied from
name (str, optional) – The name of the returned tensor
- Returns
- Return type
Examples
# example 1 - copy from a HeteroCL tensor A = hcl.placeholder((10,), "A", hcl.UInt(32)) B1 = hcl.copy(A, "B1") # example 2 - copy from a Python list pA = [[1, 2, 3], [4, 5, 6]] # The data type is NOT inferred from the list B2 = hcl.copy(pA, "B2", hcl.Int()) # example 3 - copy from a Numpy array nA = numpy.array(pA) # The data type is determined by using nA.dtype B3 = hcl.copy(nA, "B3")
-
unpack
(tensor, axis=0, factor=None, name=None, dtype=None)[source]¶ Unpack a tensor with larger bitwidth to a tensor with smaller bitwidth.
This API unpacks the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The number of dimensions stays the same after unpacking. Once factor is specified, dtype is not taken into consideration. If factor is not specified, users can have several ways to specify dtype. First, we use the data type specified by the quantization scheme. Second, if dtype is specified, we use the value. Finally, we use the data type specified via the
heterocl.init
API. Since we are performing an unpacking operation, the number of resulting elements should be larger then that of the elements in the input tensor. Namely, the factor should be greater or equal to 1.- Parameters
- Returns
- Return type
Examples
# example 1.1 - unpack with factor A = hcl.placeholder((10,), "A", hcl.UInt(32)) B = hcl.unpack(A, factor=4) print B.shape # (40,) print B.dtype # "uint8" # example 1.2 - unpack with dtype A = hcl.placeholder((10,), "A", hcl.UInt(32)) B = hcl.unpack(A, dtype=hcl.UInt(8)) # the results are the same as example 1.1 # example 1.3 - unpack with quantization scheme A = hcl.placeholder((10,), "A", hcl.UInt(32)) def unpack_A(A): return hcl.unpack(A, name="B") s = hcl.create_scheme(A, unpack_A) s.downsize(unpack_A.B, hcl.UInt(8)) # the results are the same as example 1.1 # example 2 - unpack multi-dimensional tensor A = hcl.placeholder((10, 10), "A", hcl.UInt(32)) B = hcl.unpack(A, factor=4) # B.shape = (40, 10) C = hcl.unpack(A, axis=1, factor=4) # C.shape = (10, 40)
-
pack
(tensor, axis=0, factor=None, name=None, dtype=None)[source]¶ Pack a tensor with smaller bitwidth to a tensor with larger bitwidth.
This API packs the axis-th dimension of tensor to a new tensor according to the given factor or dtype. The usage is the same as
unpack
.
-
reduce_axis
(lower, upper, name=None)[source]¶ Create a reduction axis for reduction operations.
The upper- and lower-bound of the range can be arbitrary integers. However, the upper-bound should be greater than the lower-bound.
-
reducer
(init, freduce, dtype='int32', name=None)[source]¶ Create a reducer for a reduction operation.
This API creates a reducer according to the initial value init and the reduction function freduce. The initial value can be either an expression or a tensor. With the reducer, users can create a reduction operation, where the users can further specify the input to be reduced expr, its axis axis, and the condition where. The general rule of the reduction operation is shown below. Note that for the reduction function, the first argument is the input while the second argument is the accumulator. Moreover, if the accumulator is an expression, the reduction function should return an expression. On the other hand, if the accumulator is a list or a tensor, the reduction function should not return anything.
# this can be a tensor output = init # the specified reduction axis for i in reduction_domain: if (where): output = freduce(input[..., i, ...], output)
Users can further specify the data type for the reduction operation. For a multi-dimensional reduction operation, users can have multiple reduce axes. In this case, we can write them together in a list.
- Parameters
init (Expr or Tensor) – The initial value of the accumulator
freduce (callable) – The reduction function that takes in two arguments. The first argument is the new input value and the second argument is the accumulator
dtype (Type, optional) – The data type of the accumulator
name (str, optional) – The name of the generated reducer
- Returns
- Return type
callable
Examples
# example 1.1 - basic reduction : summation my_sum = hcl.reducer(0, lambda x, y: x+y) A = hcl.placeholder((10,)) r = hcl.reduce_axis(0, 10) B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r)) # equivalent code B[0] = 0 for r in (0, 10): B[0] = A[r] + B[0] # example 1.2 - with condition B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r, where=A[r]>5)) # equivalent code B[0] = 0 for r in (0, 10): if A[r] > 5: B[0] = A[r] + B[0] # example 1.3 - with data type specification B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r, dtype=hcl.UInt(4))) # the output will be downsize to UInt(4) # example 2 = a more complicated reduction # x is the input, y is the accumulator def my_reduction(x, y): with hcl.if_(x > 5): hcl.return_(y + x) with hcl.else_(): hcl.return_(y - x) my_sum = hcl.reducer(0, my_reduction) A = hcl.placeholder((10,)) r = hcl.reduce_axis(0, 10) B = hcl.compute((1,), lambda x: my_sum(A[r], axis=r)) # equivalent code B[0] = 0 for r in range(0, 10): if A[r] > 5: B[0] = B[0] + A[r] else: B[0] = B[0] - A[r] # example 3 - multiple reduce axes A = hcl.placeholder((10, 10)) r1 = hcl.reduce_axis(0, 10) r2 = hcl.reduce_axis(0, 10) B = hcl.compute((1,), lambda x: my_sum(A[r1, r2], axis=[r1, r2])) # equivalent code B[0] = 0 for r1 in (0, 10): for r2 in (0, 10): B[0] = A[r1, r2] + B[0] # example 4 - write a sorting algorithm with reduction init = hcl.compute((10,), lambda x: 11) def freduce(x, Y): with hcl.for_(0, 10) as i: with hcl.if_(x < Y[i]): with hcl.for_(9, i, -1) as j: Y[j] = Y[j-1] Y[i] = x hcl.break_() my_sort = hcl.reducer(init, freduce) A = hcl.placeholder((10, 10)) r = hcl.reduce_axis(0, 10) # note that we need to use the underscore the mark the reduction axis B = hcl.compute(A.shape, lambda _x, y: my_sort(A[r, y], axis=r))
-
sum
(expr, axis, where=True, name='sum', dtype='int32')¶ Compute the sum of the given expression on axis.
- Parameters
- Returns
- Return type
See also
-
max
(expr, axis, where=True, name='max', dtype='int32')¶ Compute the max of the given expression on axis.
- Parameters
- Returns
- Return type
See also