heterocl.tvm.build_module module

The build utils in python.

This module provides the functions to transform schedule to LoweredFunc and compiled Module.

class BuildConfig(handle)[source]

Bases: heterocl.tvm._ffi.node.NodeBase

Configuration scope to set a build config option.

Note

This object is backed by node system in C++, with arguments that can be exchanged between python and C++.

Do not construct directly, use build_config instead.

The fields that are backed by the C++ node are immutable once an instance is constructed. See _node_defaults for the fields.

current = build_config(data_alignment=-1, offset_factor=0, double_buffer_split_loop=1, auto_unroll_max_step=0, auto_unroll_max_depth=8, auto_unroll_max_extent=0, unroll_explicit=1, restricted_func=1, detect_global_barrier=0, partition_const_loop=0)
class DumpIR[source]

Bases: object

Dump IR for each pass. With it, you can dump ir just like gcc/llvm.

with tvm.build_config(dump_pass_ir=True)
    run()
decorate(func)[source]

decorate the pass function

decorate_custompass()[source]

decorate add_lower_pass pass in BuildConfig

decorate_irpass()[source]

decorate ir_pass and ScheduleOps

enter()[source]

only decorate outermost nest

exit()[source]

recover outermost nest

scope_level = 0
build(sch, args=None, target=None, target_host=None, name='default_function', binds=None, stmt=None)[source]

Build a function with arguments as signiture.

Parameters
  • sch (tvm._Schedule, or LoweredFunc) – The schedule to be builded

  • args (list of Buffer or Tensor or Var, optional) – The argument lists to the function.

  • target (str or tvm.target.Target, optional) – The target and option of the compilation.

  • target_host (str or tvm.target.Target optional) – Host compilation target, if target is device. When TVM compiles device specific program such as CUDA, we also need host(CPU) side code to interact with the driver setup the dimensions and parameters correctly. target_host is used to specify the host side codegen target. By default, llvm is used if it is enabled, otherwise a stackvm intepreter is used.

  • name (str, optional) – The name of result function.

  • binds (dict, optional) – Dictionary that maps the binding of symbolic buffer to Tensor. By default, a new buffer is created for each tensor in the argument.

Returns

f – The result function.

Return type

Function, or pair of functions

Note

See the note on tvm.target on target string format.

build_config(**kwargs)[source]

Configure the build behavior by setting config variables.

Parameters
  • auto_unroll_max_step (int, default=0) – Threshold of number of steps in the loop to be automatically unrolled. This takes inner loop count into consideration.

  • auto_unroll_max_depth (int, default=4) – The maximum nested level of loops that can be automatically unrolled.

  • unroll_explicit (bool, default=True) – Whether explicitly unroll the loop, if set false, the unroll hint will be passed to the CodeGen phase, which may generate pragma unroll hint. Set this to be true if CodeGen support unroll pragma and when we want to be more readable.

  • detect_global_barrier (bool, default=True) – Whether detect global barrier.

  • partition_const_loop (bool, default=False) – Whether partition const loop

  • data_alignment (int, optional) – The alignment of data pointer in bytes. If -1 is passed, the alignment will be set to TVM’s internal default.

  • offset_factor (int, default=0) – The factor used in default buffer declaration. If specified as 0, offset field is not used.

  • restricted_func (bool, default=True) – Whether build restricted function. That is each buffer argument to the function are guaranteed not to overlap. This enables more optimization. Corresponds to restricted keyword in C99

  • double_buffer_split_loop (int, default=2) – Whether split the loop with factor. If it is zero, no splitting will happen. It it is bigger than one, the logic will do a split with factor equals the integer and unroll the inner loop. This allows the buffer fetching won’t contain condition.

  • add_lower_pass (list of tuiple (phase, function(Stmt->Stmt)), default=None) – phase contains an integer on which optimization pass we apply the pass. Additional lowering passes to be applied before make_api.

  • dump_pass_ir (dump ir of each pass into file idx_passname_ir.cc, default=False) –

  • generate_reuse_buffer (bool, default=True) – Lower the Reuse node to reuse buffers

Returns

config – The build configuration

Return type

BuildConfig

build_fpga_kernel(sch, args, target, name='default_function')[source]

Build an FPGA kernel.

Parameters
  • sch (tvm._Schedule, or LoweredFunc) – The schedule to be builded

  • args (list of Buffer or Tensor or Var, optional) – The argument lists to the function.

  • target_name (str) – The target string to indicate the language to be used.

Returns

module – The generated kernel module.

Return type

Module

Note

This function should only be called by build.

get_binds(args, binds=None)[source]

Internal function to get binds and arg_list given arguments.

Parameters
  • args (list of Buffer or Tensor or Var) – The argument lists to the function.

  • binds (dict of Tensor to Buffer, optional) – Dictionary that maps the Tensor to Buffer which specified the data layout requirement of the function. By default, a new compact buffer is created for each tensor in the argument.

Returns

  • binds (dict) – The bind specification

  • arg_list (list) – The list of symbolic buffers of arguments.

lower(sch, args, name='default_function', binds=None, simple_mode=False, kernel_only=False, stmt=None)[source]

Lowering step before build into target.

Parameters
  • sch (tvm._Schedule) – The schedule to be builded

  • args (list of Buffer or Tensor or Var) – The argument lists to the function.

  • name (str, optional) – The name of result function.

  • binds (dict of Tensor to Buffer, optional) – Dictionary that maps the Tensor to Buffer which specified the data layout requirement of the function. By default, a new compact buffer is created for each tensor in the argument.

  • simple_mode (bool, optional) – Whether only output simple and compact statement, this will skip LoopPartition, api wrapper generation and Unrolling.

  • kernel_only (bool, optional) – This will skip inserting all checkers and only keep the kernel part when making LoweredFunc.

Returns

f – The result function, if with_api_wrapper=False Then the Stmt before make api is returned.

Return type

LoweredFunc or Stmt