Skip to content

HGQ v2 support (part 1) #1154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 69 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
de9ae54
Add hint on import failure
calad0i Dec 16, 2024
dd01ff7
import converter dependencies lazily
calad0i Oct 26, 2024
c57417a
keras v3 object based parser
calad0i Nov 7, 2024
8359d3e
sequential and i/o tensor name parsing fix
calad0i Nov 8, 2024
4a98e42
support activation layers
calad0i Nov 8, 2024
1799bfa
consistent v2 weight reader behavior
calad0i Nov 8, 2024
911a726
add v3 conv handlers
calad0i Nov 8, 2024
93f482a
add test
calad0i Nov 8, 2024
7c06087
pre-commit fix
calad0i Dec 17, 2024
d4487c6
revert keras v2 converter
calad0i Dec 6, 2024
9d6d02d
make reshape handler compatiable with keras v3
calad0i Nov 13, 2024
a53b57c
general einsum support for io_parallel and latency
calad0i Nov 15, 2024
d034521
add tests for einsumdense
calad0i Nov 15, 2024
4be36b4
keras v3 converter clean-up
calad0i Nov 19, 2024
282bdc7
add symbolic quantized interval
calad0i Dec 2, 2024
8b66a30
preliminary bit-exact precision derivation opt pass
calad0i Dec 4, 2024
b369b05
squark layer support start
calad0i Dec 4, 2024
9ce7306
fix einsum_dense precision computation
calad0i Dec 4, 2024
374b5a3
add leftover
calad0i Dec 4, 2024
2dcd001
qdense fix
calad0i Dec 4, 2024
cdecefb
support batch_norm
calad0i Dec 4, 2024
908f14a
support merge layers
calad0i Dec 4, 2024
178a02f
support bit-exact q_einsum and fix precision trace for multi inp layers
calad0i Dec 5, 2024
a851865
add einsum test
calad0i Dec 5, 2024
d8fc9eb
declare all softmax attrs in layer class
calad0i Dec 6, 2024
d0cf465
fix lazy import in handler
calad0i Dec 6, 2024
2b51ec0
cleanup einsum handler
calad0i Dec 6, 2024
0966587
cleanup einsum handler
calad0i Dec 6, 2024
29c82a3
more granular control over softmax for vivado
calad0i Dec 6, 2024
4af1529
properly propagate inv/exp_table_size
calad0i Dec 7, 2024
025b29a
support bit-exact softmax for stable impl
calad0i Dec 7, 2024
19397b2
bit-exact softmax fix and leftovers
calad0i Dec 7, 2024
d08dff3
softmax table fixer update
calad0i Dec 7, 2024
0a8c11c
support input scaler in softmax
calad0i Dec 8, 2024
cb92419
support multidim parallel softmax
calad0i Dec 8, 2024
62b87aa
fuse quantizer when possible
calad0i Dec 8, 2024
4dde858
partial activation, fix input precision in SAT mode
calad0i Dec 9, 2024
98e2350
fix padded convXd precition derivation rule
calad0i Dec 9, 2024
36b193b
add unary lut support
calad0i Dec 9, 2024
48a5071
fix bit-exact corner case introduced by reverse flow
calad0i Dec 10, 2024
3f7b97c
general data_t inference
calad0i Dec 10, 2024
cd74f42
softmax compatbility
calad0i Dec 11, 2024
25a8efd
fix typo in einsum handler
calad0i Dec 11, 2024
07ed776
fix more typos
calad0i Dec 11, 2024
a7d222b
MHA :tada:
calad0i Dec 11, 2024
1672b64
fix einsum and softmax template typos
calad0i Dec 11, 2024
8d60067
assert einsum ops doesnot include direct sum operation
calad0i Dec 12, 2024
5026e28
style
calad0i Dec 13, 2024
76e3793
fix mha layer indexing
calad0i Dec 13, 2024
a557e9c
switch to model opt
calad0i Dec 14, 2024
7b76371
pooling layers
calad0i Dec 15, 2024
935a8d6
handle stray inputs
calad0i Dec 15, 2024
6389175
fix pooling layer accum_t
calad0i Dec 15, 2024
09beb48
bit-exact concatenate
calad0i Dec 15, 2024
8f08fa6
add comments
calad0i Jan 18, 2025
677d65a
skip non-bit-exact compatiable softmax in bit-exact pass
calad0i Jan 18, 2025
33132a9
fix activation matching, fix bw edge case handle
calad0i Jan 20, 2025
d839571
fixes in bw inference
calad0i Jan 21, 2025
49616bf
warn double quantizer
calad0i Jan 21, 2025
c007490
fix softmax bw and table_size in non-bit-exact case
calad0i Jan 22, 2025
3284338
fix dependency
calad0i Jan 23, 2025
dda3f5b
formatting after rebase
calad0i Jan 23, 2025
a65bf6b
add kif cache
calad0i Feb 3, 2025
a558b49
fix edge cases, allow fixedquantizer bw shrink when possible
calad0i Feb 5, 2025
f9e22d5
fix global pooling bw inference
calad0i Feb 6, 2025
294a35b
rm linear layer if FixedQuantizer presents
calad0i Feb 6, 2025
30b1c3c
add parallelization factor support for squaark QDense
calad0i Feb 6, 2025
2e15ac2
warn at bw inference overflow, instead of crash
calad0i Feb 6, 2025
95530d8
minor fix on weight type var detection
calad0i Feb 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,11 @@ repos:
exclude: docs/conf.py
additional_dependencies: [flake8-bugbear, flake8-print]
args: ['--max-line-length=125', # github viewer width
'--extend-ignore=E203,T201'] # E203 is not PEP8 compliant
'--extend-ignore=E203,T201', # E203 is not PEP8 compliant
'--per-file-ignores=hls4ml/model/optimizer/passes/bit_exact.py:E741,hls4ml/converters/keras_v3/squark/_base.py:E741,__init__.py:F401',
# i for #int w/o sign, I for #int w/ sign when massively processing bw conversions ......
# ignore unused imports in __init__.py .....
]

- repo: https://github.com/mgedmin/check-manifest
rev: "0.50"
Expand Down
2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ pipeline {
sh '''#!/bin/bash --login
conda activate hls4ml-py310
conda install -y jupyterhub pydot graphviz pytest pytest-cov
pip install pytest-randomly jupyter onnx>=1.4.0 matplotlib pandas seaborn pydigitalwavetools==1.1 pyyaml tensorflow==2.14 qonnx torch git+https://github.com/jmitrevs/qkeras.git@qrecurrent_unstack pyparsing
pip install pytest-randomly jupyter onnx>=1.4.0 matplotlib pandas seaborn pydigitalwavetools==1.1 pyyaml tensorflow==2.14 qonnx torch git+https://github.com/jmitrevs/qkeras.git@qrecurrent_unstack pyparsing quantizers
pip install -U ../ --user
./convert-keras-models.sh -x -f keras-models.txt
pip uninstall hls4ml -y'''
Expand Down
33 changes: 1 addition & 32 deletions hls4ml/backends/fpga/fpga_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import numpy as np

from hls4ml.backends.backend import Backend
from hls4ml.model.attributes import ChoiceAttribute, ConfigurableAttribute, TypeAttribute
from hls4ml.model.attributes import ConfigurableAttribute, TypeAttribute
from hls4ml.model.layers import (
GRU,
LSTM,
Expand All @@ -32,16 +32,13 @@
SeparableConv1D,
SeparableConv2D,
SimpleRNN,
Softmax,
)
from hls4ml.model.optimizer import model_optimizer
from hls4ml.model.types import (
ExponentPrecisionType,
FixedPrecisionType,
IntegerPrecisionType,
PrecisionType,
RoundingMode,
SaturationMode,
UnspecifiedPrecisionType,
XnorPrecisionType,
)
Expand Down Expand Up @@ -109,34 +106,6 @@ def __init__(self, name):
act_attrs.append(TypeAttribute('table', default=FixedPrecisionType(18, 8), description=descriptions.table_type))
self.attribute_map[Activation] = act_attrs

softmax_attrs = self.attribute_map.get(Softmax, [])
softmax_attrs.append(
ChoiceAttribute(
'implementation',
['latency', 'stable', 'argmax', 'legacy'],
default='stable',
description=descriptions.softmax_implementation,
)
)
softmax_attrs.append(
ConfigurableAttribute('skip', value_type=bool, default=False, description=descriptions.softmax_skip)
)
softmax_attrs.append(
TypeAttribute(
'exp_table',
default=FixedPrecisionType(18, 8, rounding_mode=RoundingMode.RND, saturation_mode=SaturationMode.SAT),
description=descriptions.table_type,
)
)
softmax_attrs.append(
TypeAttribute(
'inv_table',
default=FixedPrecisionType(18, 8, rounding_mode=RoundingMode.RND, saturation_mode=SaturationMode.SAT),
description=descriptions.table_type,
)
)
self.attribute_map[Softmax] = softmax_attrs

def create_layer_class(self, layer_class):
new_attrubutes = []
for cls, attributes in self.attribute_map.items():
Expand Down
6 changes: 5 additions & 1 deletion hls4ml/backends/fpga/passes/fix_softmax_table_size.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@

class FixSoftmaxTableSize(OptimizerPass):
def match(self, node):
return isinstance(node, Softmax)
if not isinstance(node, Softmax):
return False
if 'inv_table_size' in node.attributes:
return False # handler generating inv_table_size sets it properly
return True

def transform(self, model, node: Layer):
inp_layer = node.get_input_node() # type: ignore
Expand Down
5 changes: 0 additions & 5 deletions hls4ml/backends/fpga/passes/hgq_proxy_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,6 @@ def match(self, node: Layer):
return isinstance(node, FixedPointQuantizer)

def transform(self, model, node: FixedPointQuantizer):
if node.fusible:
model.remove_node(node, rewire=True)
return True

if model.config.config['IOType'] != 'io_parallel':
raise NotImplementedError('Heterogenous quantization for activations is only supported with IOType=io_parallel')

Expand Down Expand Up @@ -94,7 +90,6 @@ def __init__(self):

def format(self, node):
params = self._default_function_params(node)
node.attributes['result_t'].precision = node.attributes['table_t'].precision
params['config'] = f'unary_lut_config{node.index}'
params['table'] = node.get_weights('table').name

Expand Down
50 changes: 48 additions & 2 deletions hls4ml/backends/vivado/passes/core_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,13 +150,21 @@ def format(self, node):

softmax_config_template = """struct {type}_config{index} : nnet::activ_config {{
static const unsigned n_in = {n_in};
static const unsigned table_size = {table_size};
static const unsigned n_outer = {n_outer};
static const unsigned n_inner = {n_inner};
static const unsigned parallelization_factor = {parallelization_factor};
static const unsigned exp_table_size = {exp_table_size};
static const unsigned inv_table_size = {inv_table_size};
static const unsigned io_type = nnet::{iotype};
static const unsigned reuse_factor = {reuse};
static const unsigned axis = {axis};
static const nnet::softmax_implementation implementation = nnet::softmax_implementation::{implementation};
static constexpr float exp_scale = {exp_scale};
typedef {exp_table_t.name} exp_table_t;
typedef {inv_table_t.name} inv_table_t;
typedef {accum_t.name} accum_t;
typedef {inv_inp_t.name} inv_inp_t;
typedef {inp_norm_t_str} inp_norm_t;
}};\n"""

activ_function_template = 'nnet::{activation}<{input_t}, {output_t}, {config}>({input}, {output});'
Expand Down Expand Up @@ -208,10 +216,48 @@ def __init__(self):
super(ActivationConfigTemplate, self).__init__(Softmax) # Skip ActivationConfigTemplate's __init__
self.template = softmax_config_template

def format(self, node):
params = self._default_config_params(node)
params['type'] = node.get_attr('activation')
params.setdefault('exp_table_size', params['table_size'])
params.setdefault('inv_table_size', params['table_size'])
params.setdefault('n_inner', 1)
params.setdefault('n_outer', 1)
params.setdefault('exp_scale', 1.0)
params.setdefault('parallelization_factor', -1)
if params['accum_t'].name == 'model_default_t': # type: ignore
params['accum_t'] = params['exp_table_t']
if params['inv_inp_t'].name == 'model_default_t': # type: ignore
params['inv_inp_t'] = params['exp_table_t']

if 'inp_norm_t' not in params:
input_t = node.get_input_variable().type.precision
width, iwidth = input_t.width, input_t.integer
params['inp_norm_t_str'] = f'ap_fixed<{width}, {iwidth}, AP_RND, AP_SAT>'
else:
params['inp_norm_t_str'] = params['inp_norm_t'].name # type: ignore

return self.template.format(**params)


class SoftmaxFunctionTemplate(FunctionCallTemplate):
def __init__(self):
super().__init__(Softmax, include_header=activ_include_list)
self.template = activ_function_template

def format(self, node):
params = self._default_function_params(node)
use_multidim = node.get_attr('n_inner', 1) > 1 or node.get_attr('n_outer', 1) > 1
use_multidim = use_multidim and node.model.config.get_config_value('IOType') == 'io_parallel'
params['activation'] = 'softmax' if not use_multidim else 'softmax_multidim'
params['config'] = f'softmax_config{node.index}'

return self.template.format(**params)


class ActivationFunctionTemplate(FunctionCallTemplate):
def __init__(self):
super().__init__((Activation, HardActivation, Softmax), include_header=activ_include_list)
super().__init__((Activation, HardActivation), include_header=activ_include_list)
self.template = activ_function_template

def format(self, node):
Expand Down
105 changes: 105 additions & 0 deletions hls4ml/backends/vivado/passes/einsum.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
from math import ceil

from hls4ml.backends.backend import get_backend
from hls4ml.backends.template import FunctionCallTemplate, LayerConfigTemplate
from hls4ml.model.layers import Einsum

from .reshaping_templates import transpose_config_gen

# Shared Dense template
# Einsum template

einsum_config_template = '''
struct config{index} {{
typedef config{index}_tpose_inp0 tpose_inp0_conf;
typedef config{index}_tpose_inp1 tpose_inp1_conf;
typedef config{index}_tpose_out tpose_out_conf;

typedef {accum_t.name} accum_t;

// Layer Sizes
static const unsigned n_free0 = {n_free0};
static const unsigned n_free1 = {n_free1};
static const unsigned n_contract = {n_contract};
static const unsigned n_inplace = {n_inplace};

// Resource reuse info
static const unsigned io_type = nnet::{iotype};
static const unsigned strategy = nnet::{strategy};
static const unsigned reuse_factor = {reuse_factor};
static const unsigned multiplier_limit = {multiplier_limit};
static const bool store_weights_in_bram = false; // NOT USED

template <class x_T, class y_T>
using product = nnet::product::{product_type}<x_T, y_T>;
}};
'''

einsum_function_template = 'nnet::einsum<{input0_t}, {input1_t}, {output_t}, {config}>({input0}, {input1}, {output});'

einsum_include_list = ['nnet_utils/nnet_einsum.h']


class EinsumConfigTemplate(LayerConfigTemplate):
def __init__(self):
super().__init__(Einsum)
self.template = einsum_config_template

def format(self, node: Einsum):
default_params = self._default_config_params(node)

strategy = node.model.config.get_strategy(node)
io_type = node.model.config.get_config_value('IOType')

assert io_type == 'io_parallel', 'EinsumDense layer only supports io_parallel for now'
assert strategy.lower() == 'latency', 'EinsumDense layer only supports Latency strategy for now'

# EinsumDense config
params = default_params.copy()
params['strategy'] = strategy
params['n_free0'] = node.attributes.attributes['n_free0']
params['n_free1'] = node.attributes.attributes['n_free1']
params['n_contract'] = node.attributes.attributes['n_contract']
params['n_inplace'] = node.attributes.attributes['n_inplace']
inp0_t = node.get_input_variable(node.inputs[0]).type.precision
inp1_t = node.get_input_variable(node.inputs[1]).type.precision
params['product_type'] = get_backend('vivado').product_type(inp0_t, inp1_t)

total_mults = params['n_free0'] * params['n_free1'] * params['n_contract'] * params['n_inplace']
params['multiplier_limit'] = ceil(total_mults / params['reuse_factor'])

einsum_conf = self.template.format(**params)

# inp/out transpose config
inp0_shape = node.attributes.attributes['inp0_shape']
inp1_shape = node.attributes.attributes['inp1_shape']
out_interpert_shape = node.attributes.attributes['out_interpert_shape']
inp0_tpose_idxs = node.attributes.attributes['inp0_tpose_idxs']
inp1_tpose_idxs = node.attributes.attributes['inp1_tpose_idxs']
out_tpose_idxs = node.attributes.attributes['out_tpose_idxs']
tpose_inp0_conf_name = f'config{node.index}_tpose_inp0'
tpose_inp1_conf_name = f'config{node.index}_tpose_inp1'
tpose_out_conf_name = f'config{node.index}_tpose_out'

inp0_tpose_conf = transpose_config_gen(tpose_inp0_conf_name, inp0_shape, inp0_tpose_idxs)
inp1_tpose_conf = transpose_config_gen(tpose_inp1_conf_name, inp1_shape, inp1_tpose_idxs)
out_tpose_conf = transpose_config_gen(tpose_out_conf_name, out_interpert_shape, out_tpose_idxs)

return '\n\n'.join((inp0_tpose_conf, inp1_tpose_conf, out_tpose_conf, einsum_conf))


class EinsumFunctionTemplate(FunctionCallTemplate):
def __init__(self):
super().__init__(Einsum, include_header=einsum_include_list)
self.template = einsum_function_template

def format(self, node: Einsum):
params = {}
params['config'] = f'config{node.index}'
params['input0_t'] = node.get_input_variable(node.inputs[0]).type.name
params['input1_t'] = node.get_input_variable(node.inputs[1]).type.name
params['output_t'] = node.get_output_variable().type.name
params['input0'] = node.get_input_variable(node.inputs[0]).name
params['input1'] = node.get_input_variable(node.inputs[1]).name
params['output'] = node.get_output_variable().name
return self.template.format(**params)
Loading
Loading