Credit System, New Driver, Fixed Accuracy Bug, Support for Resnet18 filters (!23) · Merge requests · Mathew Hall / HPIPE

Mario Doumet requested to merge mario_dev into master Jul 28, 2023

Credit System:

The old backpressure system has been replaced by a new credit system to avoid overflowing in the input buffers in the event where the backpressure signals is too slow to propagate back.

Comparisons from the two systems can be found in the tables below:

	Rewritten driver + MV2 Bug Fix + Credit System + Resnet18 changes (@300MHz)
	Throuhghput (im/s)	Top1-acc	Top5-acc	Latency (first image)	Logic Utilization (ALMs)	Total Block Memory bits	Total RAM Blocks	Total DSP_Prime Blocks
MV1 (3400DSP)	20674.8	66.482	87.012	0.398 ms	60%	47%	73%	86%
MV2 (2800 DSP)	19795.5	63.81	85.26	0.459 ms	83%	37%	63%	69%
MV3 (2900 DSP)	26356.7	55.124	78.788	0.420 ms	80%	37%	57%	64%

	Rewritten driver + MV2 Bug Fix + Old Backpressure (@300MHz)
	Throuhghput (im/s)	Top1-acc	Top5-acc	Latency (first image)	Logic Utilization (ALMs)	Total Block Memory bits	Total RAM Blocks	Total DSP_Prime Blocks
MV1 (3400DSP)	22285.8	66.482	87.012	0.397 ms	62%	47%	73%	86%
MV2 (2800 DSP)	Hangs	63.81	85.26	--	85%	37%	62%	69%
MV3 (2900 DSP)	27490.9	55.124	78.788	0.403 ms	81%	37%	57%	64%

New Driver:

Because the old driver included here only reliably operated at 16,000 im/s, and often hung when modified for higher throughputs, we included a new driver that will allow operation at higher speeds

Accuracy Bug Fixed:

In the previous commit, MV2 and MV3 had poor accuracy because of a bug in the Add layers that caused two RAM blocks to send their outputs at the same time. The bug is fixed in this commit.

Support for Resnet filters in TensorMode:

In the previous commit, kernels of height and width greater than 1 would only work in tensormode if the number of input channels was less than ICP (10 in this case). In other words, we could support a 3x3x10x32 filter, but not a 3x3x11x32 filter. Changes have been added here to support any number of input channels (such as 3x3x64x64). Moreover, we now support convolutions with kernel of size 1x1 but stride 2.

Credit System, New Driver, Fixed Accuracy Bug, Support for Resnet18 filters

Credit System:

New Driver:

Accuracy Bug Fixed:

Support for Resnet filters in TensorMode:

Merge request reports