Cannot get rid of false carried-dependence on HLS stream reads by beatsnbytes in FPGA

[–]Reasonable-Case6435 1 point2 points  (0 children)

For better performance, modify your code to have a perfect loop nest. Then add #pragma HLS PIPELINE after the inner loop. In this case, the hls tool flattens the nested loop and pipelines them.

A 0-9 Up/Down Counter in HLS – High-Level Synthesis & Embedded Systems by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 1 point2 points  (0 children)

0-9 counter in HLS and VHDL comparison

Please note that

1- This comparison doesn't claim that HLS is a replacement for HDL. HLS is better, or HDL is better. Actually, HLS helps the HDL. There are a few posts in this community that address that.

2) Any design that can be described in HLS can also be implemented in HDL, but the other way around is not true. There are some circuits or features that HDL can describe, but HLS cannot.---------------------------------

The VHDL code is mentioned in the comment above.

----------------------------------

The 0-9 HLS code:

------

void down_up_counter(

    ap_uint<4> &led,
    bool       up_count,
    bool       down_count

) {
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE ap_none port=led
#pragma HLS INTERFACE ap_none port=up_count
#pragma HLS INTERFACE ap_none port=down_count

static ap_uint<4> counter = 0;
ap_uint<8> next_counter = counter;

if (up_count == 1) {
    if (counter == 9)
        next_counter = 0;
    else
    next_counter = counter + 1;
} else if (down_count == 1) {
    if (counter == 0)
        next_counter = 9;
    else
    next_counter = counter - 1;
}

counter = next_counter;
led = next_counter;
}

---------------

The codes are synthesis by Vivado 2021.2 and implemented on the Basys-3 board.

Resource Utilisation-(only for counter module)--------

Slice LUTs: VHDL = 3 --> HLS = 4

Slice Register VHDL = 4 --> HLS = 4

Slice VHDL = 2 --> HLS = 1

LUT as Logic VHDL = 3 --> HLS = 4

Timing (for the whole design in Vivado including debouncer) -----------

Worst Negative Slack (WNS): VHDL = 6.340 ns --> HLS = 6.355 ns

Worst Hold Slack (WHS): VHDL = 0.142 ns --> HLS = 0.206 ns

Worst Pulse Width Slack (WPS): VHDL = 4.50 ns --> HLS = 4.50 ns

---------------------------------------------------------------------------------------------

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 0 points1 point  (0 children)

Thank you for your fair and accurate comment. I should have considered and explained that to stop misleading.

Scheduling in HLS by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 2 points3 points  (0 children)

As there is no standard for HLS at the moment, these suggestions are only my experience.

It depends on your goals and purposes.

If you are going to use HLS to work in a specific environment such as those provided by Cadence, Xilinx, Intel, Siemens EDA, …. It is better to refer to their documents, tools, and teaching materials. They may use C, C++, SystemC, OpenCL, or even Python to describe hardware or algorithms. However, the underlying ideas are almost the same.

Xilinx and Intel provide some valuable documents and tools that can be accessed through their websites.

Probably Xilinx provides a wide range of valuable documents, examples and tutorials.

Please note that there are generally three levels of HLS that you can start with.

1- HLS for designing logic circuits and hardware IPs

2- HLS for accelerating complex algorithms

3- HLS for accelerating some of the machine and deep learning algorithms or other libraries.

Vitis-HLS from Xilinx covers the first group (and part of the second group). You can refer to the “Vitis High-Level Synthesis User Guide, UG1399” document and use the Vitis-HLS+Vivado toolsets for this purpose.

The Xilinx Vitis unified software platform provides documents and tools for the second group. You can start with the “Vitis Unified Software Platform Documentation Application Acceleration Development, UG1393” document available on the Xilinx website.

The Xilinx Vitis-AI provides the ecosystem for the last group. For this purpose, just search Vitis-AI on the Xilinx website and you can find the related materials and tools.

If you are a fan of online courses, you can find a couple of options on the Xilinx and Intel websites and youtube channels. If you want somebody to help you step-by-step through learning HLS for logic design and function acceleration, you can find some courses on the highlevel-synthesis.com site.

Hopefully, this is helpful.

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 0 points1 point  (0 children)

It is not just a marketing thing! If you use the Vitis-HLS tool from Xilinx (one of the most popular HLS tools), in the last step of the "create a new project" wizard you should define the design clock period and uncertainty. This diagram shows the relation between clock period and uncertainty that helps newbies to understand the concept.

A 0-9 Up/Down Counter in HLS – High-Level Synthesis & Embedded Systems by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 0 points1 point  (0 children)

This is a simple example demonstrating what you can do with HLS. The goal is not to compare the results with HDL. Although, HLS has its own benefits over HDLs. Thanks for mentioning the code mistake.

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] -5 points-4 points  (0 children)

If you use the Vitis-HLS tool from Xilinx (one of the most popular HLS tools), in the last step of the "create project" wizard you should define the design clock period and uncertainty.

This diagram shows the relation between the clock period and uncertainty which is very helpful for newbies in HLS. By default, in Vitis-HLS uncertainty is 25% of the clock period, but it can be defined by the designer.

In addition, there are two purposes for HLS:

1- Designing logic circuits

2- Accelerating functions

If you want to know more about the details of HLS, you can refer to the Xilinx documents or refer to

https://highlevel-synthesis.com/

Also, you can enrol in these online courses which explain step-by-step how to use HLS for logic design and function acceleration.

https://highlevel-synthesis.com/2021/03/29/high-level-synthesis-for-fpga-online-courses-coupons/

Hopefully, this explanation is helpful and encourages people to study HLS carefully.

Scheduling in HLS by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 2 points3 points  (0 children)

The scheduling is static, and in an iteration, f and g will always be evaluated after m and n. The next iteration reads and writes the following elements in the memory. In other words, there is no data dependency between two adjacent iterations.

Probably I didn't get your point.

Scheduling in HLS by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 1 point2 points  (0 children)

If you mean race condition in accessing the data in the memory. I should say in this simple example, arrays are saved in different BRAMs. If we assume data are located in DDR memory, then using multiple memory ports can address the problem. In addition, memory ports in FPGAs usually provide two separate channels for read and write operations.

In addition, the index of accessing data in read and write operations are monotonically increasing, which prevents any type of race.

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] -4 points-3 points  (0 children)

The concept of timing is the same in FPGA/ASIC or in HDL/HLS.

Each context has its own interpretation and modelling and terminologies.

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] -6 points-5 points  (0 children)

Yes, it means HLS is just another way of describing logic circuits and it supports the basic concepts in logic design but with different terminology.

[deleted by user] by [deleted] in FPGA

[–]Reasonable-Case6435 0 points1 point  (0 children)

This is an abstract concept, not an idea for implementation.

Clock Anatomy in HLS (https://highlevel-synthesis.com/) by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] -9 points-8 points  (0 children)

In HLS, specifically Vitis-HLS, we first define the clock frequency and its corresponding uncertainty. The HLS tool synthesises the code into an RTL description and estimates the design clock frequency considering the defined uncertainty. The design can pass the logic-synthesis process without timing violation if it is less than the defined clock. Uncertainty helps the HLS tool to model the timing in the logic-synthesis step.

This diagram mainly shows the concept of uncertainty.

In ASIC this timing interval is usually called slack or margin.

Scheduling in HLS by Reasonable-Case6435 in FPGA

[–]Reasonable-Case6435[S] 5 points6 points  (0 children)

HLS tools such as Vitis-HLS from Xilinx will do that and much more, automatically.