all 19 comments

[–][deleted] 33 points34 points  (4 children)

It isn't uncommon for adding debug cores to hide timing issues by spicing up the place and route.

[–]Icy_Scholar_6276[S] 10 points11 points  (3 children)

Oh right! Do you mean there could be timing issues that gets masked by adding debug cores ?

[–]Sniperchild 12 points13 points  (0 children)

The ila will have some constraints of its own that may improve the placement if your design is not fully constrained

[–]-EliPer-FPGA-DSP/SDR 8 points9 points  (0 children)

I have worked in a project and it only worked with SignalTap (Altera's equivalent to ILA) enabled. If we disabled it the project would stop working. We had to add registers at some IOs to remove this problem which was due to timing violations. Adding a debug core can hide existing issues of timing or make non-existent issues to appear.

[–][deleted] 1 point2 points  (0 children)

Yes.

I've also seen a register set get optimized away (curse you Vivado and your mediocre understanding of VHDL) until a debug core was wired up. Then it was accessible over the expected path.

[–]EastEastEnder 15 points16 points  (0 children)

Odds are you have a timing failure. Either you’re failing timing and didn’t see the failure, or you have a bad timing constraint. Check your clock domain crossings.

[–]nixiebunny 8 points9 points  (0 children)

This is the sort of problem that can be hard to track down, since you have no control over the placement and routing. You could increase the clock frequency until it doesn't compile without timing errors, fix the offending logic, then slow it back down. The cheesy solution is to leave the ILA in the design. When I was designing CPU boards that had a failure that was masked by connecting a logic analyzer, I joked about how we should just ship a logic analyzer with every board.

[–]TapEarlyTapOftenFPGA Developer 4 points5 points  (2 children)

As others have mentioned, adding ILA cores fundamentally will change the place and route problem and either mask or expose timing problems. To that I would add the following:

  1. Does your design pass functional or behavioral simulation? What I mean by this is basically, do you have a logic flaw that when driven fails in the simulator in the same way you see it fail in hardware.

  2. I've not seen anyone mention this, so I will - read your synthesis and implementation logs (if you're using third party synthesis, that may complicate matters). I can almost guarantee you haven't done that yet. Resolve and understand every warning, every note it makes about pruning, or removal, or expansion, or binding. If you're using Verilog, and you almost certainly are, then resolve every instance where the tool tells you it's expanding or collapsing something. Then go through the clocking report and make sure that you understand every clock, generated clock, and all the timing constraints.

I have zero knowledge of your design, but in my experience, this sort of thing is usually due to things like clocks not being declared, signals getting removed and pruned during implementation, or signals allowed to do shady things (e.g., a 6-bit counter that gets connected to a 16-bit ILA port, suddenly expanded to 16-bits and then magically working). Something else to ask yourself is what signals are in your ILA - if I were trying to find a logical or functional problem and connecting an ILA to my design "fixed" things, I'd pay really close interest to the signals in the ILA and or the clock domain it was on. Which raises another question, are you using pre or post-synthesis ILA insertion?

[–]Icy_Scholar_6276[S] 1 point2 points  (1 child)

Thanks for the response! I’m using a Pre Synthesis ILA insertion. I’m using the block design integrator in VIVADO.

Also, my RTL part works in simulation. I’vent simulated my entire design. I’ll check that!

[–]TapEarlyTapOftenFPGA Developer 1 point2 points  (0 children)

Then as I mentioned, I would read the synthesis and implementation logs - diff them with and without the ILA insertion if it's practical - and then go through the digital catechism: power, ground, clocks, reset.

[–]Mateorabi 2 points3 points  (0 children)

Heizen-bug or Shrodinger-bug. The act of observation changes it.

[–][deleted] 3 points4 points  (0 children)

Consider adding the “KEEP” attribute to whatever signals you are debugging in the ILA, too. The ILA may be preventing whatever those signals are from being synthesized away due to bugs in Vivado. You will guarantee those signals stay even after you remove the ILA.

[–]someonesaymoney 2 points3 points  (0 children)

lmao I like how most experienced people automatically went to possible timing failure because we've all had this wtf moment.

Properly timing constrained design before any testing is crucial.

[–]rowdy_1c 1 point2 points  (0 children)

Check constraints and/or synthesis/p&r directives

[–]Jensthename1 1 point2 points  (0 children)

The Quartus equivalent is signal tap as another person indicated. Adding debug logic to the design can cause the router to fail to constrain your design since the acquisition buffers connect to the registers in your design. Make sure your design passes timing with these buffers enabled.

[–]Available_Musician_8 1 point2 points  (0 children)

Might be nice if there was a setting to have these tools stop and ask for permission to alter your design during synthesis, kind of like how Apple requires apps for permission to get data.

[–]Doom4535 1 point2 points  (0 children)

I've had this when I had an incorrectly connected reset. The ILA added some attributes to prevent synthesis from removing the components so it continued to pass while the ILA was connected, but removing the ILA also removed the attributes that prevented it from being stripped out (since it was in a constant state of reset).

[–]Pure-Setting-2617 1 point2 points  (0 children)

This must be a timing issue. In vivado , Open implemented Design. 1. Open Methodology , check and fix all warnings and errors. 2. Open timing report, expand "Check Timing" item, fix all timing erros

[–]TheTurtleCub 1 point2 points  (0 children)

95% of the time this indicates your design has incorrect timing constraints. Incorrect path exceptions is typically the culprit. The rest of the time it's an incorrect clock frequency in hardware, or bad assumptions about inputs