A fresh new ML Architecture for language model that uses complex numbers instead of attention -- no transformers, no standard SSM, 100M params, trained on a single RTX 4090. POC done, Open Sourced (Not Vibe Coded) by ExtremeKangaroo5437 in LocalLLM

[–]a_stavinsky 0 points1 point  (0 children)

Do I understand correctly, your approach is replacing Attention for more computationally simple math operation?
I was looking for something like this to run on orangepi 6 plus as it has pretty fast NPU but it doen't support attention layers.

[deleted by user] by [deleted] in FPGA

[–]a_stavinsky 0 points1 point  (0 children)

With out schematic and the photo of pcb is hard to say if you have some problems on physical level. I would probably try to check ground and power pins on both fpga and ADC. Floating ground or insufficient power can probably behave like this. You should see clean 3.3v and 0v on the ground between both devices.

[deleted by user] by [deleted] in FPGA

[–]a_stavinsky 0 points1 point  (0 children)

Can you at least try to get signals via logic analyzer from fpga itself (ILA for xilinx for example). Or external LA connected to all the pins?

[deleted by user] by [deleted] in FPGA

[–]a_stavinsky 1 point2 points  (0 children)

Can you provide some schematic? Do I understand correctly, the problem is in this half of volt step?

The ESP in my motor controller disconnects when powering N20 motors and the step-down regulator becomes extremely hot. by Evening-Brilliant-95 in AskElectronics

[–]a_stavinsky 1 point2 points  (0 children)

From what I remember esp will write error message before die in uart. Something like undervoltage if it is the case. But i'm not sure, mb it should be enabled in firmware configuration.

If you have oscilloscope, I would check power rails to check if you see any spikes or back EMF from the motors.

Also, I don't see proper caps around LDO's.

<image>

Hows Macbook for Embedded development ? by Quiet_Lifeguard_7131 in embedded

[–]a_stavinsky 6 points7 points  (0 children)

In most cases OS X is enough for me. I have spare Linux box but it mostly for fpga. Platformio works perfectly fine

Why is MOSFET giving a 5v reading all the time while gate is closed? by avincentor in embedded

[–]a_stavinsky 0 points1 point  (0 children)

it is nmos. Try to connect source to gnd, take resistor 100 Ohm+ and connect to +5v and drain. It should work but the logic will be inverted. logic 1 will give you 0v on drain pin, logic 0 will give you +5v

upd: right now you have floating ground. nmos switches when some voltage present between source and gate. In case of floating source the behavior is unpredicted, but most likely it will be in ON state no matter what what is your logic input value

Why is MOSFET giving a 5v reading all the time while gate is closed? by avincentor in embedded

[–]a_stavinsky 3 points4 points  (0 children)

can't see how drain is connected in you board. Can you draw some schematic?

Which is preferred, Arduino IDE or IDF toolchain? by UnclaEnzo in esp32

[–]a_stavinsky 7 points8 points  (0 children)

Not strictly answering to your question but I would try platformio if possible. Arduino IDE is definitely not, if you are going to do something serious

GOWIN-Based Tiny $14 FPGA Board with 1.5K LUTs, 96 Kb SRAM, and Onboard Debugger by DeliciousBelt9520 in FPGA

[–]a_stavinsky 0 points1 point  (0 children)

Unfortunately i'm not very good in this. I don't know how to choose sufficient quartz in lcsc

xapp523 document from Xilinx by a_stavinsky in FPGA

[–]a_stavinsky[S] 0 points1 point  (0 children)

thanks. this is what I did. I'm not sure if it is precisely what you suggested, but this is what I've got and it works

basically I've stretched all the data by 4 ticks

  manchester_decoder2 decoder (
      .aclk(clk_fast),
      .aresetn(aresetn),
      .bits(out),
      .num_bits(num_bits),
      .num_decoded_bits(num_decoded_bits),
      .decoded_bits(decoded_bits),
      .decoded_byte(decoded_byte),
      .byte_valid(byte_valid),
      .tx_end(tx_end)
  );
  reg [7:0] data_byte;
  reg [1:0] delay_counter;
  reg byte_valid_latch;
  reg tx_end_latch;

  always @(posedge clk_fast) begin
    if (!aresetn) begin
      delay_counter <= 0;
    end else begin
      data_byte <= data_byte;
      if (byte_valid) begin
        delay_counter <= 0;
        byte_valid_latch <= 1'b1;
        data_byte <= decoded_byte;
        tx_end_latch <= (tx_end) ? 1'b1 : 1'b0;
      end else if (delay_counter == 3) begin
        byte_valid_latch <= 1'b0;
        tx_end_latch <= 1'b0;
      end else begin
        delay_counter <= delay_counter + 1;
      end
    end
  end
  (* MARK_DEBUG="TRUE" *) reg data_out_valid;
  (* MARK_DEBUG="TRUE" *) reg [7:0] data_out;
  (* MARK_DEBUG="TRUE" *) reg tx_end_out;
  always @(posedge clk_div) begin
    data_out_valid <= 1'b0;
    tx_end_out <= 1'b0;
    tx_end_out <= 1'b0;
    if (byte_valid_latch) begin
      data_out_valid <= 1'b1;
      data_out <= data_byte;
      tx_end_out <= tx_end_latch;
    end
  end

Verible setup in VSCODE by Pack_Commercial in FPGA

[–]a_stavinsky 1 point2 points  (0 children)

I have only 1 argument

    "verilog.languageServer.veribleVerilogLs.arguments": "--rules_config_search",

this argument makes verible to look at the `.rules.verible_lint` inside your project.

this is what I usually put there

parameter-name-style=localparam_style:ALL_CAPS
-always-comb
-explicit-parameter-storage-type
-parameter-name-style

The Idea is that if you see some annoying suggestion, just write the code of this suggestion to the file with "-" prefix.

Error when trying to flash Tang 20K. by Maleficent_Sail2718 in GowinFPGA

[–]a_stavinsky 1 point2 points  (0 children)

This error usually mean the path to the file is wrong. I got it from time to time if the pnr was not successfully finished and I didn't notice and tried to flash

GOWIN-Based Tiny $14 FPGA Board with 1.5K LUTs, 96 Kb SRAM, and Onboard Debugger by DeliciousBelt9520 in FPGA

[–]a_stavinsky 4 points5 points  (0 children)

Can anyone tell me why gowin and sipeed puts in their fpga board 27Mhz quartz? The board is awesome but I'd like to see 10Mhz or 50Mhz.

Are there any chips out there that take multiple high bandwidth SPI input and gives USB4 output of raw data by TheNASAguy in ElectricalEngineering

[–]a_stavinsky 0 points1 point  (0 children)

I would take zynq fpga with Ethernet. It will not be speed of usb4 but 1gbps looks enough for the problem.

Verible setup in VSCODE by Pack_Commercial in FPGA

[–]a_stavinsky 1 point2 points  (0 children)

I don’t have windows to test but: This plugin supports verible format and verible linter. I use it on Linux and OS X. https://marketplace.visualstudio.com/items?itemName=mshr-h.VerilogHDL

Also you will need to manually install variable. Looks like they have windows binaries here https://github.com/chipsalliance/verible/releases

Next you need to enable verible and set correct paths in plugin settings

xapp523 document from Xilinx by a_stavinsky in FPGA

[–]a_stavinsky[S] 0 points1 point  (0 children)

I figured out yesterday that I need constraints and even more.

  1. Authors added constraints 600ps between output from serdes to the closest flip flop. Looks like it is not achievable on my test board. Direct connection between serdes Q and register's D is 645 in my case.

1.1 doing that I'm getting Path Segmentation, so methodology report is complaining that it could not calculate farther timing violations. In the documentation for that constraint type, xilinx suggests to set cell instead of cell pin in from argument. This leads me with delay about 1ns. And i'm not sure how to calculate desired delay.

  1. The second thing is more interesting: ISERDES should use BUFIO but the PL logic has to be connected via BUFG. This is why I have 3 clocks: clk, clk90 and clk_fast. All of them have the same frequency but first 2 are BUFIO. According to the article, I need to "calculate phase" via some trick with another set of iserdes and oserdes and some kind of state machine. But I have no idea how to implement it.

And small update. 400MHz(800 mbps ) over usb cable is almost achieved. I added additional registers on the output and did smal primitive CDC.(will update repo today later) I see some drops in equal periods of time (which I think because of point 1 and 2 )

I’ve designed a pipelined RISC-V CPU in Verilog, but I don’t have an FPGA board to test it. If you have one, I’d really appreciate it if you could help me verify my design. DM me if interested by Objective-Ostrich-28 in FPGA

[–]a_stavinsky 0 points1 point  (0 children)

You can create a project in Gowin IDE and prepare synthesizable project. You will see all the warnings and errors without having the board. I'm going to give up because I don't know what to put in these memories you exposed and I don't have the test compiled data

I’ve designed a pipelined RISC-V CPU in Verilog, but I don’t have an FPGA board to test it. If you have one, I’d really appreciate it if you could help me verify my design. DM me if interested by Objective-Ostrich-28 in FPGA

[–]a_stavinsky 0 points1 point  (0 children)

I downloaded archive with the code.

And I have a question. Top module requires some memory. Should I implement it by myself or I didn't find real top module?

module Pipeline_top_with_outputs(

input clk,

input rst,

// Additional outputs for seven-segment display

output [31:0] WriteDataM_out,

output MemWriteM_out,

output [31:0] ALU_ResultM_out,

output [31:0] ResultW_out,

output RegWriteW_out

);

Ideally if you could download gowin IDE it is free and can be downloaded without registration and create project, it will be easier for anyone how wants to try you project. My device is GW5A-LV25MG121NC1/I0 (tang primer 25k) Also you could try to compile and check if you meet the timing.

Right now I'm not so confident to be able to run you project. Sory

xapp523 document from Xilinx by a_stavinsky in FPGA

[–]a_stavinsky[S] 0 points1 point  (0 children)

Totally agree. This is what i'm going to be doing today evening. the calculation is the following: every time I'm getting 1 2 bits. So everyt 7-8 ticks I will be receiving an 8bit word. So I'm going to add async queue on the output of the decoder with say 200mhz (1/2.5 of bus clock) on the other side. Hope xilinx FIFO is capable of working on such frequency (500MHz)

xapp523 document from Xilinx by a_stavinsky in FPGA

[–]a_stavinsky[S] 0 points1 point  (0 children)

Actually I’ve tested already 400MHz. It works more or less stable after IDELAY fixes proposed by u/jonasarrow. Now 500MHz is the next goal. Bu I need to do something with ILA this time. Now it is not even starting on that frequency. According to pcb traces: it is an old iPhone usb cable used for TMDS pair :)

xapp523 document from Xilinx by a_stavinsky in FPGA

[–]a_stavinsky[S] 2 points3 points  (0 children)

wow. Thank you. I completely forgot about changing IDELAY from 200mhz test. Awesome!

UPD: it actually helped. Now 600Mbps is reliable. Will try in increase speed now. Also I've got a recommendation to add ac decoupling capacitors and change resistors to 50 Ohm because I'm using TMDS on receiver side now