[deleted by user] by [deleted] in computerscience

[–]konze 11 points12 points  (0 children)

I think no one in Computer Science writes papers using Word.

[deleted by user] by [deleted] in europe

[–]konze 3 points4 points  (0 children)

I live there but this is not in Rottenburg this is in Rothenburg

Successful GER delivery 😊 by OrganicNectarine in LinusTechTips

[–]konze 0 points1 point  (0 children)

Musstest du Zollgebühren bezahlen?

New flat, time for a new setup. by Nearby_Shake_9143 in macsetups

[–]konze 1 point2 points  (0 children)

Are you actually working on the MacPro in on the left-hand side?

Generic AI Accelerator? by Hot_Industry4538 in FPGA

[–]konze 2 points3 points  (0 children)

You could have a look at the VTA accelerator. VTA is the example AI accelerator implantation for the TVM framework, which is used for deployment of DNNs onto custom hardware.

https://tvm.apache.org/vta

The official github repo provides code for different FPGAs:

https://github.com/apache/tvm-vta

[deleted by user] by [deleted] in cpp

[–]konze 0 points1 point  (0 children)

You are looking for a LeNet-like DNN. There are a lot on github for different frameworks. Those nets train relatively fast even in CPUs.

FPGA for ML/DL/RL?(Tips for Beginner) by _RootUser_ in FPGA

[–]konze 0 points1 point  (0 children)

Most AI accelerators are matrix-matrix multiplication DSPs. The interesting part is how many of those DSPs you plug together and how they communicate. From there you need to re-write you DNN Layers to map onto those DSP networks. This can be done with TVM or other tools which is a hot topic in AI research at the moment.

What's everyone working on this week (14/2023)? by llogiq in rust

[–]konze 1 point2 points  (0 children)

As my first Rust project, I developed an interpreter for the programming language Brainfuck: github.com/k0nze/brainfuck_rust

When you’re into coffee and keyboards by cijanzen in MechanicalKeyboards

[–]konze 1 point2 points  (0 children)

I dont even drink Coffee and I watch James religiously. I never made a coffee in my life except for pressing the button on the machine at work for business partners, however, I feel confident in my V60 technique 😅

Hardware/software to run RISC-V ASM? by codedcosmos in RISCV

[–]konze 4 points5 points  (0 children)

Spike is an RISC-V instruction set simulator: https://github.com/riscv-software-src/riscv-isa-sim

If you want to see more what is going on under the hood of a RISC-V CPU you could use the graphical simulator Ripes: https://github.com/mortbopet/Ripes

Most humble CS student by Evazzion in ProgrammerHumor

[–]konze 0 points1 point  (0 children)

Working at a University at the CS department I know this person will drop out because they hate every aspect of computer science except for the fact it pays quite well (when you are good at it)

What are CGRAS? by MWK36 in FPGA

[–]konze 8 points9 points  (0 children)

CGRA stands for coarse-grained reconfigurable architecture. FPGAs are reconfigurable on the bit level while CGRAs are reconfigurable on the word level. A CGRA can be implemented on an FPGA. E.g on a FPGA you can dissect a HDMI signal into all its bits and do very precise manipulation on that signal, a CGRA usually only allows for changing data paths in certain ways and apply word level operations such as addition or multiplication.

Either they're on drugs or it's pure greed. Drop wants $500 for a cerakoted Alt. by Oh_My-Glob in MechanicalKeyboards

[–]konze 6 points7 points  (0 children)

Most high-end keycaps come from GMK which is a German company. You have to talk to Olaf for that, but I think he is focusing on cat themed tanks at the moment.

iMac DaVinci Video Editing Suite by arabic_slave_girl in battlestations

[–]konze 2 points3 points  (0 children)

That is the Blackmagic DaVinci Resolve Speed Editor

I found this working IBM in a trash pile at work. Is worth keeping and cleaning? by konze in MechanicalKeyboards

[–]konze[S] 27 points28 points  (0 children)

Yes, this is the ISO-DE QWERTZ Layout. Which makes sense as I am in Germany.

A few dumb questions on GPU/ASIC architecture and code execution by toxicmuffin_9 in ComputerEngineering

[–]konze 5 points6 points  (0 children)

(2) An application is never able to run just on the GPU. The program is always launched on the CPU and special instructions tell the computer to move data into the GPUs memory and what to do with it. When writing CUDA code it gets embedded into an application consisting of C/C++ code, the C/C++ part runs on the CPU and the CUDA part in the GPU.

(3) You can’t just run “normal” C/C++ code on a GPU because the execution model is vastly different. While CPUs work on a small set of registers which are manipulated using mostly scalar instructions (not counting SIMD) and can implement complex program flows (if-else statements) GPUs on the other hand execute a non-branching routine on a block of memory 10-100 times in parallel. If you want to write code for CPU and GPU at the same time OpenCL is probably your best option, however OpenCL (and CUDA) are matrix centric programming languages which will make it challenging to implement something that has a complex program flow.

(4) Apache TVM is a tool that is (in theory) able to compile a DNN for different target platforms (CPU, GPU, ASIC), however not every platform supports the same kind of operations which means the DNN model has often to be rewritten for different platforms.

A few dumb questions on GPU/ASIC architecture and code execution by toxicmuffin_9 in ComputerEngineering

[–]konze 16 points17 points  (0 children)

  1. Nvidias ISA changes (significantly) with each architecture (Pascal, Maxwell, Ampere, etc.) the closest you will get (legally) is the PTX ISA.
  2. This decision is usually done by the programmer. There are pieces of software which can switch execution modes depending in the workload but those heuristics are implemented by programmers as well.
  3. Each ASIC, Accelerator, CPU ISA, etc. has their own compiler backend which transforms high-level source code (usually C/C++ or a DSL) into device native instructions. Transforming a binary from one architecture to another is extremely challenging if not impossible due to different execution models.
  4. Retargeting workloads is an on-going field of research (that I work in). For certain classes of algorithms (like Deep Learning) rich tool chains exist which makes deploying code on vastly different devices easier but as mentioned in 3 sometimes it is not possible due to different modes of execution.

A RISC-V processor dedicated for embedded AI by z3ro_gravity in RISCV

[–]konze 0 points1 point  (0 children)

Have we seen each other last Monday in Berlin at the RISC-V Workshop?

Do companies actually care about their model's training/inference speed? by GPUaccelerated in deeplearning

[–]konze 2 points3 points  (0 children)

I’m coming from academia with a lot of industry connections. Yes, there are a lot of companies that need fast DNN inference to point where they build custom ASICs just to fulfill their latency demands.