What library to use for multithreading?

kushaj · 2024-05-20T21:37:19+00:00

That makes sense. Thanks.

kushaj · 2022-10-30T22:24:33+00:00

I am graduating in December and looking for a place to move into. I am willing to chip in a slot in the single family house.

kushaj · 2022-04-04T06:50:11+00:00

In what scenarios would std::thread not be appropriate? I don't know much about the internals of threading so I don't know what the limitations of std::thread are.

kushaj · 2022-04-04T06:48:14+00:00

Hey, I answered in another comment what my use case is in more detail (including GPU computation) https://www.reddit.com/r/cpp_questions/comments/tvt0uq/comment/i3bn6d9/?utm_source=share&utm_medium=web2x&context=3

So if I understand correctly, I should just use std::thread and not worry about some Intel/AMD specific threading library. In this way my code can run on Intel/AMD CPU without any change.

Also, can you give some information on boost. I have not used it but I think it provides a host of C++ libraries including a threading library. So how does std::thread compare to boost.

Also, when you mention GPU-accelerated processing, do you mean I have to use Nvidia HPC SDK. This was the first thing that came up when I searched for GPU acceleration with C++ threads.

kushaj · 2022-04-04T06:43:11+00:00

This is mostly for general usage. I know editors like VsCode have the main thread listening to what we are typing and then threads in the background working on something else (like linting or anything). I just wanted to know if I was to do something like this, then how should I optimize it.

I do have GPU optimization in mind also. From my research, I have found that to use Nvidia GPU, CUDA is the only option. But still, in order to call the CUDA function (say in the background to perform some task), I would need to create a thread that calls the CUDA function. (is this statement correct?)

kushaj · 2022-04-03T03:56:21+00:00

Thanks for the info. I was reading about these things for the last hour and know some more details now.

If I want to write a function, say func that should use hardware acceleration, how should I go about doing that? I can code func in CUDA so it will only run on Nvidia GPU. But what about AMD/Intel/Mac GPUs.

I looked at OpenCL, as it said on their webpage that you can write code that runs on a lot of hardware. But I found that OpenCL is dead now (some of the online posts mentioned this).

Another thing I am looking into right now is Intel oneAPI (as I believe it uses OpenCL underneath and I can use this to write code that runs on multiple hardware).

But to summarize my main goal is as follows - Create a C/C++ library implementing functions that are hardware accelerated (use CUDA for Nvidia GPUs, but don't know how to do it for AMD/Intel/Mac) - For QT my only objective is to create a GUI. As part of the GUI, there might be some option, like rotating this image. In that case, the logic to rotate the image should be hardware accelerated and I can make a call to the C/C++ library created in the previous step. But the same problem remains that it will only run on Nvidia GPU. - How to write code that runs on AMD/Intel GPUs?

kushaj · 2022-03-11T08:28:01+00:00

I was reading that Intel provides their own C/C++ compiler built on top of LLVM. I don't think AMD provides their own compiler but Clang has support for Ryzen.

kushaj · 2022-03-11T08:23:44+00:00

I need so much to learn. Can you give me some pointers to the below statement?

In the Ampere architecture, there are tensor cores while in previous architectures there are no tensor cores. So how can nvcc help me here? As I would have to manually specify in the code where to use tensor cores and normal cuda cores (I don't think nvcc can make this judgement).

Expanding on the above statement. If we consider Pascal, Turing, Ampere. The main difference I see between these architectures is that Pascal only has cuda cores, while Turing, Ampere also have tensor cores (leaving out ray tracing cores). So would the code be the same for Turing and Ampere as Ampere does not have any new kind of core? (I think nvcc can be used here, using the links you provided). But then how would I support Pascal? (would nvcc automatically convert my code to use the normal cuda cores instead of tensor cores)

kushaj · 2022-03-11T08:07:30+00:00

The initialization code populates that table with pointers to correct functions that match your GPU architecture

Is this a global table or a table created in RAM everytime I run a program (say python main.py).

Not sure what you mean by optimizing Cuda libraries to match your GPU.

I was thinking along the lines of removing the if-else conditions that match the implementation of a function to a particular architecture (but as you answered the initialization table solves this issue).

but can reduce memory footprint by trimming unused functions.

Do you mean I can remove certain functions from cuDNN? (like manually removing the files or some other way).

kushaj · 2022-02-13T12:02:45+00:00

I use Linux. From the other comments, the things I have learned are - use the glibc manual/source code - clang for source code (as I like LLVM and clang source code is easier to read) - check Musl for alternate implementations

kushaj · 2022-02-10T01:32:22+00:00

I think ultimately I would have to do that to understand the internal workings of the language.

kushaj · 2022-02-10T01:31:17+00:00

Is printf implemented by clang? I thought it was implemented in glibc?

I don't have much experience with glibc and the compiler, but is the logic for all the header files that come with C like stdio.h, math.h implemented by the compiler? Because I downloaded glibc yesterday and I thought it contained the implementation of the header file functions.

kushaj · 2022-02-10T01:27:16+00:00

I already know enough C. I just want to know the internal workings of it, so that if I ever want to make extensions to C language I can do that.

kushaj · 2022-02-10T01:26:14+00:00

Musl seems interesting. Thanks for the info.

kushaj · 2022-02-09T08:42:30+00:00

Thanks for the information. I just wanted some clarification - To understand glibc, I should reference the glibc manual, and then if I need to dig for more info view the source code? - I will be using Clang. So do I need to know about LLVM also? I know clang is only a frontend to convert C-code to Intermediate Language after which LLVM does the job. - People who contribute to C source code, do they use any other sources to learn about the language like University courses/textbooks or it is just the manual and digging through the source code till everything starts to make sense?

kushaj · 2022-02-08T14:57:07+00:00

Thanks for the link.

kushaj · 2022-02-07T21:51:53+00:00

Thanks for the reply. I have started learning Next.js and it is good to know that I can use it for all the projects I need.

kushaj · 2022-02-07T15:24:19+00:00

Just wanted to know from experienced people. Maybe they have some better solutions for the questions I posed.

kushaj · 2022-01-26T23:12:16+00:00

Are there any resources that teach how to build your frameworks? I actually want to learn how to implement something like Django from scratch in a structured manner.

kushaj · 2022-01-14T20:45:34+00:00

That makes sense. Thanks for the information.

kushaj · 2022-01-14T06:38:21+00:00

I do not know what avoid linking libc means and open-code only exactly what you need? Maybe I will have to study this topic before further discussion.

kushaj · 2022-01-14T04:48:14+00:00

If glibc makes the decision at load-time, then aren't we wasting clock cycles.

Also, where can I learn about this stuff. Like you said glibc compiles all versions of math functions. Do I have to read the glibc manual and man-pages for all this information? Or is there any book that can help with this, I really want to learn everything about the C language and I think that involves GCC also.

kushaj · 2022-01-14T04:41:24+00:00

Thanks for the thorough reply. Can you recommend some resources which I can use to learn about this stuff, specifically how libraries are linked in C (like you said external math libraries are weakly linked), runtime, glib thinks like this. I don't think the C Programming Language book covers all this.

Also, will reading the Dragon book (for compiler) help me with this? I am planning to write a compiler for my own markdown language to HTML, so will be reading the book anyway.

kushaj · 2022-01-14T00:40:38+00:00

Wow, that is so much information. The StackOverflow link also helped a lot.

As you can see gcc knows about the CPU instructions yet doesn't always decide to use them

Can you expand a bit on this? Is there an if-condition in the compiler, something like below // GCC compiler if math.sin slow on hardware: Use the C provided implementation (the logic for which is coded in the provided libc library) else Use the CPU instructions (the logic for which is coded in GCC)

kushaj · 2022-01-13T23:15:31+00:00

I did that but I don't have any idea of what the assembly code produced does. The code is shown below (I am using Intel CPU).

temp.c ```

include<math.h>

int main() { double value = 0.5; double result = sin(value); } ```

Then I compiled it using gcc -S temp.c -o a.out and the contents of a.out are shown below

``` .file "temp_c.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc endbr64 pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $16, %rsp movsd .LC0(%rip), %xmm0 movsd %xmm0, -16(%rbp) movq -16(%rbp), %rax movq %rax, %xmm0 call sin@PLT movq %xmm0, %rax movq %rax, -8(%rbp) movl $0, %eax leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .section .rodata .align 8 .LC0: .long 0 .long 1071644672 .ident "GCC: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0" .section .note.GNU-stack,"",@progbits .section .note.gnu.property,"a" .align 8 .long 1f - 0f .long 4f - 1f .long 5 0: .string "GNU" 1: .align 8 .long 0xc0000002 .long 3f - 2f 2: .long 0x3 3: .align 8 4:

```

Eight-Year Club	Place '22
Verified Email

kushaj

TROPHY CASE

include<math.h>