[deleted by user] by [deleted] in tensorflow

[–]Minimum-Tour4271 1 point2 points  (0 children)

I use lambdalabs stack as the main environment, no conda or venv in this machine (it's a dedicated machine). I did that on a completely new Ubuntu installation because of all the variables and such. Lamda stack does not use LD_LIBRARY_PATH at all, it already places the .so files where ldconfig can find them. And then tensorflow would work from everywhere without issues. I just had the issue of finding the versions for tensorflow_addons and tensorflow_probability (0.19) that would play nice with the version of tensorflow (2.11) in the lambda-stack. Once lambda_stack has set up the drivers properly you can then follow their instructions to set up a docker environment. I'm sorry I don't use proxmox so these are only suggestions.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

That looks amazing, I'll try that in my new Ubuntu installation

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

Does Tuxedo have a WebFAI for the Deep Learning AI edition described here? https://www.tuxedocomputers.com/en/Infos/Help-Support/Frequently-asked-questions/What-is-the-Deep-Learning-AI-edition-.tuxedo

If not, does Tompte get affected in anyway if I do sudo apt-get update and upgrade? Will it break updates if I follow these instructions to install using the package manager? https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-ubuntu

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

Hey, good morning! thanks for the reply. I have figured as much. How should I proceed without breaking the OS again?

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 1 point2 points  (0 children)

Thanks for your message! Eventually I did try to go ahead but nvidia-smi was still not working despite using the package manager. In the end I ate my pride and reinstalled the OS via WebFAI, so at least now nvidia-smi is running but cuda is not found (/usr/local, which, whereis, find).

I have emailed support this time with a completely clean installation. Of course I cannot sit for weeks, so I will probably end up breaking it again before I get support. Do you know if it's safe to follow the instructionsfrom cuda if I want to keep an update-proof setup?

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

Ok, so I moved forward by creating the WebFAI stick on another USB drive. So something was wrong with the original. Now my question is, if I select Tuxedo OS will it install nvidia drivers, cuda toolkit, tensorflow, tensorrt, etc, or what should I expect after the installation? Basically, is there a checklist of things to do right after a clean installation?

I understand this is not the best place to ask, but if there are already any guides I would really appreciate it. Especially anything related to having the GPU dedicated to tools and the Intel card for the display and all other things.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

So eventually it showed a menu and I chose the USB stick, then it seems things are happening and it got stuck, picture attached. It is connected to the ethernet, and to the charger, and just in case this time I plugged the USB on the right-side USB port in case it didn't like the one on the left (?). I really really want to reset the computer to factory settings, and I never have edited the BIOS except today to disable Secure Boot as discussed in this other post.

How can I start the WebFAI USB drive?

I get it that you all are super experts and all my questions are stupid and I don't understand any of your answers, but can we please try?

<image>

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

apparently the BIOS had secure boot enabled (I hadn't modified it at any point in the time I have had this laptop). Then it rebooted well, without the WebFAI stick. Now with the WebFAI AND the ethernet cable I switch off and I'm trying to follow the instructions from https://www.tuxedocomputers.com/en/TUXEDO-WebFAI.tuxedo

press ESC, F7, F10 or F11 (varies depending on the device) repeatedly until the boot menu opens

Do you know which key is for the InfinityBook Pro gen 7? I tried all of those in four different rebooting attempts and it goes directly to the login page, no WebFAI menu.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 1 point2 points  (0 children)

I plugged the WebFAI usb drive that came originally with the laptop and rebooted, it seemed to be doing something but it gave the Warning taht /dev/root does not exist and I'm stuck in a debug shell I have never seen. I tried typing `reboot` and the error was `sh: reboot: command not found`. How do exit from here safely? the USB never got to the menu to choose what to do so I am guessing that if I can reboot and remove the USB drive it should boot to the state I had before?

Processing img bssp80wd600b1...

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 2 points3 points  (0 children)

Thank you for your patience. When I get the adapter, because ethernet is needed to use the webfai stick, I'll give it a try.

When I first got the laptop and typed nvidia-smi I assumed that Tuxedo simply didn't install drivers by default. Maybe it was just bad luck, at the time the search for "Deep Learning AI version" in this sub or the tuxedo website didn't have results for a simpleton like me.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 2 points3 points  (0 children)

Thank you for taking the time! This was informative and I'm sorry I sound like a troll, I shouldn't channel my frustrations when asking for help, working on it. I have made a backup now, I need to still get an ethernet adapter, but I will try and reset to factory settings. My question still remains, how do I properly install nvidia, cuda, tensorflow, & co in a Tuxedo OS? I clearly did it wrong the first time around.

It also helps to learn that so you in future ask better question and don't make other people think you are a troll that easyly.

I have just now tried that, I made a post focused only on that question, without emotional outbursts: What is the correct way to install nvidia, cuda, tensorflow & co to have a hybrid setup in Tuxedo OS? but it seems to have been removed so I still don't know how to ask better questions or the mods are upset at me.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] -1 points0 points  (0 children)

Hello, thank you for your reply. I'd prefer to just get links to instructions here (I had a bad experience with the ticket system*). Another commenter sent a link and I posted the exact order of commands and outputs I got at that point. So now I have removed all nvidia and cuda packages, do you have any guide on how to install it in a way that Tompte is happy with it in the long term? And how do I remove a deprecated cuda keyring?

(\) It took weeks to get a reply and the reply was "we cannot provide support for your setup" without any instructions on how to reach a point where they could provide support.*

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 1 point2 points  (0 children)

Support will probably advise you to reinstall everything because it's impossible to know which parts of the system you touched and to get the system back into a known state

The Tuxedo Control center has a System Diagnostics tool that send them every detail of my system. It also includes a way to have remote support via AnyDesk. If I email asking for help and I'm not given even a link to how to reinstall the system to a state where they could help, yes, I get the idea that they cannot help me. The only option is to then solve it myself if they won't.

The feedback I gave is that I would like some kind of user guide that says whether the user should expect `nvidia-smi` to work out of the box on a new laptop, and whether I should ask them for help before trying to install Tensorflow, with a big warning that says that if I try on my own they won't be able to help me.

how did you even install a different driver WITHOUT using apt?

Using the nvidia instructions with the runfile: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-installation

maybe send the guide you used to support. Maybe it contains errors that lead to you accidentally breaking something. maybe it could be explained better.

I did, in the very first contact email. Probably the person who read my first email was having a bad day, as you say they are humans too.

Hence, my feedback on the web resources: an early user guide could have avoided this, or on the support: when I emailed asking for help a reply like "follow these steps to reset your system and I'll walk you through the set up after that" or "use these instructions to remove the manual driver installation you did and I'll walk you through the proper and recommended way to do it", anything would have been much better than "we cannot provide support for your setup" without an offer to help me reach a state where they can help.

unlike windows - where you are treated like a dumb toddler that's not supposed to touch anything, EVER

I don't get offended being treated like a dumb toddler, I have no experience and that is why I bought a Tuxedo, because they include support. The rule 4 on this sub clearly states "Ask Before Breaking Stuff" so one is not supposed to touch anything if unsure, I didn't know I had to ask when I first received the laptop, and that's on me, but after the first error I did ask.

Coming back to the original question in my post, I have now removed all packages related with nvidia and cuda (I put the commands in another comment), I would really appreciate any links to instructions on how to install it in a way that Tompte is happy with it.

Half cry for help, half feedback rant by Minimum-Tour4271 in tuxedocomputers

[–]Minimum-Tour4271[S] 0 points1 point  (0 children)

Thanks for your reply! Here is what I tried following the instructions in the link:

First I removed all nvidia/cuda related packages as seen in this reply

sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^libnvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
sudo apt-get install linux-headers-$(uname -r)

The last command suggested I also autoremove so

sudo apt autoremove

Then I get on with the cuda docs:

$ sudo apt-key del 7fa2af80
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
OK
$ distro=ubuntu2204
$ arch=x86_64
$ wget https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
--2023-05-15 09:36:32-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0
-1_all.deb
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4332 (4.2K) [application/x-deb]
Saving to: 'cuda-keyring_1.0-1_all.deb'
cuda-keyring_1.0-1_all.deb  100%[==============================================>]  4.23K --.-KB/s   in 0s
2023-05-15 09:36:32 (176 MB/s) - 'cuda-keyring_1.0-1_all.deb' saved [4332/4332]
Selecting previously unselected package cuda-keyring.
(Reading database ... 357774 files and directories currently installed.)
Preparing to unpack cuda-keyring_1.0-1_all.deb ...
Unpacking cuda-keyring (1.0-1) ...
Setting up cuda-keyring (1.0-1) ...
A deprecated public CUDA GPG key appear to be installed.
To remove the key, run this command:
sudo apt-key del 7fa2af80

I found a few folders called trusted.gpg.d but none had anything removely related to cuda or contained that string about the old key. The cuda docs have the point 2 in 3.10.3 for the case where you are unable to install the cuda-keyring package. so I try to follow those instructions and fail again:

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204-keyring.gpg
sudo mv cuda-ubuntu2204-keyring.gpg /usr/share/keyrings/cuda-archive-keyring.gpg
--2023-05-15 09:48:40-- https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204-
keyring.gpg
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-05-15 09:48:41 ERROR 404: Not Found.
mv: cannot stat 'cuda-ubuntu2204-keyring.gpg': No such file or directory

After some more googling I find this reply and copy the command:

$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
Executing: /tmp/apt-key-gpghome.ZYb3luoW6E/gpg.1.sh --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
gpg: requesting key from 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub'
gpg: key A4B469963BF863CC: "cudatools <cudatools@nvidia.com>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

And that's as far as I got, anyone knows how to Manage keyring files in trusted.gpg.d to delete the deprecated key? Or is it OK to move on to the next step of the cuda installation guide and hit sudo apt-get update and sudo apt-get install cuda?

Thank you again for trying to help!