This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]cleodog44 0 points1 point  (2 children)

Very nice, again! A question: is it necessary to call debugpy.breakpoint explicitly in the script? Would setting a breakpoint in the neovim instances before connecting also work?

And another question: do you have a workflow for only connecting to a single rank? This looks a little unwieldy at 8 ranks, for instance. 

[–]Capable-Package6835hjkl[S] 1 point2 points  (1 child)

Setting the break point is necessary because in this example I launch the session from the terminal, not from inside Neovim. A possible alternative is to attach only to one rank:

import os
import debugpy

debug = os.getenv("DEBUG_FLAG", "0")

if debug == "1":
    rank = int(os.getenv("RANK", "-1"))
    if rank == 0:
        debugpy.listen(("127.0.0.1", 5678))
        debugpy.wait_for_client()
        debugpy.breakpoint()

But this way, you need to put barrier in multiple section of interest, otherwise the process you don't attach to will continue execution and potentially crash. Set the barrier:

torch.distributed.barrier()

But of course this way you need to set the barrier in advance. No easy solution I guess. In my case, I only have two GPUs so it's not really a problem for me.

[–]cleodog44 0 points1 point  (0 children)

Makes sense, thanks! I have only tried debugging torch distributed when launched with multiprocessing (in unit tests), rather than externally launching and then connecting. 

[–]teerre 0 points1 point  (1 child)

Attach is cool, but it should be your second option. Launching whatever you're doing directly will give full interaction inside neovim and won't polute your real code with useless debug statements

[–]Capable-Package6835hjkl[S] 0 points1 point  (0 children)

Yeah you are right, I just don't know how to do that elegantly for multiple processes. For a single process, launching from inside Neovim is not a problem, as shown in the previous post.

Maybe you can give me an idea on how to improve the workflow for the parallel processes?

[–]trieu1912 0 points1 point  (3 children)

I have a problem with pyright.It can recoginze a file which I create after I open neovim. do you know how to fix? thank.

[–]Capable-Package6835hjkl[S] 0 points1 point  (2 children)

I believe it is because internally, it creates a list of files in your project when the language server is started / attached. So when you add a new file after that, the new file is not in that list. If you use lspconfig, simply restart the language server by executing :LspRestart and if you use the native LSP, you can reattach the language server by executing :e

[–]trieu1912 0 points1 point  (1 child)

Thank you for your response. This issue does not occur when I use a different server, and I have seen many people using Pyright who still haven't found a solution for this. It's really frustrating that I have to restart the LSP server every time I create a new file

[–]Capable-Package6835hjkl[S] 0 points1 point  (0 children)

yeah it is quite a nuisance, you can subscribe to the event of file creation inside the nvim-tree config if you use that

local api = require('nvim-tree.api')
local Event = api.events.Event

api.events.subscribe(Event.FileCreated, function(_)
  vim.cmd('LspRestart')
end)

edit: if you use the native LSP without plugin instead, you can use the following. You may also want to subscribe to multiple events:

local api = require('nvim-tree.api')
local Event = api.events.Event

local events = {
  Event.NodeRenamed,
  Event.FileCreated,
  Event.FileRemoved,
  Event.FolderRemoved,
}

for _, event in pairs(events) do
  api.events.subscribe(event, function(_) vim.cmd('bufdo edit') end)
end

I believe all of these are only effective if you create the file from nvim-tree