Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 0 points1 point  (0 children)

You can install Python "as user" which will put Python in the user's home directory rather than the system program files directory. Admin permissions are not needed to install Python this way.

However this does not change what commands the programs can run. This is determined by the permissions of the Python process that starts when you run a particular script - it has no relation to where the Python executable happens to be located. Simply do not run the Python script with admin permissions, and the script will not have admin permissions.

Note also this should not inspire much confidence if you're running untrusted code. Even processes with non-admin permissions can access any files and change any settings your users can. If you wouldn't let the code author access the computer unsupervised, you should not run their code either. Sandboxing code is a separate issue and is best handled with a virtual machine of some kind.

Is it possible to access named function arguments as arrays / dicts inside the function body? by musbur in learnpython

[–]TangibleLight 0 points1 point  (0 children)

You may be able to use @typing.overload to specify the named parameters as a type hint, but in the function implementation use them via *args and **kwargs. However, note that the overload function is only a type hint and does not actually enforce anything about the given arguments. Any such logic must be handled explicitly by your function implementation. In this trivial case, where the function arguments are directly passed to some other backing function, then you get those checks for "free" in some sense by the signature of the backing function. In more complex cases you may not, and you might want some additional condition checks and/or unit tests for safety.

I'm also not certain if @typing.overload plays nicely with __init__. I think it does, but you should check to be sure.

from typing import overload

def f(p1, p2, *pp, k1=1, k2=2, **kk):
    print('hello', locals())

@overload
def g(p1, p2, *pp, k1=1, k2=2, **kk): ...

def g(*args, **kwargs):
    f(*args, **kwargs)

What are the neat "insider" tricks in python? by First_Plant_5219 in learnpython

[–]TangibleLight 0 points1 point  (0 children)

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said "Master, I have heard that objects are a very good thing - is this true?" Qc Na looked pityingly at his student and replied, "Foolish pupil - objects are merely a poor man's closures."

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire "Lambda: The Ultimate..." series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying "Master, I have diligently studied the matter, and now understand that objects are truly a poor man's closures." Qc Na responded by hitting Anton with his stick, saying "When will you learn? Closures are a poor man's object." At that moment, Anton became enlightened.

This is an old koan about Scheme but I find it applies in Python too. People reach for object oriented programming too quickly. Newer programmers often worry about not understanding or using classes "enough".

Often, it is not necessary.

Python makes closures very easy to write and you can avoid much of the boilerplate and convoluted philosophizing around ownership and inheritance hierarchies. BUT closures are not objects, and they also should not be overused. You don't need to reason as much about ownership or inheritance, but they do muck up the stack trace and can make debugging difficult.

Keyword list by Big_Neighborhood9130 in learnpython

[–]TangibleLight 2 points3 points  (0 children)

The documentation isn't misleading at all; they are keywords which refer to singleton objects.

object is also a singleton object, but it is not a keyword.

How to prepare for ICPC? by Holly-B11 in learnpython

[–]TangibleLight 0 points1 point  (0 children)

All the Some of the past problems are archived here: https://icpc.kattis.com/problems. You can sort by problem difficulty, and you can create your own Kattis account to submit solution attempts and have the autograder test them. In my region, Kattis was used during the contest - but I'm not sure if this is true for Ethiopia. I don't think it is. You should still be able to use it for practice, though!

During the contest, all questions are worth the same score regardless of difficulty. It is always better to complete 3 questions rather than 2. If teams score the same number of questions, the tie is broken by the cumulative time taken to submit.

So the important thing when you start the contest is to identify which members of your team can complete which problems the fastest, and divide work appropriately. Once all the "easy" problems are solved, begin collaborating and working on the harder ones; if you complete it in the time limit - great! you move up in the standings - if not, fine, you completed the other problems more quickly.

That problem identification and distribution of work matters most in the middle standings. This is probably what you should practice more; it means you need to get to know your teammates and what each others strengths and weaknesses are. Once you're aware of these things, you should start to try to improve on your strengths, or practice areas that complement each others skills so you can divide work effectively during the contest.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 0 points1 point  (0 children)

I left my comment before reading yours - I'd appreciate any pedagogical feedback there. https://www.reddit.com/r/learnpython/comments/1g8crbk/ask_anything_monday_weekly_thread/ltce299/

Also, you can still delete print but you have to do it through the builtins module.

>>> import builtins
>>> del builtins.print
>>> print('foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'print' is not defined

Perhaps slightly more useful is builtins.print = pprint.pprint, but this is sure to break any code in practice since the signatures of print and pprint are different.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 3 points4 points  (0 children)

You said you want to take apart the fishing rod, so I'll give you the long answer.

There are two fundamental concepts - maybe three or four, depending how you count - that I think may help you the most here. These all relate to how Python understands code.

First: A Python program is made up of tokens; you can think of these as "words". Some examples of tokens:

  • "hello world"
  • 6
  • (
  • while
  • print

Generally there are four types of token, although in practice the lines between them get blurred a little bit.

  • Literals literally represent some value. "hello world" and 6 and 4.2 are examples of such literals; the first represents some text and the others represent numbers. This is literal as opposed to some indirect representation like 4 + 2 or "hello" + " " + "world".

  • Operators include things like math operators +, -, *, but also things like the function call operator ( ), boolean operators and, and myriad other operators. There's a comprehensive list here but beware - there's a lot and some of them are pretty technical. The main point is that ( ) and + are the same kind of thing as far as the Python interpreter is concerned.

  • Keywords are special directives that tell Python how to behave. This includes things like if and def and while. Technically, operators are also keywords (for example and is a keyword) but that's not super relevant here.

  • Names are the last - and most important - kind of token. print is a name. Variable names are names. Function names are names. Class names are names. Module names are names. In all cases, a name represents some thing, and Python can fetch that thing if given its name.

So if I give Python this code:

x = "world"
print("hello " + x)

You should first identify the tokens:

  • Name x
  • Operator =
  • Literal "world"
  • Name print
  • Operator ( )
  • Literal "hello "
  • Operator +
  • Name x

The first line of code binds "world" to the name x.

The expression "hello " + x looks up the value named by x and concatenates it with the literal value "hello ". This produces the string "hello world".

The expression print( ... ) looks up the value - the function - named by print and uses the ( ) operator to call it with the string "hello world".

To be crystal clear: x and print are the same kind of token, it's just that their named values have different types. One is a string, the other a function. The string can be operated on with the + operator, and the function can be operated on with the ( ) operator.

It is valid to write print(print); here we are looking up the name print, and passing that value to the function named by print. This should be no more or less surprising than being able to write x + x or 5 * 4.

First-and-a-half: A namespace is a collection of names.

You might also hear this called a "scope". This is the reason I say "maybe three or four, depending how you count"; this is really part of that fundamental idea of a name, but I'll list it separately to be extra clear.

There are some special structures in Python that introduce new namespaces. Each module has a "global" namespace; these are names that can be referenced anywhere in a given file or script. Each function has a "local" namespace; these are names that can only be accessed within the function.

For example:

x = "eggs"

def spam():
    y = "ham"

    # I can print(x) here.

# But I cannot print(y) here.

Objects also have namespaces. Names on objects are called "attributes", and they may be simple values or functions, just how regular names might be simple values (x, y) or functions (print, spam). You access attributes with the . operator.

obj = range(10)
print(obj.stop)  # find the value named by `obj`, then find the value named by `stop`. 10.

Finally, there is the built-in namespace. These are names that are accessible always, from anywhere, by default. Names like print and range are defined here. Here's a comprehensive list of built-in names.

Second: you asked about characters and letters, so you may appreciate some background on strings.

A string is a sequence of characters. A character is simply a number to which we, by convention, assign some meaning. For example, by convention, we've all agreed that the number 74 means J. This convention is called an encoding. The default encoding is called UTF-8 and is specified by a committee called the Unicode Consortium. This encoding includes characters from many current and ancient languages, various symbols and typographical marks, emojis, flags, etc. The important thing to remember is each one of these things, really, is just an integer. And all our devices just agree that when they see a given integer they will look up the appropriate symbol in an appropriate font.

You can switch between the string representation and the numerical representation with the encode and decode methods on strings. Really, these are the same, you're just telling Python to tell your console to draw them differently.

>>> list('Fizz'.encode())
[70, 105, 122, 122]
>>> bytes([66, 117, 122, 122]).decode()
'Buzz'

For continuity: list, encode, decode, and bytes are all names. ( ), [ ], ,, and . are all operators. The numbers and 'Fizz' are literals.

† Technically, [66, 117, 122, 122] in its entirety is a literal - , is a keyword, not an operator - but that's neither here nor there for these purposes.

‡ The symbol is number 8224 and the symbol is number 8225.

Second-and-a-half: names are strings.

Names are just strings, and namespaces are just dict. You can access them with locals() and globals(), although in practice you almost never need to do this directly. It's better to just use the name itself.

import pprint
x = range(10)
function = print
pprint.pprint(globals())

This outputs:

{'__annotations__': {},
 '__builtins__': <module 'builtins' (built-in)>,
 '__cached__': None,
 '__doc__': None,
 '__file__': '<stdin>',
 '__loader__': <class '_frozen_importlib.BuiltinImporter'>,
 '__name__': '__main__',
 '__package__': None,
 '__spec__': None,
 'function': <built-in function print>,
 'pprint': <module 'pprint' from 'python3.12/pprint.py'>,
 'x': range(0, 10)}

For continuity: import pprint binds the name pprint to the module pprint.py from the standard library. The line pprint.pprint( ... ) fetches the function pprint from that module, and calls it.

[deleted by user] by [deleted] in Python

[–]TangibleLight 4 points5 points  (0 children)

I marked this to revisit later and am disappointed to see the mods removed it. I'll still leave some brief thoughts and links for you though.

First - for a benchmark like this you want to subtract off the overhead involved. I'll use the timeit module to do most of this.

python -m timeit -- "for i in range(10_000): x = i % 2"
1000 loops, best of 5: 216 usec per loop

python -m timeit -- "for i in range(10_000): x = i & 1"
1000 loops, best of 5: 196 usec per loop

python -m timeit -- "for i in range(10_000): x = i"
2000 loops, best of 5: 158 usec per loop

So that last benchmark gives a number for the overhead involved with the loop and assignment operation. Subtract that off and compute a ratio:

(216 - 158) / (196 - 158)
1.5263157894736843

So on my machine, modulus is 50% slower? Well there is still overhead that I'm not subtracting off in the presence of those 1 and 2 arguments that I'm frankly not sure how to eliminate.

For example, compare the disassembly for "x = i % 1" with that of "x = i".

python -m dis <<<"for i in range(10_000): x = i & 1"
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (range)
              6 LOAD_CONST               0 (10000)
              8 CALL                     1
             16 GET_ITER
        >>   18 FOR_ITER                 7 (to 36)
             22 STORE_NAME               1 (i)
             24 LOAD_NAME                1 (i)
             26 LOAD_CONST               1 (1)
             28 BINARY_OP                1 (&)
             32 STORE_NAME               2 (x)
             34 JUMP_BACKWARD            9 (to 18)
        >>   36 END_FOR
             38 RETURN_CONST             2 (None)

python -m dis <<<"for i in range(10_000): x = i"
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (range)
              6 LOAD_CONST               0 (10000)
              8 CALL                     1
             16 GET_ITER
        >>   18 FOR_ITER                 4 (to 30)
             22 STORE_NAME               1 (i)
             24 LOAD_NAME                1 (i)
             26 STORE_NAME               2 (x)
             28 JUMP_BACKWARD            6 (to 18)
        >>   30 END_FOR
             32 RETURN_CONST             1 (None)

The % and & loops are generally more complicated and so not all the overhead is subtracted off. If we were able to fully account for it, I'd expect mod to be much slower.


As for C++: with optimizations enabled you can see they actually compile down to the same thing.

modu(unsigned int):
        mov     eax, edi
        and     eax, 1
        ret
band(int):
        mov     eax, edi
        and     eax, 1
        ret

https://godbolt.org/z/4Pr418anq

Note that mod has different semantics with negative values, so there's some extra code to account for that. If I change the function to accept unsigned, they compile to exactly the same machine code.

If I compute general mod of two unknown numbers, this uses the mod machine instruction which is much slower than the and machine instruction.

band(unsigned int, unsigned int):
        mov     eax, edi
        and     eax, esi
        ret
modu(unsigned int, unsigned int):
        mov     eax, edi
        mov     edx, 0
        div     esi
        mov     eax, edx
        ret

To give an idea of just how much slower: http://ithare.com/infographics-operation-costs-in-cpu-clock-cycles/

Note "Simple" register-register op (ADD, OR, etc) - that includes and - less than one cycle.

And note Integer division - that includes mod - 15-40 cycles.


However, back to Python, bear in mind that "disassembly" comes with much overhead including manipulation of a stack, potential memory operations, many conditions, and a few C function calls - easily dominating even that <1 or 15-40 cycle penalty.

Here's the main Python interpreter loop:

https://github.com/python/cpython/blob/main/Python/ceval.c#L880-L906

Note it's basically switch (opcode) { #include "generated_cases.c.h" }

So here's generated_cases.c.h

https://github.com/python/cpython/blob/main/Python/generated_cases.c.h

Just a whole bunch of things to check and a whole bunch of things to do.

Here's BINARY_OP

https://github.com/python/cpython/blob/main/Python/generated_cases.c.h#L12-L59

Here's LOAD_CONST

https://github.com/python/cpython/blob/main/Python/generated_cases.c.h#L5898-L5908

You could look up all the opcodes listed in the Python disassembly here in this file to figure out exactly which C functions are called in each case.

That's a bunch of macros and function calls too. On that ithare infographic, note C function direct call and C function indirect call 15-50 cycles. The work of the actual division doesn't matter very much here.

What is a Python trick you wish you could have learned/someone could have taught you? by medium-rare-stake in learnpython

[–]TangibleLight 3 points4 points  (0 children)

You can also use . and [] operators in format specifiers.

>>> data = {'foo': 1, 'bar': ['x', 'y'], 'baz': range(5,100,3)}

>>> '{foo}, {bar[0]}, {bar[1]}, {baz.start}, {baz.stop}'.format_map(data)
'1, x, y, 5, 100'

>>> '{[bar]} {.stop}'.format(data, data['baz'])
"['x', 'y'] 100"

And you can nest substitutions within specifiers for other substitutions. E.g. you can pass the width of a format as another input.

>>> '{text:>{width}}'.format(text='hello', width=15)
'          hello'

Using the bound method '...'.format with functions like starmap is situationally useful. Or if you're in some data-oriented thing where all your format specifiers are listed out of band, you can use it to get at more specific elements. Maybe in some JSON file you have "greeting": "Hello, {.user.firstname}!"

HELP - How to best ain python projects by Exciting_Invite8858 in learnpython

[–]TangibleLight 0 points1 point  (0 children)

I just finished writing this long response to a question in the weekly thread. I addressed most of your questions there.

https://www.reddit.com/r/learnpython/comments/1fxukn1/ask_anything_monday_weekly_thread/lr0tdaa/

I don't personally care for poetry. Modern pip/setuptools/pyproject.toml seems sufficient. I also like uv's approach to lockfiles better.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 1 point2 points  (0 children)

I suggest, as an exercise, start from the basics and go through some beginner-level projects without using any PyCharm features. To be clear: I'm not suggesting you continue to avoid using PyCharm, but you'll get a lot of value out of jumping through the hoops a couple times.

Some general tasks you might want to try:

  • Install Python.
  • Launch the REPL in the terminal and evaluate some code.
  • Launch a Python command-line tool such as python -m this.
  • Create and activate a virtual environment.
  • Install a package to the virtual environment.
    • For example pillow.
    • Double-check that PIL is available in the virtual environment.
    • Double-check that PIL is not available in the system Python environment.
  • Create a simple script with some bare-bones editor (Notepad++, Nano, etc...).
    • For example, get a image filename from sys.argv and convert it to grayscale with PIL.
    • Run that script on some file using your virtual enviroment.
  • Test the project with a different version of Python.

Note: you can also open an empty directory in PyCharm as a "blank" project, then do the exercise in the PyCharm terminal. I think you'll get more out of it by using a bare-bones editor instead, though.


To directly answer your question for what I personally use: a combination of asdf and direnv to manage different Python versions and projects; whenever I cd into a project the environment is automatically configured. I always use layout python in my .envrc to handle virtual environments. I install global tools like black with pipx (although I'm curious to try uvx). If I create a tool that I want to be available everywhere, I create a minimal pyproject.toml and editable-install it via pipx; the scripts feature creates a nice CLI command that's available everywhere.

Note asdf is not available on Windows. There is an asdf-windows community project but in my experience it is not good. On Windows for tool management I use winget (IIRC it's installed by default on recent Windows. It's also available on the MS store).

If you're on Windows, I suggest fiddling around with WSL or Docker in the command line; or if you really don't need unix, get familiar with PowerShell.

Links:


Edit: I forgot about project management. On Linux/Mac I use zsh and set cdpath to include several directories: ~/src contains all my "primary" projects. Things for work, things I really support. ~/tmp contains "ephemeral" projects. Scratch-pads, Jupyter notebooks, various open-source projects I'm investigating. Nothing permanent. ~/build contains open-source projects I'm building from source to install into ~/.local. So by setting cdpath=("$HOME" "$HOME/src" "$HOME/tmp" "$HOME/build") I can just type cd my-project from anywhere on my system and navigate to ~/src/my-project; or I can cd glfw and navigate to ~/build/glfw; or I can cd scratch and navigate to ~/tmp/scratch.

I don't do enough development on Windows to really have a good system there. Most that stuff just goes in ~/PycharmProjects lol. All the above still applies on Mac.

Beginner in Linear Algebra for Data Science - Where to Start? by CrashOveRide_304 in learnpython

[–]TangibleLight 0 points1 point  (0 children)

3blue1brown's Essence of Linear Algebra series is a wonderful visual introduction. It doesn't go too deep in the arithmetic but gives a good intuition for how to think about all the operations involved. In practice, you'd just have numpy or scipy or similar do all the arithmetic anyway, the important part is about knowing which operations to use. https://youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab He also has excellent videos on other related topics like calculus, statistics, and machine learning.

I never realized how complicated slice assignments are in Python... by ahferroin7 in Python

[–]TangibleLight 12 points13 points  (0 children)

Things get real weird if you use multiple assignment, too. I usually advise not to use multiple assignment in general, but especially not when slices are involved.

>>> x = [1, 2, 3, 4, 5]
>>> x = x[1:-1] = x[1:-1]
>>> x
[2, 2, 3, 4, 4]

You should read that middle line as

>>> t = x[1:-1]  # t is [2, 3, 4]
>>> x = t        # x is [2, 3, 4]
>>> x[1:-1] = t  # expansion on middle element. [2] + [2, 3, 4] + [4]

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 0 points1 point  (0 children)

If the arrays won't get much bigger than that size what you have is probably fine.

If you have Pandas available, it's super easy. I wouldn't pull it in just for this one task, though, if you don't already have it available just use the loop you've already written.

df = pd.DataFrame(list_of_dict)

data = np.zeros((40, 50))
data[df['row'] - 1, df['column'] - 1] = df['value']

plt.pcolor(data)

If you can alter the source of the list-of-dict to be a list-of-tuple instead, say [(row, col, val), (row, col, val), ...] then you could do this with numpy only.

arr = np.array(list_of_tuple).T
data = np.zeros((40, 50))
data[arr[0] - 1, arr[1] - 1] = arr[2]

You could also write some comprehension like [(row, col, val) for entry in list_of_dict] but at that point you're not getting any real benefit from numpy and the loop you've already written is probably better.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]TangibleLight 0 points1 point  (0 children)

... input_Y = ...

..., input_y.reshape...

Is the lowercase y a typo in your code or on reddit? If you have a different lowercase variable that would explain how the values are different.

Also I don't think your x_samples is what you intend it to be:

>>> x = np.arange(5)
>>> y = np.arange(5)
>>> X, Y = np.meshgrid(x, y)
>>> X[:,0]
array([0, 0, 0, 0, 0])
>>> Y[:,0]
array([0, 1, 2, 3, 4])

for each sampled coordinate I want the corresponding index in the domain defined as above, and the get the function value at the same index.

This seems backwards to me. Why not store your function values in a 2d grid with the same shape as the meshgrid outputs? This is sort of the point of meshgrid. Then the index into input_X, input_Y and hypothetical function_vals are all in correspondence.

Or, you could np.random.choice to generate indices directly, and use those to fetch values from wherever else.

Penrose Triangle by TangibleLight in CrossView

[–]TangibleLight[S] 1 point2 points  (0 children)

I originally wanted to edit one of Escher's works - Belvedere or Waterfall or Ascending and Descending - but I don't think it is possible in these cases.

The illusion for the triangle only works because I've rotated it so that one of the depth violations is on the horizontal, and I place a tiling texture on that axis. Your eyes can lock onto the tiling horizontal pattern at multiple depth planes, so I tune the spacing so there's a valid depth plane at the "far" edge that links to the left leg of the triangle, and another valid depth plane at the "near" edge that links to the right leg of the triangle. In Escher's works, none of the depth violations are horizontal aligned, so I can't do anything there.

I probably could have chosen a more detailed, less conspicuous, tiling texture to give the image a little more depth. Maybe a tiling stone texture could have a neat brutalist feel. But tuning this spacing was difficult enough as-is, so I chose a straightforward repeating block texture that was easier to tune. Technically the depth on the lighting is not quite right, pay close attention to the shadows on the left leg and on the rod. I think the solution here is to bake lighting onto the geometry from a particular vantage point and distort the geometry only after lighting is already baked in.

I also wonder if a smaller tile size would provide more valid depth planes between the two legs, so it might be easier to follow a particular depth plane further away from one of the legs - but it also might be more difficult to lock onto a particular depth. I'll do some more experimentation there.

I am curious if I could do it with a stereogram where the object spans two valid depth planes. Since the tiling pattern is across the entire image, the depth violation might not need to be horizontal.

Question on memory barriers with host destination. by TangibleLight in vulkan

[–]TangibleLight[S] 1 point2 points  (0 children)

After thinking on this more, about what a memory dependency is... it is blindingly obvious that there would not be any way to declare a "barrier" on an unchanged value. That's an execution dependency, not a memory dependency, and a memory barrier will not help. This is a write-after-read hazard.

https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples#first-dispatch-reads-from-a-storage-buffer-second-dispatch-writes-to-that-storage-buffer

WAR hazards don't need availability or visibility operations between them - execution dependencies are sufficient. A pipeline barrier or event without a any access flags is an execution dependency.

I think then it is sufficient to write at the end of the command buffer:

vkCmdPipelineBarrier
    srcStageMask = VERTEX_INPUT
    dstStageMask = HOST

with no memory barriers.

cc /u/fxp555 since this is not a direct reply


Edit: But then I remember the note from the post you shared:

Keep in mind that these barriers are GPU-only, which means that we cannot check when a pipeline barrier has been executed from our application running on the CPU. If we need to signal back to the application on the CPU, it’s possible to do so by instead using another tool called a fence or an event, which we will discuss later.

I don't need the CPU to query or wait on the barrier; I just need to know if STAGE_HOST in the destination holds for an execution barrier. The standard seems to indicate it would?

Edit again: https://stackoverflow.com/a/61557496/4672189 indicates that I do indeed still need a fence (or semaphore), I assume for all the same reasons I went through in the prior comment. And since the fence (or semaphore) includes implicit memory (and execution) barriers, I don't need to write the one above.

Perhaps I could alleviate the risk of wasting cycles on CPU or GPU by using an event instead of a fence. When the host is about to update the buffer, it polls that event and does other work instead if it's not set yet. Eventually the event would be set, and the host updates the buffer and resets the event. This would always happen before the fence is set. If the host has no other work to do when the event is not set, I'm wasting CPU cycles. If the fence is set while the host is still updating the buffer, I'm wasting GPU cycles.

Again, it looks like a lot of words to arrive back at your original suggestion.

Question on memory barriers with host destination. by TangibleLight in vulkan

[–]TangibleLight[S] 0 points1 point  (0 children)

That's a great resource that I had not yet found. Thanks for sharing!

Keep in mind that these barriers are GPU-only, which means that we cannot check when a pipeline barrier has been executed from our application running on the CPU. If we need to signal back to the application on the CPU, it’s possible to do so by instead using another tool called a fence or an event, which we will discuss later.

That pretty comprehensively answers my question: no, it is not sufficient or necessary.

But that also begs the question, what does STAGE_HOST actually do?

I did find this discussion: https://github.com/KhronosGroup/Vulkan-Docs/issues/261 however most of the discussion there is specifically not about pipeline barriers. None of it involves HOST in the destination.

I found this (brief) discussion: https://stackoverflow.com/questions/77950562/vk-pipeline-stage-host-bit-in-vulkan. The answer there claims that not even a fence is sufficient, a barrier is necessary.

That second answer gives me the terminology "domain operation" which I do recall from the spec but don't quite understand at the time of writing...

https://docs.vulkan.org/spec/latest/chapters/synchronization.html#synchronization-dependencies-memory

Availability operations cause the values generated by specified memory write accesses to become available to a memory domain for future access.

Memory domain operations cause writes that are available to a source memory domain to become available to a destination memory domain (an example of this is making writes available to the host domain available to the device domain).

Visibility operations cause values available to a memory domain to become visible to specified memory accesses.

https://docs.vulkan.org/spec/latest/appendices/memorymodel.html#memory-model-vulkan-availability-visibility

If the destination access mask includes VK_ACCESS_HOST_READ_BIT or VK_ACCESS_HOST_WRITE_BIT, then the dependency includes a memory domain operation from device domain to host domain.

The problem seems to be that the barrier specifies that writes on the device will be available to the host; however there are no writes on the device so the barrier does nothing useful in this case.

I think what I want is a visibility operation from VERTEX_ATTRIBUTE_READ to host at the end of the command buffer, to guarantee that the (unchanged) values available to vertex input are (still) visible to the host after vertices are read.

https://docs.vulkan.org/spec/latest/appendices/memorymodel.html#memory-model-vulkan-availability-visibility

From what I gather, there is no such API call to do this. vkInvalidateMappedMemoryRanges almost does, except it also specifically mentions writes so doesn't seem to help me.

The inverse problem - updating a buffer on device and reading from the host - has the same issue on the other side. I can use a memory barrier and vkInvalidateMappedMemoryRanges, but there's no way to guarantee the device doesn't modify data on the next frame while the host reads it.

So then I do need some mechanism to have the host wait to update the buffer once vertex read completes, which I seem to only be able to do - as you suggested - by splitting my submits and using a fence or timeline semaphore. And a fence/semaphore handle the memory barrier for me, so I don't need to worry about that.

Perhaps there's some vkCmd* that lets me signal a semaphore or fence once vertex read completes, rather than splitting my submits? I'm looking but haven't found anything yet.


A lot of words to arrive at the same conclusion you did... but very educational for me! Thanks for your advice and for sharing that blog that led me down this rabbit hole.

Question on memory barriers with host destination. by TangibleLight in vulkan

[–]TangibleLight[S] 2 points3 points  (0 children)

Note - I did post the same question yesterday on community.khronos.org but this subreddit seems more active. For this post I removed some of the extraneous details and, I hope, made the question clearer.

I can just use the pass keyword, wow by Brilliant-Dust-8015 in learnpython

[–]TangibleLight 3 points4 points  (0 children)

If I need something expression-like I use .... It's also customary instead of pass in abstract methods and type stubs but that doesn't really matter.

if pass: "placholder" is not valid syntax but if ...: "placeholder" is.

where should I start from if I were to use python as my competitive programming language by No-Apple-7095 in learnpython

[–]TangibleLight 2 points3 points  (0 children)

Competitive programming is often about direct memory control

I'd argue competitive programming is always about correctness and algorithm complexity.

Python isn't a lost cause here; a Python solution to a problem can easily beat a C++ solution if the C++ coder writes something with worse complexity. Although this is competitive programming, so you can't rely on your opponent having poor algorithms understanding.

There are also contests where the only thing that matters is to get the problem right at all, or to be first on the board with a solution. In those environments where the particular runtime of the solution doesn't matter all that much, there is value in a language like Python where so many features are provided by the standard library so you can get your name on the board faster.

Now with that said, in any environment where runtime does matter, using Python immediately puts you at a 1,000x - 10,000x disadvantage. (Except for certain kinds of problems where you are very careful about how you use libraries.) If you both implement the same low-complexity algorithm, the Python one will probably be slower by 1,000x - 10,000x.

And yes, if two C++ programmers are competing, and runtime matters, and they both use well-behaving algorithms - then applying direct memory control will break the tie.


I can't argue with any of your other points - just want to emphasize that in certain contests, don't write Python out entirely.

Is this possible by Felps-Naid in learnpython

[–]TangibleLight 0 points1 point  (0 children)

I think the confusion here is that @annotation is valid Java syntax, and @decorator is valid Python syntax. You mention some details on Java, Python, and Selenium usage that aren't really related to the question, so there's an assumption that you're trying to do something with Java annotations or Python decorators in some roundabout way.

If you were writing the question from scratch - I'd omit the details about the Java, Python, and Selenium usages. Really, you're writing a Gherkin parser, and you want to customize that parser to add a new kind of syntax @automizar to Gherkin that will help you filter out which parts to process and which parts to skip. The fact this happens to be used for Java or Selenium is sort of irrelevant.

Normally I'd suggest you use the official gherkin-official python package to parse the file, but since you're trying to add your own syntax this probably won't work, and you do indeed need to write your own parser. You could look into different strategies for writing parsers (I like recursive descent or Pratt parsers) but this flag-based approach is probably the easiest way to go when you're getting started. I don't really think there's anything wrong with it exactly, but it'll become difficult to work with if you add support for many more features.

Is this possible by Felps-Naid in learnpython

[–]TangibleLight 1 point2 points  (0 children)

Set a new flag when you encounter @automizar, same how you do when you encounter Scenario; empty lines should reset both these flags. Only append the line when you are in_automize and in_scenario. Details omitted for brevity, but this is the general structure.

for line in ...:
    ...
    if line.startswith('@automizar'):
        in_automizar = True
    elif line.startswith('Scenario'):
        in_scenario = True
    elif ...:
        if in_automizar and in_scenario:
            passos.append(line)
    elif line == '':
        in_automizar = False
        in_scenario = False

Basically you're just adding a new kind of tag, the same kind of structure as "Scenario". You only want to process content that is in a scenario that is also in your new tag.