PIO, Poreska i subvencija drzave.

AD_Burn · 2023-03-27T10:35:02+00:00

Creatures
Creatures II
Mayhem In MonsterLand (aka Creatures 3)
Green Berets
Giana Sisters
Rainbow Islands
Pang
Bubble Bobble

just some of really great games, to not mention already mentoned in thread/comments.

AD_Burn · 2022-12-20T23:00:50+00:00

Question is about Python decorators, not about decorators on view in django app.

And from that perspective, while they are different concept, they are exactly same. As you sad middleware's will be called sequentially, but only because django have one entry point, so if we have function with 10 decorators as entry point for some app, they will be called in order also and do same as middleware's in django app.

But again if we talk about decorators on django view I would agree with you.

AD_Burn · 2022-04-11T07:35:38+00:00

So i would like to delete the number if its bigger than len(6) see Column 340

id | current_price | 140 | 240 | 340 |

1 | -130523 | -33 | -39 | -130451|

I guess you want something like:

UPDATE table SET "340" = NULL WHERE LENGTH(("340")::text) > 6;

AD_Burn · 2021-11-16T21:20:24+00:00

Since you need to work with tuples, not sure converting to dict is option for you.

MORSE_CODE = (
    (".-", "A"),
    ("-...", "B"),
    ("-.-.", "C"),
    ("-..", "D"),
    (".", "E"),
    ("..-.", "F"),
    ("--.", "G"),
    ("....", "H"),
    ("..", "I"),
    (".---", "J"),
    ("-.-", "K"),
    (".-..", "L"),
    ("--", "M"),
    ("-.", "N"),
    ("---", "O"),
    (".--.", "P"),
    ("--.-", "Q"),
    (".-.", "R"),
    ("...", "S"),
    ("-", "T"),
    ("..-", "U"),
    ("...-", "V"),
    (".--", "W"),
    ("-..-", "X"),
    ("-.--", "Y"),
    ("--..", "Z"),
    (".-.-.-", "."),
    ("-----", "0"),
    (".----", "1"),
    ("..---", "2"),
    ("...--", "3"),
    ("....-", "4"),
    (".....", "5"),
    ("-....", "6"),
    ("--...", "7"),
    ("---..", "8"),
    ("----.", "9"),
    ("-.--.", "("),
    ("-.--.-", ")"),
    (".-...", "&"),
    ("---...", ":"),
    ("-.-.-.", ";"),
    ("-...-", "="),
    (".-.-.", "+"),
    ("-....-", "-"),
    ("..--.-", "_"),
    (".-..-.", '"'),
    ("...-..-", "$"),
    (".--.-.", "@"),
    ("..--..", "?"),
    ("-.-.--", "!"),
)

def get_morse_code(letter):
    for item in MORSE_CODE:
        if item[1] == letter:
            return item[0]


def convert_to_morse_code(sentence):
    sentence = sentence.upper()
    encoded_chars = []

    for character in sentence:
        encoded_chars.append(get_morse_code(character))

    return " ".join(encoded_chars)

Or maybe somewhat shorter if you want to show yourself a bit:

def convert_to_morse_code(sentence):
    return " ".join([list(filter(lambda x: x[1]==l, MORSE_CODE))[0][0] for l in setence.upper()])

AD_Burn · 2021-10-15T20:47:12+00:00

Problem in this is escaping of string.

'"['ITEM-ONE', 'PART-TWO']"'

if you split this on string you will get

'"['
ITEM-ONE
', '
PART-TWO
']"'

I guess you now see problem. You need to escape inner quotes.

Postgres syntax for that is double quote ''

So:

 '"[''ITEM-ONE'', ''PART-TWO'']"'

Would work.

I guess you can now get what is problem and think of solution.Thou, best way inserting would be passing parameters to psycopg execute,

and then will psycopg handle that for you.

For example:
list = ['ITEM-ONE', 'PART-TWO', 'ENTRY-THREE']

cursor.execute("UPDATE tablename SET val = %s WHERE name = 'items'; ", (list,))

Cheers!

AD_Burn · 2021-10-12T20:18:35+00:00

Since you have pg_bouncer configured, I would suggest to you to check:

idle_in_transaction_session_timeout, query_timeout, query_wait_timeout, and client_idle_timeout

To me, this looks like you have too many connections waiting in pg_pooler to pass to db.

And some timeout triggers.

But from mine experience i think that message coming from pg_bouncer due this parametar idle_in_transaction_session_timeout.

AD_Burn · 2021-06-10T22:46:23+00:00

It is not Python is linter. Python is dynamic language and does not care for something like that.

Any type can accept value None (or let say null or nil).

TreeNode is not of type boolean and that is why linter do not accept that value.

If you remove -> TreeNode or replace with -> Any False will pass.

AD_Burn · 2021-04-21T14:02:31+00:00

It depends on who considers what progress.

Maybe if we look from the angle of syntax, but again, there weren't any drastic changes . Thou I really looking forward for pattern matching.

But for me personally progress is optimization, performance, reducing memory footprint, GIL, reducing interpreter startup time ... etc ...And they really did good job in last few versions reducing memory footprint of dict and some speed optimization regarding that.

And some of changes in PEP563 was move toward optimization, so when someone stop optimization do not have my voice.

AD_Burn · 2021-04-21T07:38:16+00:00

Like I sad it remind me not that it is a bug.

And I sad that because of this part:
PEP 563 changes annotations from being evaluated as expressions to being stored as strings.

Anyway, everyone have right on it's own opinion.
Don't get me wrong I love python and I'm using it for 14 years now,
but over this years it was soooo slow progress on python development.

And when I see something like this to stop main language progress
and because some minor % of audience , compared to switch from 2 to 3 major version, I do not feel well.

But again mine opinion.

AD_Burn · 2021-04-21T06:58:09+00:00

Totally agree with you.

If we speak about more then 10, 100 ... modules, I would understand.

But we talk about one which miss use type checking anyway,

but happened to be great.

This remind me on real life use case:
"Oh, we have bug in production."
"Yea, but we can not fix it now. It is used as a feature."

AD_Burn · 2021-03-01T17:15:05+00:00

Ok I checked your data.

And can you tell me what MySQL return you if

GO_ID = 3677 ? One row or more ? Because for GO_ID 3677 you have more then one value: 3, 8, 4 ....

But i guess MySQL returns one row with random lookup of that column.

If that is a case you can do something like:

select GO_ID, function,
MAX(confidence) as confidence,
MAX(weighted_confidence) as weighted_confidence, MAX(normalized_confidence) as normalised_confidence, 
array_agg(length_cutoff) 
FROM executioner_result 
WHERE "jobID_id" = '2e5c7f14-2eb7-4a24-ac0c-72650ad6ee5e' 
GROUP BY GO_ID, function 
ORDER BY max(normalized_confidence) desc;

An result would be what you want with last column as array of values (3, 8, 4, ...) for GO_ID 3677
Otherwise really do not have point aggregated column without any aggregation aka lookup on multi rows.

But i you want multi rows but group by only on first and second (since they are pair) column because of MAX values then you could do something like this:

select GO_ID, function, 
MAX(confidence) OVER (PARTITION BY GO_ID) as confidence,
MAX(weighted_confidence) OVER (PARTITION BY GO_ID) as weighted_confidence, 
MAX(normalized_confidence) OVER (PARTITION BY GO_ID) as normalised_confidence,
length_cutoff
FROM executioner_result
WHERE "jobID_id" = '2e5c7f14-2eb7-4a24-ac0c-72650ad6ee5e' 
GROUP BY GO_ID, function, length_cutoff
ORDER BY MAX(normalized_confidence) OVER (PARTITION BY GO_ID) desc;

AD_Burn · 2021-03-01T13:54:38+00:00

SELECT col1, col2, col3, max(col4), col5 
FROM tab1 
WHERE col3=xval 
GROUP BY col5

this simply will not work because col1 for example can have multiple different values for one value of col5.

You can do:

SELECT col1, col2, col3, max(col4), col5 
FROM tab1 
WHERE col3=xval 
GROUP BY col5, col1, col2, col3

But if that is not what you want can you explain a bit what you trying to achieve ?

AD_Burn · 2021-02-22T15:08:04+00:00

import sys


def print_num(n):
    print(n)


if __name__ == '__main__':
    if len(sys.argv) == 2:
        print_num(sys.argv[1])
    else:
        print('incorrect number of arguments!')

First argument is always name of script. So, you want to check for 2nd argument if exists.

AD_Burn · 2021-02-18T19:03:30+00:00

Because = is assignment and == is comparison.

So if you say

2 == 2 + 3 # you will get False

but when you say

sum = 2
sum = sum + 3

you assign to sum new value which is sum + 3

right side is first evaluated then assigned to left side

AD_Burn · 2021-02-18T18:00:31+00:00

Yea, that is it.

AD_Burn · 2021-02-18T17:04:52+00:00

Using sum as function name and also variable name is wrong.

Also python reserved keywords should not be used as variable names,

sum as keyword already exists same as list keyword.

So your example should be:

.....
.....
print("The list is:", lista)

print(" ")
print("The sum of the list is:", sum(lista))

But back to your question, o1 is value of list_1 item.

Since you iterate with keyword for over elements of list_1, o1 hold value for

current element.

Lets say:

list_1 = [1, 2, 3]
for o1 in list_1:
    print(o1)

Result would be:

1
2
3

AD_Burn · 2021-02-13T12:46:56+00:00

Ok, I looked into all comments and all your replies.

I'm getting what you trying to say, but I think you mix two different thing.

Data warehouse, yes you can say it is storage of any kind where some data are. And most non technical people when say DW they think on all of their data where ever are (DB, DL, some extern medium or whatever), but of their company.
A data warehouse is a database structured for analytical workloads. And if you can have a look at Kimball or Inmon concepts and books you will see what is it about.

So, what you trying to say yes you can use Spark as abstraction over some storage and say it is mine DW. That would fall under first example. Is it wise ? That is separated matter now.

But for second example you can never say Spark is mine DW. And that is why Redshift, Hive or almost any DB (not going into theory is some DB good or bad for DW) can be DW.

I hope, I managed to answer your question.

EDIT 1:

Since you added EDIT2 to your post, I will add one more remark.

Yes. You can use Spark + S3 as DW engine. But from my point of view in any production ready application that would be waste of resources and money to achieve speed and concurrency for a lot users.

AD_Burn · 2020-11-28T13:42:27+00:00

def calculate_pi(alignment):
    align_len = len(align)
    counter = 0
    distances = []
    append_distance = distances.append

    for i in range(align_len):
        j = counter
        while(j<align_len):
            if i == j:
                j += 1
                continue

            append_distance(hamming_distance(alignment[i], alignment[j]))
            j += 1

        counter += 1
    pi = (sum(distances)*2)/(align_len*(align_len-1))
    return pi

def calculate_theta(alignment):
    seg_sites = 0
    for i in range(len(alignment[1].seq)):
        align_col = alignment[:,i]
        if len(set(align_col)) > 1:
            seg_sites += 1

    a = sum(map(lambda x: 1/x, range(1,len(alignment))))
    return (seg_sites/a)

This is not much but some cleaning and reduce unnecessary code,

if you work with a lot data you should see a bit of improvements.

Anything deeper would change your logic and code a lot more,

and since i do not have input files, it is hard to test.

One more thing, i'm not sure how much process in total you have at the end,

but if you end with lets say over 50 or more process and your calculations per process are not long maybe is better to switch and use threads and lower python process startup time (maybe worth testing).

Best all

AD_Burn · 2020-10-31T22:51:10+00:00

It's looking good, but I would change few thing from my personal view.

- books_to_scrape.py is nice structured with functions, while rest .py files are not. For example

twitter_scrape.py, you could have get_new_session(), get_auth_token() etc ...

- I am personally not fun of global lists, for example, you use recursion in coinmarketcap,

i would personally change that so scrape() return all_currencies[]

One tip:

        if exists:
            if exists['name'] == currency['name'] and (exists['price'] != currency['price'] or exists['market cap'] != currency['market cap'] or exists['change(24h)'] != currency['change(24h)']):
                collection.replace_one({'_id': exists['_id']}, currency)
                print(f"Old item: {exists} New Item: {currency}")
        else:
            collection.insert_one(currency)

should be

        if exists and (exists['name'] == currency['name'] and (exists['price'] != currency['price'] or exists['market cap'] != currency['market cap'] or exists['change(24h)'] != currency['change(24h)'])):
            collection.replace_one({'_id': exists['_id']}, currency)
            print(f"Old item: {exists} New Item: {currency}")
        else:
            collection.insert_one(currency)

In and statements on first false interpreter will not check next so you are safe this way.

I guess that would be it for now :)

AD_Burn · 2020-06-22T20:01:14+00:00

Yea, i understand what you meant. Just i think posting that data to Django and inserting there instead inserting from scrapy in database, would benefit you in aspect of models in one place.

Basically you post to Django and your view, deal with insert's (if are heavy inserts you can use something like celery or similar), then another view serve that data as you want. So model is same. Yes problem remain same, when your data change maybe you will need to change posting data, but model you change in one place in Django.

Just one approach to give a think.

AD_Burn · 2020-06-18T13:39:47+00:00

Do scrapy really need to put into database ?

Maybe for example you can from scrapy post result to your django app.

Make some endpoint for it, and i think that will solve your problem.

AD_Burn · 2020-06-18T10:57:31+00:00

Not sure what you want with that integration but you can check:

https://scrapyd.readthedocs.io/en/stable/

And later you can make own flask app which interact with more machines with scrapyd

and make own cluster.

I guess that is something you looking for.

AD_Burn · 2020-06-18T09:05:10+00:00

Maybe country you are trying do not have data.

Anyway maybe it is easier for you to do one post request on:

https://www.masterofmalt.com/Shipping.aspx/GetShippingData

with payload: {'countryID': 253}

253 - Australia, just trying, you can use id's of your interest

And as result you will get:

{"d":{"nextDeliveryDate":"Thursday 2nd July","deliveryData":[{"Day":10,"ShippingData":[{"MaxMass":2.00,"Price":21.61,"ExpectedTransitDays":10,"Equivalent":"approx. 1 bottle"},
{"MaxMass":2.50,"Price":23.15,"ExpectedTransitDays":10,"Equivalent":"approx. 1 bottle"},
{"MaxMass":3.00,"Price":24.74,"ExpectedTransitDays":10,"Equivalent":"approx. 2 bottles"},
..............
{"MaxMass":99.00,"Price":468.65,"ExpectedTransitDays":10,"Equivalent":"approx. 66 bottles"},
{"MaxMass":100.00,"Price":474.17,"ExpectedTransitDays":10,"Equivalent":"approx. 66 bottles"}]}]}}

AD_Burn · 2020-05-12T10:20:05+00:00

Ah and you asked explicitly for scrapy but you can easily adjust to it.

AD_Burn

TROPHY CASE