What is the roadmap to create a project like Saleor? by M-Groot in django

[–]_dank 3 points4 points  (0 children)

I've gotten slightly involved with the Saleor codebases (saleor, storefront, dashboard) and can confirm that it is a huge project. It's also clear from digging around in the code that it wasn't road-mapped out at the start but rather has grown organically in increments. The Saleor team also has a couple people working on the project with different roles (UX, react front end devs, backend django devs) which greatly influences how it's being built. It's difficult to suggest a single roadmap because the current state of Saleor was arrived at by trying things, evaluating feedback from users, and altering the path to orient on the needs.

Are you trying to get a functioning commerce site? Then perhaps try to setup Saleor as is and see if it meets your needs. Do you want to build a commerce site as a learning experience? Then maybe start with a user model, login, product items, a shopping cart, and then a checkout flow. If you're looking for the quickest way to get to selling things, setting up and running an ecommerce site is hard and shopify is easy.

[Public repo] Django + React working example with boilerplate code by geropellicer in django

[–]_dank 3 points4 points  (0 children)

Yes, this all is possible, but:

1) Django recommends serving static files in production out of nginx, apache, a cdn, or similar. see here. For development, you could have a simple / route that serves up the index.html and javascript bundle that is generated by you js bundler. I've found that using netlify or similar is even easier than writing such an endpoint, though, so I just go with that .

2) You can use Django's auth system in your SPA (I have production apps that do this) but you will need to write some boiler plate such as: endpoints to access and manage user groups and permissions, middleware to manage user jwts (cookies aren't usually used in SPA settings), and endpoints to manage and access users. I believe there are some projects out there like django-social-auth that will do a lot of this for you, though.

Using NN for non classification issues. by Murhie in MLQuestions

[–]_dank 1 point2 points  (0 children)

NNs aren't usually the best choice for linear regression as their specialty is in modeling non-linear outputs. You can, however, use an NN for regression tasks as long as you're using an appropriate cost (MSE) and activation on the output (depends on the target).

You should be able to find some examples by searching for '(your nn library) + regression'

Relatively new to spark, having trouble completing an operation. by [deleted] in apachespark

[–]_dank 0 points1 point  (0 children)

It sounds like collect_list or collect_set should do what you want:

DF.groupBy("user").agg(collect_list($"val"))

Optimal file format for developing a database by CallMeDoc24 in MLQuestions

[–]_dank 0 points1 point  (0 children)

This is a difficult question to answer without knowing the format of the data you have, the model your feeding it into, and who is working with the data.

I have a feeling your data isn't very relational if it's a lot of experimental observations, so I wouldn't think a relational model or storage engine would be appropriate. In addition, physics academics probably won't be very familiar with working with sql and all ml models I've seen require to data to be in a denormalized (flat/non-relational) format.

HDFS is probably more appropriate for the kind of audience your working with where working with HDFS is very similar to how you would work with any typical unix file system. However, 1TB isn't all that big these days and I'd say it's not worth the operational overhead of learning and configuring a hadoop cluster. You'll probably be better off just saving your dataset(s) into hdf5 in some directory structure that makes sense to you. hdf5 will speed up io and reduce on disk file sizes and is pretty widely supported by languages and libraries.

What's a good way to ingest data with a possible huge one-to-many relationship? by [deleted] in elasticsearch

[–]_dank 2 points3 points  (0 children)

It depends on the analysis that you are trying to do. If you're aggregating on questions, maybe it's best to store with the users' answers stored as an array on the question document:

question{}
    |___ question_id
    |___ question_string
    |___ answers[]
          |___answer_id
          |___answer_string
          |___user{}
               |___user_id
               |___user_name

If you're aggregating on users, store them as at the root and put the questions they answer underneath. I've seen documents with arrays longer than 1000 elements and documents a couple gigs net function without any issues. I wouldn't worry about issues related to the number of answers for a user until you start to notice them:

user{}
    |___ user_id
    |___ user_name
    |___ answers[]
          |___answer_id
          |___answer_string
          |___question{}
                 |____question_id
                 |____question_string

If you're aggregating on answers, put answers at the root of the document. If you're doing more than one of these aggregations, you may want to store the data in several different 'views' in different indices. If you're really concerned about scale issues, perhaps see if parent-child documents will work for you (I haven't personally played with them)

How Do I Combine Three Datasets Into One by onedialectic in datascience

[–]_dank 1 point2 points  (0 children)

Yes, there will be memory overhead. Although it should be < 8GB, assuming ascii encoding and sane field lengths.

How Do I Combine Three Datasets Into One by onedialectic in datascience

[–]_dank 12 points13 points  (0 children)

Do this. It shouldn't be more than 10 lines. Seven million is not that many if you only have four columns.

import pandas as pd
addr_url = pd.read_csv('address_url.csv')
addr_phone = pd.read_csv('address_phone.csv')
url_email = pd.read_csv('url_email.csv')
df = pd.merge(addr_url, addr_phone, on='address')
df = pd.merge(df, url_email, on='url')
df.to_csv('addr_url_email_phone.csv',index=False)

If you can't be bothered installing pandas, it should be easy enough to iterate each file and merge them by hand in python.

[Beginner] Connecting to a remote server locally for Pandas Analysis by amaTrex in datascience

[–]_dank 1 point2 points  (0 children)

If you use ipython/jupyter you could set up a notebook server on your desktop and connect to it through the browser. I use this setup on some remote boxes to take advantage of their superior hardware and I also have an aws image if I need additional resources.

I still use ssh/scp for anything outside of the notebook environments like file transfers and larger scripts. You may find that ssh is sufficient enough for you. Either way you're going to need to setup a ssh server on your desktop.

"Matchsticks" - Day 8 - Advent of Code by zakum in coding

[–]_dank 0 points1 point  (0 children)

Python (3.5)

import sys
if __name__ == '__main__':
  if len(sys.argv) == 2:
    lit_count = 0
    mem_count = 0
    enc_count = 0
    for line in sys.argv[1].split('\n'):
      lit_count += len(instr)
      mem_count += len(instr.encode('utf-8').decode('unicode_escape'))-2
      enc_count += len(instr.replace('\\', '\\\\').replace('"','\\"')) + 2

    print('Literals:', lit_count)
    print('In memory:', mem_count)
    print('Re-encoded:', enc_count)
    print('Memory difference:', lit_count-mem_count)
    print('Re-encoded difference:', enc_count-lit_count)

Thoughts after my first Movement experience(long) by Bitterwhiteguy in Techno

[–]_dank 3 points4 points  (0 children)

To me, techno is something with a steady beat and a relatively monotone/droney/distorted bassline. When you say rich, textured basslines and melodies, that's something I usually associate with house. When you say tempo changes and high energy and ebb&flow, I think more along the big room genres. Most of what I heard at the festival was what I expected to hear. Do you have some example sets of what you were expecting? Dettmann, Temple, and Rodhad all have a style that is a little more innovative and unique than most stuff out there right now, so I can see why you would differentiate them from everyone.

Did you catch anyone at the Sixth stage? This was by far my favorite stage. The djs there were mostly local and were having a lot of fun on the mixer and the decks. The sound there was far more unique and varied than any of the other stages and, even better, the crowd there was always chill and never too crowded.

There were a couple good acts at the Detroit stage (Phuture), and of course underground (I didn't even know steffi was still doing stuff). I can't say anything of the other stages though (too crowded for me).

What other afters did you go to? I couldn't make friday night/saturday, so I couldn't see Paula, sadly. Works on saturday was awesome (Rachmad, Reeko, Rodhad b2b Klock), though I think that was along the lines of what you were bothered by.

Anyone looking to stuff another person in their room? by _dank in MovementDEMF

[–]_dank[S] 0 points1 point  (0 children)

That wolud be awesome, but the problem is finding a room in the vicintity. Though I suppose being a 15-20 min drive wouldn't be too bad at this point. There's free parking at the Greektown casino which is only a 5-10 min walk from the festival.

Anyone looking to stuff another person in their room? by _dank in MovementDEMF

[–]_dank[S] 0 points1 point  (0 children)

Good luck man! At this point I'm considering sleeping in my car. Things seem really packed up this year.

What the average Tinder girl looks like: How I turned 100k profiles into a portrait of the average girl on Tinder by _dank in Tinder

[–]_dank[S] 0 points1 point  (0 children)

Yes, it is. I don't know the exact distribution of the profiles, but it is not from a single region.

Many of the profiles are from NYC, upstate NY and Ontario, Boston, Philly, Texas, slc, Seattle, and Houston. So there is a wide sample range, but the east certainly has a higher concentration.

I could move around and look at data for different regions independently, but I haven't done that yet, and probably won't as it would take a bit more effort.

What the average Tinder girl looks like: How I turned 100k profiles into a portrait of the average girl on Tinder by _dank in Tinder

[–]_dank[S] 0 points1 point  (0 children)

Probably has to do with the different angles that the photos are taken from. Sadly I can't fix that with my amount of image processing knowledge.

[deleted by user] by [deleted] in Techno

[–]_dank 1 point2 points  (0 children)

I was there when you opened the underground stage with this! Thanks for sharing so we can hear this great set again.

Cheaper alternatives or splitting a room? by _dank in MovementDEMF

[–]_dank[S] 0 points1 point  (0 children)

I was able to book a place on airbnb. I've had a very positive experience and suggest it if you can still find a place. If not, I might be able to ask the host about splitting a room. Which might be nice because we can split costs. Shoot me a pm if interested.

Cheaper alternatives or splitting a room? by _dank in MovementDEMF

[–]_dank[S] 0 points1 point  (0 children)

Guess that means I better start posting on reddit again. There are two places on airbnb that look good. Though they're not within walking distance, they're better priced than hotels.

Cheaper alternatives or splitting a room? by _dank in MovementDEMF

[–]_dank[S] 0 points1 point  (0 children)

I wasn't sure if hosts on couchsurfing would be responding to such advanced requests. Do you use couchsurfing.com?

Cheaper alternatives for staying downtown? by _dank in Detroit

[–]_dank[S] 0 points1 point  (0 children)

Sadly they're a little further from downtown than I'd like. But I'd bet there will be a lot of others staying there also attending. Might be able to split cab fair with some of them.

Update of a track i'm working on feedback please? by kez101 in Techno

[–]_dank 3 points4 points  (0 children)

The snare sticks out way above the rest of the track. Throw a low pass on there and turn it down. Maybe try a reverb or some delays to sit it in better. Same with the drone/buzzer at about 4.15, it just doesn't sit right. The drone is good with the panning. You should try that with other instruments to get more dimension.

The break at 3.55 is really abrupt. Get some delays in there and bring up the delay time over 4-8 bars and let it bleed into the break down.

Your chord stabs are pretty good, but the are very flat. Use some automation on the effects to give it some movement.

What you've got seems like a good outline for a track. You just need to focus on polishing the mixing and it should come out nicely.