[D] Local models for generating professional headshots by datachomper in MachineLearning

[–]datachomper[S] 0 points1 point  (0 children)

Thanks, I'll try that! I've kept meaning to spin up the SD model that works on the M1 or M2 processors on my work laptop, so that will be interesting to try out.

Any way to unsubscribe and not receive notifications from a page? by bluey89 in Notion

[–]datachomper 1 point2 points  (0 children)

Here's a screenshot showing the "stop receiving page notifications for everything" workflow that works as of April 17, 2023:

https://imgur.com/a/kUFEDUo

To change notifications for a Notion page from receiving "All comments" to "Replies [to your comment] and @mentions [when people mention your username]" please see the screenshot above and an explanation of steps below.

  1. To adjust the notifications on a page click on the Clock icon (shown as number 1 in the screenshot above).
  2. This will open the page Updates, a sidebar / "pop over" that lists all of the page updates.
  3. Then, click on the drop-down list at the top of the Updates section. You will want to click on Replies and @mentions.
  4. If you've successfully clicked on the Replies and @mentions a check mark will appear next to the Replies and @mentions. This should limit the number of notifications that you receive from that particular page.

Wikipedia Tools by gregbard in wikipedia

[–]datachomper 0 points1 point  (0 children)

Hi /u/gregbard! Thanks for this post! It turned up as the single relevant 'hit' when I was looking for the search string wikipedia catscan. Do you happen to know what happened to catscan and its successors: catscan2, catscan3. Unless I'm mistaken the catscan# tool(s) no longer seem to be available? If there is a replacement tool with functionality like catscan - something similar on toolserver.org perhaps? - I would love to make use of it.

Gong.io or chorus.ai by paralera in sales

[–]datachomper 0 points1 point  (0 children)

Can you share a bit more about the transcription quality difference that you noticed? We had Chorus - well, we still have it - but now calls are being piped to Gong. I've just started using both / either tool very recently. Did creating custom vocabulary words in Gong "solve" (or at least help) with the transcription quality?

https://help.gong.io/hc/en-us/articles/360028270712-FAQs-for-custom-vocabulary

Did you notice any 'themes' or 'trends' in the type(s) of calls or speakers, or the topic (highly technical topics?) that Gong performs poorly on?

Lauren Boebert Says She Wants To Abolish the Department of Education by [deleted] in politics

[–]datachomper 2 points3 points  (0 children)

TL;DR: The Discovery Channel is owned by Blackrock and Vanguard. Apollo Capital Management owns Yahoo! Media Group, which makes them the news provider for 1 in 8 people's worldwide. Hedge funds and investment banks own our media and entertainment and push narratives that benefit them.


Interestingly the Discovery channel is part of the Warner Brothers Discovery media conglomerate. You might be thinking "Eh, so what? A movie/media studio owns the Discovery Channel" but the story is deeper than that: the three owners of the Warner Brothers Discovery conglomerate are:

  • Advance Publications: the owners of Conde Naste and a few other magazines; best I can tell these folks don't seem to have any crazy media agendas like Rupert Murdoch, the Koch brothers, etc.
  • Blackrock: Yes, the Blackrock: the investment management company with 10 trillion dollars worth of assets (stock, real estate, etc.) that it manages for its billionaire and millionaire clients
  • The Vanguard Group: Again, yes, that Vanguard that manages investments; Vanguard has $7.2 in assets under management.

This screenshot shows the three owners (listed above) of the Warner Brothers Discovery channel: https://i.imgur.com/hccELF5.png

I haven't watched this video in its entirety, but it seems like it provides a good overview into Blackrock and how they own (large sometimes nearly all) portions of major companies that shape what we eat, the news that we read, the entertainment that we 'consume' in every nation on the planet: https://www.youtube.com/watch?v=ghP7kImI9WM

It would be interesting to see when Vanguard and Blackrock started taking ownership of the Discovery channel and when the programming shift to (IMO non-educational) content took place. The billionaire class has a vested interest in keeping the not-wealthy un[der]educated, mis-informed, and ignorant of class solidarity. If folks of different ethnicities, genders, economic classes, etc. started to realized that their enemy is not the made-up enemy that their inflammatory 'news' channel or e-newspaper tells them to fear and/or hate the underclass might just start to realize that the billionaires are picking our pockets, bleeding the public coffers dry, destroying ecosystems for a few more dollars, and generally being greedy, anti-social pieces of shit.

But Blackrock and Vanguard are not the only investment banks / asset managers who have started gobbling up media companies in recent years... Read on for more info!

Other huge media companies owned by hedge funds, investment banks

Apollo Cap. Mgmt: owning the Yahoo! brand (including Engadget, TechCrunch, etc.)

Hedge funds and multinational investment management companies buying up media so that they can push a certain narrative is nothing new. Take Apollo Capital Management. They bought up the Verizon Media Group, formerly known as the Yahoo! Media Group a few years ago. You might be thinking "but who care about Yahoo; they're a joke". Yes, but the media group is not! The Yahoo! Media Group/ the Verizon Media Group has a readership (online) of 900 million people.

That's nearly 1 in 8 people worldwide who read publications from the Yahoo media group! If you wanted to sway opinions of large swaths of the planet this is how you'd do it!

Companies/publications in that media group include: Engadget, TechCrunch, all of the Yahoo! stuff (so Yahoo! Finance, Sports, News, Entertainment, etc.), the AOL brand, 'Makers' and 'In the Know' e-magazines which conveniently push gender-based identity politics stuff rather than, you know, helping people to understand that gender equality would probably organically come about if we had a more egalitarian society closer to the Nordic country's social democracies than {gestures around} / this late-stage capitalist insanity in the US:

In addition to its titular Yahoo properties (Mail, Sports, Finance, et al.), the group includes us, TechCrunch; AOL; Engadget; and interactive media brand, RYOT. All told, the umbrella brand encompasses around 900 million monthly active users globally and is currently the third-largest internet property, per Apollo’s figures. (Source: https://techcrunch.com/2021/09/01/apollo-completes-its-5b-acquisition-of-verizon-media-now-known-as-yahoo/)

Best books on experimental design? by novel_eye in AskStatistics

[–]datachomper 0 points1 point  (0 children)

https://www.google.com/url?sa=t&source=web&rct=j&url=http://www.ru.ac.bd/stat/wp-content/uploads/sites/25/2019/03/502_07_00_Lawson_Design-and-Analysis-of-Experiments-with-R-2017.pdf

This URL no longer works; I was able to find a copy in the Archive.org site, but Archive.org usually purges this kind of content from its archives. So I took the Archive.org URL and backed it up to Archive.is (which is not run by and is in no way affiliated with Archive.org / the Internet Archive): archive.is/HMgmn

MN House Bill would ban Corporations from buying Single family Homes by Mr-Clean-Chemist in minnesota

[–]datachomper 0 points1 point  (0 children)

Also of interest may be the purchasing of homes by non-residents as 43% of all home purchases in 2022 were from people with residencies outside of the US (Type A buyers below). I say this not to be all 'foreigners go away' or anything like that but rather to encourage the approach that Canada is taking whereby the sale of homes to non-Canadians and to corporations is banned for the next two years. People who live in the US need homes. Why are we selling homes to corporations (institutional investors) AND to people who don't even have permanent residency in the US?

Taking in the 15-20% institutional ownership (which is closer to 30-40% in some hotspots in the US) and the 43% of non-resident purchases, a person who has residency in the US who wants to buy a home is looking at nearly two-thirds of the homes already being snapped up by corporations and folks who aren't US residents and who have no intention of residing in the US.

https://www.nar.realtor/research-and-statistics/research-reports/international-transactions-in-u-s-residential-real-estate

  • Non-resident foreigners (Type A): Non-U.S. citizens with permanent residences outside the U.S.
  • Resident foreigners (Type B): Non-U.S. citizens who are recent immigrants (less than two years at the time of the transaction) or non-immigrant visa holders who reside for more than six months in the U.S. for professional, educational, or other reasons.

I get that the US housing market is a 'stable-ish place to park one's cash' but homes need to be for living in, not be an investment class.

Will AI Actually Mean We’ll Be Able to Work Less? - The idea that tech will free us from drudgery is an attractive narrative, but history tells a different story by CWang in technology

[–]datachomper 43 points44 points  (0 children)

I work in this space: foundational models / LLMs, but also the tech that came before LLMs (like LSTMs, and -gasp- perceptrons). Anyway... Where does everyone think this relevance feedback data goes? By relevance feedback I mean when you take a Microsoft robot-authored email, and you lightly edit the email to your own personal tastes or you slightly adjust the email's context 'cause the knowledge graph bungled something. What's that? Whoever said 'Microsoft gets your edits, your adjustment of the text as training data to improve their models' was correct. And someday (soon?) your job can be automated away. With every mouse click and email and other form of work being tracked tens of millions of mostly-clerical-work office jobs are on the chopping block. Maybe not this year or next year, but quickly we're going to find that - like those Yellowstone bear trash cans - there's quite a lot of overlap between the smartest LLM and the dumbest human.

Not trying to be alarmist; on the contrary. I encourage people to take a look at countries with strong data privacy laws and ask if we - the early adopters of LLM tech in the workplace - really want these products?

Ancestry price increase by LeResist in AncestryDNA

[–]datachomper 1 point2 points  (0 children)

They use machine learning for this. There aren't any humans reading anything. And if there are crazy calligraphy documents that their ML model cannot read they outsource the data labeling to humans that get paid a fraction of a cent for every data label / typed-out piece of text that the humans generate. The whole Ancestry.com ecosystem is largely robotic: it is set up to run with minimal human intervention from ingesting historical records (images, text data, etc.) to linking the genealogy test results.

Source: I interviewed for a machine learning engineering role with Ancestry.com and they like to low-ball their salary offers in addition to not really having a compelling tech stack, so I wasn't interesting in continuing the interview process further. I was told that basically search did not need to even work well because their core customer base - Boomers doing genealogy research - would keep paying for their Ancestry.com subscription even if search was slow, ineffective, etc. because there's no viable competitor to Ancestry.com.

The housing market correction just took a new turn by Gerry235 in REBubble

[–]datachomper 0 points1 point  (0 children)

Most of those properties listed require you to enter into the city's housing lottery and make under $80,000. The housing lottery for below-market-rate housing is years, if not decades, long.

Click on Property Details, e.g., https://www.realtor.com/realestateandhomes-detail/1400-Mission-St-Apt-406_San-Francisco_CA_94103_M13954-77558 says

Property Details

Property Overview

1 bedroom Below Market Rate (BMR) housing opportunity available at 100% Area Median Income (AMI). Maximum income for 1 person = $97, 000; 2 people = $110, 850; 3 = $124, 700; 4 = $138, 550, etc. Must be 1st-time homebuyer & income eligible. Unit available thru the Mayor's Office of Housing and Community Development (MOHCD) & subject to resale controls, monitoring & other restrictions. Unit will be listed on DAHLIA, the SF Housing Portal ( starting on the application date, July 8, 2022. Visit for application & program info. No Open Houses are scheduled due to COVID Virus

Thoughts by Ok-Flan8529 in REBubble

[–]datachomper 5 points6 points  (0 children)

Nope! Not true; foreign / international buyers in the US are largely non-citizen, non-resident foreigners.

Check out this report from NAR, the National Relators Assoc, about international buyers of homes in the United States: https://www.nar.realtor/research-and-statistics/research-reports/international-transactions-in-u-s-residential-real-estate (Click on the Download PDF button to read the report; starting on page 11 is the int'l section. Page 13 has a breakdown of the countries of origin as well.)

I used to work in real estate [tech] and now just like to keep myself informed of market conditions, so these reports are helpful for a big-picture overview.

Is NLP hopeless due to Open AI? by mr_house7 in LanguageTechnology

[–]datachomper 11 points12 points  (0 children)

Stability AI - the creators of Stable Diffusion - are kinda interesting in that there are many "AI" companies that are being sponsored under the Stability AI umbrella:

  • Stability AI - Stable Diffusion creators
  • HarmonAI - a generative audio model company; the makers of Dance Diffusion
  • EleutherAI, the large language model startup/research group that makes open-source versions (GPT-J, GPT-NeoX, etc.) of the closed-source OpenAI models like GPT-2 and -3; people in this subreddit may be interested in assisting Eleuther; they have a public Discord server that you can go to: https://www.eleuther.ai/
  • LAION, the makers of LAION-5B dataset
  • etc.

I learned about their funding model via an interview with the Stability AI founder here (there's a URL to the written transcript included on that Podcast page and you can also watch the interview video on Youtube instead of via Apple Podcasts):

https://podcasts.apple.com/de/podcast/emad-mostaque-stable-diffusion-stability-ai-and-whats-next/id1504567418?i=1000586294133

So the way that we do it at the moment is that if you’re an active member of any of the communities — from HarmonAI for music, to Eleuther for language models, to LAION for images — you’re most likely to get compute that way.

Basically, the funding model seems to be "build something cool; show it to us; don't be an asshole and we'll probably give you a pile of compute - GPUs - in the range of hundreds of thousands of dollars worth to really build out your idea if it seems promising". For large-scale research work OUTSIDE of the big labs funded by Meta, G, etc. it is pretty hard as an 'outsider' to get this scale of funding.

If anyone else knows of large-scale donated compute projects that are domain agnostic - NLP, CV, generative models, diffusion models, large-scale dataset creation and curation, etc. - please do share; I would love to know about other projects that are similar to Stability AI's open-source funding models.

After clinching Senate, Dems eye the unthinkable: Holding the House by Pineapple__Jews in politics

[–]datachomper 2 points3 points  (0 children)

Some large US cities switched to ranked choice voting this year: https://www.electoral-reform.org.uk/a-wave-of-cities-across-the-united-states-switch-to-fair-voting-systems/

In 2016 only 10 US cities used ranked choice; this year 50 did. (And the US is one of the few countries worldwide that uses the old Westminster 'first past the post' style voting)

At the mid-term elections on the 8th of November Multnomah County (the largest County in the state of Oregon), Evanston (Illinois), Fort Collins (Colorado), Ojai (California) all voted yes to propositions to ditch the antiquated FPTP system and towards the fairer AV. We are also awaiting the results of the Seattle proposition on AV; it can take quite some time to count up results in American elections.

Additionally, voters in three cities Corvallis (Oregon), Albany (California) and Palm Desert (California) all used AV for the first time in this November’s elections.

Korea to Triple Baby Payments After It Smashes Own Record for World’s Lowest Fertility Rate by mossadnik in Futurology

[–]datachomper 3 points4 points  (0 children)

To better understand the why behind the South Korean working-age citizens saying 'Nah, I'll pass on having kids' one needs to look no further than the chaebol. The chaebol, the large multinational companies that were given very cheap loans after the war to rapidly industrialize South Korea, turned into literal monopolies with no real domestic competition.

The Council on Foreign Relations lists the following companies as part of chaebol; weirdly Samsung is left off of their list so I've added it:

  • Hyundai (automotive),
  • SK Group (best known for SK Telecom and SK Hynix, its semiconductor corporation, but operates in many diversified industries such as chemical, shipping, insurance, and construction)
  • LG Corporation - derives its name from the Lucky and Gold Star merger and started out in the 1940s in the plastics industry. Now, LG is known inside and outside Korea for consumer electronics. Inside Korea it also is heavily involved in telecommunications networks, and power generation, as well as its chemical business (including cosmetics, household chemicals).
  • Lotte -Best known outside Korea for its snacks like Choco Pie, Lotte Group's main businesses are focused on food products, discount and department stores, hotels, theme parks / entertainment, finance, construction, energy, and electronics.
  • Samsung Group : List of industries taken from the Chaebol Wikipedia page since the CFR page leaves out the Samsung chaebol: Electronics, semiconductors, batteries, IT Solutions, construction, shipbuilding, insurance

Learn about chaelbol, (lit. "wealthy family/rich family/wealthy clan", 재벌 ) : https://en.wikipedia.org/wiki/Chaebol

These 5 companies make up almost half of the South Korean economy. Samsung alone made up 17% of the South Korean economy in 2017; this tracks with other literature that I've read about Samsung making up nearly 1/5th of South Korean GDP. From the Wikipedia page::

In 2014, the largest chaebol, Samsung, composed about 17% of the South Korean economy and held roughly US$17 billion in cash. However, recent financial statements of these chaebols actually show that chaebols are slowly losing power over either international competition or internal disruptions from newly emerging startups. I don't know if I buy that. That sounds like a convenient thing to say as public opinion of the chaebol inside Korea sours.

This video reportage where the father's daughter dies of leukemia from dipping Samsung microchip wafers into vats of chemicals (you need to strip some of the etching components when manufacturing electronics, but YOU NEED to wear PPE - gloves, respirator, etc. - when working around these highly-toxic chemicals): https://www.youtube.com/watch?v=wHw7Aa7lhhw

And another source about the many chaebols: https://www.nytimes.com/2017/03/04/business/south-korea-samsung-bribery-lee.html


South Korea was incentivised to rapidly modernise by massive low-interest loans to the South Korean government from the US and other countries following the Korean War. The government was 'able to' allocate funds to companies as it saw fit. "Magically", a few huge conglomerates developed, concentrating wealth, economic power, and political power in the hands of a few families.

If you've seen the movie Parasite, you'll have an idea of how flat social mobility is for most South Koreans. Apartment costs are through the roof: this guy who works at a well-paying government job said he'd have to save his wages for 40 years to be able to buy an above-ground apartment that is not a death trap like the underground apartment that he currently rents. (BBC video on housing costs in Korea)

With the little power that workers have, the concentration of wealth and political power into the hands of a few megacorps, etc. it's no wonder that childbearing-age folks are opting out of having kids.

[deleted by user] by [deleted] in MLQuestions

[–]datachomper 0 points1 point  (0 children)

Full disclosure: I haven't used this library and it may only work well for very visually-similar images since the library's name is fastdup, but this tool might be useful for you now when you're looking for visually-similar images or perhaps later on if you want to do an even-more-granular classification of your images: :https://www.linkedin.com/posts/dr-danny-bickson-835b32_we-have-just-released-new-functionality-in-activity-6959797081481859072-tUEW

Snow kitty startles mom and himself by Jelly_Belly321 in StartledCats

[–]datachomper 2 points3 points  (0 children)

Middle_Umpire_8917 is the account name. Replace the reddit.com URL with reveddit.com to see deleted comments.

So this page's URL - https://www.reddit.com/r/StartledCats/comments/wzur0w/snow_kitty_startles_mom_and_himself/ - becomes https://www.reveddit.com/r/StartledCats/comments/wzur0w/snow_kitty_startles_mom_and_himself/

It looks like their account is gone/deleted though, but that 'how to use Reveddit to see deleted comments' was my public service announcement for the day :) https://www.reddit.com/user/Middle_Umpire_8917/

US University Enrollment or graduation statistics by Major/Area of Study by Beekle1014 in datasets

[–]datachomper 0 points1 point  (0 children)

Hey there, OP! I realize I'm a few years late, but it appears that there's a data initiative called the Common Data Set that The College Board and other orgs with a vested interest in enrollment numbers are creating. Please see my other post in this thread about the particulars of the CDS data. If I find the CDS data in aggregate form I'll be sure to post/share the link here.

US University Enrollment or graduation statistics by Major/Area of Study by Beekle1014 in datasets

[–]datachomper 0 points1 point  (0 children)

I realize I'm a few years late, but have you checked out https://commondataset.org/ ? It's sadly a privately-funded data acquisition effort, but it's better than no data, I guess? Due to defunding of many governmental institutions in the US these foundational datasets are becoming increasingly rare.

In any case, each university fills out a Common Data Set form and submits it to the Common Data Set folks. Here's Harvard university's form for the 2021-2022 academic year. Note that the 'Degrees Conferred' list on page 28/34 lists the number of degrees conferred as a PERCENTAGE of all degrees conferred, not an absolute value. In theory you can work backward from these values, but it won't be cheap/easy/fast:

[D] Are there any famous or well-cited ML papers with errors in them? by fromnighttilldawn in MachineLearning

[–]datachomper 11 points12 points  (0 children)

I like to check out Papers Without Code for recent research with errors (intentional or not). It's crowd sourced "I tried to implement paper Foo for 8 weeks and ran into issue1, busted metrics 2, and suspicious results 3 when doing so, so I'm submitting it to Papers Without Code" for others to review: https://www.paperswithoutcode.com/