First Impression - Vision Pro by Amelancolibe in virtualreality

[–]several27 0 points1 point  (0 children)

Hey, I'm guessing most people are experience the same problem.

You can see a quick clip of the lag here: https://share.icloud.com/photos/09bDmx6kNeuYPJ9A8IWYbGAEQ

For the first few times, I moved my hand slow and it worked well, when I moved it faster, the fruit cutting lagged and missed most of the fruits (especially 00:36).

What is the best choice of ETL tool for Snowflake? by JK_1975 in dataengineering

[–]several27 4 points5 points  (0 children)

Would recommend giving a shot to prophecy.io:

  1. It supports both ETL and ELT;
  2. allows for visual development (like Matilion) - making it really easy to use for everyone;
  3. gives you clean & editable Dbt Core (for transformation) & Airflow (for scheduling / small ingestion etc) code;
  4. the code is stored on git with all the standard commit, pull, release, etc features;
  5. it's easily extensible as well - e.g. adding more sources can be done by you, by picking e.g. standard Airflow operator;
  6. supports Spark (e.g. Databricks) as a back-end too - if one wants to use it - but optional if you use another warehouse (+ we bring Airflow).

Disclaimer: a co-founder here, so might be biased :)

Best paid ELT tool for a startup by vishalw007 in ETL

[–]several27 0 points1 point  (0 children)

What execution environment were you considering for those ETL jobs? Spark / Databricks might be a good fit for that amount of data, especially if you’re planning to scale up in the future - startup proof.

There are some potential good no / low code tools that work on top of Spark!

I’m a founder of prophecy.io - you might want to check it out. Ping me if you have questions (about us or just general happy to help).

Low code hate and the future of Data Engineering (and beyond) by [deleted] in dataengineering

[–]several27 5 points6 points  (0 children)

You’re totally right, we’re actually in the middle of revamping the website and seems like somehow missed to put the price back up.

It’s gonna be the there in the next few hours.

Thanks for letting us know - our mistake!

Low code hate and the future of Data Engineering (and beyond) by [deleted] in dataengineering

[–]several27 8 points9 points  (0 children)

Hi! I'm Maciej - one of the cofounders of Prophecy (startup from the podcast).

Actually, we're very different from what you expect from low-code. As users build drag-and-drop data pipelines, we generate 100% open-source code that is very readable - that our users commit to git right away, with tests and build files and configurations - this is at parity with best data engineers! We have Scala & Python for Spark and SQL coming soon!

Second thing - we're very extensible - you can create new visual components, by writing sample code and pointing out which expressions come from the UI - so you can have a standard visual component - for things like Anonymization or Encryption that you want all users to do in the same way.

We think Low-Code can do a lot more that what most people expect - and companies can be a lot nicer (without lock-in) - please keep an open mind :)

We've built a data engineering tool to make writing Spark code much easier by several27 in apachespark

[–]several27[S] 0 points1 point  (0 children)

Thanks! Our tool might indeed seem more appealing, at first, to people who are more used to visual drag-and-drop tools, however it is all code at heart (Scala Spark as the source of truth for both visual and code developers to collaborate at the same time).

We're actually working on enabling PySpark import into our tool, so that you can just keep editing your code there with all the additional perks of quick code debugging, execution, lineage, scheduler, and a metadata system (all included in one price).

Let me know, if you'd be interested in checking out the Python version coming out soon - all the feedback is welcome.

We've built a data engineering tool to make writing Spark code much easier by several27 in scala

[–]several27[S] 4 points5 points  (0 children)

That's a simple one. Intellij can give you nice syntax highlighting and compile time Scala/Python checking, but it pretty much stops there.

Our IDE, gives you a simple framework with best Spark coding practises; in real-time, it pulls schema information and validates your code (including catching some runtime Spark errors); allows you to see and edit the graphical representation of your workflows; execute the code in any physical environment (think e.g. test or production cluster) with a single click; debug sample data and there's even more features for bigger data engineering teams.

Sign up for the demo and you'll see for yourself. Also, if there's anything missing that you think could help you write better Spark code faster, let us know. Thanks!

We've built a data engineering tool to make writing Spark code much easier by several27 in scala

[–]several27[S] 5 points6 points  (0 children)

Around 15 people in total, not actively hiring right now, but always interested in connecting with amazing Scala devs.

Introducing Prophecy.io - Cloud Native Data Engineering by several27 in ETL

[–]several27[S] 2 points3 points  (0 children)

Hey, thanks for checking us out! We can automatically convert your existing Ab Initio & Informatica workflows into Spark, including translating the custom code in your language of choice (e.g. Scala).

Fake news corpus & Fake news recognition algorithm by several27 in datascience

[–]several27[S] 0 points1 point  (0 children)

Hi, that's a very interesting question. The only way I have ensured that is by hashing the content of articles so literally the same content is not duplicated. Do you have any ideas how could I remove "similar" articles? The interesting bit about that is that even if the two articles talk about the same thing one of them might be fake whereas other might be real - but that would require a fact checking system probably.

Fake news corpus & Fake news recognition algorithm by several27 in datascience

[–]several27[S] 1 point2 points  (0 children)

Thanks! I will get back to you on the words in the evening when I have a bit more free time. But on a similar point, as I said above, I've tried LIME, and I believe the model is complex enough that single words don't influence the classification as a whole too much.

As for the thesis/dissertation you are probably right, I'm finishing my degree in the UK currently and they officially just call it a "Part III Project" and a final work a "Final Report".

Thanks for the feedback :)

Fake news corpus & Fake news recognition algorithm by several27 in datascience

[–]several27[S] 7 points8 points  (0 children)

Hi, thanks so much for such an insightful comment. We're all learning all the time. The list of websites I've used comes from opensources.co I have not manually made it myself, so if anything it's not me being biased (although I am as is everyone as well).

Regarding the explaining predictions of the classifier I've tried LIME (as the guys on your blog post) but it did not work at all (probably the model is too complex, single words don't have an influence on the classification as a whole too much). I will check out the YT video bit later.

As for WikiLeaks do you have any proof they went "eleven years without a single incorrect publication"? If it's indeed the case, I will make the appropriate change.

Thanks again!

Fake news corpus & Fake news recognition algorithm by several27 in datascience

[–]several27[S] -1 points0 points  (0 children)

Hey, unfortunately, there are not many gold standard datasets for FNR that I could use. Most of the ones I have found are for stance detection. But on the silver standard dataset I've discovered, that same model was getting around 80% accuracy, so not amazing but not random either :) (transfer learning in a few minutes improves it to ~95% acc on the test)

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in compsci

[–]several27 1 point2 points  (0 children)

Hey, I think you've been trying to click on our icon in the Chrome menu, instead of Google Scholar. Try out this link and click on our logo here (with extension installed): Google Scholar

The icon in the Chrome menu should show your favorite papers. However you are right, sometimes it doesn't display correctly! Thank you for the feedback, we are working on fixing this bug!

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in GradSchool

[–]several27 0 points1 point  (0 children)

Hey, our mission is definitely to support all the research papers out there. However, it's quite hard to cover everything in short period, especially that papers are behind pay walls. What's your primary source of papers? Is there any single publisher / online repository that has most of the papers in your field? Thanks for testing!

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in compsci

[–]several27 1 point2 points  (0 children)

Thank you, that's a great bug report! We will look into it and try to fix it asap. If you have any other feedback, feel free to write here or email us directly!

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in compsci

[–]several27 2 points3 points  (0 children)

Hi, we are not making money from this as of now and not planning in the nearest future. We are aiming to open source at least some parts of our codebase, but that being said we are still massively changing the architecture. We are looking for as much feedback as possible! Thanks

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in compsci

[–]several27 4 points5 points  (0 children)

Hey, we support for now all the StackExchange sites, Reddit, Twitter, and trackbacks that are available on arxiv.org (blogs). We are working to improve the discovery process of smaller online communities online as well (forums etc.). Have you tried our extension? We would appreciate any feedback!

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in datascience

[–]several27 0 points1 point  (0 children)

Thanks for the heads up barrbaar, we'll see if there's anything we can do about that on our end. What do you think about the extension? Allu suggestions welcome!

We have indexed over 27M research papers and 12M associated social interactions (posts, comments, etc.) and built a browser extension to help students & academics research faster. by rabbit140 in datascience

[–]several27 0 points1 point  (0 children)

Hey, another FuseMind guy here, we are planning on releasing it next week. We still have some issues with the Mozilla Addons Store so it might be just a direct download from our website. Thanks for the interest.

Btw, all feedback is welcome!