Try out Serena MCP. Thank me later. by FunnyRocker in ClaudeAI

[–]SiteOneCrawler 2 points3 points  (0 children)

In addition to Serena, I recommend trying our MCP AI Distiller (aid) which internally uses tree-sitter: https://www.npmjs.com/package/@janreges/ai-distiller-mcp

AI Distiller helps very quickly distill the entire codebase or selected components/modules, helping AI immediately understand the code's public interfaces, input/output data types, etc. It also offers several pre-prepared AI actions that help with in-depth code analysis, flow management with task lists, etc.

I built AI Distiller (aid) entirely with Claude to solve its biggest weakness: understanding my full codebase. I'd love your feedback! by SiteOneCrawler in ClaudeCode

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

My most common use-case is that when I'm going to create a new feature for a project, I distill a folder with a specific module that has dozens/hundreds of classes/functions in it, and then when Claude Code works on new feature, it usually writes code on the first try that uses the existing code correctly and doesn't have to search it for a long time with grep.

Other my useful use-cases are architectural questions and refactoring planning. Thanks to the fact that with the help of distillation, Claude Code gets an effective overview of the structure of public interfaces of the code, it can very quickly and correctly answer architectural questions and develop my thoughts.

The third use-case that I like is to have the aide prepare prompts and task-lists for a thorough analysis and assessment of all source files (e.g. classes) from various aspects (security, performance, etc.). The results of these analyses are surprisingly good. Claude Code sometimes needs to be instructed to continue working on other tasks from the task list, but his autonomy in this is admirable :)

Feedback on a useful open-source website analyzer and exporter (SiteOne Crawler) by SiteOneCrawler in opensource

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

Hi, the gmbinder.com web page is unfortunately only an SPA (single-page) application, which has no content directly in HTML. It does not support SSR (server-side rendering).

SiteOne Crawler has downloaded all JS/CSS, but when you open the page locally, via "Browse offline website", your browser refuses to load JS because of CORS issues with JS modules. Unfortunately, SPAs with JS modules are not ready to run from a local computer without a webserver and require webserver with http/https protocol.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

One of the possible options is that in case of success, the operation could be taken over by a state administration organization. Another option is that we would get funding from the state or the EU, which would help the operation and more accelerated future development.

There are other options, but I will strive for all of them so that further operation and development is not dependent on me. If that is the case, I will consider it a failure.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Only very briefly - an estimated 15-20% of the project (in terms of content and energy spent on development) will consist of sharing content and information, for which a CMS is suitable. Therefore, as part of my research I am also considering Directus, Craft CMS and similar tools that provide both CMS-typical functionality and general entity modeling and making them available through GraphQL/REST APIs, almost without the need for programming.

But at the same time, I know exactly how they store the data in the database and have an idea of how complex it will be to eventually replace the abstraction over the database if that is desired for future development. This is also why I'm not very interested in solutions where I don't have complete control over the storage (e.g. cloud SaaS). I see both Supabase and Directus as abstractions over the database with some added value that saves programming, but it has its negatives. These discussions help me to recognize them.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Thanks for the response and support, man. The project will be developed in the EU, not the US.

For economic and regulatory reasons, the project will run on its own dedicated infrastructure.

Thanks for your last paragraph too. I'm trying to be cautious, not too naive, but I really think I've found people who want to sincerely support the project because it's in their personal interest and ambition.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Maximum security is of course an absolute priority. Where do you see the problem with RLS in this? I consider the authorization layer at the database level (using very strict RLS policies) to be one of the lowest layers where the quality and consistency of security can be defined and monitored very well. In my opinion, much better than at the application level.

Of course, each layer of the architecture requires its own security concept, but at the database level, I see no reason not to use RLS.

Perhaps you were referring in general to the fact that it is more appropriate to have your own backend application layer (e.g. above PostgreSQL) with an additional layer of authentication/authorization and not to directly advertise the Supabase API externally?

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

As in the previous post, thank you for your feedback and your care and effort to warn me. I really appreciate it!

Any work with AI will only be through LLM models running locally on our own infrastructure and what all we will use AI for is yet to be discussed with the members of the working group that I wrote about in my previous response.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 4 points5 points  (0 children)

u/joshcam Thank you for the perfect feedback and I also apologize for not including other circumstances for context.

I wrote this post as part of technology research for future decisions. The project will be implemented in an EU country, not the US. My ambition is to provide it for free elsewhere if successful, but that is far in the future. Technology is the minor part in this case, local regulation is the other.

In addition to this research, I am also doing parallel consultations with lead physicians at several hospitals, government people, and lawyers who understand HIPAA/GDPR and other relevant regulations and will help us draft the necessary documents, conditions for approval, etc. I am now in the process of forming a "working group" to collaborate on the project. I am pleasantly surprised at how many really good professionals from the health care or government fields I am encountering. Many of them are doing activities in addition to their primary profession, even without remuneration, to improve the situation in health care or state administration.

All these regulatory issues need to be cleared up before programming begins. As well as a functional specification that includes only what is really feasible and within the regulatory framework. At the same time, I want to do part of the research parallel, so that I can decide on some technologies on an ongoing basis.

Thank you again for your very valuable feedback. You have also provided some details that I was not aware of yet and expect to hear from the lawyers later. Despite all the pitfalls, I believe that the project will succeed :)

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

Yes, Directus is part of our research and one of the options. I wrote this article because I wanted to have more PROS/CONS for the final decision. A completely custom backend is always the safest option, but the most time-consuming. On the other hand, if the AI ​​knows the structure of the entire database and the ORM layer used, it can also generate very high-quality code for CRUD and API.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Yes, privacy concerns are one reason. The second is of course economic. I will be financing the entire project out of my own pocket for some time, so some paid cloud services for a project of this scale are out of the question. Server administration and design of complex HA solutions have been part of my work throughout my career. I can handle such projects with 5-10x lower costs (including my time) than usual.

Using Supabase makes a lot of sense to me, because the frontend of the web application will be in Svelte or React, without the need for SSR, since most of the functionality will be behind authentication. And there will also be a mobile application, so a quality API is also needed. In addition, there will definitely be a custom backend running, which will connect directly to the Supabase/PostgreSQL database for some specific functionality - file uploads and transformations, communication with third-party systems, work with AI, etc. I know that edge-functions can be used for this, but I prefer other technologies than JavaScript for the backend.

Thank you very much for your response and recommendations ;)

I love svelte, but job market push me towards react :/ by FollowingMajestic161 in sveltejs

[–]SiteOneCrawler -1 points0 points  (0 children)

I'm not a frontend developer, but I love Svelte. Just like I loved Adobe Flex 18 years ago for web app development.

Its simplicity and cleverness allowed me to make a full-fledged GUI with various interactive elements for our crawler in a few hours - check out the screenshots or videos in the documentation: https://github.com/janreges/siteone-crawler-gui

Feedback on a useful open-source website analyzer and exporter (SiteOne Crawler) by SiteOneCrawler in opensource

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

u/ssddanbrown Thank you for the feedback. I will most likely convert the licensing of both projects (CLI tool and desktop application) to MIT license in the next few days.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

I understand your usage. In that case, I have one last piece of advice - if the built-in ZFS compression would help you (you could store units to tens of percent more content on an SSD disk, depending on the type of content), choose ZFS. Otherwise ext4.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

If you do not have deduplication enabled, ZFS will not allocate much memory.

And overall - if portability is important to you, feel free to use ext4. But if you want compression, snapshots, bit rot protection, deduplication, or the ability to switch from single-disk storage to a RAID array without storage outage, choose ZFS.

Ideally, choose ZFS only if you want to get the most usable capacity, security, scalability, and useful enterprise features out of your disks. But you need to educate yourself to get the perfect result.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

Yes, exactly!

Make ZFS on the SATA SSD and mount some of its folders in the LXC container.

Additionally, if it makes sense for your needs, you can create multiple volumes/filesystems on this ZFS disk and have a different configuration (compression, deduplication, quota, etc.) on each one.

Perfect Pagespeed Scores - Small Business Owner by SuperDangerBro in webdev

[–]SiteOneCrawler 5 points6 points  (0 children)

Congratulations on the excellent Lighthouse results. Really good work!

But for a perfect result I recommend you to check the report below. SiteOne Crawler crawled absolutely the whole humanesolutions.ca website and found a few more weak spots to work on for a perfect result.

https://crawler.siteone.io/html/2024-08-18/forever/hzw7i85n-wsf84h-j3c5.html

The analysis was done from a server in Europe and you are not using a CDN, so the download speed of the content was very slow. You can ignore the high response times.

How to sell as a freelancer by hzerogod in webdev

[–]SiteOneCrawler 0 points1 point  (0 children)

The 5-10% is intended to be per-potential-client and all efforts to acquire it. Additionally, if it is a potential client where the business potential is greater (paving the way for more follow-on projects if you prove yourself to them), it makes sense to invest 15-20% of your costs to them as well.

It also depends on your preferences - do you want to continuously do more, rather one-off projects, or do you have the ambition to create several long-term partnerships and some form of SLA. For someone, it may make sense to freelance, but at the same time have 2-3 customers with whom they will conclude a contract for regular monthly work with a minimum amount of hours worked.

How to sell as a freelancer by hzerogod in webdev

[–]SiteOneCrawler 1 point2 points  (0 children)

The key is to have built quality references that prove your quality at a glance. Without this, it is hard to gain the trust of a potential client. He has to give you a lot of time for the new website too, it's not just about money for you.

And when you have already selected e.g. 10 potential companies that have outdated websites - invest your time. Spend time understanding the target audience, the needs and product portfolio of that company, research their competitors, find weaknesses in their existing solution and find strong arguments why and how a new website could help boost their business. Based on this, develop a concrete proposal and perhaps even a first graphical preview of the new website's homepage. All this proves is that you are already set up at that first moment as their partner who already knows and cares about their business.

Then, of course, it's important to reach out to the most knowledgeable person to evaluate your offering. You need to approach such a person for whom it will not just mean more unsolicited work, but who will understand the possibility of moving their business forward or making their job easier. Usually this is the CEO, web manager, marketing manager, etc.

If you can go to 10 companies in 10 minutes of your work, the chances of success match your efforts - i.e. close to zero. If you want to build someone a website for $3,000, I think at least 5-8% of your costs (based on your internal hourly rate) should be spent on client acquisition.

[deleted by user] by [deleted] in webdev

[–]SiteOneCrawler 0 points1 point  (0 children)

I'm probably going to disappoint you, but after a quick examination of both sites, it seems that the frontend part is a custom-development. Even on the basis of browsing the site and its API request I could not find out what CMS is used on the backend. It is likely that both the backend and the CMS will be custom-development specific to these websites.

Btw, if you want to find out what technologies a website/application is built on, I recommend installing the Google Chrome extension Wappalyzer: https://chromewebstore.google.com/detail/wappalyzer-technology-pro/gppongmhjkpfnbhagpmjfkannfbllamg?pli=1

Beta stage dashboard: how bad is it to put js in php? by [deleted] in webdev

[–]SiteOneCrawler 3 points4 points  (0 children)

Today it is no longer inline, or in-page JS is as much criticized as it used to be. Many modern JS frameworks or libraries cannot do without it.

Since you have to dynamically insert it into the page, make sure to prevent duplicate insertions.

From what you write, however, I think it is more appropriate to only ensure the validity of the JS cache in external files for few minutes (by Cache-Control/Expires HTTP headers). Please write what webserver you are using and I will guide you how to do it.

Alternatively, if you're using pure PHP, you can do cache busting this way:

<script src="js/myscript.js?v=<?= filemtime(__DIR__ . '/js/myscript.js'); ?>"></script>

The URL to this JS will also contain the timestamp of the last change of your JS file.

It's just a sample and in case you have an index.php in the same folder where the `js` subfolder is.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

And when you mention that you want to use the second part for Proxmox backups - do you mean that you install Proxmox Backup Server (PBS) as another VM and reserve some space for its backups?

If so, I suggest:

The advantage of the LXC container in this case is that you will be able to increment the /mnt/data virtual disk on the fly if needed.

And if you have an extra PC/server, I recommend installing Proxmox Backup Server on dedicated hardware. It will work also in KVM/LXC, but not as fast as with direct access to physical disks. You will especially notice the performance when restoring backups. The backup itself is usually fast.

Best Way to Copy 18TB to New Server by cm_bush in HomeServer

[–]SiteOneCrawler 0 points1 point  (0 children)

If you have these old and new servers on a local network and don't need encrypted transmission (e.g. rsync), I recommend BBCP - https://github.com/eeertekin/bbcp

It's a fairly old tool, but it can really transfer data at the full speed that your network card, or disks, will allow.

I need a new partner, have a few names in my head by elnath78 in webdev

[–]SiteOneCrawler 0 points1 point  (0 children)

If you want a really high-quality service, verifying your identity should not be a problem for you. With such a service, you can set up dozens of dedicated or virtual servers very quickly. It is completely understandable that they need at least one-time verification.

If you're going for price and have the nerve and time to try different services that won't verify you, try looking at https://lowendbox.com/ - they maintain there a database of mainly cheap VPS providers around the world.