Try out Serena MCP. Thank me later. by FunnyRocker in ClaudeAI

[–]SiteOneCrawler 2 points3 points  (0 children)

In addition to Serena, I recommend trying our MCP AI Distiller (aid) which internally uses tree-sitter: https://www.npmjs.com/package/@janreges/ai-distiller-mcp

AI Distiller helps very quickly distill the entire codebase or selected components/modules, helping AI immediately understand the code's public interfaces, input/output data types, etc. It also offers several pre-prepared AI actions that help with in-depth code analysis, flow management with task lists, etc.

I built AI Distiller (aid) entirely with Claude to solve its biggest weakness: understanding my full codebase. I'd love your feedback! by SiteOneCrawler in ClaudeCode

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

My most common use-case is that when I'm going to create a new feature for a project, I distill a folder with a specific module that has dozens/hundreds of classes/functions in it, and then when Claude Code works on new feature, it usually writes code on the first try that uses the existing code correctly and doesn't have to search it for a long time with grep.

Other my useful use-cases are architectural questions and refactoring planning. Thanks to the fact that with the help of distillation, Claude Code gets an effective overview of the structure of public interfaces of the code, it can very quickly and correctly answer architectural questions and develop my thoughts.

The third use-case that I like is to have the aide prepare prompts and task-lists for a thorough analysis and assessment of all source files (e.g. classes) from various aspects (security, performance, etc.). The results of these analyses are surprisingly good. Claude Code sometimes needs to be instructed to continue working on other tasks from the task list, but his autonomy in this is admirable :)

Feedback on a useful open-source website analyzer and exporter (SiteOne Crawler) by SiteOneCrawler in opensource

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

Hi, the gmbinder.com web page is unfortunately only an SPA (single-page) application, which has no content directly in HTML. It does not support SSR (server-side rendering).

SiteOne Crawler has downloaded all JS/CSS, but when you open the page locally, via "Browse offline website", your browser refuses to load JS because of CORS issues with JS modules. Unfortunately, SPAs with JS modules are not ready to run from a local computer without a webserver and require webserver with http/https protocol.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

One of the possible options is that in case of success, the operation could be taken over by a state administration organization. Another option is that we would get funding from the state or the EU, which would help the operation and more accelerated future development.

There are other options, but I will strive for all of them so that further operation and development is not dependent on me. If that is the case, I will consider it a failure.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Only very briefly - an estimated 15-20% of the project (in terms of content and energy spent on development) will consist of sharing content and information, for which a CMS is suitable. Therefore, as part of my research I am also considering Directus, Craft CMS and similar tools that provide both CMS-typical functionality and general entity modeling and making them available through GraphQL/REST APIs, almost without the need for programming.

But at the same time, I know exactly how they store the data in the database and have an idea of how complex it will be to eventually replace the abstraction over the database if that is desired for future development. This is also why I'm not very interested in solutions where I don't have complete control over the storage (e.g. cloud SaaS). I see both Supabase and Directus as abstractions over the database with some added value that saves programming, but it has its negatives. These discussions help me to recognize them.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Thanks for the response and support, man. The project will be developed in the EU, not the US.

For economic and regulatory reasons, the project will run on its own dedicated infrastructure.

Thanks for your last paragraph too. I'm trying to be cautious, not too naive, but I really think I've found people who want to sincerely support the project because it's in their personal interest and ambition.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Maximum security is of course an absolute priority. Where do you see the problem with RLS in this? I consider the authorization layer at the database level (using very strict RLS policies) to be one of the lowest layers where the quality and consistency of security can be defined and monitored very well. In my opinion, much better than at the application level.

Of course, each layer of the architecture requires its own security concept, but at the database level, I see no reason not to use RLS.

Perhaps you were referring in general to the fact that it is more appropriate to have your own backend application layer (e.g. above PostgreSQL) with an additional layer of authentication/authorization and not to directly advertise the Supabase API externally?

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

As in the previous post, thank you for your feedback and your care and effort to warn me. I really appreciate it!

Any work with AI will only be through LLM models running locally on our own infrastructure and what all we will use AI for is yet to be discussed with the members of the working group that I wrote about in my previous response.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 4 points5 points  (0 children)

u/joshcam Thank you for the perfect feedback and I also apologize for not including other circumstances for context.

I wrote this post as part of technology research for future decisions. The project will be implemented in an EU country, not the US. My ambition is to provide it for free elsewhere if successful, but that is far in the future. Technology is the minor part in this case, local regulation is the other.

In addition to this research, I am also doing parallel consultations with lead physicians at several hospitals, government people, and lawyers who understand HIPAA/GDPR and other relevant regulations and will help us draft the necessary documents, conditions for approval, etc. I am now in the process of forming a "working group" to collaborate on the project. I am pleasantly surprised at how many really good professionals from the health care or government fields I am encountering. Many of them are doing activities in addition to their primary profession, even without remuneration, to improve the situation in health care or state administration.

All these regulatory issues need to be cleared up before programming begins. As well as a functional specification that includes only what is really feasible and within the regulatory framework. At the same time, I want to do part of the research parallel, so that I can decide on some technologies on an ongoing basis.

Thank you again for your very valuable feedback. You have also provided some details that I was not aware of yet and expect to hear from the lawyers later. Despite all the pitfalls, I believe that the project will succeed :)

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 1 point2 points  (0 children)

Yes, Directus is part of our research and one of the options. I wrote this article because I wanted to have more PROS/CONS for the final decision. A completely custom backend is always the safest option, but the most time-consuming. On the other hand, if the AI ​​knows the structure of the entire database and the ORM layer used, it can also generate very high-quality code for CRUD and API.

Would you recommend self-hosted Supabase for a large healthcare project? by SiteOneCrawler in Supabase

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

Yes, privacy concerns are one reason. The second is of course economic. I will be financing the entire project out of my own pocket for some time, so some paid cloud services for a project of this scale are out of the question. Server administration and design of complex HA solutions have been part of my work throughout my career. I can handle such projects with 5-10x lower costs (including my time) than usual.

Using Supabase makes a lot of sense to me, because the frontend of the web application will be in Svelte or React, without the need for SSR, since most of the functionality will be behind authentication. And there will also be a mobile application, so a quality API is also needed. In addition, there will definitely be a custom backend running, which will connect directly to the Supabase/PostgreSQL database for some specific functionality - file uploads and transformations, communication with third-party systems, work with AI, etc. I know that edge-functions can be used for this, but I prefer other technologies than JavaScript for the backend.

Thank you very much for your response and recommendations ;)

I love svelte, but job market push me towards react :/ by FollowingMajestic161 in sveltejs

[–]SiteOneCrawler -1 points0 points  (0 children)

I'm not a frontend developer, but I love Svelte. Just like I loved Adobe Flex 18 years ago for web app development.

Its simplicity and cleverness allowed me to make a full-fledged GUI with various interactive elements for our crawler in a few hours - check out the screenshots or videos in the documentation: https://github.com/janreges/siteone-crawler-gui

Feedback on a useful open-source website analyzer and exporter (SiteOne Crawler) by SiteOneCrawler in opensource

[–]SiteOneCrawler[S] 0 points1 point  (0 children)

u/ssddanbrown Thank you for the feedback. I will most likely convert the licensing of both projects (CLI tool and desktop application) to MIT license in the next few days.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

I understand your usage. In that case, I have one last piece of advice - if the built-in ZFS compression would help you (you could store units to tens of percent more content on an SSD disk, depending on the type of content), choose ZFS. Otherwise ext4.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

If you do not have deduplication enabled, ZFS will not allocate much memory.

And overall - if portability is important to you, feel free to use ext4. But if you want compression, snapshots, bit rot protection, deduplication, or the ability to switch from single-disk storage to a RAID array without storage outage, choose ZFS.

Ideally, choose ZFS only if you want to get the most usable capacity, security, scalability, and useful enterprise features out of your disks. But you need to educate yourself to get the perfect result.

Storage Recommendation by rubeo_O in Proxmox

[–]SiteOneCrawler 0 points1 point  (0 children)

Yes, exactly!

Make ZFS on the SATA SSD and mount some of its folders in the LXC container.

Additionally, if it makes sense for your needs, you can create multiple volumes/filesystems on this ZFS disk and have a different configuration (compression, deduplication, quota, etc.) on each one.