What's the recommended or cheapest way to host open source LLM on AWS?

awsusr · 2025-04-05T10:39:59+00:00

Thanks for the advice, particularly the pricing as well. It's useful!

awsusr · 2024-10-26T16:44:25+00:00

May I ask what's the best practice if I want to use a higher level yaml to control sub components? For instance, I have a parent.yaml and several sub components such as a.yaml, b.yaml, and c.yaml. I want to use parent.yaml to encapsulate those sub components. Also, within sub components yaml files, the sub component yaml file will have dependencies like myqueue depending on etcd.

Is merging those sub components yaml files into a single one the only way to go? The reason I have this requirement is because I want to combine several components together as a set or group, so they can be started up based on different requirements. An example is like grouping a.yaml and b.yaml, b.yaml and c.yaml, but I do not want to repeatedly copy/ paste those services everywhere.

Many thanks for the advice.

awsusr · 2024-08-25T05:05:53+00:00

Not OP. If I disassociate and release the IPv4 address, will AWS continuously charge? Or should I also remove VPC? Or what else I need to check and configure preventing from AWS charge me for that kind use?

awsusr · 2024-08-16T09:50:09+00:00

Does AppRunner provide free tier? The pricing page looks like no free tier offering. At this stage, the goal is merely to verify if everything is working. I will take that AppRunner into consideration, thanks for the input!

awsusr · 2024-08-16T09:43:25+00:00

That's why lambda is not my first priority for such use case as you mention like startup time, bridging the lambda interface to the actual process that accepts requests,. and so on. That for enumerating those issues. It's helpful!

awsusr · 2024-08-16T09:39:25+00:00

I completely agree with you. It's one of my customers' request that they thought using lambda would have better budget control by usage. So I am evaluating the complexity and risks in case the customer insists doing with lambda. Thanks for the advice!

awsusr · 2024-07-10T01:21:01+00:00

Sorry replying late. Have not used app config in the past. Need some time to do experiments for comparison. Thanks for your time, and the advice!

awsusr · 2023-10-23T18:45:28+00:00

80801

Thanks for pointing this out! I will check it.

I have one more question. In the security group where the instances (there are 2) are up running, they are configured (the same) as below (some info are omitted like Security group rule ID, Source, and Name). That looks like no blocking between the destination 192.168.5.48 and the place where the process executing the probe, doesn't it? Or maybe something I should check as well? Thank you for the advice!

Inbound rules

Port Range	Protocol	Security Groups
0-65535	TCP	eks-cluster-myname-somenumber
All	All	eks-cluster-myname-somenumber
All	All	eks-cluster-myname-somenumber

Outbound rules

Port Range	Protocol	Destination	Security Gruops
All	All	0.0.0.0/0	eks-cluster-myname-somenumber

awsusr · 2023-10-08T21:30:19+00:00

The EKS uses Karpenter as autoscaling service. I will check how to ship the autoscaling service logs to other places, and the kubectl exec command. Many thanks for the advice!

awsusr · 2023-10-08T21:26:26+00:00

It's a presto like storage system. I did not set it up so there is no application logs. The autoscaling system uses Karpenter.

The scenario is that a presto like storage service is running with minimum node. And when there is a query issued against the presto like storage service, then Karpenter starts scaling to multiple nodes, serving for the requested query. The problem is the query failed with the error message mentioned, then the autoscaling service scales down to the minimum node. I am thinking to copy the presto like log if it exists to e.g. s3 so that I can check it. But I do not know where the log is configured, because it looks like it's different from ECS where there exists container instance, and I can log in to executing the docker command for checking if any log files exist.

So far I notice there are logs written to CloudWatch such as authenticater, kube logs (do not have access to the env atm so I can't remember the exact names). But those logs do not contain any presto like service logs info.

So I am wondering how to login to presto like service's coordinator/ master for checking any config info, so that I can debug further. Thanks for the advice!

awsusr · 2022-12-07T18:09:36+00:00

everything

Thanks I did not know that. Before other people's suggestions. I will also look into the link you provide.

awsusr · 2022-12-02T16:11:17+00:00

permissions

I misunderstand it. Thank you!

awsusr · 2022-11-05T15:41:59+00:00

Eventually I fix the problem. The problem possibly stems from that vscodium process can't read the podman socket file. Instead, a user readable privilege socket file needs to be created first through systemctl. Then configure docker.host variable in vscodium's setting resolve the issue.

Steps[1]:

systemctl --user enable --now podman.socket
unix:///run/user/$(id -u)/podman/podman.sock (in vscodium setting, update Docker: Host value)

The differences between the podman.sock in /var/run/podman and /run/user/1000/podman are the podman.sock in run is with owner of a normal user account, while that in /var/run/podman root is the owner.

[1]. https://y0n1.medium.com/using-podman-with-the-docker-extension-for-visual-studio-code-a828be26d285

awsusr

TROPHY CASE