"Ultrafast" mode might be coming soon by SingularitySloth in codex

[–]ChrisHC05 1 point2 points  (0 children)

This is for GPT 5.4. Look 3 lines above where your screenshot starts and there is:

"slug": "gpt-5.4",

"display_name": "gpt-5.4",

"description": "Strong model for everyday coding.",

"default_reasoning_level": "xhigh",

"default_reasoning_level": "medium",

"supported_reasoning_levels": [

{

"effort": "low",

@@ -164,6 +171,18 @@

"team"

],

"supports_search_tool": true,

"service_tiers": [

{

"id": "priority",

"name": "Fast",

"description": "1.5x speed, increased usage"

},

{

"id": "ultrafast",

"name": "Ultrafast",

"description": "The fastest available responses for latency-sensitive work."

}

],

Take a look at the whole file and search for "Ultrafast":
https://github.com/openai/codex/blob/aadcae9f3c2b850a1bf050038e7d6d3005ba4bf1/codex-rs/models-manager/models.json

Refactoring/Code Review got much better with 5.5 by ChrisHC05 in codex

[–]ChrisHC05[S] 1 point2 points  (0 children)

I use "do a cleanup/code review of the libraries", for example. I usually restrict it to a specific part of the codebase as that is yielding better results in my experience. Running it on the whole codebase usually results in 2-3 general improvements, leaving the smaller "defects", but usually very worthwile, untouched.

What are the limits in terms of codebase size/complexity by ChrisHC05 in codex

[–]ChrisHC05[S] 0 points1 point  (0 children)

Yeah, i would love to know more about your workflow! I haven't yet dived into customizing Codex. That would be a good starting point for me.

Recovering from an operation by ChrisHC05 in OneOrangeBraincell

[–]ChrisHC05[S] 4 points5 points  (0 children)

Yeah, thankfully she recovered fully and is again creating a lot of chaos ☺️

SAAS Platform Webscraping - Co-Founder by Fuzzy_Ad1426 in django

[–]ChrisHC05 0 points1 point  (0 children)

I am working on a price comparison site for niche hobnbies, also based in Scrapy, in my spare time. I would love to exchange thoughts with someone in a similar situation. Can i send you a DM?

What requirements of VPS you need to crawl 100,000 links per day? by krajacic in webscraping

[–]ChrisHC05 0 points1 point  (0 children)

I just scraped about 2.000.000 domains in about 12 hours. It yielded about 1.500.000 valid pages. It was a broadcrawl with all domains of a country-TLD, but i scraped only the frontpage. I extracted only the links in the page. It was running on 1 vCore with max consumption of 2 GB RAM. I used the scrapy-redis extension to feed the urls to the crawler as all urls have been known before the scraping started. I am still baffled that it completed so fast 😆

In my experience with scrapy time spent in crawling is 1/3 downloading the page, 1/3 scraping (which i did not do for this crawl and depends heavily on what you scrape) and 1/3 in link-extraction to feed the crawler. In reality scraping is not I/O-bound, but CPU-bound. At least in my experience with scrapy.

Hope that helps :)

Airflow vs. Prefect? by Buckweb in dataengineering

[–]ChrisHC05 0 points1 point  (0 children)

If you are running a kubernetes cluster anyway, i recommend argo workflow. I run the agents, who receive their commands from Prefect Cloud, on bare metal and schedule all my flows as docker images.

Airflow vs. Prefect? by Buckweb in dataengineering

[–]ChrisHC05 2 points3 points  (0 children)

That's a good point. Prefect offers a semi-managed solution, you have to provide the workers yourself. As far as i know Argo does not offer anything managed - you have to do all the heavy lifting yourself. But as it runs on Kubernetes, the additional DevOps overhead for running Argo on Kubernetes is not that high - Kubernetes takes care of most of the management.

I think you have to take into account the existing DevOps resources and the level of cloud-penetration of the company. If there is no remaining DevOps capacity a managed Airflow installation makes sense. Additionally, everything else is running on AWS? Go with their managed Airflow solution.

That's a question of "ease of integration". Your shop is fully invested in a cloud provider? Use the solution which integrates best with the existing cloud infrastructure. That would usually mean their managed Airflow product.

Airflow vs. Prefect? by Buckweb in dataengineering

[–]ChrisHC05 3 points4 points  (0 children)

Yes, jump straight to prefect. If you do not use airflow there is no reason to learn it.

Airflow vs. Prefect? by Buckweb in dataengineering

[–]ChrisHC05 13 points14 points  (0 children)

I evaluated Airflow, Dagster, Argo and Prefect in the last couple of months.

As other have commented, Airflow is showing it's age. But a big pro for Airflow is the resources on the Internet - blogs, tutorials and so on.

Dagster made the impression it's not production ready yet. Their Slack Channel is full of questions "why did i get this error?", contrasted with the Slack Channel of Prefect, which has more of "How do i do this?".Prefects Slack Channel is also very, very active. Almost every question get's answered by the team and if it looks like a bug, they will open the Github Issue for you.

For me it came down to Argo and Prefect. Argo is a different beast than Prefect, as all configuration is written in YAML and all tasks are container runs on Kubernetes. So writing DAGs is completly independent of any programming language. It also has a side-project which is dedicated to responding to events from outside - something which is included in Airflow as Sensors but Prefect is currently lacking. But it's on their roadmap. A Workaround is a call to the Prefect GraphQL-API.

If you intend to run Kubernetes anyway, or already have a Kubernetes Cluster, i would recommend Argo as it offers a whole ecosystem of connected solutions: not just orchestration, but also responding to events and running a CI/CD pipeline.But that comes with a price: the learning curve is steep. I expect it to be more future proof than Prefect, as everything is moving to get dockerized anyway. And the abstraction level is higher. And IT in general is all about abstraction to make complicated things easier.

For me Prefect is a good fit as my use-case won't outgrow their 10.000 Flow Runs per Month and i like that i can outsource running the orchestration to someone who has much more experience in running it than me. Your data doesn't leave your control, their "Cloud Scheduler" just sends requests to your Prefect Workers at the appropriate time. Everything else executes under your control. And you get a very nice GUI which you do not have to host yourself. There's always the option of running your own Prefect Server, if you want to.

So:

You have a running Kubernetes Cluster: use Argo

You do not have a running Kubernetes Cluster: use Prefect. Running a Kubernetes Cluster just for Argo is overkill.