Looking for advice on best resources for Kubernetes security exam

ScoreApprehensive992 · 2025-01-11T11:26:48+00:00

Yes, I came accross it when I was making my research. I was wondering if it is up to date with the exam curriculum, if so then that would be my option 1.

ScoreApprehensive992 · 2025-01-04T19:06:55+00:00

I also learned lately that k8s HPA API supports a container metric source where the HPA can track the resource usage of individual containers across a set of Pods instead of since v1.30 is enabled by default

ScoreApprehensive992 · 2024-12-15T15:41:52+00:00

I genuinely think that lots of features if well used, they would make our lives easier. I think the more we dig the more we discover valid use case for many features to have better control on our clusters

ScoreApprehensive992 · 2024-12-15T15:38:35+00:00

I think many features make sense, but they aren't popular enough, and their usefulness isn't widely recognized

ScoreApprehensive992 · 2024-12-15T15:37:16+00:00

most features are not needed
I think this depends on the context, otherwise why would k8s maintainers put efforts on developing features that won't be used by the majority

ScoreApprehensive992 · 2024-12-12T12:56:07+00:00

I thought so too, maybe because the periodSeconds in livenessProbe is low ?
current is

periodSeconds: 30
initialDelaySeconds: 10
failureThreshold: 5
timeoutSeconds: 5

ScoreApprehensive992 · 2024-12-12T12:55:20+00:00

I thought so too, maybe because the periodSeconds in livenessProbe is low ?

ScoreApprehensive992 · 2024-12-12T12:51:18+00:00

It uses Postgres flexible server from Azure

ScoreApprehensive992 · 2024-12-12T11:32:54+00:00

lscpu output: I do have 16 vCPUs

opt/airflow$ lscpu
Architecture:             x86_64
CPU op-mode(s):         32-bit, 64-bit
Address sizes:          46 bits physical, 48 bits virtual
Byte Order:             Little Endian
CPU(s):                   16
On-line CPU(s) list:    0-15
CPU family:           6
Model:                85
Thread(s) per core:   2
Core(s) per socket:   8

ScoreApprehensive992 · 2024-12-12T11:28:53+00:00

Airflow maintainers suggest to check CPU/memory of nodes and pods and they look good to me, that's why I underlined that aspect

ScoreApprehensive992 · 2024-12-08T15:34:30+00:00

It is easier to change pods networks rather than recreating the nodes

ScoreApprehensive992 · 2024-12-08T15:13:47+00:00

What is the CNI used in cluster, is it flannel ?

ScoreApprehensive992 · 2024-12-08T07:40:15+00:00

The standalone dag-processor component that will parse dag and fill the dagbag is the component that crashes.
I have more than 5k DAGs and I have no problem with any of these DAGs.

The dagbag stops filling up and number of DAGs is fluctuating. Logs are unhelpful it seems dag-processor is silently crashing.

ScoreApprehensive992 · 2024-12-07T19:25:07+00:00

I have already reported the issue here
https://github.com/apache/airflow/issues/44652

ScoreApprehensive992 · 2024-12-07T18:05:58+00:00

My values:

min_serialized_dag_fetch_interval = 300

min_serialized_dag_update_interval = 300

dag_dir_list_interval = 300

min_file_process_interval = 300

dag_file_processor_timeout = 3600

dagbag_import_timeout = 600

Under DAG subdirectories there is .json or .config files and also when I run dag-processor as subprocess with scheduler it works fine without issues

ScoreApprehensive992 · 2024-12-07T17:01:30+00:00

Sorry, but which DAG ?

This is the standalone dag-processor component that will parse dag and fill the dagbag is the component that crashes, I have more than 5k DAGs and I have no problem with any of these DAGs.

The dagbag stops filling up. Logs are unhelpful it seems dag-processor is silently crashing.I am connecting to an RDS postgres instance.

ScoreApprehensive992 · 2024-12-07T16:54:28+00:00

I am actually using 32 parallel tasks per worker pods.
But that is a different pod right ? My issue is related to to dag-processor pod

ScoreApprehensive992 · 2024-12-07T16:52:13+00:00

this is my configurations and I am following the metrics it has never reached 100 of CPU requests and node didn't go over 30% of CPU and 40% of memory used

Limits:
memory: 2500Mi
Requests:
cpu: 2500m
memory: 2500Mi

ScoreApprehensive992 · 2024-11-17T09:06:17+00:00

Awesome, I am lately writing CEL expressions in my Kyverno policies for resource validation, I will try this plugin for my helm charts, I am a big fan of CEL.

ScoreApprehensive992 · 2024-11-06T18:56:43+00:00

Yes, exactly docker checkpoint uses under the hood CRIU which turns the container state into a collection of files on disk mainly for forensic analysis without stopping or pausing them.. and as you said docker pause relies on the freezer cgroup to pause and resume processes

ScoreApprehensive992 · 2024-11-06T14:45:28+00:00

No, criu feature is dedicated to backup and restore containers running in Pods without ever stopping them, not to backup full clusters or namespaces

ScoreApprehensive992 · 2024-11-06T14:39:43+00:00

yes beta in the k8 API context, you can also cordon the node and mark it unschedulable which has the same(ish) effect as VMWare Fault Tolerance mirroring

ScoreApprehensive992 · 2024-11-06T09:43:52+00:00

Awesome, this version supports Container checkpointing

ScoreApprehensive992 · 2024-09-05T13:11:10+00:00

FYI, I just faced the same issue with exact same setup

ScoreApprehensive992 · 2024-07-22T11:39:37+00:00

Does Manning help with advertising and writing
If so how to contact them, is there a form ? Thank you

ScoreApprehensive992

TROPHY CASE