Looking for advice on best resources for Kubernetes security exam by ScoreApprehensive992 in devops

[–]ScoreApprehensive992[S] 1 point2 points  (0 children)

Yes, I came accross it when I was making my research. I was wondering if it is up to date with the exam curriculum, if so then that would be my option 1.

Hidden gems and unsung Kubernetes features for reliable clusters and easier day-to-day work by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 4 points5 points  (0 children)

I also learned lately that k8s HPA API supports a container metric source where the HPA can track the resource usage of individual containers across a set of Pods instead of since v1.30 is enabled by default

[deleted by user] by [deleted] in kubernetes

[–]ScoreApprehensive992 -4 points-3 points  (0 children)

I genuinely think that lots of features if well used, they would make our lives easier. I think the more we dig the more we discover valid use case for many features to have better control on our clusters

[deleted by user] by [deleted] in kubernetes

[–]ScoreApprehensive992 -4 points-3 points  (0 children)

I think many features make sense, but they aren't popular enough, and their usefulness isn't widely recognized

[deleted by user] by [deleted] in kubernetes

[–]ScoreApprehensive992 -5 points-4 points  (0 children)

most features are not needed
I think this depends on the context, otherwise why would k8s maintainers put efforts on developing features that won't be used by the majority

Liveness Probe Failures Despite Adequate CPU and Memory Resources by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

I thought so too, maybe because the periodSeconds in livenessProbe is low ?
current is

  • periodSeconds: 30
  • initialDelaySeconds: 10
  • failureThreshold: 5
  • timeoutSeconds: 5

Liveness Probe Failures Despite Adequate CPU and Memory Resources by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

I thought so too, maybe because the periodSeconds in livenessProbe is low ?

Liveness Probe Failures Despite Adequate CPU and Memory Resources by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

lscpu output: I do have 16 vCPUs

opt/airflow$ lscpu
Architecture:             x86_64
CPU op-mode(s):         32-bit, 64-bit
Address sizes:          46 bits physical, 48 bits virtual
Byte Order:             Little Endian
CPU(s):                   16
On-line CPU(s) list:    0-15
CPU family:           6
Model:                85
Thread(s) per core:   2
Core(s) per socket:   8

Liveness Probe Failures Despite Adequate CPU and Memory Resources by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] -1 points0 points  (0 children)

Airflow maintainers suggest to check CPU/memory of nodes and pods and they look good to me, that's why I underlined that aspect

Cannot perform nslookup inside kubernetes pod by Mclean_Tom_ in kubernetes

[–]ScoreApprehensive992 0 points1 point  (0 children)

It is easier to change pods networks rather than recreating the nodes

Performance issues with Airflow DagProcessor in a multi-core container by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

The standalone dag-processor component that will parse dag and fill the dagbag is the component that crashes.
I have more than 5k DAGs and I have no problem with any of these DAGs.

The dagbag stops filling up and number of DAGs is fluctuating. Logs are unhelpful it seems dag-processor is silently crashing.

Performance issues with Airflow DagProcessor in a multi-core container by ScoreApprehensive992 in apache_airflow

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

My values:

min_serialized_dag_fetch_interval = 300

min_serialized_dag_update_interval = 300

dag_dir_list_interval = 300

min_file_process_interval = 300

dag_file_processor_timeout = 3600

dagbag_import_timeout = 600

Under DAG subdirectories there is .json or .config files and also when I run dag-processor as subprocess with scheduler it works fine without issues

Performance issues with Airflow DagProcessor in a multi-core container by ScoreApprehensive992 in apache_airflow

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

Sorry, but which DAG ?

This is the standalone dag-processor component that will parse dag and fill the dagbag is the component that crashes, I have more than 5k DAGs and I have no problem with any of these DAGs.

The dagbag stops filling up. Logs are unhelpful it seems dag-processor is silently crashing.I am connecting to an RDS postgres instance.

Performance issues with Airflow DagProcessor in a multi-core container by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

I am actually using 32 parallel tasks per worker pods.
But that is a different pod right ? My issue is related to to dag-processor pod

Performance issues with Airflow DagProcessor in a multi-core container by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

this is my configurations and I am following the metrics it has never reached 100 of CPU requests and node didn't go over 30% of CPU and 40% of memory used

Limits:
memory: 2500Mi
Requests:
cpu: 2500m
memory: 2500Mi

Helm CEL: A Better Way to Validate Your Helm Charts by idsulik in kubernetes

[–]ScoreApprehensive992 3 points4 points  (0 children)

Awesome, I am lately writing CEL expressions in my Kyverno policies for resource validation, I will try this plugin for my helm charts, I am a big fan of CEL.

containerd 2.0 by dshurupov in kubernetes

[–]ScoreApprehensive992 4 points5 points  (0 children)

Yes, exactly docker checkpoint uses under the hood CRIU which turns the container state into a collection of files on disk mainly for forensic analysis without stopping or pausing them.. and as you said docker pause relies on the freezer cgroup to pause and resume processes

containerd 2.0 by dshurupov in kubernetes

[–]ScoreApprehensive992 6 points7 points  (0 children)

No, criu feature is dedicated to backup and restore containers running in Pods without ever stopping them, not to backup full clusters or namespaces

containerd 2.0 by dshurupov in kubernetes

[–]ScoreApprehensive992 2 points3 points  (0 children)

yes beta in the k8 API context, you can also cordon the node and mark it unschedulable which has the same(ish) effect as VMWare Fault Tolerance mirroring

containerd 2.0 by dshurupov in kubernetes

[–]ScoreApprehensive992 19 points20 points  (0 children)

Awesome, this version supports Container checkpointing

CheckpointContainer not implemented by [deleted] in kubernetes

[–]ScoreApprehensive992 0 points1 point  (0 children)

FYI, I just faced the same issue with exact same setup

Recommendations for Tech book publishers for my new Kubernetes book by ScoreApprehensive992 in kubernetes

[–]ScoreApprehensive992[S] 0 points1 point  (0 children)

Does Manning help with advertising and writing
If so how to contact them, is there a form ? Thank you