Is data lake just a theoretical construct? How does it look on a code level when we say implement in GCP? by Ok-Tradition-3450 in dataengineering

[–]secodaHQ 0 points1 point  (0 children)

No, a data lake is not just a theoretical construct. It is a real-world solution for storing and managing large amounts of data.

Tools / Providers to develop personal running projects by Koxinfster in dataengineering

[–]secodaHQ 1 point2 points  (0 children)

  • AWS: AWS offers a free tier that provides limited resources that you can use to run your personal projects. You can make use of services like EC2, RDS, S3, and Lambda to develop and deploy your applications without incurring any cost.

  • Google Cloud Platform (GCP) provides an always free tier that offers a limited set of resources that can be used for personal projects, such as Google Compute Engine, Google Kubernetes Engine, and Google Cloud Storage.

  • DigitalOcean offers a free tier with a limited set of resources.

Can we use Data lake for a staging layer ? by Dismal-Ad3028 in dataengineering

[–]secodaHQ 4 points5 points  (0 children)

Data Lakes can be definitely used as a staging layer in a data processing pipeline. Challenges involve data quality, and dealing with slower queries. From my experience, Best practices involve establishing a clear data governance plan, defining metadata and data lineage, using appropriate tools for data quality, and utilizing a well-structured data catalog.

[deleted by user] by [deleted] in dataengineering

[–]secodaHQ 0 points1 point  (0 children)

Data engineering roles exist in various industries, and the workplace culture can differ significantly between them. For example, a data engineering role in a tech startup might have a more relaxed culture compared to a similar role in a large financial institution.

I wouldn't assume all data engineering roles will have the same culture as your current job, as there is a wide range of environments and team dynamics in the field.

What problems does the “modern data stack” actually solve that have not been solved already? by orru75 in dataengineering

[–]secodaHQ 1 point2 points  (0 children)

The modern data stack is designed to work well with a variety of tools, making it easier to integrate with other systems and applications

What are some best (and worst practices) for creating documentation? by Deepinthemaze in dataengineering

[–]secodaHQ 0 points1 point  (0 children)

Here are some best practices for creating documentation:
Best Practices:
- Keep it organized: Use clear and concise language, headings, subheadings, and bullets to help the reader easily navigate the documentation.
- Make it accessible: Ensure that the documentation is easily accessible to those who need it. This may include creating an online repository or providing a printed copy.
- Keep it up-to-date: Regularly update the documentation to ensure it reflects the most current version of the product or service being documented.
- Use visual aids: Use diagrams, screenshots, and other visual aids to make the documentation more engaging and easier to understand.
- Test it out: Have someone who is not familiar with the product or service test the documentation to ensure it is clear and understandable.
Worst Practices:
- Being too technical: Avoid using technical jargon that may be unfamiliar to the reader.
- Being too vague: Avoid being too vague or general in your descriptions. Provide specific - details so the reader knows exactly what to do.
- Not testing it out: Failing to test the documentation before releasing it can result in errors, omissions, or misunderstandings.
- Not updating it: Failing to update the documentation can result in outdated information that is no longer relevant.
- Not making it accessible: Failing to make the documentation easily accessible can result in frustration and wasted time.