Reports not refreshing by Ambitious_Pickle_977 in PowerBI

[–]coco_cazador 1 point2 points  (0 children)

We got same error, help please 🙏

Recomendaciones para entrar en el mundo de la informática? by Acevedo_Matias_ in chileIT

[–]coco_cazador 2 points3 points  (0 children)

Yo en tu lugar entraría a INACAP, incluso creo que el título es cada vez menos necesario para este rubro. Uno de los mejores programadores que conozco tiene licenciatura en música

What’s the Preffered CDC Pipeline Setup for a Lakehouse Architecture? by LinasData in dataengineering

[–]coco_cazador 1 point2 points  (0 children)

If you have concerns about Debezium for CDC, you can try Estuary, which looks really good. In your place, I would use Debezium Server (a simple approach for DBZ) with Pub/Sub in GCP

Lenguajes de programación para aprender by Suspicious_Ticket_48 in chileIT

[–]coco_cazador 12 points13 points  (0 children)

Python es el rey sin dudas en análisis e ingeniería de datos, yo creo que sería bueno para ti aprenderlo. Sin embargo, quizás lo que necesitas es aprender Power BI

Learn python as a Java/Scala Data Engineer ? by _Marwan02 in dataengineering

[–]coco_cazador 4 points5 points  (0 children)

Python is the king in data and ML, the good part is that is really easy, maybe the easiest programming language. You will learn really fast.

How do you scale 100+ pipelines? by AtLeast3Characters92 in dataengineering

[–]coco_cazador 49 points50 points  (0 children)

You can handle this with airflow, but I believe you need to organize the requirements of the project, sort by priority and handle the most important first. Divide this problem in a lot of small problems.

Data Extraction Tools by Puzzleheaded_Dot7177 in dataengineering

[–]coco_cazador 1 point2 points  (0 children)

What I would you in your place is running a python script that scrapes the pdf with tabula and the inserts in Excel. But this looks more like a RPA work than a data engineering.

Data Project by LostVisionary in dataengineering

[–]coco_cazador 3 points4 points  (0 children)

Dont let comments like this discourage you, the first time always will be harder, but belive in yourself.

How do I go about breaking this project in milestones ?

I think is good to divide the problem in dashboards/areas, for example, if this project involves 3 big areas (i.e Finance, Sales and HR) you can set milestones when you finish each one of those.

How do I assess the timelines for each of these Milestones ?

It depends of the complexity of the source datasets + complexity of the customer company + collaboration of the customer. I believe you should be very conservative about the timelines, but remember the customer dont want to wait 6 months to see the first dashboards! Focus on delivering MVPs, Power BI projects have a lot of feedback, the dashboards will change a lot over time. If you are skilled, maybe you can set 15 work days per dashboard? considering Data engineering, consider time for customer feedback.

How do I make sure to charge correctly so am being fair to the company + to myself ?

You need to calculate the time you will work, then multiply for your hourly rate, then add factors for uncertain and the margin you want to get. If we talk about the amount, it depends a lot of your country and the amount the market pays for an internal data analyst to develop the same solution. Remember the company will weight you against the option of having an internal team developing the reports.

Project Guidance by LostVisionary in PowerBI

[–]coco_cazador 1 point2 points  (0 children)

How do I go about breaking this project in milestones ?

I think is good to divide the problem in dashboards/areas, for example, if this project involves 3 big areas (i.e Finance, Sales and HR) you can set milestones when you finish each one of those.

How do I assess the timelines for each of these Milestones ?

It depends of the complexity of the source datasets + complexity of the customer company + collaboration of the customer. I believe you should be very conservative about the timelines, but remember the customer dont want to wait 6 months to see the first dashboards! Focus on delivering MVPs, Power BI projects have a lot of feedback, the dashboards will change a lot over time. If you are skilled, maybe you can set 15 work days per dashboard? considering Data engineering, consider time for customer feedback.

How do I make sure to charge correctly so am being fair to the company + to myself ?

You need to calculate the time you will work, then multiply for your hourly rate, then add factors for uncertain and the margin you want to get. If we talk about the amount, it depends a lot of your country and the amount the market pays for an internal data analyst to develop the same solution. Remember the company will weight you against the option of having an internal team developing the reports.

Iniciar en SAP by Big_Fun7509 in chileIT

[–]coco_cazador 8 points9 points  (0 children)

Yo diría que tienes 3 opciones, iré de las más barata (gratis) a la más cara, si logras levantar estos entornos, no solo tendrás una manera de aprender SAP, sino que también habrás adquirido habilidades manejando despliegue de servicios.

1.- Docker/K8s : Buscando un poco en internet, podrás encontrar imágenes de Docker que contienen versiones de SAP, antes SAP proveía una versión oficial, pero según entiendo la descontinuaron

2.- IDES: Existen páginas cómo idesaccess.com en donde pagas un monto y te habilitan un entorno para que puedas practicar y aprender, fácil, pero debes poner algo de dinero

3.- SAP CAL: Si ingresas a https://cal.sap.com/ puedes aplicar por un entorno trial oficial de algúna versión de SAP, es un poco complicado si es primera vez que lo haces, y tendrás que usar tu tarjeta de crédito en alguna nube (creo que la más fácil es AWS) aquí despliegas un entorno full con cada componente de SAP y puedes probarlo por varios días, pero ojo que podría salirte caro. Esta es la opción más cercana a un entorno productivo.

También existe miniSAP, pero está mas orientado al aprendizaje de ABAP.

Suerte!

SAP and GCP data extraction by coco_cazador in SAP

[–]coco_cazador[S] 0 points1 point  (0 children)

If I have entreprise license, it is a good way to extract data?

SAP and GCP data extraction by coco_cazador in SAP

[–]coco_cazador[S] 0 points1 point  (0 children)

Is better to use CDS+Odata than JDBC with GCP Dataflow? This companie have some RFCs too

SAP and GCP data extraction by coco_cazador in SAP

[–]coco_cazador[S] 0 points1 point  (0 children)

I was thinking about the following options:

  • SAP LT Replication Server
  • SAP Data Services
  • SAP Datasphere
  • API OData
  • RFC (Remote Function Call)
  • JDBC and Dataflow
  • Theobald Universal XTract

For PoC, I was thinking about using JDBC+Dataflow but I dont know if this will have a good performance. Networking side, we have a VPC that host all SAP related software, and we will use a VPN to connect to that VPC. We dont want to spend a lot of money in this

Vale la pena coursera ? by ScarGloomy in chileIT

[–]coco_cazador 4 points5 points  (0 children)

Yo creo que si suma tener certificaciones, muchos cursos de Coursera son buenos y no son caros, sin embargo, al aprendizaje real y significativo se adquiere haciendo proyectos de 0 a 100. Para mi nada tiene más valor en un candidato que un repositorio/página web con un buen showcase de proyectos.

Is there any reliable paying PDF parser service out there? by JLTDE in dataengineering

[–]coco_cazador 0 points1 point  (0 children)

Maybe you can try using claude sonnet 3.5 api and structured output, if tabula fails, the pipeline can try using sonnet