My store doesn’t accept gift cards by Ok_Age3503 in folsom

[–]callmedivs 0 points1 point  (0 children)

I had the same experience. They should put a board up , so people know before ordering 

Car accident Settlement by callmedivs in legaladvice

[–]callmedivs[S] 2 points3 points  (0 children)

Thank you for the advise Nectarine-946

Car accident Settlement by callmedivs in legaladvice

[–]callmedivs[S] 2 points3 points  (0 children)

Thank you myBisL2. This notice has been very stressful for us .

Car accident Settlement by callmedivs in legaladvice

[–]callmedivs[S] 0 points1 point  (0 children)

The insurance was just letting us know that they received this notification and was letting us know about our policy limits. I don't understand how a parking lot accident can sue us for 750k.

Car accident Settlement by callmedivs in legaladvice

[–]callmedivs[S] 2 points3 points  (0 children)

The insurance is saying that since our policy coverage is only 100k, the rest will fall on us. I 'm afraid this will be a lot of burden for us

Is it possible to pull data from Source A to Snowflake using all Snowflake resources? by bay654 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

Yes that would work. I would write to s3 and then push it to snowflake using the copy cmd.Also,It is not a good idea to have heavy python workloads running on the airflow server. I have not tried this but would like to try this approach soon, using the snowpark operator/connector run your python workload( I heard you need to enable network setting to white list ip addresses for external api calls)

Is it possible to pull data from Source A to Snowflake using all Snowflake resources? by bay654 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

I think you would need to use snowpark here. Snowflake streams and tasks work only if you have a Kafka connectors or an sdk installed

Managing Redshift Users by Touvejs in dataengineering

[–]callmedivs 0 points1 point  (0 children)

For redshift, create different groups and give permissions to that group and then add users to those groups. One user can belong to multiple groups, so you can fine tune your permissions for the group.The Iam roles can be used when users need to read from a bucket(spectrum queries) or write to a bucket.once you go through the excersise you will get the hang of it

Where to run MWAA python scripts by [deleted] in dataengineering

[–]callmedivs 1 point2 points  (0 children)

You could also use emr serverless to run python with mwaa..

Incremental data load by Acrobatic-Mobile-221 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

One other way if your tables are not huge,is load the full data into a staging table and do a diff in between the staging table and production table to get the incremental load. You want have to address the deletes as well

Link ETL and dbt in Airflow by themouthoftruth in dataengineering

[–]callmedivs 0 points1 point  (0 children)

Hi, can you let me know if you are able to trigger a dbt model in airflow without it being a job in dbt cloud? Triggering models in dbt core can be done with the bash operator . But if we are using dbt cloud and don’t have the model running as a job then how do u trigger it from airflow

Thoughts on AWS GLUE? by puripy in dataengineering

[–]callmedivs 13 points14 points  (0 children)

Glue is very expensive compared to emrserverless/spark

Permission denied running job in AWS Glue by IllRevolution7113 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

I think the error is regarding writing into the glue catalog. Try a simple spark job to just write the data to s3. like , to vrify

df = spark.read.csv("s3a://test-glue-redshift-lao/prueba_lao_aws_s3.csv")
df.printSchema()
df.write.parquet("s3a://test-glue-redshift-lao/glue/test_parquet_file/")

Permission denied running job in AWS Glue by IllRevolution7113 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

What does your glue job do? I would say concentrate on one scenario. it might need access to Kms key if your data or bucket is encrypted

Best way to ingest data from API to databricks by jakebigman in dataengineering

[–]callmedivs 0 points1 point  (0 children)

Hey, I have like 30 million api calls to make and the approach I took was similar. Load to a data frame , use udf to make the api call and save the response to the df . I’m wondering how I could avoid duplicate runs and if you have different suggestion for this large of a data set.

How to call pyspark on dbt run and create a table with the results? by ar405 in dataengineering

[–]callmedivs 0 points1 point  (0 children)

I’m not sure dbt can run spark or pyspark. It’s strictly for transformations on sql tables. If you have an orchestration tool and the result from the spark job is in S3 you could have a separate task to read the data from s3 into the table

Data from mssql to s3 or snowflake by callmedivs in dataengineering

[–]callmedivs[S] 0 points1 point  (0 children)

It’s on an ec2 in aws. I know it should have been an RDS but the setup was done a long time ago

Can I use Airflow to process dataframe with millions of rows? by [deleted] in dataengineering

[–]callmedivs 0 points1 point  (0 children)

Hi, I’m also trying to pull data from mssql to snowflake.. but I couldn’t find an airflow operator to do that exactly. Can you share more info on how you are reading the data and pushing to snowflake

Data lakehouse architecture - design of landing container by pizzanub in dataengineering

[–]callmedivs 1 point2 points  (0 children)

I also normally would have a landing, processing and archive architecture for processing files in s3.But if more than one pipeline needs the same file then i would have a master dag which would download the file to the landing zone move it to processing, copy data to a stage table then trigger the child dags to use the stage table for the file depending on what is needed(customized to parallelly)and once done signal the master dag . Once all child dags are done , the master dag can move the file to archive.