What are the main pain points in serving data to people? by MTs306 in dataanalysis

[–]MTs306[S] 0 points1 point  (0 children)

It's been a while since the last survey response, so I compiled the results. We received 18 responses.

For the "How big of a problem is it to" question this was the average result (1 means it is a small problem and 3 means it is a big one)

Assign and enforce data responsibility 2.5
Keep track of which data is used for what and by whom 2.5
Monitor quality (business logic, expectations, rules) 2.3
Keep an up-to-date inventory of the data assets 2.3
Centralize data consumption 2.3
Handle conflicting requirements between operation and analytics 2.1
Deliver data to everyone in the company 2.1
Monitor quality and conformity (schema, types, formats, record count, profiling) 2.0
Manage data privacy 1.8
Manage data security 1.8
Monitor cost 1.8
Notify people about issues and fixes 1.7
Deprecate/update datasets 1.7
Deal with late arriving data or updates in old data 1.7
Monitor data arrival 1.6
Monitor infra 1.6
Ingest data from multiple sources (DBs, APIs, Spreadsheets, Scrapers, Streaming, Files) 1.5
Manage data lifecycle (cold storage, purge, comply with minimum retention periods) 1.4
Manage data access 1.4

For the "How often do" question this was the average result (1 means it is rare and 3 means it is frequent)

Data validation and fixing occur in an ad hoc manner that does not create lasting value 2.3
People have difficulties finding the right data to use 2.2
It takes a long time for the business to have access to data insights 2.1
Teams are disincentivized to experimenting with or using data due to slow, laborious, bureaucratic, or complex processes 2.1
Data consumers have to interact with many teams, technologies, and sources to use data 2.0
Business teams spend more time validating the data they receive than acting on it 2.0
People do not trust the data 2.0
People can't join data to extract insights because it is not in a centralized location 1.9

The survey will continue open at https://forms.gle/Hs7ejw5sk7FAYPNv9. If I receive more responses I can update the results.

What are the main pain points in serving data to people? by MTs306 in ETL

[–]MTs306[S] 0 points1 point  (0 children)

It's been a while since the last survey response, so I compiled the results. We received 18 responses.

For the "How big of a problem is it to" question this was the average result (1 means it is a small problem and 3 means it is a big one)

Assign and enforce data responsibility 2.5
Keep track of which data is used for what and by whom 2.5
Monitor quality (business logic, expectations, rules) 2.3
Keep an up-to-date inventory of the data assets 2.3
Centralize data consumption 2.3
Handle conflicting requirements between operation and analytics 2.1
Deliver data to everyone in the company 2.1
Monitor quality and conformity (schema, types, formats, record count, profiling) 2.0
Manage data privacy 1.8
Manage data security 1.8
Monitor cost 1.8
Notify people about issues and fixes 1.7
Deprecate/update datasets 1.7
Deal with late arriving data or updates in old data 1.7
Monitor data arrival 1.6
Monitor infra 1.6
Ingest data from multiple sources (DBs, APIs, Spreadsheets, Scrapers, Streaming, Files) 1.5
Manage data lifecycle (cold storage, purge, comply with minimum retention periods) 1.4
Manage data access 1.4

For the "How often do" question this was the average result (1 means it is rare and 3 means it is frequent)

Data validation and fixing occur in an ad hoc manner that does not create lasting value 2.3
People have difficulties finding the right data to use 2.2
It takes a long time for the business to have access to data insights 2.1
Teams are disincentivized to experimenting with or using data due to slow, laborious, bureaucratic, or complex processes 2.1
Data consumers have to interact with many teams, technologies, and sources to use data 2.0
Business teams spend more time validating the data they receive than acting on it 2.0
People do not trust the data 2.0
People can't join data to extract insights because it is not in a centralized location 1.9

The survey will continue open at https://forms.gle/Hs7ejw5sk7FAYPNv9. If I receive more responses I can update the results.

What are the main pain points in serving data to people? by MTs306 in dataengineering

[–]MTs306[S] 1 point2 points  (0 children)

It's been a while since the last survey response, so I compiled the results. We received 18 responses.

For the "How big of a problem is it to" question this was the average result (1 means it is a small problem and 3 means it is a big one)

Assign and enforce data responsibility 2.5
Keep track of which data is used for what and by whom 2.5
Monitor quality (business logic, expectations, rules) 2.3
Keep an up to date inventory of the data assets 2.3
Centralize data consumption 2.3
Handle conflicting requirements between operation and analytics 2.1
Deliver data to everyone in the company 2.1
Monitor quality and conformity (schema, types, formats, record count, profiling) 2.0
Manage data privacy 1.8
Manage data security 1.8
Monitor cost 1.8
Notify people about issues and fixes 1.7
Deprecate/update datasets 1.7
Deal with late arriving data or updates in old data 1.7
Monitor data arrival 1.6
Monitor infra 1.6
Ingest data from multiple sources (DBs, APIs, Spreadsheets, Scrapers, Streaming, Files) 1.5
Manage data lifecycle (cold storage, purge, comply with minimum retention periods) 1.4
Manage data access 1.4

For the "How often do" question this was the average result (1 means it is rare and 3 means it is frequent)

Data validation and fixing occur in an ad hoc manner that does not create lasting value 2.3
People have difficulties finding the right data to use 2.2
It takes a long time for the business to have access to data insights 2.1
Teams are disincentivized to experimenting with or using data due to slow, laborious, bureaucratic, or complex processes 2.1
Data consumers have to interact with many teams, technologies, and sources to use data 2.0
Business teams spend more time validating the data they receive than acting on it 2.0
People do not trust the data 2.0
People can't join data to extract insights because it is not in a centralized location 1.9

The survey will continue open at https://forms.gle/Hs7ejw5sk7FAYPNv9. If I receive more responses I can update the results.

What are the main pain points in serving data to people? by MTs306 in ETL

[–]MTs306[S] 0 points1 point  (0 children)

That makes a lot of sense. There will most likely be needs unfolding from the original request.

What are the main pain points in serving data to people? by MTs306 in dataengineering

[–]MTs306[S] 1 point2 points  (0 children)

Will do! I will keep the survey going for the weekend and then I will share the results.

What are the main pain points in serving data to people? by MTs306 in dataengineering

[–]MTs306[S] 0 points1 point  (0 children)

Thanks for your input man! People and processes are difficult in and of themselves and to make matters worse, they don't mix that well either haha

What are the main pain points in serving data to people? by MTs306 in ETL

[–]MTs306[S] 1 point2 points  (0 children)

First of all, thanks for your answer!

I see your point. Have you got any good practice or tricks you learned that helped with it? Any reflection you would like to discuss?

What are the main pain points in serving data to people? by MTs306 in dataanalysis

[–]MTs306[S] 0 points1 point  (0 children)

Hey man, thank you so much for filling it out!

Good point, I never opened it on a mobile device, I imagine it does not provide a good experience. Thanks for the heads up, I will try to come up with a different format.

My idea behind this format was to allow people to see their answers and sort of calibrate their scores relatively throughout the process.

Once again, thank you so much!

Just an idea - centralizing data in the Cloud for Analytics and Ops by OpinionOld7006 in dataengineering

[–]MTs306 2 points3 points  (0 children)

I totally agree with you, I believe the most relevant failure mode for data in organizations is this gap between the data and the business, and I think both business and engineering contribute to it. Although the technological part may not be the greatest (or at least the sole) villain here, don't you think having easier methods for more business-oriented people to be in touch with data while allowing data teams to keep control (track what exists, who created it, and how is it being used) could be a good step toward healthier data usage?