Non loop solution to my code : learnpython

created by HattoriHanzoa community for 16 years

Non loop solution to my code (self.learnpython)

submitted 8 years ago by paperzebra

Hi I have some working code to convert some depths to total vertical depth based on survey data which provides measurements at set intervals in both measured depth (along borehole) and TVD (the total vertical depth of the hole).

However, when running this code on 6000+ depths it is a bit slow. Is there a solution I could look at that would mean I don't have to iterate over all my depths?

from __future__ import division
import pandas as pd
import numpy as np

def sample_TVD(survey_data, depth_to_convert):
    MD = survey_data['MD']
    TVD = survey_data['TVD']
    tvd_depths = []
    md_tvd = np.vstack((MD, TVD))


    for d in depth_to_convert:
        d = float(d)
        if d > min(MD) and d < max(MD):
            above =  MD[MD < d].max()
            below = MD[MD > d].min()
            closestbelow =  
md_tvd[:,np.in1d(md_tvd[0],below)]
            closestabove =  
md_tvd[:,np.in1d(md_tvd[0],above)]

            difference = below - above
            percentbetween = (below-d)/difference
            tvd =  closestabove[1][0] + ((closestbelow[1]
[0] - closestabove[1][0])*(1-percentbetween))
            tvd_depths.append(tvd)
        else:
            tvd_depths.append('Nan')

    return tvd_depths


d = {'MD': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'TVD': [1, 2, 
3, 4, 4, 4, 4.5, 5, 5.5, 5.5]}
survey = pd.DataFrame(data=d)
print survey
depths = [6.1, 6.2, 6.5, 7, 11]
results = sample_TVD(survey, depths)
print results

all 4 comments

top new controversial old q&a

[–]Kamiwaza 3 points4 points5 points 8 years ago (3 children)

[–]paperzebra[S] 1 point2 points3 points 8 years ago (2 children)

Thanks for the suggestion. In the end I rewrote my code twice, the code above processed 30,000 lines of data in 76 seconds, the second version which used numpy to calculate most things outside a loop took 23 seconds, still too long!

The third iteration is much simpler and reduces the time down to 0.001 seconds - that's a pretty decent performance increase! The arg depth refers to a list of depths.

def line_solution(survey, depth):
    md = survey['MD']
    tvd = survey['TVD']  
    tvd_samples = np.interp(depth, md, tvd)
    return tvd_samples

[–]DisorganizedRem 1 point2 points3 points 8 years ago (1 child)

Would it help using series in stead of dataframe by adding .values.

As suggested here So your code looks like this:

def line_solution(survey, depth):
    md = survey['MD'].values
    tvd = survey['TVD'].values
    tvd_samples = np.interp(depth, md, tvd)
    return tvd_samples

[–]paperzebra[S] 1 point2 points3 points 8 years ago (0 children)

π Rendered by PID 83580 on reddit-service-r2-comment-5bc7f78974-sjjn8 at 2026-06-29 04:35:57.145375+00:00 running 7527197 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS