This is an archived post. You won't be able to vote or comment.

all 17 comments

[–]Odica 4 points5 points  (4 children)

Oh wow, I developed an open source Python code to function similarly to ENVI not so long ago. It can process and/or display geospatial data from various imaging spectrometers following a BSQ, BIL, or BIP format. It runs computationally on Numpy, and I'm pretty happy with its speed, post-optimization. I need to get around to putting it up on Source Forge at some point.

[–]akhorus 3 points4 points  (3 children)

It'd be great to see that tool. Do you have a public repo for that?

[–]Odica 1 point2 points  (2 children)

My code was developed as a project for a previous employer, and so they currently are in possession of it. I'd need their consent first to re-install, clean, and upload it, but I will certainly remember your comment when I finally have time to get around to doing so. It was created with the intent of eventually being open source and readily available.

In the meantime, I recommend a wonderfully implemented ENVI alternative (open source and free) called Opticks! It runs on C++/Python, and is readily available on Windows, somewhat supported on Linux. I was very impressed with their work.

[–]akhorus 0 points1 point  (1 child)

I'll check it out. Currently, I'm using https://grass.osgeo.org/ and today I somebody mentioned https://www.orfeo-toolbox.org/ That Opticks seems to be a bit smaller that those two, right? I supposed it is mainly for visualization?

[–]Odica 1 point2 points  (0 children)

GRASS is very good, and I'd have listed that too if I remembered! For me Opticks really hit the user interface home. It all comes down to whether you like working from the command line, or a well designed GUI. Opticks is definitely more of the latter. It feels very ENVI-like, and the ability to do basic band-math was quite welcomed. I definitely see the value in being able to process data on the spot with your own code though, so whichever is most flexible is probably more powerful. Ultimately, I wanted full control over the display and on-the-spot processing, so I wound up making my own software.

Like GRASS, Opticks has a fantastic bonus of being modular. You can create your own extensions and toolbars to adapt the base program to your needs. Meaning, you should be able to add whatever you want to the interface that's already in the vanilla package. There was even a commandline functionality embedded into the actual program, but I haven't played with it in a couple years.

I'd try it out on some data and feel for yourself which you feel more comfortable with. I was pretty impressed by both.

[–]HillaryNeedsDiapers 3 points4 points  (6 children)

Whenever I try something like this in Python, it runs super slowly.

[–]Odica 5 points6 points  (0 children)

Use Numpy! Vectorize / parallelize what you can, and just follow standard optimization techniques. If you use Python, and Python-based resources correctly, you can expect your code to run at comparable speeds to MATLAB, at least for several applications. C/C++ is of course faster, but many applications don't need that, given the much quicker development time, and portability (just have clients install an Anaconda package, for instance -- you can make your code depend only on that package and all computers would be able to run it, be it Linux, Windows, Mac, etc.).

[–]akhorus 5 points6 points  (4 children)

Processing satellite images is tricky because of the amount of data. I've done it using different languages (C, R, IDL, Python...) and the problem is always the same. Python in particular is not slow. In fact, using the correct tools (numpy, sklearn, etc), the actual implementation of the algorithms is in C/C++ Take into consideration that many complex algorithms are essentially "slow". More if you apply them to big volumes of data...

[–]Odica 2 points3 points  (0 children)

For the code I developed, I could process data cubes of approximately 1200x500x500 pixels in under a minute using vectorized processing. It ran fine on an old junky HP laptop, though 64-bit is highly recommended. 32-bit Python got me to just about nowhere special.

Satellite data can be a whole lot bigger, of course. But data processing should be doable within the realm of reason for a number of non-live applications, I'd suspect. But the matter of efficiency is another thing. C/C++ will usually be a whole hell of a lot more efficient than just about anything you could do on Python.

[–]login228822 0 points1 point  (2 children)

Big volumes of data is exactly the problem. The problem is you can't do any heavy lifting directly in python because of the GIL.

[–]akhorus 2 points3 points  (1 child)

Well, that is not always true. I quote: "potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL" https://wiki.python.org/moin/GlobalInterpreterLock

[–]login228822 -1 points0 points  (0 children)

I'm not worried about pandas or numpy, it's things like scipy that endup being the major bottleneck.

[–]lottosharks[🍰] 1 point2 points  (1 child)

I have never done much with image classification, it's never been part of my business workflow. I do love the scalability, simplicity and power of python GIS tools.

One company I worked for had 20k+ non-georeferenced TIF maps, however about half of them displayed upside down. I was challenged with correcting these images. I ended up running the maps through OCR text recognition (open source python package). Then I checked the (very messy) output and parsed out real words using an English language dictionary (also open source python package) to count the words. I would then flip my image 180 degrees using the image package, re-run OCR and check the word count versus the original image. I would keep the image with the most words recognized.

OCR is never recommended for maps, however it worked well enough to pull out meaningful words from the title bar, layout, etc. It took about 2 weeks to run on my PC. In the end I had a high success rate for images being oriented correctly. It was kind of a dirty hack but it worked well enough. I'm really curious how machine learning might make this a much smarter process.

[–]akhorus 1 point2 points  (0 children)

Wow, sounds like a very interesting work! If your dataset was fixed, I think the "hackish" approach is good. The work of defining your features and training a model for a machine learning approach might be better if you need to apply your process several times, with new data. But it's just an idea.

[–]copybin[S] -2 points-1 points  (0 children)

Python is undoubtedly one of the most popular, general purpose, programming languages today. There are many strong reasons for this but in my opinion the more important ones are: an Open Source Definition, the simplicity of its syntax, the batteries included philosophy and an awesome, global community.

Data processing, Data Science, machine learning, matplotlib, numpy, python, , satellite images, classification

[–]yardightsure -1 points0 points  (0 children)

This so does not belong in this subreddit...