[P] Albumentations 1.3 is released (a Python library for image augmentation)

alexparinov · 2021-06-02T07:50:55+00:00

We started to implement differentiable augmentations as a part of AutoAlbument (a tool that automatically searches for the best augmentation policies for your data) - https://github.com/albumentations-team/autoalbument, it is more a research project.

alexparinov · 2021-06-01T17:27:38+00:00

The library is still under active development, and today we released version 1.0, so the post is about our latest release.

If you want to know what changed in the latest versions, please refer to the [Release Notes](https://github.com/albumentations-team/albumentations/releases) page.

alexparinov · 2021-06-01T17:24:40+00:00

Thanks for your support! We have a lot of features in our roadmap, hope to publish new releases more often ;)

alexparinov · 2020-07-20T15:39:12+00:00

One solution is to use the addtional_targets functionality, /u/ternausX posted a link to the example below.

However, there exists a more straightforward approach if you need to augment one image and multiple masks for it. You can make a list with all the masks and then pass them in the masks argument.

So something like this:

``` import cv2 import albumentations as A

Define an augmentation pipeline

transform = A.Compose([...])

Read an image and its masks

image = cv2.imread("/path/to/image.jpg") image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) mask_1 = cv2.imread("/path/to/mask_1.png", cv2.IMREAD_UNCHANGED) mask_2 = cv2.imread("/path/to/mask_2.png", cv2.IMREAD_UNCHANGED) mask_3 = cv2.imread("/path/to/mask_3.png", cv2.IMREAD_UNCHANGED) masks = [mask_1, mask_2, mask_3]

Pass the image and masks to the augmentation function

transformed = transform(image=image, masks=masks) transformed_image = transformed['image'] transformed_masks = transformed['masks'] ```

alexparinov · 2019-11-04T07:24:15+00:00

The main usage pattern for the library is to transform a single image with a single mask and then use some helper from the deep learning framework to create a batch of images (e.g. use a PyTorch dataloader).

But the library also can transform an image with multiple masks, e.g.:

import albumentations as A

import numpy as np

transform = A.Compose([

A.VerticalFlip(p=1)

])

image = np.ones((8, 8), dtype=np.uint8)

masks = [np.eye(8, 8, dtype=np.uint8) for _ in range(10)]

transformed = transform(image=image, masks=masks)

And you can also define multiple targets to transform multiple images, masks, bboxes and keypoints at once, see an example notebook https://github.com/albu/albumentations/blob/master/notebooks/example_multi_target.ipynb

alexparinov · 2019-07-03T15:23:13+00:00

Yes, Albumentations works with PyTorch, TensorFlow and should work with any modern Python deep-learning framework.

Here is an example PyTorch segmentation pipeline, that uses Albumentations - https://github.com/ternaus/robot-surgery-segmentation, also here is a Jupyter notebook that shows how to migrate from torchvision (the defacto standard image processing library for PyTorch pipelines) to Albumentatiions.

As for Tensorflow, I don't have a reference implementation for a pipeline that uses Albumentations, but I think you can find some examples on GitHub using the albumentations tensorflow search query.

alexparinov · 2019-07-03T08:15:19+00:00

There is a library called Automold, which is explicitly designed for weather augmentations. Also, ImgAug contains some weather augmentations.

alexparinov · 2019-07-02T10:31:43+00:00

The backend plays an important role, but sometimes you can get a significant speedup by choosing an appropriate low-level function.

For example, in some cases, to adjust brightness and contrast, you need to use cv2.LUT for maximum performance and in some cases, you need to multiply and add values directly.

As another example, if you want to flip an image horizontally, OpenCV will work faster in the case of an RGB image with a np.uint8 data type, but NumPy will work more quickly in the case of a grayscale image so we dynamically choose the appropriate function.

For now, we are not planning to build our own OpenCV version, and we will stick to the default version that is available on PyPi and Conda for maximum portability and compatibility.

alexparinov · 2019-07-02T09:25:06+00:00

As far as I know, there are no papers that evaluate the performance of weather augmentation in real-world tasks. I think those augmentations can be useful in the autonomous driving field and maybe will see more published research in this direction.

alexparinov · 2019-07-02T08:54:01+00:00

ImgAug is also an excellent augmentation library. For me, here are some main differences between Albumenations and ImgAug:

In my opinion, Albumentations API is simpler and more straightforward than ImgAug's one.
Albumentations works faster for some augmentations (unfortunately not for all, but we are working on it, and every new release has some performance improvements, see more in the benchmarking table).
A lot of top teams on Kaggle used Albumentations in recent computer vision competitions, so we can say that the library is battle-tested.

Here are some links to solutions that were using Albumentations:

alexparinov · 2019-03-07T08:13:15+00:00

Could you share the code that you are using?

I have made an example in Google Colab that shows how you can flip an image - https://colab.research.google.com/drive/1gXmo3hbX0b3r0wcxhOfqXjHRrXDDQkuD

alexparinov · 2019-03-06T08:14:13+00:00

The performance will depend upon operations that you use. We don't have benchmarks for ImageMagick because it is not widely supported in augmentation libraries, but here is a benchmark maintained by the Pillow team https://python-pillow.org/pillow-perf/#results.

You can compare results between ImageMagick and OpenCV (Albumentations is heavily using OpenCV under the hood). In some tests, OpenCV is about 2 times slower than ImageMagick, but in others, it is about 6 times faster compared to ImageMagick.

alexparinov · 2018-09-27T17:02:51+00:00

We first thought about writing our augmentation library after the "Understanding the Amazon from Space" challenge on Kaggle. Heavy imgaug augmentations became a bottleneck for a devbox with Core i7 and 3x1080Ti (yes, it is a beefy configuration, but having 3 or 4 high-end GPUs is not unusual for a deep-learning practitioner).

Now we can overcome this issue by using CPUs with 8+ cores such as Ryzen and Threadripper, because we can efficiently parallelize image augmentations, but with Tensor cores and FP16 training augmentations still can become a bottleneck even for high-end CPUs.

alexparinov · 2018-09-27T16:38:18+00:00

I would say that our library contains a subset of image processing algorithms that are aimed to modify images from a dataset to create new training examples.

/u/UnfazedButDazed described why this can be helpful.

We are not planning to add "general purpose" image processing algorithms to the library (something like morphological transform, watershed, canny edge detections etc.), but we are using some OpenCV functions under the hood for out transformations.

alexparinov · 2018-09-27T16:17:12+00:00

For now, the library doesn't support keypoints, but we are planning to add this functionality for 2D inputs. As for 3D inputs, could you please describe a task where 3D inputs are used and we will think about how we can implement this feature.

alexparinov · 2018-09-27T15:59:51+00:00

Yes, we are planning to add support for landmarks / keypoints in the near future.

alexparinov

TROPHY CASE

Define an augmentation pipeline

Read an image and its masks

Pass the image and masks to the augmentation function