Course:CPSC312-2023-Image-Blending

From UBC Wiki

Image Blending

Authors: Michelle, Owen, Floria

Application

In this project, we explore whether Haskell is suitable for image processing. As a case study, we develop a Haskell utility that blends 2 images and a mask into a new combined image, which the user can save to their computer if they choose to. We then compare our implementation to the Python interfaces more commonly used for such tasks.

What is the problem?

Image 1

The utility reads two RGB images from the user, as well as a mask image specifying which parts of the two images to blend. It then constructs a Laplacian pyramid representation of each image and a Gaussian pyramid representation of the mask, and merges them using a weighted sum. This process gives a single image pyramid that can be used to reconstruct the output image.

The images on the side provide an example of the two input images, the mask, and the output.

We credit the inspiration for this project to CPSC 425; our topic is based on a CPSC 425 assignment done in Python.

Image 2
Mask
The completed image produced from image 1, image 2 and the initial mask.

What is the something extra?

Being able to gather pictures from online could be very useful and opens more options. Therefore, for our extra feature, our application lets users provide an HTTPS (or HTTP) URL to the tool, which results in photos from a website being scraped and used for the image blending.

What did we learn from doing this?

Advantages of Functional Programming

We learned that functional programming is theoretically quite suitable for image processing, where operations are usually transformations of existing images that yield new images. We never needed to modify images, except possibly as an optimization.

We used the Haskell Image Processing (HIP) and JuicyPixels libraries, which let us easily read and display various image formats.

Overall, the implementation in Haskell was far more concise than Python. Since the image processing operations we used have a strong basis in mathematical formulas, Haskell could express them very naturally. (Though this could possibly change if the transformations become more complex.) Haskell abstractions were also highly applicable; for example, nearly all operations (elementwise arithmetic, convolution, etc) could be expressed as a form of map over pixels. Lastly, because images were immutable, we found debugging to be much easier.

Efficiency and Time

Despite these benefits, a major downside to using Haskell was the difficulty of achieving an efficient implementation.

Our initial design, a custom Image data structure backed by a function, could easily express operations such as elementwise addition and convolution by defining new functions.

-- Image: width height (row -> col -> channel -> value)
data Image = Image Int Int (Int -> Int -> Int -> Double)

However, without any optimization, long evaluation chains and exponential runtime growth with more pyramid levels made the program too slow to run in a practical timeframe.

Our final design used an Image data structure backed by an optimized array, as provided by HIP. Though runtime improved greatly, our relatively high-level implementation was still much slower than Python. In particular, with more iterations, a higher sigma value, or a larger image, the image blending time shot up.

Metrics for the Python and Haskell version on the same computer show the drastic difference.

Image set Parameters Real time (Python) Real time (Haskell)
Tomato/Apple (200 × 165) Sigma = 2.0, Iterations = 3 0.0646 sec 23 sec
Blue Cup/Green Cup (384 × 256) Sigma = 2.0, Iterations = 3 0.238 sec 64 sec
Orchid/Violet (512 × 341) Sigma = 2.0, Iterations = 3 0.248 sec 113 sec

While we are certain techniques exist to optimize the code, our experience trying to understand the HIP library's low-level modules suggests they demand a much steeper learning curve than Python's ready-to-use libraries.

Overall Evaluation

Using Haskell for this task has the benefit of expressing the image transformations from a mathematical standpoint. While there was an initial learning curve with tools like Cabal and HIP, the payoff is having code with fewer side effects. We conclude that Haskell may suit image processing applications built by experienced developers to be highly expressive and bug-free. However, users and programmers without a strong low-level or Haskell background may not want to deal with the slower runtime and challenge of optimization, and may be better off sticking to Python or another language with well established high-level libraries.

Link to Repository

https://github.com/zifgu/cpsc312-project-1