Monday, January 28, 2008

Procedural modeling readings: texture generation

Here are some papers on texture synthesis. Please make your reactions comments on this post.

L.-Y. Wei and M. Levoy. (2000). Fast texture synthesis using tree-structured vector quantization. Proc. ACM SIGGRAPH, 479–488.

V. Kwatra, A. Schoedl, I. Essa, G. Turk, and A. Bobick. (2003). Graphcut textures: Image and video synthesis using graph cuts. Proc. ACM SIGGRAPH, 277–286.



Unknown said...

Wei and Levoy:
This was one of the papers that really began the recent push for texture synthesis techniques. It presents a very simple and easy-to-understand way of doing synthesis. It then moves from single-resolution to multi-resolution synthesis using image pyramids. These pyramids are a little harder to understand but make a large difference in synthesis results. The basic idea is to create several versions of the to-be-synthesized texture, each time dividing the height and width by 2. Then you do synthesis on a coarse-to-fine order with the last synthesized texture as input to the next pyramid level. The power here is that it allows you to capture both large and fine details. My first read of this paper I didn't really appreciate the power of using pyramids but it really makes a huge difference. Single resolution is only able to capture fine details and often has bad results. The other thing of note here is dependency on a causal neighborhood (meaning that, other than the very start, the neighborhood is always composed of previously synthesized pixels). This poses two problems: in single-resolution it limits the way sytnhesis can proceed (basically forcing a raster or spiral order path through the texture) and over-emphasizes the influence of the noise (because the first few rows are determined by the initial noise, and these rows in turn determine the rest of the texture). In every pixel-based method I've seen these problems crop up, only newer non-pixel methods have solved these. The TSVQ portion of the paper, while extremely useful, isn't as interesting to me because optimization isn't typically my thing.

This is one of the more impressive papers I've read on synthesis. Both this and "Image Quilting" by Efros use graphcuts (though Efros does it in a slightly different way) to get away from pixel-based synthesis. Patches of the sample texture are used and then stitched together using the graphcuts. It's a very clever idea and was a giant leap in performance at the time. One important part of the process that isn't discussed very much is the error correction. Here they use two different methods, and there are several others. What I find interesting is that modern methods, while definitely improving, all seem to add an error correction step. I am wondering when someone will come up with a method that does not require error correction. It seems like the ideas that make it work could somehow be incorporated into the synthesis algorithm.

Stuart Heinrich said...

Graph cut synthesis

I thought the graph cut texture synthesis was pretty cool.
This is something that would probably be fun to implement sometime.
Their method for patching together image segments seemed very good.
However, I think their work could have benefited by putting additional focus into more optimal or random patch placement algorithms. It is true that by adding rotation and translation the search space is increased, but it is not necessary to use brute force on that search space. For example, they could do monte carlo sampling on possible patch placements and then use a form of gradient descent to find the best local registration. This is not very difficult to do, and seems to me like it would have overcome some of their limitations. However, when it comes down to it, their results were still inspiring, so I consider that a success. This method of texture synthesis is clearly a compromise -- the textures will look better, but will have more repeating features compared to a more random process (like the Wei paper)

I am suspicious that using a raster scan order for this problem has weakened the quality of the output. Since they are doing a monte carlo sampling, I would expect them to get better results by sampling in random order. Of course this would slow down the convergence, but that's just how it is. It would also allow for non-causal neighborhoods -- ie, Gaussian neighborhoods could be used, which are the "correct" neighborhood to use for this kind of application. I like the results on this one, too.