2009年3月31日星期二

簡單地閱讀整個文件

有時你會想閱讀整個文件,而不是一行行或一個固定大小的緩衝區。這裡有一種方法來完成該工作:

std::ifstream file("myFile", std::ios_base::binary);
if(file) {
std::ostringstream buffer;
buffer << file.rdbuf();
file.close();

std::string data = buffer.str();
}

就這麼簡單。該文件的內容會複製到 ostringstream 去。
相同的代碼可以用來複製文件,您只需要把 ostringstream 更換成一個 ofstream。
請注意,以上代碼旨在簡潔,未必是高效的實作。

2009年3月19日星期四

Visual Studio 插件推介 - RockScroll

這個插件把原本平平無奇的卷軸棒躍身一變成為整個文件的預覽。

還有,當你雙擊選擇一個字,所有出現過的地方將自動突顯出來。



現在就下載 RockScroll 並安裝它!

2009年3月6日星期五

Accumulative Screen Space Ambient Occlusion

This is my attempt to combine Real-Time Reprojection Cache and Screen Space Ambient Occlusion. Using such caching scheme, the spatio-temporal coherence nature of the SSAO algorithm can be exploited. You can download the demo with shader source here.


Add Video
The name "Accumulative SSAO" comes from the fact that the occlusion value is accumulated and averaged over a number of frames. The algorithm itself is quite independent of how the occlusion is calculated and here I will assume the reader is familiar with SSAO implementation such as those from Crysis and Startcraft II.

The pipeline

For every frame,

  1. The scene was rendered using deferred shading technique, producing the color, normal and depth buffers.
  2. A number of random vectors were generated in CPU (where in usual SSAO these vectors only generated once in the program).
  3. The normal and depth buffer are then utilized to calculate the occlusion value in the SSAO pass.
  4. Instead of writing the occlusion value to the final output, it would combine with the previous frame's accumulated occlusion value and then written to a second accumulation buffer.
  5. A blur pass can optionally apply to the most updated accumulated occlusion buffer.
  6. The color buffer was then combined with the occlusion value to product the final result, also the two accumulation buffers were switched with each other.
Re-projection

The re-projection happens in the SSAO pass when it tries to access the previous frame's occlusion value. Having the eye-space 3d position for each pixel, we can transform that into a texture coordinate by using a matrix (and a perspective division afterward), lets call it the delta matrix. This matrix is calculated on CPU as:
bias = translation(0.5, 0.5, 0.5) * scale(0.5, 0.5, 0.5)
deltaMatrix = bias * lastFrameProjection * lastFrameView * currentFrameView.inverse()

In simple words, for each current frame's pixel, we are trying to locate their corresponding pixel coordinate on the last frame. If there is no camera movements, the two coordinates should be the same.

Accumulative AO

With the re-projection working, the current frame's occlusion value can be combined with the previous one with the following accumulation formula:
currentAo = currentAo / 30.0 + lastFrameAo * 29.0 / 30.0;

In order to make something interesting for the above equation, the current occlusion value should not be the same as the previous one. Therefore, a new set of sampling position should be generated for each frame, this can be done by re-generating the random unit sphere samples or the dithering texture every frame. In a loosely sense, it is doing a Monte Carlo Integration over the time domain. To achieve better visual quality, more frames should be taken over the time.

As each frame's AO value will also depends on the last few frames, there will be some time delay for the AO to become up-to-date in a dynamic scene. However, by changing the numerator and denominator in the equation, the trade-off between quality and responsiveness can be adjusted.

Cache-miss consideration

Up to now the cache miss problem of the re-projection is not yet addressed. A cache miss will happen if somewhere in the scene that cannot be seen before becoming visible now, due to camera or object movement. Such a cache miss can be detected by comparing the current pixel's depth value with it's re-projected counterpart. If the two values differed by a certain threshold, a cache miss is detected. And to do this, the last frame's depth value is needed. Instead of using a separated texture to store the last frame's depth value, the depth can be encoded and stored together with the accumulative AO value into a 32-bit texture.
// Encode a float value into 3 bytes
// The input value should be in the range of [0, 1
// Reference: http://www.ozone3d.net/blogs/lab/?p=113
vec3 packFloatToVec3i(const float value)
{
 const vec3 bitSh = vec3(256.0 * 256.0, 256.0, 1.0);
 const vec3 bitMsk = vec3(0.0, 1.0/256.0, 1.0/256.0);
 vec3 res = fract(value * bitSh);
 res -= res.xxy * bitMsk;
 return res;
}
float unpackFloatFromVec3i(const vec3 value)
{
 const vec3 bitSh = vec3(1.0/(256.0*256.0), 1.0/256.0, 1.0);
 return dot(value, bitSh);
}

If there was a cached miss, the accumulative AO will be discarded and the instance AO value is used instead. Of course more samples can be taken in this frame to reduce the visual impact of the cache miss.

Discussion/improvements
  • Currently a new independent set of random samples were generated for the above video demo. Other random sample over time generation method may reduce the noise.
  • As some of the re-projection cache scheme suggested, a cache value should be cleared after a certain period of time to avoid in-stability and provide a better response to dynamic environment, and this is done here by the accumulation formula.
  • To reduce cache miss due to object movement, each object's last transformation matrix can also be incorporated into the algorithm.
  • The depth encoding scheme also make the blur pass much more efficient.
Conclusion

The explained algorithm provides a new way to improve the quality and efficiency of traditional SSAO by using the result from a number of frames instead of one. It also opens up more parameters and sampling patterns to explore with.