Component-Based Design Links

Filed under:General — posted by Brian on June 10, 2008 @ 9:15 pm

Master's research is finished, but there are still things to learn. This site never updated much, but I'll be trying harder to keep it going with my various personal projects.

Component-based design has been my latest hot topic. Here are some useful links I've collected (or written):

Component Based Game Architectures - GDNet Post
Outboard component-based entity system architecture - GDNet post
Object Oriented Game Programming: The Behavior System
301: Introduction to Dungeon Siege Architecture
Various Bilas presentations
Some stuff written by Jeremy Chatalaine
The Animal Farm Creations: A Question of Component-Based Design

GLSL Flow Control

Filed under:GPGPU, Rendering, General — posted by Brian on February 11, 2008 @ 1:00 am

The nVidia GeForce 8 Documentation explicitly says that loops should now support run-time condition testing, which to my mind means that non-const variables should be permissible in the condition test. Either way, I should be able to create an infinite loop and have a statement inside which breaks out of the loop once a counter variable reaches a certain, dynamically calculated number. And barring that, I should be able to create a for-loop that executes some max number of iterations, but also has a statement inside which breaks out once the counter variable reaches a certain number.

Of course, none of the above works on my 8800 GTX. I speculated that there was some loop unrolling going on that might be throwing the system off, but a quick look at the generated assembly (using NVemulate, handy little program that it is) indicated otherwise. The last example I gave above is especially weird: I can create a for-loop that executes a max number of iterations and put a break condition inside, but the number I put in as the max number of iterations matters and is very, very touchy, despite that the underlying assembly doesn't change at all. Sometimes I can use 70 just fine, sometimes I can only go up to 10, which is highly restrictive.

This isn't an app-killer; I've got my system basically working and working well, but for some of my requirements the restrictions might get in my way.

OpenGL Debugging

Filed under:GPGPU, Rendering, General — posted by Brian on February 9, 2008 @ 9:58 pm

Alas, the majority of nVidia's profiling/debugging tools only work with DirectX, so I am either forced to insert code myself or find different tools.

GLIntercept looked promising, but it hasn't been updated in a little over two years. There's no logging of geometry shader info in there, which I could really use. It also appears to be mostly non-interactive, preferring to output results to log files (though it does have some very interesting plugins that allow you to dynamically edit shaders or fly through a scene).

glslDevil is much cooler, though very temperamental. It gives dynamic information, allows stepping through GL calls, allows stepping through shaders (very cool), allows putting watches on shader variables (cooler), can show/save the framebuffer, and do a whole lot more. It also gives geometry shader information. It does, however, crash a lot; it crashes when my research program compiles a shader, which is, uh, unfortunate.

GPU Random Number Generation

Filed under:GPGPU, Procedural Content Generation, General — posted by admin on February 8, 2008 @ 12:34 pm

It confuses me a bit why the GPU doesn't have a decent random number generator. Procedural textures have been around for quite some time, and moving their generation to a shader makes sense. There is a noise function in GLSL which doesn't work reliably (everything I've read indicates it always returns a 0 on nVidia cards). I needed some kind of random number generation for my building generation, so I had to write my own.

The Mersenne Twister was the first to catch my attention, but after looking at a few implementations, it doesn't look like a natural fit for a GPU. It's fast, which is good, but it has a few bits of the algorithm (using an array for a state vector, really large numbers that would cause an error on the GPU) that made it prohibitive. If I knew the algorithm well enough, perhaps I could overcome the limitations, but I really don't have any experience in this area.

Someone on GameDev.net recommended Simplex Noise, which resembles Perlin noise. This algorithm seemed fairly involved though, and I don't need Perlin noise - random noise works fine for my purposes. Were I randomly generating textures, this would be something to look into.

I landed on a Linear Congruential Generator, which seems to be the simplest type of generator. It's fast, in my case only requiring a single multiplication and modulus operator. The distribution it generates is reasonable; it could be better, but I'm not doing cryptography here. I had to search a bit on the internet for numbers that worked better (the numbers in the Wikipedia article don't play nice with the GPU), but now I'm pretty much set.

Seeding is slightly awkward on the GPU, since I'm doing some ping-ponging and there's no global memory to speak of. Basically I send in a seed with my first bit of data. Then I generate a few numbers and output a new seed with a building component by adding some component-specific attributes to the current seed. Since those are floats, the inaccuracy may be costing me a degree of determinism. Alternatively, it would probably be more random and more deterministic (odd little combination there) if the new seed I output were just the sum of a few random integers, but that would be slower and I'm not sure that it's necessary for my purposes.

GLSL Lessons Learned

Filed under:GPGPU, Rendering, Procedural Content Generation, General — posted by Brian on February 5, 2008 @ 7:08 pm

So apparently GLSL is very touchy, and instead of giving reasonable error messages or even reporting linking errors, it will choose to simply not work. Sometimes this will be a cryptic failure to link with a meaningless OpenGL error; sometimes the shader will just hang indefinitely. Here are some things I've learned:

Just because you have some varyings and they're used all throughout your code does not mean they're active when you link the shader program. This is problematic when you have to get the location of those varyings so that you can bind them to transform feedback. nVidia is kind enough to provide an extension to force varying variables to be active upon linking. Be sure to use that.

If you attempt to access an invalid array index with a constant, it's a linking error, not a runtime error.

Using variables in loop-terminating conditions is, at best, problematic. If you're very, very lucky, you won't have any problems. What's more likely to happen is that your shader will hang indefinitely. For that scenario, a work-around exists where you use a large (not too large, mind) constant instead of a variable, and then break out of the loop when the loop counter reaches the variable.

It's nice that logs exist, but in my experience, they only report hard syntax errors or blatantly invalid statements (invalid implicit casts, for instance). Subtle nuances like the one above are unreported.

These aren't the only problems, but they're the most prevalent ones currently.

Ping Pong

Filed under:GPGPU, Procedural Content Generation, General — posted by Brian on January 27, 2008 @ 10:11 pm

For complex GPGPU stuff and even with normal graphics rendering, it's sometimes required to "ping pong" input and output repeatedly to achieve a desired effect. The output in one pass becomes the input in the next, and the output of that next pass becomes the input to the next, and so on. I just got this process working using vertex buffers and transform feedback, which will lead naturally into my building generation:

Fill a vertex buffer object (we'll call it "send") with some initial information that you want expanded upon. In the case of building generation, this initial information will be the initial lot.

Turn on transform feedback and bind it to a different vertex buffer ("receive"). Transform feedback is used to take geometry data after it's gone through the vertex and geometry processing and place that data back in a client-side buffer. For our purposes, the vertex shader becomes a simple pass-through shader (it doesn't transform the vertices), and the geometry shader is where the spawning of new information (building components) takes place.

Draw the send buffer such that the results of the geometry shader end up in the receive buffer.

Repeat the above steps N times, where N is the total number of passes you need over the data. When dealing with a split grammar, N might just be the number of rules if no rules go 'backward' in the grammar. However, there's no real guarantee of this - I've written grammars myself that go backward. For the time being, I'm going to fix N at 2 * numRules.

Here are some considerations:

Why am I fixing N? Because I haven't thought of a better way to tell if processing of the entire building has finished. I have a hunch I might be able to pull something off here by doing some trickery with the fragment shader, but I'm not certain yet.

Is this going to be faster than doing everything on the CPU? I have no clue. Geometry shaders aren't fully mature yet. When they do mature, this *should* be faster, but I'm not sure by how much yet.

Should I only process a single component in one shader pass, or should I run through multiple steps? Here I run into the risk of exceeding the maximum outputs of a geometry shader, but I think I can safely run through two or three steps in one pass without that being a problem. Whether this will help or hurt speed, I don't know yet.

The next step is writing the code-generator that takes a grammar and turns it into GLSL code. I'm convinced I can do this in a relatively straight-forward fashion, with only a few awkward places (basically anything involving strings). I do, however, need to find a good (and fast) noise function to generate random numbers. I've heard simplex noise is good for this, but I haven't found a lot of information on it yet.

“Koru Diagrams” and some results

Filed under:Data Mining, Which, General — posted by Zach on January 25, 2008 @ 2:24 pm

All of my diligent research seems to have finally paid off. I now have some results worthy of a Master's Thesis. Dr. Menzies and I had been staring at these results for a long time but thinking they were bad. The reason for this is because we did not know how to interpret what we were looking at. A little bit before Christmas Dr. Menzies suggested what he calls the Koru Diagram. Below is an example of the Koru Diagram and an explanation of it follows the image.

Example Koru Diagram using KC2 data

The X-axis of a Koru Diagram is a percentage of lines of code explored so far. The Y-axis is the percentage of the total defects found in the file. The red line, labeled Oracle in the key, shows the perfect rule. That it is only finds defects. Its output is sorted by module size, literally C functions or C++ methods, depending on the codebase used, and then plotted in this diagram. The Oracle represents the best possible detector, that is it illustrates the least amount of effort required to find all of the defects. Since the items in the data file that each detector says is defective are sorted in ascending order, no line will ever exceed the Oracle line. The X-axis, Y-axis, sort order, and Oracle make up the Koru Diagram. The other lines on the graph show how well detectors generated by the different learners perform. The fact that Launam performs better than Manual in most cases suggests that there is a disproportionate amount of defects in small modules, which is the reason we decided to graph the results using a Koru Diagram in the first place.

A key thing to notice here are the two lines labeled "manual" and "launam." What they really are are manual methods of searching, there is no detector associated with them. Manual simply sorts ALL modules in a file in ascending order and inspects them, whereas Launam sorts them in descending order and inspects them that way.

Another key thing to notice is the yellow line that is which2. which2 is a machine learner that is which with 2 equal frequency binning. That is we have a bottom 50% of the numbers and the top 50% of the numbers and we convert them to bin1 and bin2. In almost all of the cases, which2 is the best version of which. I am not entirely sure what to say about this. I do not know what the fact that 2 bins means. Also notice how nearly perfect which2 performs.

As a final note, a plateau in a curve means the detector said things were defective that were not. A line that has a lot of horizontal segments equates to a detector that has a very high probability of false alarm.

This results were completely typical in all 10 data sets I tested on. I have the results of a 10x3 cross validation, that is divide each file into 3 parts. Each one of those parts contains 66% train data and the detector learned is tested on the remaining 33%. This is repeated 10 times. So 10 data sets times 10 repeats times 3 folds is a total of 300 numbers for each detector. This gives us a very large amount of empirical data to show our results are valid. The results can be found here. The file is a csv file. Opening it in your browser might hurt. An analysis of Mann-Whitney and Quartile Charts was performed on that 3000+ line csv file. Here are the Mann-Whitney results for KC2, the above graph.

       #key, ties, win, loss, win-loss @ 99%
     which2,    0,  10,    0,             10
   manualUp,    0,   9,    1,              8
     which4,    1,   7,    2,              5
     nBayes,    1,   7,    2,              5
 manualDown,    1,   5,    4,              1
       jRip,    3,   3,    4,             -1
     which8,    2,   3,    5,             -2
        j48,    2,   3,    5,             -2
  which8loc,    2,   0,    8,             -8
  which4loc,    2,   0,    8,             -8
  which2loc,    2,   0,    8,             -8

This shows that which won 10 times and lost 0 times and tied 0 times out of 10 repeats. In other words, it was always the winner. All of the Mann-Whitney and Quartile Chart results can be found here.

From Grammars to Shaders

Filed under:GPGPU, Rendering, Procedural Content Generation, General — posted by Brian on January 24, 2008 @ 8:43 am

So here's the current problem I'm working on:

There's no good way to send the rules across into the shader dynamically, so I'm nixing that. Instead, I'm writing a translator that takes the rules (as already parsed by Lex/Yacc) and converts them directly into shader code. The rules fit mostly neatly into a series of if-then statements, with little problems when variables or functions are encountered (these problems don't seem heavy, but enough to make the translator a little awkward in places).

After that, it becomes a problem of ping-ponging the generated building components back and forth in such a way that I don't waste time processing components that have already reached terminal status. I thought using multiple transform feedback targets would work - sending the finished components to a different buffer than the unfinished - but there's a limit on the amount of information I can feedback in one shot such that I can't get out all the information needed for this. The nasty alternative seems to be searching through the feedback buffer for terminals and shifting them out, but I can't imagine that being in any way fast. If only fragment shaders allowed a greater number of outputs (since processing any component might result in 10+ other components, a fragment shader won't be able to work here). Fragment shaders are more mature than geometry shaders, and I'm afraid that the geometry shader might not provide a significant speed boost even if I get my current problems worked out.

The Animal Farm Research Site Started!

Filed under:General — posted by Brian on September 9, 2007 @ 7:49 pm

First post for The Animal Farm Creations's research site!

Basically, Zach and I wanted a place to talk about our research without cluttering up the main site, so we'll be posting all research related stuff here. The main site will still probably have academia stuff there from time to time, but the bulk of it will be located here from now on.

Stay tuned for some wicked-cool updates.



image: detail of installation by Bronwyn Lace