Use Cases

The following outline some use cases that will help define the ways in which the main engine must be modularized, and some of the functionality that it will need to provide.

Storyboard

Comic book format, sketch quality only. Artists need to be able to add visual elements not part of the scene itself, which describe the visual language of film. Need to match storyboard frames with lines in the script, and see overviews of entire scenes. Desirable to ease conversion to Animatic. Uses a small portion of the budget of a production, but may take a fair percentage of the time if many iterations are required before main production begins.

Animatic

Storyboard with visual language elements converted to actual animation, and synchronized to preliminary soundtrack. Various simple 2D and 3D animation techniques must be easy to specify and synchronize. In 2D, often uses storyboard frames or cutouts from frames; in 3D, uses extremely simplified standins for environment, objects, and actors.

Render to Film

Final output form is digital film, at medium to very high resolution with a fixed (and low) frame rate, for later display on a huge screen. Utter realism is expected. Extremely complex environments are commonplace. Artist-guided physical simulation is necessary to handle complex materials and interactions without astronomical artistic workload. Repeatability must be guaranteed. Offline rendering is acceptable (but no more than "overnight"), and significant pre-render optimization work may pay off. Artists desire interactive rendering as close to final as possible, to improve artistic feedback loop. Content known in advance, and heavy tweaking to fit artistic and directorial vision is expected. Budgets range from ten million to several hundred million dollars.

Render to Video

The final output form will be digital video, at low to medium resolution with a fixed (medium) frame rate, for later display in a home theater environment. Requirements are generally similar but simpler than with Render to Film. Overnight rendering is often considered too slow, except possibly for final render. Budgets are generally considerably smaller than film budgets, perhaps 1% as large.

Stream for Broadcast

Special effects and graphical elements are added on the fly to a slightly delayed broadcast, with resolution and frame rate as per Render to Video. Often used for sports and news reporting. Rendering must work in real time. Artists require tools that allow them to select and customize chosen elements in just a few seconds. Optimization based on properties of elements in toolbox is encouraged, but on-the-fly optimization is limited by allowable turnaround time and realtime rendering requirements. Budgets likely even smaller than standard video rendering.

Machinima

Similar to Render to Video, but scripted and rendered entirely within a Computer Game engine. Currently limited to independent films, with attendant extreme budget limits.

Computer Game

Many rendering styles, including cel-rendered, sketched, anime-influenced, hyper-realistic, old film, and so on. Pre-rendered intros and cut scenes can be considered a special case of Render to Video, in which video file size is at a premium and the video is likely to be scaled at odd ratios to fit the player's chosen resolution. Game engine intros and cut scenes can instead be considered Machinima. In either case, these precreated scenes are limited to using only a fraction of the budget of the computer game as a whole, but have the advantage of being amenable to considerable optimization, as well as using much heavier resources to create than can be applied on the user's system in real time.

Resolution and frame rate independence are expected, as well as the ability to gracefully degrade and improve quality along many axes to fit the desires of the user and available system resources. Different genres tend to have vastly different optimization profiles; strategy games require highest available resolution while allowing low frame rates, preferring instead to spend system resources on AI, pathfinding, and so on. Conversely, shooter games depend on very high frame rates to improve responsiveness, and often spend more non-rendering resources on physics simulation than on AI.

Budgets were long relatively small, and still sometimes are in true independent shops, but are now ballooning past Render to Video into Render to Film territory.

Unlike all previous use cases -- except, in a limited way, Stream for Broadcast -- computer games must deal with interactivity. As a consequence, there is a constantly varying line between creator- and user-determined activity. In some games, the entire setting may be changed by the user, prohibiting certain valuable techniques that depend on knowing the content in advance.

Simulation Inputs

It is valuable to move as much control of the virtual world as possible outside the renderer, preferably into one or more dedicated modules. Aside from all of the usual benefits of a modular architecture, this makes it much easier to maintain a consistent shared reality in multiuser systems. The stronger this modularization, the more exact the consistency and "fairness" of the system.

Ideally, the rendering engine would be purely a viewer client for a "world server", which controls every aspect of the virtual world. Unfortunately, the bandwidth, memory, and processing power necessary to make this pure design workable are currently out of reach of the average consumer (and according to various improvement curves, unlikely to ever reach the consumer). While this design can be used for high budget and completely non-interactive purposes, such as final film rendering and scientific simulation, a different approach must be used for interactive situations.

One possible method for mitigating the downsides of the pure architecture is to allow the server to speak to clients in a sort of shorthand; the server merely supplies parameters that clients can extrapolate to determine the current world state. This can be seen as an extremely specialized compression method, compressing all of the information that the server knows and client needs into a vastly shorter format that the client then "decompresses" for use by the renderer.

Nearly all multiuser games do this, to varying degrees. For example, first person shooter games usually operate within a few premade settings, and the server can merely tell the client at the beginning of a match the name (or even just number!) of the current setting; the client is then expected to load all relevant setting data from this single detail. During play, the server merely relates position, orientation, velocity, and attributes for each moving object, along with miscellaneous other bits of shared state, for each simulation step.

Unfortunately, to further hide bandwidth and latency issues that would severely affect interactivity, the server must provide more information about the world state than the client should normally know, such as the exact state of currently hidden enemies. Cheaters use this information to give them uncanny perception and targeting accuracy.

Another issue is the question of what state must be shared at all: is it actually important that two clients see mist swirl in exactly the same way at the same moment? (The answer to this may depend on whether the swirling mist interacts with the rest of the world; if the mist is mere decoration, consistency is probably not necessary. On the other hand, if sudden swirling in previously calm mist is a warning of an approaching enemy, consistency would be much more important. This is even more true if the swirling actually reflects the position and movements of the enemy.)

The following are some thoughts on various attributes of a hybrid client-server architecture intended to be most things to most people.

Static Content and Setting

Several methods for transmitting static content and details of the game setting might make sense:

Ship
Every client and server includes a copy of all static game content. Makes for huge installs, but the server and client can refer to this shared data with a very compact protocol.
Download
The server may be extended to handle new settings by downloading new content; clients can be similarly extended by downloading matching content either from the server or directly from the same download location the server used. Downloading from the server has the advantage that the server can be sure that the client has matching data, but has the disadvantage of using server bandwidth probably better spent on clients that have already downloaded the new settings and are engaged in a current match.
Stream
During normal play, the server streams new content to the clients in a low-priority data stream. Once all clients have been updated, the new settings are "enabled" on the server, allowing them to be selected for play. For very large settings, not all content may be streamed to all clients, but merely that which each client is likely to view in the near future.
Generate
Client and server include libraries that can generate complete, detailed settings from simple parameters and/or partial data. Some variation of the Download and Stream methods is used to update the server and clients with new parameters, and then generate the new settings either in background or batch mode.

Wake-Up State

When the client is not connected to the server for the lifetime of the simulation, the connected client must be updated with deltas against the static content (or most recent shared state) to get back into sync with the server before play can begin. Depending on the importance of the data, and whether it can materially affect gameplay, this may be batch downloaded before the connection is allowed to complete, or streamed to the client during the first few minutes of play.

Dynamic Data

Dynamic data can be broken into several different classes:

Shared, Game-Affecting
This must be owned exclusively by the server, and pushed to each client as needed at highest priority. To hide latency, jitter, and packet loss, the protocol will need to include time and error correction.
Shared, Non-Critical
Again, this is owned exclusively by the server, but the data can be handled at lower priority, with little or no need for time correcting protocols.
Peer to Peer
This type of data, typified by taunts during action games, would typically be delivered on a best effort, low priority basis.
Unshared
Some data is useful only for presentation, and has no real effect on the simulated environment or other players. For example, in a combat game an explosion's timing, location, and blast radius would be critical shared data, but details of the explosion's on-screen rendering would not; the latter could safely be generated and consumed entirely on the client.

Generated Data

It's very useful if extrapolated and procedurally generated data has certain properties:

Repeatability
Given the same parameters, the generator should produce exactly the same output. This can only occur if the generator has no implicit parameters, such as the state of a global PRNG. If it is desirable that the output vary slightly from run to run, the generator should take a seed parameter that calling code is free to set as desired. If the seed requires certain randomness properties (for example, not being simply an upcounter), the generator should manually enforce this internally so that callers are free to be lazy.
Predictability
The generator parameters (other than the seed, if any) should have meaning, and tweaking the parameters should have predictable effects. For example, a world generator might have parameters such as "mean mass", "distance from sun", and so on.
Conciseness
The generator should be able to produce very detailed output data from just a few parameters. This necessarily implies a certain level of complexity in the generator code, but see Partial Evaluation.
Subset Generation
The problem with a concise generator is that even if the system only needs a small portion of the generator output, it has to run a complex procedure that produces vast quantities of output, which then must be filtered to find the desired data. This problem goes away if the generator can be designed to produce chosen subsets of the full output without significant wasted effort.
Partial Evaluation
Rather than always generating the final output data directly from base parameters, partial evaluation uses a multi-stage approach. Each stage expands its input parameters to a larger data set, ready for the next stage. This allows a classic time and space tradeoff between generation and caching or transmission. This is especially powerful when combined with Subset Generation in one or more of the final stages (so that all stages beyond a certain point support that feature).
Artist Control
Very concise generators often have the downside of producing something that is almost but not quite what the artist desired. Some generators allow the artist to tweak the generation, either by accepting manual overrides for certain outputs, or by allowing some internally generated intermediate data to be specified as an explicit input. Partial Evaluation generators are especially amenable to the latter approach when the inputs and outputs of each stage are exposed to artist control (rather than hidden behind an opaque parameter such as "evaluation depth").
Pluggability
An especially powerful form of Artist Control allows the internals of the generator to be modified. This could be a matter of subclassing the generator, or could involve hooks between each stage of a Partial Evaluation, allowing the data to be modified programmatically, or could even allow entire portions of the algorithm to be replaced.