The Perfect Voxel Engine

More Is Better, Right?

What is it about voxels that makes people go crazy? Throughout the past decade, there have been SO many developers obsessed with shrinking them down to have as many as possible (1, 2, 3, 4, 5), admittedly including myself. It’s exciting both as a developer and to the gaming community to see amazing algorithms produce so much detail and think about the possibilities it brings to virtual worlds.

Voxels = destruction!
Voxels = building!
Voxels = the universe!

And yet, there’s no commercially available, general-purpose, widely-used voxel engines which games are built on. The term “(micro) voxel engine” is basically synonymous with vaporware. We see jaw-dropping showcases that are sometimes accompanied by hyperbolic claims (“UNLIMITED DETAIL”) and then radio silence. But why?

Defining the Problem

A lot of voxel developers base their algorithms on their rendering capabilities. If you can rasterize (dual contouring), ray cast (Voxlap), splat/frustum trace (Euclideon), sphere march (SDFs/isosurfaces), ray trace (sparse voxel octrees), or somehow get billions of voxels from memory to pixels on your screen, then the most important part is done, right?

Let’s briefly consider what additional capabilities a voxel engine should provide in order to create a game:

  • Lighting
  • State serialization and synchronization across a network
  • Physics/collision detection
  • AI features
  • Dynamic objects

These are all systems that operate on the existing data. This isn’t even including the building and modification of the game world. To create anything resembling reality, the world usually also needs:

  • Terrain with interesting features
  • Trees
  • Vegetation
  • Water
  • Artificial structures

Perhaps most importantly, people tend to expect or look forward to voxel engines with the following features:

  • Creating anything
  • Destroying everything
  • Procedural generation
  • Physical interactions on a per-voxel level
  • Alternate simulations (anything else that interacts with voxel data, e.g. vegetation growth, door responses, “logic” like Minecraft’s redstone and pistons, and so on)
  • Voxel characters

And these are really just core features. While not strictly engine-related, there’s still gameplay to integrate that provides the user with a fulfilling way to experience what the engine can do. There’s also art direction, sound design, atmosphere and tone, and other architecture details like system compatibility, performance, codebase management, scalability, the developer experience, mod support, and more. It’s important to think about these things because the engine needs to be sufficiently designed to achieve these goals.

It’s All About the Data

While writing an awesome voxel renderer is certainly no easy feat, we can see that it’s only a small part in the grand scheme of systems that need to work together. We need to choose a voxel format that can be rendered efficiently, but how can we make sure it works for everything else as well?

Sparse voxel octrees might be able to hold a couple billion voxels worth of data that we can cast primary and shadow rays against, but how well do they work for collision detection? Global illumination? Path finding? Adding new per-voxel attributes besides just albedo and normals? Dynamic objects?

It’s important to answer these questions ahead of time because most of the systems we build need to incorporate the format into their designs. As it turns out for sparse voxel octrees, storage and rendering are the only things they are acceptable (not even great) at. Moreover, if systems are built on top of the general volume design and that design changes, then a lot of time is going to end up spent updating the entire codebase to work with it.

Modular By Design

In the previous post I talked about how an engine can be designed around ECS principles to enable flexibility and embrace expandability in its systems. This post is going to continue that philosophy to solve the voxel format problem.

The solution is actually rather obvious: to use whatever voxel format is best for the job! This means not having one or two, but as many as are necessary.

Actually, the credit for this idea comes from graphics programming – we see it successfully used with 3d models both with file protocols (obj, ply, fbx, etc.) and with shaders! Game engines are able to use a library like assimp to import basically any model into a common format, and then convert that format into whatever the GPU needs to rasterize the triangles.

Going beyond just graphics programming, many physics and ray tracing libraries are built around triangle data (vertices and indices). That is the core common “raw” format which makes up the physical geometry. Supplementary attributes can be stored alongside this data, like texture coordinates for use in the fragment shader or material identifiers for looking up friction coefficients on the physics side.

A Thought Experiment

Imagine you’re writing a ray tracing API that accepts a list of vertices making up a triangle mesh, which then builds an acceleration structure and bakes the vertex attributes into the leaf nodes. Your input looks like this:

struct vertex_t
{
    vec3 position;
    vec3 normal;
};
acceleration_structure_t* build_structure(vertex_t* vertices, uint32_t num_vertices) { ... }

Suppose one day you wanted to include colors with the vertices as well. Would the correct solution be:

  • A) Add vec3 color; to the vertex structure, modify the build_structure code to account for this new data, and adjust all relevant code to look for the colors at the end of each vertex.
  • B) Adjust build_structure to accept any data type and have the user specify where in this data the position attribute is via offset and stride, and then switch to baking vertex indexes into the acceleration structure instead so code that cares to use the colors can be adjusted.

If you guessed B, you’d be correct! The answer was rather obvious in this silly experiment, but imagine replacing Vertex with Voxel and suddenly with answer A you’ve described Efficient Sparse Voxel Octrees!

Now of course, baking in attributes is an implementation detail and follow-up works that attach attributes to DAGs use an index-based approach, but that’s hardly the problem here.

Voxels in the traditional sense use explicit data sampled in an implicit space, meaning the geometry is based on how you assign values in a grid. This means that in many cases, we won’t have a vec3 position; with which to build our ray tracing API, but instead, we’ll have a grid of attributes! Here’s a more realistic example.

struct voxel_grid_t
{
    uint8_t material_ids[16 * 16 * 16];
};

Well that’s all fine and dandy, but at some point we’re going to need more than material identifiers. With small single-colored voxels, we actually need unique per-voxel attributes in addition to a material identifier, like albedo and normal. For gameplay, maybe we want to have procedural vegetation, and “redstone” logic simulated on the voxel scale. Are we going to store that data everywhere?

struct dumb_voxel_grid_t
{
    const size_t size = 16 * 16 * 16;
    uint8_t material_ids[size];
    vec3 albedo[size];
    vec3 normal[size];
    uint8_t vegetation_type[size];
    uint8_t vegetation_state[size];
    uint8_t redstone_strength[size];
};

That would be, as the struct name suggests, dumb. Memory issues aside, how is this supposed to scale up? Sure, we could just have generic data tacked on that’s identified by the material, but that’s wasteful and limited. What if modders want to add voxel data of their own? What if we want to bitcrush the attributes? What if some of the geometry we render is procedural, like fire? What data belongs on the GPU and what doesn’t? What gets serialized? More importantly, what even defines the geometry here?

There’s A Point, I Promise

We’ve done a lot of talking about what the problems are and not a whole lot of proactive thinking to address them. However, I believe understanding the problem is more important than coming up with the solution itself.

I’m going to digress for a moment. Back when Steve Jobs revealed the iPhone back in 2007, he made some remarks about adding physical buttons to the smartphones of the day:

And what happens when you think of a great idea six months from now? You can’t run around an add a button to these things – they’re already shipped! So what do you do? It doesn’t work because the buttons and the controls can’t change. They can’t change for each application, and they can’t change down the road if you think of another great idea you want to add to this product.”

Steve Jobs (2007)

This sounds an awful lot like the problems that we have with our voxel designs but with buttons instead of data. And no, I’m not just trying to make some clichéd remarks. We need a solution that anticipates the developers’ needs and is going to allow us to dynamically build specially optimized systems.

Back To the Format Discussion

Like the OO-ECS concept, my main goal was to design everything in such a way that no system was closed-ended, and that developers could always iterate on or expand the engine without any codebase overhauls.

We’ve observed the main problems with voxel engines being how our data is laid out and processed, and what happens when it’s not optimal for a given job. We’ve also discussed how this problem is handled with traditional graphics and meshes, and that’s by having a common, bare-minimum format with which to convert to other desired formats.

Now it’s time to apply these ideas to voxels. As I previously mentioned, the answer is simply to use whatever format is best for the job. But when working with volumes, it’s not that easy. Even with different formats, we can’t ever just store raw data for the entire world because of memory usage. Conversely, we can’t always work with compressed data because compression takes place after the raw data has been filled out.

Lastly there’s the need to dynamically add attributes. We don’t want to store vegetation growth state deep underground in some cave where vegetation isn’t growing anyway. As part of “future-proofing” the engine, developers need to be able to add any type of attribute they want without worrying about how it fits with the rest of the engine.

In summary, what we need is a way of working with raw/common data when we need it, adding arbitrary per-voxel data only where we want it, and compressing the data when we’re done.

Three Tools – Allocation, Tagging, Conversion

We can achieve all of the above through a general volume pipeline, described as Allocation, Tagging, and Conversion. I’ll talk about each one below, and give an example of how they’re used to convert some offline terrain data blocks into a common format.

Allocation

Some voxel data needs to exist on the GPU. Some needs to be written to the hard drive. Some may only need to exist for a short time while we work on it. The allocation stage is responsible for creating a buffer that voxel data can be written to and dealing with it afterwards. It isn’t the developer’s responsibility to manage any of the lower-level tasks associated with this allocation besides returning it to the allocator when finished.

auto allocator = ((volume_allocator_context_t*)context_t::m_engine->get_context("volume_allocator_context"))->get_volume_allocator_from_name("cpu_recycled");
volume_data_t* dest_volume = allocator->new_volume();

The cpu_recycled allocator provides the buffer for a short time while we work on it. If we wanted the data to be sent to the GPU, we would specify that allocator instead. But the allocator ends up being generalized and can be swapped out based on our needs.

Tagging

We next assign attributes and tie voxel data to the allocations. Attributes consist of a name, size, data type, and optional data pointer.

// Albedo
dest->header.attributes[0].bits_per_element = sizeof(albedo_t) * 8u;
dest->header.attributes[0].type = volume_attribute_types::to_enum<albedo_t>::value;
dest->header.attributes[0].total_size_in_bytes = aabb_volume * sizeof(albedo_t);
dest->header.attributes[0].set_name("albedo");

// Normal
dest->header.attributes[1].bits_per_element = sizeof(normal_t) * 8u;
dest->header.attributes[1].type = volume_attribute_types::to_enum<normal_t>::value;
dest->header.attributes[1].total_size_in_bytes = aabb_volume * sizeof(normal_t);
dest->header.attributes[1].set_name("normal");

There’s some template magic going on here for albedo_t and normal_t in order to allow us to change data types on the fly.

As mentioned, attributes contain a data type (which mirrors GLSL’s primitive types, or is labeled “custom”), but the engine framework preloads some common types. For albedo and normal, this looks like:

// Albedo
{
	attr.set_name("albedo");
	attr.bits_per_element = sizeof(u8vec4) * 8u;
	attr.type = VOLUME_ATTRIBUTE_TYPE_U8VEC4;
	add_attribute(&attr, "albedo");
}

// Normal
{
	attr.set_name("normal");
	attr.bits_per_element = sizeof(vec3) * 8u;
	attr.type = VOLUME_ATTRIBUTE_TYPE_VEC3;
	add_attribute(&attr, "normal");
}

The pseudo-reflection part happens here:

if (m_albedo_template->type == VOLUME_ATTRIBUTE_TYPE_U8VEC4 && m_normal_template->type == VOLUME_ATTRIBUTE_TYPE_VEC3)
{
	m_standard_conversion = std::bind(&terrain_block_conversion::convert_terrain_block_to_static_standard<u8vec4, vec3>, this, _1, _2, _3);
}

Admittedly, this part is a little weak because it requires pre-coding the supported type combos. We are able to change it on the fly by changing the function pointer, but we still have to declare the template function so the compiler can actually generate the function. While I’m hoping to improve this part down the line, it works great by having compile-time conversions in place.

Conversion

At the heart of the solution is arguably the most important stage which is responsible for converting voxel data from one format to the next.

allocator->allocate_volume(dest);

binary_writer_t albedo_writer = binary_writer_t(dest->get_data_from_index(0), dest->header.attributes[0].total_size_in_bytes);
binary_writer_t normal_writer = binary_writer_t(dest->get_data_from_index(1), dest->header.attributes[1].total_size_in_bytes);

for (int x = 0; x < aabb_size.x; x++)
{
	for (int y = 0; y < aabb_size.y; y++)
	{
		for (int z = 0; z < aabb_size.z; z++)
		{
			u8vec4 src_albedo = cell_reader.read_u8vec4();
			albedo_t dest_albedo = attribute_converter<u8vec4, albedo_t>::convert(src_albedo);
			albedo_writer.write(dest_albedo);

			vec3 src_normal = cell_reader.read_vec3();
			normal_t dest_normal = attribute_converter<vec3, normal_t>::convert(src_normal);
			normal_writer.write(dest_normal);
		}
	}
}

This starts off by calling the volume’s allocation function, which performs any behind-the-scenes allocations to prepare the data specified by the attributes.

This terrain block conversion expects there to be a 3D array of albedo and normal data made up of a u8vec4 and vec3, respectively. It then writes to the attributes using whatever type was specified for them. In this instance, the conversion operator is reading the terrain_block format, and converting to the default format, which is simply raw attributes laid out for the entire volume.

Attribute type conversion is handled here as well, but again optimized thanks to templates. By default, values are simply casted, but more often than not we want a more sophisticated conversion. For example, converting a u8vec4 to a vec4:

template<>
class attribute_converter<glm::u8vec4, glm::vec4>
{
public:
	static inline glm::vec4 convert(glm::u8vec4 input)
	{
		glm::vec4 v = input;
		v *= 1.0f / 255.0f;
		return v;
	}
};

Finally, we just make sure to call the conversion function and cleanup afterwards. The format source and destination is specified here too:

auto fn_conversion = ((volume_conversion_context_t*)(context_t::m_engine->get_context("volume_conversion_context")))->get_conversion("terrain_cell", "default");
...
(*fn_conversion)(src_volume, dest_volume, allocator);
allocator->return_volume(dest_volume);

The usage of function pointers for these things is talked about in the last post, but I’ll mention it here as well – I want these systems to be language-agnostic. The core of the engine will be written in C++ for performance reasons, but content developers may want to whip something up in C# (which everything has bindings for). And so it’s important that that avenue, and thus by extension other languages that can call C functions and generate C function pointers, remains open.

Also, while the above example focused on a rather trivial conversion, by tagging attributes and including their formats, compressors and other conversion operators can be made to work on general data. There are some other indicators that hint how data can be compressed, such as special entries in the type enum and the bits_per_element field. When working with explicit volumes, general format conversion for the entire volume is great, and then separate volumes would be used for the special data.

What Data Conversion Has To Do With Anything

Data conversion has everything to do with anything. By building systems around formats that can be swapped out at run time or exchanged/changed in the future, your codebase is safe from painful refactors. Experimenting with a new format means just writing a conversion from the old format to the new. Maybe you want to forego any conversion overhead in a specific system – so extend that system to be compatible with your compressed data.

The power of the system goes beyond just saving your codebase, too. Conversion operators can also be written for:

  • Importing meshes and voxelizing them
  • Importing Minecraft maps to their detailed voxel counterparts
  • Converting CSG operations to voxels (AKA a building system)
  • Importing old versions of game maps
  • Creating more effective compressors
  • Generating collision data for physics processing
  • Voxelizing procedurally generated terrain
  • Converting compressed or raw data to an offline or network-ready format
  • Generating voxel vegetation by converting “seed data” into voxels

So as you can see, conversion operators aren’t just for rearranging data; they’re more like black boxes. Input data of a specific format and let the engine figure out and execute the link to the desired output format. It’s such a simple concept that’s more general than volume conversion, and yet it’s managed to solve our problems.

One Missing Key Component

Rendering: the very thing we started out discussing and are going to close on. How do we take a dynamic voxel format with dynamic attributes and expect to throw it at the GPU?

The solution here will have to wait until the next post where I take a deep dive into the rendering setup (I guess I accidentally lied on the last post). I’ll briefly cover it here though.

The following is generally true for OptiX, DXR, Embree, or Vulkan, but I’ll focus on Vulkan because it’s the greatest programming API to ever exist (more on that in the next post).

The shaders that get executed in the ray tracing pipeline depend on four things:

  • The geometry configuration in the BLAS
  • The SBT offset in the top level instance
  • The shader binding table’s data
  • The geometry that was potentially hit

By tracking the voxel formats that make up a BLAS’ geometries, we can build the SBT with specific intersection shaders that have been tailored to the format design. Callable shaders can also be bound in order to decode attributes that are required in a pipeline. The end result is a very clean and again modular way of handling different formats, and of course the engine takes care of building these – the user just has to link a format with its intersection & callable shaders.

Closing Remarks

The “Perfect” Voxel Engine, at least to me, is an engine that lets developers to just use voxels where they shine without having to deal with structures. There’s a lot that I glanced over, like how to build and deal with a large world that’s volumetric on a higher level, but I actually think it’s out of the scope for this post. The perfect voxel engine should provide the tools with which to build bigger infrastructures on top of, which is where the game engine starts coming into play.

In the next post, we’ll (hopefully for real) dive into the rendering architecture. It’ll be all about Vulkan, which I can’t wait to talk about!

21 comments

  1. > As it turns out for sparse voxel octrees, storage and rendering are the only things they are acceptable (not even great) at.
    Stop teasing lol can’t wait to learn more about the structures that you’re using

    But yeah it’s a great write up and I can’t wait to learn more about your design!

  2. This is amazing work. I’ve worked on a voxel engine in the past using SVOs. You are absolutely right about them. They are way to static and not even great for rendering. I ran into so many issues with them. I’ve thought about using multiple data structures like you do, but never got it to work in my head. This is brilliant and I can’t wait to see more. I’ve always believed voxels are the future of computer graphics. That’s why I wanted to develop a voxel engine. But contrary to my attempt, this looks really promising. Hang in there and continue the good work!

  3. This is just amazing
    i’m still having a hard time figuring out how the hell do you figure out bending with voxels, though
    since it’s volumetric, and thus we operate in a static grid, bending that happens in your video with leaves and in (1) at the 50s mark, there must be voxels that dissapears or gets generated of essentially nowhere, unless it’s continuously generated by adapting a vertex object into a voxel object, but then again, you cannot do that and still keep modifications ported to that deforming object:
    if i have a brick wich i bend in a 90° curve, there is a crease in wich the voxels are denser and an outer crease that’s stretched, so some voxels must dissapear and some must be generated out of thin air, right?
    hell, let’s stretch it, since in your game videos we see that when a leaves moves it doesn’t translate, it’s closer to adding new pixels before it and take out some after it, what happens when there’s something in the way but the animation still goes through, does it dig into the static voxels then leaves a hole?
    let’s say we have another animation where a plank stretches and compresses along its length, if we cut a hole when it’s stretched the shape of the hole should change as it compresses, right? But that shouldn’t be possible since the animatio isn’t conservative, it’s not treating the stretched item voxel per voxel and arranging them conservatively so that it has the same voxels in the same number and same relative position, because it’s limited by its resolution, so deleting a part of the object must not stay well where it’s meant to be since it has no reference, the deformable shape isn’t a voxel so a hole isn’t a lack of voxels in an area but a constant piercing operation at a specified point on a vertex model it’s referencing, but that means that for each deformable element they have to pull up their source and apply a piercing modifier wich is computationally heavy.
    Something Ain’t Right.
    If this is a voxel per voxel stretch and the engine has to figure out what pixel goes where then it cannot be fast, or be undoable by not pinching it anymore, as there’s a good chance the voxels don’t go back to their former place
    i think the best visual explanation would be to take a leaf and color each of its pixels white and black in a chessboard fashion and see how it changes as it deforms with the wind, from low deformations to extreme cases where a leaf can get stretched 3 times its normal length and see the stretching effects in this to figure out what’s happening

  4. This needs support, I was left in the dust when it came to teaching students basic code as a normal thing. But I am learning slowly. I have been knowing where all this going and this is definitely the next step, especially since VR and holoprojectors are becoming more developed now also. Do you see where im going with this? These steps towards total digital immersion are going to be at the forefront of the future. Most people may think, “oh its just gaming”. But these simulations and generations of space are going to open up many doors for new and world changing tech and advancement. I am very interested in development myself and hope that nothing gets in your way. This is very good stuff and the coding is elegant. If we can advance the efficiency of the digital flow of information and reduce the ever irritating lag and and dropped signals, then the possibilities for infinite microscopic voxel data to be experienced on anyones hardware are certain and without wondering if your GPU can hold up. Cloud gaming would be perfect for this engine if we can break through the weak signal barrier. Very impressive work, keep up the great work.

  5. How can I work for you, I want in on this. I’m passionate about building games but I know next to nothing about it, the world and engine you’ve created here is something I dreamed up myself. I want to help you, I want you to teach me your ways sensei. Your world is beautiful and your mind even more so. this is insane, please let me know what I can do. If you want me to leave you alone I understand but please give me some kind of feedback.

  6. John, I honestly can’t wait to see where this goes. You’re taking the industry paradigm of “Know less, do more, get paid” and flipping on its head. You’re thinking ahead and using insight- the industry needs more of that. Between this project, and your micro voxels, you have an incredibly bright future ahead of you. If there’s any way I can support you or either of these projects, please let me know.

  7. This is history in the making. Thank you for these blog posts and for taking the time to work on this amazing project!

  8. Thank you for writing about this.
    I am one who wants to develop a voxel game.
    But in no way can I be a voxel engine dev any time soon.
    It’s pretty apparent from the outset that the undertaking requires new and advanced techniques on the engine level. That’s a lot to ask.
    I hope you can get serious contributors to make something awesome. Of course I’m also a OSS/FOSS nut, but you know.
    Thank you for your work.

  9. Awesome post and thought. I miss your Twitter/YouTube updates John! The wait for the next post about rendering and Vulkan is killing me 🙂 Merry Christmas!

  10. Still waiting on that solution (next post) 😉

    Thank you for all that you’re doing!

    You might not teach for a living- and probably shouldn’t- because I would pay full tuition for just one class with you.

  11. Hello, are you still working on the project? There were no updates since this post and many people are wondering, what happened to this project? It looked absolutely amazing and I personally thought it would be the future of sandboxes! Thank you.

  12. Hello John! I wanted to reach out because I would hate myself if I didn’t. I want to make it clear that I’m not just trying to jump on the band wagon of your success. I have been working on a voxel project of my own for almost a year now, and I don’t plan on stopping any time soon, just because I’m so passionate about it (you can see my YouTube from my website https://www.gaberundlett.com). I wanted to just let you know that (since I see there are so many people that appear to be working on the same kind of thing, completely separate) I think it would benefit the consumer if there was more collaboration between devs, and I wanted to just let you know that I would be absolutely honored if you considered hiring some more talent to work on your project. Whether it’s someone to work on internal tools or to be a technical artist, or even to co-develop engine features, I would love it.

    JSYK, I’m currently happily employed, but I am just putting myself out there, since I love this kind of stuff. If you were to consider my offer, I wouldn’t have a problem working with you full-time, and I understand that may mean no longer working on my personal voxel project(s)

    I am under the assumption that you were picked up by a company or sponsor to create your project, so I understand if it’s not really something you can talk about, so in any case, I’m really excited to see what you put out, and keep up the great work!!

  13. John,

    I saw a video of your micro-voxel engine

    https://www.youtube.com/watch?v=8ptH79R53c0

    You mentioned that you have shelved it and I was wondering if it might be possible to get a copy of the source code so that I can run and maybe do some development work with it.

    Anything that you could do would be greatly appreciated.

  14. Hi John 🙂
    Really love your voxel engine, its very impressive. Especially the scaling which seems to be insanely good.

    Keep up the good work!

Leave a comment

Your email address will not be published. Required fields are marked *