Batching texture to texture rendering in Unity on the gpu

Batching texture to texture rendering in Unity on the gpu

I'm working on a 2D sidescroller within Unity 5.1. The terrain is intended to be semi-infinite in all 4 directions, procedurally generated using a noise algorithm.
I'm using a 3x3 grid of quads as "chunks" that I shift/update as the camera pans around. The issue I'm having is related to the dynamic texture I have tied to each chunk. When they need to be created rather than just shifted around, I'm not quite sure how to render individual tiles from a single texture to specific locations on another texture without using system memory.
Using setpixels, I believe it is passing color data each apply to the gpu side. Using gui/graphics with a rendertexture to drawtexture or any variants seems to be doing individual draws per call rather than giving me the opportunity to batch them. Using the native gameobjects has FAR too much overhead considering I'm trying to make tiles only be 8x8 pixels in size (meaning higher screen res = more on-screen tiles).
So, is it possible within Unity to do a low level texture to texture render/blit/draw/whatever without pulling texture data into system memory AND still batching same material/texture calls together without having to buy some toolkit from the asset store?
If any of this doesn't make sense, please ask anything you need to be able to help me and I'll update as needed.


Answer 1:

It seems to me your best bet would be a dynamic mesh, whose material texture is the source texture, and which is drawn onto the target RenderTexture. The mesh’s UV coordinates correspond to read position, and vertex coordinates correspond to write position. It’d be one draw call, although you’d need to create the mesh as well.

Still, for a simple mesh, creating the mesh would be very quick. Here are some performance tips if you do want to go down this route:

If your mesh can be of a fixed size (or you have an upper bound on how big it’ll be), you can set it up once and then just update what’s already there through mesh.vertices = changedVertices;. By using a fixed size mesh you avoid having to call Clear(), which can really hurt performance but is only necessary if mesh.triangles changes. For example, if you know you’ll always need less than or equal to 1000 quads to do this batched blit, you’d have a mesh of 1000 quads, some of which may be degenerate (all vertices in the same position) depending on how many quads you actually want to draw. There’s nothing wrong with a few degenerate quads at this scale — in fact it’ll be better than regularly having to use Clear() (which isn’t cheap, apparently) every time the mesh changes.

If you’re updating the mesh very regularly, you can MarkDynamic. This just tells the GPU it should keep the data somewhere it can quickly and regularly update it.

Avoid getting the mesh vertices from the mesh (var x = mesh.vertices actually has to build the array for you, even though it’s disguised as a simple assignment). Instead you’ll want to store your own copy of the vertices array, uvs array, and whatever else you’ll be using.

If you’re on mobile you can optimise this even more by double-buffering the changes to the mesh, so the CPU doesn’t have to wait for the GPU to catch up with the changes you’re making to the mesh:

void Update (){
  // flip between meshes
  bufferMesh = on ? meshA : meshB;
  on = !on;
  bufferMesh.vertices = vertices; // modification to mesh
  meshFilter.sharedMesh = bufferMesh;

(…according to the manual under Mobile Optimisations; desktop doesn’t seem to need this)

If you’re worried about the performance implications of updating a small mesh every frame, perhaps it’ll be of comfort that I use a similar technique to have thousands of GPU-simulated particles and decals.

Also, while it might seem like a kind of hacky way to do things, this is probably how I’d do it in OpenGL, too, though I’d look into instancing or perhaps a geometry shader. One way or another, using a mesh to map from space on one texture to space on another is the best way to do this I can think of, assuming I understood your question.