Animation and Game performance tips.
Avoid save restore
Use setTransform as that will negate the need for save and restore.
There are many reasons that save an restore will slow things down and these are dependent on the current GPU && 2D context state. If you have the current fill and/or stroke styles set to a large pattern, or you have a complex font / gradient, or you are using filters (if available) then the save and restore process can take longer than rendering the image.
When writing for animations and games performance is everything, for me it is about sprite counts. The more sprites I can draw per frame (60th second) the more FX I can add, the more detailed the environment, and the better the game.
I leave the state open ended, that is I do not keep a detailed track of the current 2D context state. This way I never have to use save and restore.
ctx.setTransform rather than ctx.transform
Because the transforms functions transform, rotate, scale, translate multiply the current transform, they are seldom used, as i do not know what the transform state is.
To deal with the unknown I use setTransform that completely replaces the current transformation matrix. This also allows me to set the scale and translation in one call without needing to know what the current state is.
ctx.setTransform(scaleX,0,0,scaleY,posX,posY); // scale and translate in one call
I could also add the rotation but the javascript code to find the x,y axis vectors (the first 4 numbers in setTransform) is slower than rotate.
Sprites and rendering them
Below is an expanded sprite function. It draws a sprite from a sprite sheet, the sprite has x & y scale, position, and center, and as I always use alpha so set alpha as well
// image is the image. Must have an array of sprites
// image.sprites = [{x:0,y:0,w:10,h:10},{x:20,y:0,w:30,h:40},....]
// where the position and size of each sprite is kept
// spriteInd is the index of the sprite
// x,y position on sprite center
// cx,cy location of sprite center (I also have that in the sprite list for some situations)
// sx,sy x and y scales
// r rotation in radians
// a alpha value
function drawSprite(image, spriteInd, x, y, cx, cy, sx, sy, r, a){
var spr = image.sprites[spriteInd];
var w = spr.w;
var h = spr.h;
ctx.setTransform(sx,0,0,sy,x,y); // set scale and position
ctx.rotate(r);
ctx.globalAlpha = a;
ctx.drawImage(image,spr.x,spr.y,w,h,-cx,-cy,w,h); // render the subimage
}
On just an average machine you can render 1000 +sprites at full frame rate with that function. On Firefox (at time of writing) I am getting 2000+ for that function (sprites are randomly selected sprites from a 1024 by 2048 sprite sheet) max sprite size 256 * 256
But I have well over 15 such functions, each with the minimum functionality to do what I want. If it is never rotated, or scaled (ie for UI) then
function drawSprite(image, spriteInd, x, y, a){
var spr = image.sprites[spriteInd];
var w = spr.w;
var h = spr.h;
ctx.setTransform(1,0,0,1,x,y); // set scale and position
ctx.globalAlpha = a;
ctx.drawImage(image,spr.x,spr.y,w,h,0,0,w,h); // render the subimage
}
Or the simplest play sprite, particle, bullets, etc
function drawSprite(image, spriteInd, x, y,s,r,a){
var spr = image.sprites[spriteInd];
var w = spr.w;
var h = spr.h;
ctx.setTransform(s,0,0,s,x,y); // set scale and position
ctx.rotate(r);
ctx.globalAlpha = a;
ctx.drawImage(image,spr.x,spr.y,w,h,-w/2,-h/2,w,h); // render the subimage
}
if it is a background image
function drawSprite(image){
var s = Math.max(image.width / canvasWidth, image.height / canvasHeight); // canvasWidth and height are globals
ctx.setTransform(s,0,0,s,0,0); // set scale and position
ctx.globalAlpha = 1;
ctx.drawImage(image,0,0); // render the subimage
}
It is common that the playfield can be zoomed, panned, and rotated. For this I maintain a closure transform state (all globals above are closed over variables and part of the render object)
// all coords are relative to the global transfrom
function drawGlobalSprite(image, spriteInd, x, y, cx, cy, sx, sy, r, a){
var spr = image.sprites[spriteInd];
var w = spr.w;
var h = spr.h;
// m1 to m6 are the global transform
ctx.setTransform(m1,m2,m3,m4,m5,m6); // set playfield
ctx.transform(sx,0,0,sy,x,y); // set scale and position
ctx.rotate(r);
ctx.globalAlpha = a * globalAlpha; (a real global alpha)
ctx.drawImage(image,spr.x,spr.y,w,h,-cx,-cy,w,h); // render the subimage
}
All the above are about as fast as you can get for practical game sprite rendering.
General tips
Never use any of the vector type rendering methods (unless you have the spare frame time) like, fill, stroke, filltext, arc, rect, moveTo, lineTo as they are an instant slowdown. If you need to render text create a offscreen canvas, render once to that, and display as a sprite or image.
Image sizes and GPU RAM
When creating content, always use the power rule for image sizes. GPU handle images in sizes that are powers of 2. (2,4,8,16,32,64,128....) so the width and height have to be a power of two. ie 1024 by 512, or 2048 by 128 are good sizes.
When you do not use these sizes the 2D context does not care, what it does is expand the image to fit the closest power. So if I have an image that is 300 by 300 to fit that on the GPU the image has to be expanded to the closest power, which is 512 by 512. So the actual memory footprint is over 2.5 times greater than the pixels you are able to display. When the GPU runs out of local memory it will start switching memory from mainboard RAM, when this happens your frame rate drops to unusable.
Ensuring that you size images so that you do not waste RAM will mean you can pack a lot more into you game before you hit the RAM wall (which for smaller devices is not much at all).
GC is a major frame theef
One last optimisation is to make sure that the GC (garbage collector) has little to nothing to do. With in the main loop, avoid using new (reuse and object rather than dereference it and create another), avoid pushing and popping from arrays (keep their lengths from falling) keep a separate count of active items. Create a custom iterator and push functions that are item context aware (know if an array item is active or not). When you push you don't push a new item unless there are no inactive items, when an item becomes inactive, leave it in the array and use it later if one is needed.
There is a simple strategy that I call a fast stack that is beyond the scope of this answer but can handle 1000s of transient (short lived) gameobjects with ZERO GC load. Some of the better game engines use a similar approch (pool arrays that provide a pool of inactive items).
GC should be less than 5% of your game activity, if not you need to find where you are needlessly creating and dereferencing.