I often find myself forgetting the conventions for coordinate spaces in DirectX, and for some reason I haven't found MSDN all that helpful. The DirectX 9 Projection Transform document has a nice diagram, but it's DirectX 9 specific which is now 3 versions out of date. So I wrote a tool to help me see what the conventions were. This document explains the tool and what I found. This also links together a bunch of the official MSDN documentation in a way that I find more intuitive.
DirectX provides several configuration settings that modify the behavior described here; this document attempts to describe the default behavior: I assume that if you're playing with culling or depth test settings you know how those settings change the pipeline's behavior.
Most of the logic is in the vertex shader; the C++ code just sets up the pipeline and draws 6 vertices and the pixel shader just uses the color as output from the vertex shader. The actual vertex data is hard-coded into the shader (see Using the Input-Assembler Stage without Buffers for how this works). Note that by default depth testing is enabled and the comparison is "strictly less than"; since we clear the depth buffer with 1.0f we won't see any vertices whose Z-coordinate is 1.0.
Here's what we see:
The vertex shader outputs position coordinates in clip space. Clip space is a cuboid, but I'll describe the XY plane (parallel to the monitor screen) and the Z axis (normal to the monitor screen) separately.
The XY plane of clip space ranges from (-1, -1) in the bottom left to (1, 1) in the top right.
The Z coordinate of clip space ranges from 0 (near the viewer) to 1 (far from the viewer). By default, the far plane is exclusive: vertices with a Z-coordinate of exactly 1 will be clipped.
This configuration is a left-handed coordinate system.
With DirectX we also set a viewport. The viewport extents position the viewport in screen space (see below), and then clip space maps onto the viewport extents. Getting Started with the Rasterizer Stage is the best documentation I've found for how the viewport fits in.
The viewport maps clip space to screen space by placing the render target in screen space. The extents of clip space (the cuboid from (-1, -1, 0) to (1, 1, 1)) map to the extents of the viewport. For example, say we've set up a viewport as follows:
D3D11_VIEWPORT viewport;viewport.Width = 200;viewport.Height = 300;viewport.TopLeftX = 100;viewport.TopLeftY = 200;viewport.MinDepth = 0.5f;viewport.MaxDepth = 0.7f;Then the clip space point (-1, -1, 0) will map to (100, 200) in screen space, with a depth of 0.5. The clip space point (1, 1, 1) will map to (300, 500) in screen space, with a depth of 0.7f.
Screen space is the 2-dimensional coordinate system of the pixels the viewer sees (these may or may not be 1:1 with physical pixels on the monitor depending on resolution settings). It ranges from (0, 0) in the top left to (Width, Height) in the bottom right.