Spatial Representations
Tags | Representation |
---|
Quick tips
- to transform frames, you need the Homogeneous transformation (it is not sufficient to do a rotation)
Notation Introduction
The important notation choice is how we represent a frame of reference. We use the pre-superscript to denote the origin, i.e. represents some point defined in terms of frame . Always, the pre-superscript shows us how something is defined. Use a pre-subscript to reference the represented frame.
- : some vector (the x-direction unit vector) as defined in frame
- : orientation of frame with respect to ’s coordinates
- is a vector that points from the origin to a point in space, .
- We generally use to reference a point. If this point happens to be a frame, you use a normal subscript. So would be the point of frame , and would be the representation of frame in .
- We may also use the format to indicate that this is the position of a point in FOV
Manipulator Structure
Let’s try to formalize the definition of a robot manipulator and the different definitions. A manipulator
has a number of rigid body links
connected through joints
. The last part (the manipulator) interacts with the environment, known as the end-effector. The first part is connected to some solid surface and is known as the base
.
By simple calculations, manipulators have joints and links. Because one of them is fixed, there are moving joints.
Joints
We assume that the joints only have one degree of freedom. This yields two possible joint types.
Prismatic joints
will create linear motion
revolute joints
will create rotational motion
Joints have limits that must be respected.
Generalized Coordinates / Joint Coordinates
Degrees of freedom
Each mobile link can be described by 6 parameters: 3 for position and 3 for orientation (like euler angle). Therefore, we say that the link has 6 degrees of freedom
. Therefore, the links can be described by total parameters.
However, each joint only affords a single degree of freedom relative to the previous joint. So, 5 out of the 6 parameters per link is actually redundant. As such, the total degrees of freedom of a manipulator is . Or, in other words, each segment adds a single degree of freedom.
Creating joint coordinates
The discussion on degrees of freedom leads us in the direction of deriving a concise set of coordinates that can represent the entire system uniquely. The number of coordinates is the same as the DOF, which should make intuitive sense.
You end up having this joint space
where each axis is a joint angle (or linear actuation depth), and each point is therefore a robot state.
Operational Coordinates
Instead of describing the positions of all the joints, we may also just try to describe the position and orientation of the end-effector. Now, if there are enough degrees of freedom in the robot, this EEF will have all six degrees of freedom. However, if we are less fortunate, this end-effector may not have all six.
We can plot the operational space
very similarly to the joint space, but each axis comprises a degree of freedom of the end-effector
Redundancy
If the number of joints is greater than the maximum degree of freedom of the end-effector (typically six), we say that the robot is redundant
. The degree of redundancy is the number of additional degrees of freedom.
As a brief hint of what’s to come, notice that operational coordinates may not map 1:1 to joint coordinates if there is redundancy.
Representing Rigid Bodies
One big lesson is to consider points / vectors as different than their representations. Point and vectors and, well, everything, can exist without numbers. However, as soon as you’re trying to represent them, you run into the need for references. That’s where the math comes in.
What’s agnostic to representation?
This is repeating some material but it’s important, so it’s worth it. Certain things are agnostic to representation, some of the time. In fact, everything is technically agnostic to representation until we want to talk about it (this is almost a philosophical point). Operationally, let’s start with points. These are agnostic, and therefore vectors defined by these points are agnostic, and therefore whole frames are also agnostic.
Therefore, certain operations are agnostic to representation, like vector operations, etc.
However, in this state, we can’t really represent these things. Implicitly, points need to be defined in terms of an origin, and vectors as well. Therefore, the second we want to talk about them, we need to define such an origin, and this is where the pre-superscript and pre-subscript notation come from. This is also where the rotation matrices gain some meaning, etc. This is a very nuanced point…maybe it will become more clear as we become acquainted with the notation and the subject.
Vectors
If we have a reference point , any point in space can be denoted as the vector . Now, here’s some nuance: the vector exists without any reference point; its something that points somewhere. To get the point , you need to define where it starts: the reference .
However, the components of is not agnostic to the reference point (rotate the origin and all the components change). The components are defined in terms of ’s unit vectors.
So, to put this concisely: is a direction, and these directions are agnostic to points of reference. If you are only dealing with vectors, there’s no need to specify a frame of reference. The upshot is that you can perform operations like dot products without worrying about the origin, as long as the two vectors share the same origin.
However, the representation of the vector is sensitive to the point of reference. Because we often do care about the components, we will typically use the notation to specify the frame.
Looking at a rigid object
If you think about it, a rigid object is just another frame of reference. You can describe this second frame of reference in terms of your original frame of reference. The second FOR needs a center point (the described above), and then a rotation. We figured out how to talk about position, but what about orientation?
Rotation Matrix
We denote as the rotation matrix
that tells us the orientation of frame with respect to . Mechanically, represents the three unit vectors of as seen in .
The rotation matrix is orthonormal, which means that the transpose is the inverse:
Here’s a good interpretation
We are expressing the orientation of in , which is why the columns are in frame. The rows, therefore, must go the other way. Each row must express the orientation of in .
What is a valid rotation matrix?
A rotation matrix must be a rigid, non-inverting transformation. Therefore, these are necessary
- (not negative 1)
- , i.e.
Changing Descriptions using Rotation Matrix
If we have anything in , we can easily get it into by computing
Because is in terms of the , and tells us how to take each unit vector and move into .
It follows, therefore, that you can chain rotations together, because each rotation is just a collection of column vectors:
Rotation Matrices as Projections
As you think back to linear algebra, we note that this and the other two are just projections of the unit vectors of onto the unit vectors of . As such, this vector can be written as dot products.
Note how we don’t care about the reference points of the because they are just vectors and we only care about the dot product.
The whole rotation matrix can therefore be expressed as projections
Transforms
Previously, we talked about how to represent positions and orientations with respect to some base. Now, let’s talk about how we can take some representation and move it into another basis. That’s what we call a transformation
.
Pure translation
Say we had some point and we know its representation with respect to , an origin. Suppose that we also knew the vector . From this, you can derive that
This should feel like a “duh” moment, but we are trying to be rigorous such that adding rotations won’t feel like a huge jump.
General Transformation
A general translation is a combination of a rotation and a translation. Let’s discuss this using our two interpretations
- Frame shift: you have a point in . To get it to the base , we first have to align this base with , and then perform a pure translation.
- Physical motion: You have something at . You move it to (pure translation), and then you align it with the reference (pure rotation)
The general transformation equation, therefore, is a composition of a rotation and a translation. It helps to rotate first into ’s orientation because then we can use the vector displacement in .
If you were to translate then rotate, then you would need to find , which is fine. Transformations and rotations are commutative (compositions of rotations, however, are not. But we’ll get to those later).
Homogenous Transform
Matrix representation of a general transformation
Because this general transformation can be a little intimidating, it’s also possible to create a shorthand that does it all in one step. The shorthand takes the place of a 4x4 matrix:
You feed in a representation WRT , multiply by the matrix, and you’ll get the same point WRT .
Three interpretations of homogenous transformation
As such, there are three ways of interpreting a homogenous transform
- Frame description: The fully represents a frame with respect to a base .
- Transform mapping: will map B→A
- Transform operator: the same represents the motion from A→B, which means that we can apply this transformation to any points or vectors, etc.
Transforms as Operators
A transformation can be interpreted as a mapping from , but it can also be interpreted as an operator
(physical motion). Let’s flesh this out a little bit.
- Mappings are always static—they deal with the same points and map between different frames of reference
- Operators are dynamic—they deal with moving some point within the same frames of reference
Building up intuition
Let’s flesh out a critical duality of all transformations. A transformation can serve two purposes: can…
- Map a point in space to space . In this case, nothing is moved, but the reference is changed.
- Imagine moving frame and carrying some vector with this rigid transformation. This is the
operator
definition.
Note that the operator definition doesn’t need any frame of reference; it refers to a general motion that is agnostic to reference. So an operator doesn’t have a reference point.
So any composition of transforms has two ways of interpreting it:
- Right to left: the operator interpretation. You’re imagining three motions that move from and applying them consecutively
- Left to right: You’re moving a world frame from . Ultimately, this gives you a motion from , which can be used to define . Note how this isn’t an operator definition; you can’t think about what a vector would do in this circumstance.
Rotations
A rotation matrix around the target axis would move the other two axes, of the form
To reinforce our prior insight, let’s have the base be and this rotated space is , and the thing above is . This is a physical action that impacts the whole world. So any vector is impacted, and you can do matrix multiplication of any vector to get the transformed vector. Again, it’s worth noting that because we don’t use the origin here, it doesn’t matter that we start with a vector in reference point .
Homogenous Operators
As before, we can represent a homogenous translation + rotation as one matrix, and this can be interpreted as a single matrix multiplication that moves .
Inverse Transforms
We saw previously that we can invert a rotation very easily. We can also invert a pure translation very easily (just negate the vector!). But what about a homogenous transformation? Well, the homogenous matrix isn’t orthonormal at all, so we can’t just take the transpose.
Turns out, if we have a homogenous transformation
we can express the inverse transform as the following:
If you think about this, it actually makes a ton of sense. To show in base , we needed everything in the matrix to be in . To reverse the process and show in base , we need everything in the matrix to be in . That’s easy for the rotation (transpose), but for the displacement, we need to first convert to frame . Then, we can negate.
Composition of Transforms
Here’s a general rule of thumb: math always operates right to left, but you can imagine going from left to right. Consider this chain:
This can be interpreted as expressing in terms of , or you can think of transforming from .