Owing to the continual growth of multimodal data (or feature spaces), we have seen a rising interest in multimedia applications (e.g., object classification and searching) over these heterogeneous data. However, the accuracy of classification and searching tasks is highly dependent on the distance estimation between data samples, and simple Euclidean (EU) distance has been proven to be inadequate. Previous research has focused on learning a robust distance metric to quantify the relationships among data samples. In this context, existing distance metric learning (DML) algorithms mainly leverage on label information in the target domain for model training and may fail when the label information is scarce. As an improvement, transfer metric learning (TML) approaches are proposed to leverage information from other related domains. However, current TML algorithms assume that different domains explore the same representation; thus, they are not applicable in heterogeneous settings where the data representations of different domains vary. In this research, we propose xTML, a novel unified heterogeneous transfer metric learning framework, to improve the distance estimation of the domains of interest (i.e., the target domains in classification and searching tasks) when limited label information, complementary with extensive unlabeled data, is provisioned for model training. We further illustrate how our proposed framework can be applied to a selected list of multimedia applications, including opinion mining, deception detection and online product searching. © 2005-2012 IEEE.