Title: | Multidimensional Scaling of Asymmetric Proximities |
---|---|
Description: | Multidimensional scaling models and methods for the visualization and analysis of asymmetric proximity data<doi:10.1111/j.2044-8317.1996.tb01078.x>. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle. An example of an asymmetric matrix is a student migration table, where the rows correspond to the countries of origin of the students and the columns to the destination countries. This package provides algorithms for three multidimensional scaling models. These are the slide-vector model<doi:10.1007/BF02294474>, a scaling model with unique dimensions and the asymscal model for asymmetric multidimensional scaling. Furthermore, a heat map for skew-symmetric data, and the decomposition of asymmetry are provided for the exploratory analysis of asymmetric tables. |
Authors: | Berrie Zielman |
Maintainer: | Berrie Zielman <[email protected]> |
License: | GPL (>=3) |
Version: | 2.0.4 |
Built: | 2025-03-05 04:33:44 UTC |
Source: | https://github.com/berriez/asymmetry |
Multidimensional scaling models and methods for the visualization and analysis of asymmetric proximity data<doi:10.1111/j.2044-8317.1996.tb01078.x>. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle. An example of an asymmetric matrix is a student migration table, where the rows correspond to the countries of origin of the students and the columns to the destination countries. This package provides algorithms for three multidimensional scaling models. These are the slide-vector model<doi:10.1007/BF02294474>, a scaling model with unique dimensions and the asymscal model for asymmetric multidimensional scaling. Furthermore, a heat map for skew-symmetric data, and the decomposition of asymmetry are provided for the exploratory analysis of asymmetric tables.
Asymmetry in general, and in proximity relations in particular means that the relation from $i$ to $j$ is not equal to the relation in the opposite direction, that is, from $j$ to $i$. This package offers functions for the analysis of asymmetry. For instance, to obtain a heatmap of the skew symmetric part of the data, we use the hmap function. Other functions available are the slidevector model, the asymscal model, a multidimensional scaling model with unique dimensions and, of course. A cornerstone of this package is the decomposition of an asymmetric matrix into a symmetric part and a skew symmetric part. This is a well-known mathematical decomposition and is used extensively in this package.
The analysis of asymmetry was developed as an extension to a symmetric method such as multidimensional scaling. We start with the definition of an asymmetric matrix. An asymmetric data matrix has the same number of rows and columns, and these rows and columns refer to the same set of objects. At least some elements in the upper-triangle are different from the corresponding elements in the lower triangle for a matrix to be asymmetric.
Usually this decomposition is applied to data to study the two components separately. But it can also be applied to model parameters. Here, we use a decomposition to residuals , that is, to the deviations from a model to the data.
Berrie Zielman
Maintainer: Berrie Zielman <[email protected]>
Zielman, B., and Heiser, W. J. (1993), The analysis of asymmetry by a slide-vector, Psychometrika, 58, 101-114.
This function fits a weighted multidimensional scaling model that is known as the asymscal model. This model is an extension of the symmetric Euclidean distance model proposed by Young (1975). The model is fitted in a stress majorization framework called SMACOF, whereas Young fitted this model using a least squares algorithm. Asymmetry is modelled by differential weighting of the dimensions of a multidimensional scaling configuration. When a subject compares object i to j he or she may use different weights when comparing object j to i In addition to these weights, the locations of the objects are jointly estimated from the data.
asymscal(data, ndim = 2, start = NULL, verbose = FALSE, itmax = 10000, eps = 1e-10)
asymscal(data, ndim = 2, start = NULL, verbose = FALSE, itmax = 10000, eps = 1e-10)
data |
Asymmetric dissimilarity matrix |
ndim |
Number of dimensions |
start |
Optional configuration with starting values, the default is a random start configuration |
verbose |
If TRUE, stress values during the iterations are printed |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for Stress |
This function exploits a connection between the INDSCAL model and the asymscal model. This method inherits the methods for plotting an printing from the smacofIndDiff
in the smacof package. Basically, the asymscal takes two steps. First, this function sets up the appropriate dissimilarity and missing data structure for a three-way multidimensional scaling model, then a call to the method smacofIndDiff
in the imported package smacof is made. After correcting for the normalization applied to the data by smacofIndDiff
, the results can be displayed and plotted by the methods in the package smacof
.
The original algorithm for fitting the asymscal model fits squared distances. This function is based on majorization, and fits distances and not squared distances. The configuration matrix is normalized, the sum of squares of the columns of this matrix are equal to one.
delta |
Observed dissimilarities |
obsdiss |
List of observed dissimilarities, normalized |
gspace |
Joint configurations aka group stimulus space |
cweights |
Configuration weights |
stress |
Stress-1 value |
resmat |
Matrix with residuals |
rss |
Residual sum-of-squares |
spp |
Stress per point |
ndim |
Number of dimensions |
model |
Type of the asymmetric scaling model |
niter |
Number of iterations |
nobj |
Number of objects |
Young, F. W. (1975). An asymmetric Euclidean model for multi-process asymmetric data. Paper presented at the U.S.-Japan Seminar on Multidimensional scaling, San Diego, U.S.A.
## Not run: data("asymscalexample") t<-asymscal(asymscalexample, ndim = 2, itmax = 10000, eps = 1e-10) t$cweights round(t$cweights, 3) plot(t, plot.type = "confplot") plot(t, plot.type = "bubbleplot") plot(t, plot.type = "stressplot") ## End(Not run)
## Not run: data("asymscalexample") t<-asymscal(asymscalexample, ndim = 2, itmax = 10000, eps = 1e-10) t$cweights round(t$cweights, 3) plot(t, plot.type = "confplot") plot(t, plot.type = "bubbleplot") plot(t, plot.type = "stressplot") ## End(Not run)
This is an artificial dataset. The data are distances from a two-dimensional model, and because of this construction the asymscal model fit this data exactly. In addition, two rows of this matrix have weights different from (1,1). The fifth subject has weights (1.35,.25), and the 15th subject has weights (1.65,.425).
data("asymscalexample")
data("asymscalexample")
A matrix with 15 rows and 15 columns.
A data matrix with 8 rows and 8 columns. The data are distances between eight English towns, this datamatrix is made asymmetric by adding linear skew-symmetric matrix. In this dataset, asymmetry is imposed by perturbing the data.
data("Englishtowns")
data("Englishtowns")
Constantine, A.G. & Gower, J.C. (1978). Graphical Representation of Asymmetric Matrices. Appl. Statist, 27, 297-304.
data(Englishtowns)
data(Englishtowns)
This heatmap displays the values of a skew-symmetric matrix by colors. The option dominance
orders the rows and columns of the matrix in such a way that the values in the uppertriangle are positive and the values in the lower triangle are negative. The order is calculated from the row-sums of the signs obtained from the skew-symmetric matrix.
hmap(x, dominance = FALSE, ...)
hmap(x, dominance = FALSE, ...)
x |
A square matrix, either skew-symmetric or asymmetric, or an object of class |
dominance |
If true the signs of the skew-symmetric matrix are shown in the heatmap, if set to false the values in this matrix are shown. |
... |
Further plot arguments: see |
data(studentmigration) hmap(studentmigration, dominance = TRUE, col = c("red", "white", "blue"))
data(studentmigration) hmap(studentmigration, dominance = TRUE, col = c("red", "white", "blue"))
This asymmetric MDS model proposed by Holman (1979) and is related to a constrained scaling model developed by Bentler & Weeks (1982).The model has two sets of dimensions, shared or common dimensions and the other set are unique dimensions. There are common dimensions that apply to all objects in the analysis, and unique dimensions that apply to one object and not the other objects. A unique dimension has a non zero value for only one object, the coordinates for the other objects are zero. There are as many unique dimensions as there are objects. An asymmetric version of this model has two sets of unique dimensions: one for the rows and one for the columns. The distance in this model is defined as:
mdsunique(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12)
mdsunique(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12)
data |
Asymmetric dissimilarity matrix |
weight |
Optional non-negative matrix with weights, if no weights are given all weights are set equal to one |
ndim |
Number of dimensions |
verbose |
If true, prints the iteration history to screen |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for Stress |
ndim |
Number of dimensions of the configuration |
fulldim |
Number of dimensions of the full model, this equals |
stress |
The raw stress for this model |
confi |
Returns the configuration matrix of shared dimensions of this multidimensional scaling model |
X |
Returns the configuration matrix of the full model consisting of shared and unique dimensions |
niter |
The number of iterations for the algorithm to converge |
nobj |
The number of objects in this model |
resid |
A matrix with raw residuals |
model |
Name of this asymmetric multidimensional scaling model |
row |
The unique dimensions for the rows |
col |
The unique dimensions for the columns |
unique |
The unique dimensions |
## Not run: data("studentmigration") mm<-studentmigration mm[mm==0]<-.5 # replace zeroes by a small number mm <- -log(mm/sum(mm)) # convert similarities to dissimilarities v<-mdsunique(mm, ndim = 2, itmax = 2100, verbose=FALSE, eps = .0000000001) plot(v, yplus = .3) ## End(Not run)
## Not run: data("studentmigration") mm<-studentmigration mm[mm==0]<-.5 # replace zeroes by a small number mm <- -log(mm/sum(mm)) # convert similarities to dissimilarities v<-mdsunique(mm, ndim = 2, itmax = 2100, verbose=FALSE, eps = .0000000001) plot(v, yplus = .3) ## End(Not run)
Method for a two-dimensional plot of the model. Available rownames are plotted as labels above the points. The slide-vector is shown as an arrow.
## S3 method for class 'slidevector' plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...) ## S3 method for class 'mdsunique' plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...)
## S3 method for class 'slidevector' plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...) ## S3 method for class 'mdsunique' plot(x, plot.dim = c(1, 2), yplus = 0, xlab, ylab, ...)
x |
Object of class |
plot.dim |
A vector with dimensions to be plotted |
yplus |
Parameter to adjust the vertical position of the label |
xlab |
Label of x-axis. |
ylab |
Label of y-axis. |
... |
Further plot arguments: see |
## 2D plot for the slide-vector model on generated data dis <- matrix(c(1, 2, 3, 4, 5, 6, 2, 8, 9, 3), nrow = 5, ncol = 2) #configuration a <- rbind(dis, dis+1.5) #generate slide-vector test <- as.matrix(dist(a))[1:5, 6:10] #extract data v <- slidevector(test, ndim = 2, itmax = 250, eps = .001) plot(v)
## 2D plot for the slide-vector model on generated data dis <- matrix(c(1, 2, 3, 4, 5, 6, 2, 8, 9, 3), nrow = 5, ncol = 2) #configuration a <- rbind(dis, dis+1.5) #generate slide-vector test <- as.matrix(dist(a))[1:5, 6:10] #extract data v <- slidevector(test, ndim = 2, itmax = 250, eps = .001) plot(v)
This plotting method provides a multidimensional representation of skew-symmetry based on the singular value decomposition (SVD). The properties of the SVD of a skew-symmetric matrix were given by Gower (1977) where also the guidelines for the interpretation of diagrams obtained by plotting pairs of singular vectors is described. The singular vectors of a skew-symmetric matrix come in pairs with equal singular values. The diagrams are not interpreted by comparing distances between point as is usual in multidimensional scaling, but by comparing areas formed by two points and the origin. The singular vectors span a plane, and the area of the triangle between two points and the origin represents skew-symmetry. The sign of the skew-symmetry between two points is modelled by a direction in the plane. Going clockwise the area between two points and the origin is negative, goint counter clockwise the area is positive.
## S3 method for class 'skewsymmetry' plot(x, plot.plane = 1, yplus = 0, xlab, ylab, ...)
## S3 method for class 'skewsymmetry' plot(x, plot.plane = 1, yplus = 0, xlab, ylab, ...)
x |
An object of class skewsymmetry |
plot.plane |
Integer indicating which plane to plot |
yplus |
Offset for the labels above the object points |
xlab |
Label for the x-axis |
ylab |
Label for the y-axis |
... |
Further plot arguments |
Gower, J.C. (1977) The analysis of asymmetry and orthogonality. In: Recent Developments in Statistics ( J. Barra, F. Brodeau, G. Romier & B. van Cutsem, Eds.), 109-123. North Holland, Amsterdam.
The Rapid Alert System for dangerous non-food products (RAPEX) notifies EU member states about risks of products to the health and safety of consumers. Risks for the consumer include choking, strangulation and fire, to name just a few. Examples of products in this database are powerbanks, clothing, toys, lighters, among others. Dozens of products in the EU are withdrawn from the market every month because they pose a risk to users health and safety. Market surveillance authorities in EU member states are expected to inform other countries about dangerous products, so that they are removed from the market in other countries. These data are maintained in an exchange system known as RAPEX. Countries can register unsafe products in the RAPEX database, this process is called notification. Other countries may then act on a notification made by one of the other countries. This table is derived from the RAPEX database. The entries in the table give the number of products removed from the row country, that is acted upon by the column country.
The decomposition of an asymmetric matrix into a symmetric matrix and a skew-symmetric matrix is an elementary result from mathematics that is the cornerstone of this package. The decomposition into a skew-symmetric and a symmetric component is written as: , where
is an asymmetric matrix,
is a symmetric matrix, and
is a skew-symmetric matrix. This decomposition provides a justification for separate analyses of
and
. This decomposition is a useful tool for data analysis and graphical representation by areas. A second application is to the study of an asymmetric matrix of residuals, obtained after fitting a MDS model.
skewsymmetry(x)
skewsymmetry(x)
x |
Asymmetric matrix |
S |
The symmetric part of the matrix |
A |
The skew-symmetric part of the matrix |
linear |
The row means of the skew-symmetric matrix, this amounts to fitting a linear model with row and column effects to the skew-symmetric matrix |
sv |
The singular vectors of the skew-symmetric matrix |
sval |
a vector containing the singular values of the skew-symmetric part of the data matrix |
nobj |
The number of objects |
data("Englishtowns") Q <- skewsymmetry(Englishtowns) # the skew-symmetric part Q$A
data("Englishtowns") Q <- skewsymmetry(Englishtowns) # the skew-symmetric part Q$A
The slide-vector model is a multidimensional scaling model for asymmetric proximity data. Here, an asymmetric distance model is fitted to the data, where the asymmetry in the data is represented by the projections of the coordinates of the objects onto the slide-vector. The slide-vector points in the direction of large asymmetries in the data. The interpretation of asymmetry in this model is aided by the use of projections of points onto the slide-vector. The distance from i to j is larger if the point $i$ has a higher projection onto the slide-vector than the distance from j to i. If the line connecting two points is perpendicular to the slide-vector the difference between the two projections is zero. In this case the distance between the two points is symmetric. The algorithm for fitting this model is derived from the majorization approach to multidimensional scaling.
slidevector(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12, rotate = TRUE)
slidevector(data, weight = NULL, ndim = 2, verbose = FALSE, itmax = 125, eps = 1e-12, rotate = TRUE)
data |
Asymmetric dissimilarity matrix |
weight |
Optional non-negative matrix with weights, if no weights are given all weights are set equal to one |
ndim |
Number of dimensions |
verbose |
If TRUE, print the history of iterations |
itmax |
Maximum number of iterations |
eps |
Convergence criterion for the algorithm |
rotate |
If TRUE, the slide-vector is aligned with the first dimension of the configuration |
The slide-vector model is a special case of the unfolding model. Therefore, the algorithm for fitting this model is a constrained unfolding model. The coordinates of the objects are calculated by minimizing a least squares loss function. This loss function is called stress in the multidimensional scaling literature. The stress is minimized by a version of the SMACOF algorithm. The main output are the configuration of points and the slide-vector.
ndim |
Number of dimensions |
stress |
The raw stress for this model |
confi |
Returns the configuration matrix of this multidimensional scaling model |
niter |
The number of iterations for the algorithm to converge |
nobj |
The number of observations in this model |
resid |
A matrix with raw residuals |
slvec |
Coordinates of the slide-vector |
model |
Name of this asymmetric multidimensional scaling model |
Zielman, B., and Heiser, W. J. (1993), The analysis of asymmetry by a slide-vector, Psychometrika, 58, 101-114.
## asymmetric distances between English towns data(Englishtowns) v <- slidevector(Englishtowns, ndim = 2, itmax = 250, eps = .001, rotate = TRUE) plot(v)
## asymmetric distances between English towns data(Englishtowns) v <- slidevector(Englishtowns, ndim = 2, itmax = 250, eps = .001, rotate = TRUE) plot(v)
The table lists the home and destination country of 268.142 students in the academic year 2012-2013 participating in the Erasmus program. The 33 rows of this table refer to the home country whereas the 33 columns refer to the destination countries. The table gives the number of inbound and outbound students between every pair of countries, and the entries in the table are read as follows: 32 students from Bulgaria studied in The Netherlands, 18 students from the Netherlands studied in Bulgaria. Macedonia (MK) was excluded from the published table because only one student from Macedonia studie abroad and this country did not receive any students.
data(studentmigration)
data(studentmigration)
A matrix of 33 rows by 33 columns
The Erasmus program is a student exchange program from the European Union. Three million students had taken part since the start of the program in 1987. To join the program a student has study at least three months or do an internship of at least two months in another country. The 2-letter codes shown below are supplied by the ISO (International Organization for Standardization). Country codes are given here: Countrycodes
Macedonia has been removed from this table because only one student from this country participated in the program, and no students moved to Macedonia.
https://education.ec.europa.eu
data(studentmigration) hmap(studentmigration)
data(studentmigration) hmap(studentmigration)
Prints a decomposition of the sum of squares of an asymmetric matrix. The first column gives the sum of squares, and the second column gives the percentages of the two components. This decomposition can be applied to data, but also to a matrix of residuals obtained from a fitted model.
## S3 method for class 'skewsymmetry' summary(object, ...)
## S3 method for class 'skewsymmetry' summary(object, ...)
object |
An object of class |
... |
Further parameters |
data(Englishtowns) q <- skewsymmetry(Englishtowns) summary(q)
data(Englishtowns) q <- skewsymmetry(Englishtowns) summary(q)