我有 XY
数据(高维数据的二维 tSNE
嵌入),我想 scatter plot
.数据被分配给几个 cluster
,所以我想通过 cluster
对点进行颜色编码,然后为每个 cluster
,具有与 cluster
相同的颜色编码,并且位于 cluster
点之外(尽可能地).
I have XY
data (a 2D tSNE
embedding of high dimensional data) which I'd like to scatter plot
. The data are assigned to several cluster
s, so I'd like to color code the points by cluster
and then add a single label for each cluster
, that has the same color coding as the cluster
s, and is located outside (as much as possible) from the cluster
's points.
知道如何在 ggplot2
和 ggrepel
或 plotly
中使用 R
来做到这一点吗?
Any idea how to do this using R
in either ggplot2
and ggrepel
or plotly
?
这是示例数据(XY
坐标和 cluster
分配在 df
中,标签在 label.df
) 和它的 ggplot2
部分:
Here's the example data (the XY
coordinates and cluster
assignments are in df
and the labels in label.df
) and the ggplot2
part of it:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none")
ggrepel
包中的 geom_label_repel()
函数允许您在尝试的同时轻松地为绘图添加标签以排斥"标签不与其他元素重叠.对您现有代码的一点补充,我们在其中汇总数据/获取放置标签的坐标(这里我选择了每个集群的左上角区域 - 这是 x 的最小值和 y 的最大值)并合并它使用包含集群标签的现有数据.在对 geom_label_repel()
的调用中指定此数据框,并在 aes()
中指定包含 label
美学的变量.
The geom_label_repel()
function in the ggrepel
package allows you to easily add labels to plots while trying to "repel" the labels from not overlapping with other elements. A slight addition to your existing code where we summarize the data / get coordinates of where to put the labels (here I chose the upper left'ish region of each cluster - which is the min of x and the max of y) and merge it with your existing data containing the cluster labels. Specify this data frame in the call to geom_label_repel()
and specify the variable that contains the label
aesthetic in aes()
.
library(dplyr)
library(ggplot2)
library(ggrepel)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
label.df_2 <- df %>%
group_by(cluster) %>%
summarize(x = min(x), y = max(y)) %>%
left_join(label.df)
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none") +
ggrepel::geom_label_repel(data = label.df_2, aes(label = label))
下一篇:的FrameLayout到RelativeLayout的ClassCastException异常即使没有使用的FrameLayout异常、FrameLayout、RelativeLayout、Clas