我有一个包含 142 个制表符分隔的文本文件的文件夹.每个文件有 19 个变量,然后是下面的一些行(通常不超过 30 行,但会有所不同).我想在 R 中自动对这些文件做几件事,但我似乎无法用我的代码得到我想要的东西.我是循环的新手,我从 stackoverflow 上的帖子中获得了这两个部分的代码,但似乎无法弄清楚如何组合它们的功能.
I have a folder with 142 tab-delimited text files. Each file has 19 variables, and then a number of rows beneath (usually no more than 30 rows, but it varies). I want to do several things with these files in R automatically, and I can't seem to get exactly what I want with my code. I am new to loops, I got both sections of code from previous posts here at stackoverflow but can't seem to figure out how to combine their functions.
我想把文件读入R的时候把文件名变成一个变量,这样每一行都有标识文件名
I want to turn the filename into a variable when reading the files into R, so that each row has the identifying file name
将所有文件(带有文件名变量且没有标题)连接到一个尺寸为 Yx19 的数据帧中,其中 Y=结果行的数量.
Concatenate all files (with filename variable and no header) into one dataframe with dimensions Yx19, where Y=however many resulting rows there are.
我可以使用以下代码创建 142 个数据帧的列表:
I am able to create a list of the 142 dataframes using this code:
myFiles = list.files(path="~/Documents/ForR/", pattern="*.txt")
data <- lapply(myFiles, read.table, sep=" ", header=FALSE)
names(data) <- myFiles
for(i in myFiles)
data[[i]]$Source = i
do.call(rbind, data)
我可以用 19 个变量创建我想要的数据框,但文件名不存在:
I am able to create the dataframe I want with 19 variables, but the filename is not present:
files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
DF <- NULL
for (f in files) {
dat <- read.csv(f, header=F, sep=" ", na.strings="", colClasses="character")
DF <- rbind(DF, dat)
}
如何将文件名(如果可能,不带 .txt)作为变量添加到循环中?
How do I add the file name (without .txt if possible) as a variable to the loop?
添加到循环dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
add to the loop dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
files <- list.files(path="~/Documents/ForR/.", pattern=".txt")
DF <- NULL
for (f in files) {
dat <- read.csv(f, header=F, sep=" ", na.strings="", colClasses="character")
dat$file <- unlist(strsplit(f,split=".",fixed=T))[1]
DF <- rbind(DF, dat)
}
do.call 中的 row.names 不应该采用 names(list)[n].i 格式,其中 i 是 1:number_of_rows_for_data.frame n?所以你可以从 row.names 中创建一列
Shouldn't the row.names from the do.call be in the format names(list)[n].i where i is 1:number_of_rows_for_data.frame n? so you can just make a column from the row.names
data <- lapply(myFiles, read.table, sep=" ", header=FALSE)
combined.data <- do.call(rbind, data)
combined.data$file_origin <- row.names(combined.data)
上一篇:该请求的性能计数器不是自定义计数器,它必须进行初始化为只读计数器、自定义、性能、不是
下一篇:如何使用 ORACLE 中的 SQL UPDATE 命令将 BLOB 数据附加/连接到 BLOB 列连接到、如何使用、命令、数据