如何使用第三方物流并行文件写入?如何使用、第三方物流、文件

2023-09-04 12:58:30 作者:无视沵的存在

我想保存字符串列表到多个文件,每个字符串在不同的文件中,并同时做到这一点。我这样做是这样的:

I am trying to save list of strings to multiple files, each string in a different file, and do it simultaneously. I do it like this:

public async Task SaveToFilesAsync(string path, List<string> list, CancellationToken ct)
{
    int count = 0;
    foreach (var str in list)
    {
        string fullPath = path + @"\" + count.ToString() + "_element.txt";
        using (var sw = File.CreateText(fullPath))
        {
            await sw.WriteLineAsync(str);
        }
        count++;

        NLog.Trace("Saved in thread: {0} to {1}", 
           Environment.CurrentManagedThreadId,
           fullPath);

        if (ct.IsCancellationRequested)
            ct.ThrowIfCancellationRequested();
    }
}

和调用它是这样的:

try
{
   var savingToFilesTask = SaveToFilesAsync(@"D:\Test", myListOfString, ct);
}
catch(OperationCanceledException)
{
   NLog.Info("Operation has been cancelled by user.");
}

但在日志文件中我可以清楚地看到,节省总是发生在同一个线程ID,所以没有并行是怎么回事?我究竟做错了什么?如何解决呢?我的目标是让所有的储蓄尽可能快地使用所有的计算机核心。

But in log file I can clearly see that saving always happen in the same thread id, so no parallelism is going on? What am I doing wrong? How to fix it? My goal is make all saving as fast as possible using all computer cores.

推荐答案

从本质上讲,你的问题是的foreach 是同步的。它使用的IEnumerable 这是同步的。

Essentially, your problem is foreach is synchronous. It uses IEnumerable which is synchronous.

要解决这个问题,首先封装循环体为异步函数。

To work around this, first encapsulate the loop body into an asynchronous function.

public async Task WriteToFile(
        string path,
        string str,
        int count)
{
    var fullPath = string.Format("{0}\\{1}_element.txt", path, count);
    using (var sw = File.CreateText(fullPath))
    {
        await sw.WriteLineAsync(str);
    }

    NLog.Trace("Saved in TaskID: {0} to \"{1}\"", 
       Task.CurrentId,
       fullPath);
}

代替循环同步,项目串的顺序来执行的包封的循环体任务序列

然后,。这不是一个异步操作的本身,而是投影不会阻止,即没有等待

Then, instead of looping synchronously, project the sequence of strings to a sequence of tasks performing your encapsulated loop body. This is not a asynchronous operation in itself but the projection will not block, i.e. there is no await.

然后等待他们的所有任务中的任务计划程序中定义的顺序来完成。

Then wait for them all tasks to finish in an order defined by the Task Scheduler.

public async Task SaveToFilesAsync(
        string path,
        IEnumerable<string> list,
        CancellationToken ct)
{
    await Task.WhenAll(list.Select((str, count) => WriteToFile(path, str, count));
}

没有什么要取消,所以没有点传递取消标记下来。

There is nothing to cancel, so there is no point passing the cancellation token down.

我使用的选择索引过载提供计数值。

I've used the indexing overload of Select to provide the count value.

我更改了自己的记录code使用当前任务的ID,这样就避免了周围调度混淆。

I've changed your logging code to use the current Task ID, this avoids any confusion around scheduling.