我使用FlatFileItemReader,并延长了AbstractResource从Amazon S3的对象返回流。
I am using FlatFileItemReader and have extended the AbstractResource to return a stream from Amazon S3 object.
S3Object amazonS3Object = s3client.getObject(new GetObjectRequest(bucket,file));
InputStream stream = null;
stream = amazonS3Object.getObjectContent();
return stream;
在我的批处理工作,我也实施MultiFileResourcePartitioner中,我给了斗分区的所有文件。我能够读取的几个文件只是一部分,在此之后,我得到下面的错误片插座复位error.see
In my batch job I have also implemented MultiFileResourcePartitioner in which i gave the bucket to partition all the files. I am able to read only part of few files and after which i get a socket reset error.see below pieces of error
.ResourcelessTransactionManager$ResourcelessTransaction@122ba881]
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=9
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:68 - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@252ce07a
2015-08-24 23:24:03 DEBUG StepContextRepeatCallback:76 - Chunk execution starting: queue size=0
2015-08-24 23:24:03 DEBUG ResourcelessTransactionManager:367 - Creating new transaction with name [null]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
2015-08-24 23:24:03 DEBUG RepeatTemplate:464 - Starting repeat context.
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=1
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=2
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=3
2015-08-24 23:24:03 DEBUG RepeatTemplate:366 - Repeat operation about to start at count=4
2015-08-24 23:24:03 DEBUG DefaultClientConnection:160 - Connection 0.0.0.0:58171<->10.37.135.39:8099 shut down
2015-08-24 23:24:03 DEBUG DefaultClientConnection:176 - Connection 0.0.0.0:58171<->10.37.135.39:8099 closed
Caused by: org.springframework.batch.item.file.NonTransientFlatFileException: Unable to read from resource: [null]
at org.springframework.batch.item.file.FlatFileItemReader.readLine(FlatFileItemReader.java:220)
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:173)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:83)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:133)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:121)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207)
at com.sun.proxy.$Proxy17.read(Unknown Source)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)
at org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)
... 22 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:196)
at java.net.SocketInputStream.read(SocketInputStream.java:122)</i>
我的要求是要处理数以百万计的记录文件从一个S3桶和应用程序在AWS上运行。我已经通过了S3的客户端配置,重试的开放的连接,这并没有太大的帮助。
My requirement is to process files with millions of records out of an S3 bucket and the application runs on AWS. I have passed the S3 client configurations with retry's and open connections which didn't help much.
作为@迈克尔·米内拉提到的,它可能是一个选择,你使用的春季云AWS 项目获得资源:
As @Michael Minella mentioned, it might be a choice for you to use Spring Cloud AWS project to get resources:
@Autowired
private ResourcePatternResolver resourcePatternResolver;
public void resolveAndLoad() throws IOException {
Resource[] allTxtFilesInFolder = this.resourcePatternResolver.getResources("s3://bucket/name/*.txt");
Resource[] allTxtFilesInBucket = this.resourcePatternResolver.getResources("s3://bucket/**/*.txt");
Resource[] allTxtFilesGlobally = this.resourcePatternResolver.getResources("s3://**/*.txt");
}
然后通过资源的MultiFileResourcePartitioner看到异常是可以解决的。
And then pass the resources to your MultiFileResourcePartitioner to see the exception can be solved.