FPGrowth算法星火星火、算法、FPGrowth

2023-09-11 05:52:29 作者：小怪兽爱上了奥特曼

我试图运行FPGrowth算法星火的例子，但是，我穿过一个错误的到来。这是我的code：

I am trying to run an example of the FPGrowth algorithm in Spark, however, I am coming across an error. This is my code:

import org.apache.spark.rdd.RDD
import org.apache.spark.mllib.fpm.{FPGrowth, FPGrowthModel}

val transactions: RDD[Array[String]] = sc.textFile("path/transations.txt").map(_.split(" ")).cache()

val fpg = new FPGrowth().setMinSupport(0.2).setNumPartitions(10)

val model = fpg.run(transactions)

model.freqItemsets.collect().foreach { itemset => println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)}

在code工作，直到在那里我得到的错误的最后一行：

The code works up until the last line where I get the error:

WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 16, ip-10-0-0-###.us-west-1.compute.internal): 
com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Can not set 
final scala.collection.mutable.ListBuffer field org.apache.spark.mllib.fpm.FPTree$Summary.nodes to scala.collection.mutable.ArrayBuffer
Serialization trace:
nodes (org.apache.spark.mllib.fpm.FPTree$Summary)

我甚至尝试使用这里提出的解决方案： SPARK-7483

I have even tried to use the solution that was proposed here: SPARK-7483

我还没有得到任何运气这两种。有没有人找到一个解决的办法？还是没有人知道的一种方式，只是查看结果或将其保存到一个文本文件？

I haven't had any luck with this either. Has anyone found a solution to this? Or does anyone know of a way to just view the results or save them to a text file?

任何帮助将是很大的AP preciated！

Any help would be greatly appreciated!

我还发现了完整的源$ C $ C这个算法 - http://mail-archives.apache.org/mod_mbox/spark-commits/201502.mbox/%3C1cfe817dfdbf47e3bbb657ab343dcf82@git.apache.org%3E

I also found the full source code for this algorithm - http://mail-archives.apache.org/mod_mbox/spark-commits/201502.mbox/%3C1cfe817dfdbf47e3bbb657ab343dcf82@git.apache.org%3E

推荐答案

我得到了同样的错误：这是因为火花版本。在星火1.5.2，这是固定的，但是我用的是1.3。我固定通过执行以下操作：

I got the same error: This is because of spark version. In Spark 1.5.2 this is fixed, however I was using 1.3. I fixed by doing the following:

我使用的火花壳火花提交开关，然后更改了配置kryoserializer。这是我的code：

I switched from using spark-shell to spark-submit and then changed the configuration for kryoserializer. Here is my code:

进口org.apache.spark {SparkConf，SparkContext} 进口org.apache.spark.rdd.RDD 进口org.apache.spark.mllib.fpm.FPGrowth 进口scala.collection.mutable.ArrayBuffer 进口scala.collection.mutable.ListBuffer

import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.rdd.RDD import org.apache.spark.mllib.fpm.FPGrowth import scala.collection.mutable.ArrayBuffer import scala.collection.mutable.ListBuffer

对象fpgrowth { 高清主（参数：数组[字符串]）{ VAL的conf =新SparkConf（）。setAppName（星火FPGrowth） conf.registerKryoClasses（阵列（classOf [ArrayBuffer [字符串]，classOf [ListBuffer [字符串]]））

object fpgrowth { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Spark FPGrowth") conf.registerKryoClasses(Array(classOf[ArrayBuffer[String]], classOf[ListBuffer[String]]))

val sc = new SparkContext(conf)

val data = sc.textFile("<path to file.txt>")

val transactions: RDD[Array[String]] = data.map(s => s.trim.split(' '))

val fpg = new FPGrowth()
  .setMinSupport(0.2)
  .setNumPartitions(10)
val model = fpg.run(transactions)

model.freqItemsets.collect().foreach { itemset =>
  println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}

} }

上一篇：修改莱文斯坦 - 距离无视秩序秩序、文斯、距离

下一篇：以1比1的选择协作排序算法算法

相关推荐

精彩图集

精彩推荐

图片推荐

王宝钏的真实历史，原型是柳银环_独守寒窑