我需要实现一个布隆过滤器。我不能找到出路了这一点。
I need to implement a bloom filter. And I cannot find a way out of this.
随着功能的固定电话号码,我怎么能计算布鲁姆的大小过滤器给误报的概率是多少?
With fixed number of functions, how can I calculate size of a Bloom Filter given the probability of false positives ?
例如,我想,该过滤器有误报10%,我有一些功能,集合中元素的个数。
For example, I want that the filter have 10% of false positives, I have the number functions and the number of elements in the set.
我如何计算布隆过滤器相匹配的假阳性概率的大小?
How can I calculate the size of Bloom Filter that match the false positive probability ?
是在维基百科。假设你有足够的哈希函数可用,需要〜每件4.8位上给出的0.1指定的假阳性率。
The formula for this is on the Wikipedia. Assuming you have enough hash functions available, you need ~4.8 bits per element given the false positive rate you specified of 0.1.
在这种情况下,它看起来像4散列函数将是最佳的。需要注意的是更多的hash函数并不总是更好 - 如果有相对滤波器的尺寸非常多的散列函数,可以快速设置几乎所有的位上,你会得到很多误报
In this case it looks like 4 hash functions would be optimal. Note that more hash functions isn't always better -- if there are very many hash functions relative to the size of the filter, you quickly set almost all the bits on, and you get lots of false positives.