什么是正确的algorthm对于两点间的logarthmic分布曲线?曲线、两点、正确、algorthm

2023-09-12 21:19:45 作者:西决丶丶

我读了一堆关于适当的方法来生成tagcloud权重的对数分布的教程。他们中的大多数组标签进入步骤。这似乎有点愚蠢的我,所以我开发了基于我读过,使其动态地沿着阈值和最大值之间的logarthmic曲线分配标签的数量我自己的算法。下面是它在python的精髓:

I've read a bunch of tutorials about the proper way to generate a logarithmic distribution of tagcloud weights. Most of them group the tags into steps. This seems somewhat silly to me, so I developed my own algorithm based on what I've read so that it dynamically distributes the tag's count along the logarthmic curve between the threshold and the maximum. Here's the essence of it in python:

from math import log
count = [1, 3, 5, 4, 7, 5, 10, 6]
def logdist(count, threshold=0, maxsize=1.75, minsize=.75):
    countdist = []
    # mincount is either the threshold or the minimum if it's over the threshold
    mincount = threshold<min(count) and min(count) or threshold
    maxcount = max(count)
    spread = maxcount - mincount
    # the slope of the line (rise over run) between (mincount, minsize) and ( maxcount, maxsize)
    delta = (maxsize - minsize) / float(spread)
    for c in count:
        logcount = log(c - (mincount - 1)) * (spread + 1) / log(spread + 1)
        size = delta * logcount - (delta - minsize)
        countdist.append({'count': c, 'size': round(size, 3)})
    return countdist

基本上,而不个别计数的对数计算,它会产生点之间的直线,(mincount,MINSIZE)和(MAXCOUNT,MAXSIZE)。

Basically, without the logarithmic calculation of the individual count, it would generate a straight line between the points, (mincount, minsize) and (maxcount, maxsize).

该算法确实的两个点之间的曲线的良好近似,但是从一个缺点受损。所述mincount是一种特殊情况,和它的对数产生零。这意味着mincount的大小将小于MINSIZE。我试过炮制数字来尝试解决这种特殊情况,但似乎无法得到它的权利。目前,我只是对待mincount作为一个特例,增加或1 的logcount行。

The algorithm does a good approximation of the curve between the two points, but suffers from one drawback. The mincount is a special case, and the logarithm of it produces zero. This means the size of the mincount would be less than minsize. I've tried cooking up numbers to try to solve this special case, but can't seem to get it right. Currently I just treat the mincount as a special case and add " or 1" to the logcount line.

有一个更正确的算法得出的两点之间的曲线?

Is there a more correct algorithm to draw a curve between the two points?

更新3月3日:如果我没有记错的话,我以计数的日志,然后将其插入一个线性方程。放的特殊情况下的说明中,换句话说,在Y = LNX在x = 1,Y = 0。这是在mincount会发生什么。但mincount不能为零,标签没有被使用0次。

Update Mar 3: If I'm not mistaken, I am taking the log of the count and then plugging it into a linear equation. To put the description of the special case in other words, in y=lnx at x=1, y=0. This is what happens at the mincount. But the mincount can't be zero, the tag has not been used 0 times.

尝试code和插入自己的数来测试。治疗mincount作为一种特殊情况是好的由我,我有一种感觉,它会更容易比任何实际的解决这个问题。我只是觉得那里的必须的是一个解决方案,这一点,已经有人可能拿出一个解决方案。

Try the code and plug in your own numbers to test. Treating the mincount as a special case is fine by me, I have a feeling it would be easier than whatever the actual solution to this problem is. I just feel like there must be a solution to this and that someone has probably come up with a solution.

更新4月6日:一个简单的谷歌搜索圈了很多的教程,我读过,但这可能是踩到最完整的例子标签云。

UPDATE Apr 6: A simple google search turns up a many of the tutorials I've read, but this is probably the most complete example of stepped tag clouds.

更新04月28日:为响应antti.huima的解决方案:在绘制时,下面的两个点之间的连线,你的算法创建的谎言曲线。我一直在试图兼顾各地的电话号码,但似乎仍不能拿出一个办法来翻转的曲线线的另一端​​。我猜测,如果该功能被改变指数的某种形式的对数,而不是它会做的正是我所需要的。那是对的吗?如果是的话,任何人都可以解释如何实现这一目标?

UPDATE Apr 28: In response to antti.huima's solution: When graphed, the curve that your algorithm creates lies below the line between the two points. I've been trying to juggle the numbers around but still can't seem to come up with a way to flip that curve to the other side of the line. I'm guessing that if the function was changed to some form of logarithm instead of an exponent it would do exactly what I'd need. Is that correct? If so, can anyone explain how to achieve this?

推荐答案

由于antti.huima的帮助下,我重新思考了什么,我要怎样做。

Thanks to antti.huima's help, I re-thought out what I was trying to do.

以他的解决问题的方法,我想的方程,其中mincount的对数等于两个点之间的线性方程。

Taking his method of solving the problem, I want an equation where the logarithm of the mincount is equal to the linear equation between the two points.

weight(MIN) = ln(MIN-(MIN-1)) + min_weight
min_weight = ln(1) + min_weight

虽然这给了我一个很好的起点,我需要使它通过点(MA​​X,max_weight)。这将需要一个恒定的:

While this gives me a good starting point, I need to make it pass through the point (MAX, max_weight). It's going to need a constant:

weight(x) = ln(x-(MIN-1))/K + min_weight

解决在K我们得到:

Solving for K we get:

K = ln(MAX-(MIN-1))/(max_weight - min_weight)

所以,把所有这一切放回一些Python code:

So, to put this all back into some python code:

from math import log
count = [1, 3, 5, 4, 7, 5, 10, 6]
def logdist(count, threshold=0, maxsize=1.75, minsize=.75):
    countdist = []
    # mincount is either the threshold or the minimum if it's over the threshold
    mincount = threshold<min(count) and min(count) or threshold
    maxcount = max(count)
    constant = log(maxcount - (mincount - 1)) / (maxsize - minsize)
    for c in count:
        size = log(c - (mincount - 1)) / constant + minsize
        countdist.append({'count': c, 'size': round(size, 3)})
    return countdist