项目欧拉第14号哈斯克尔斯克、欧拉、项目

2023-09-11 23:15:33 作者:﹎葬¦龙王¦

我试图来解决项目欧拉( http://projecteuler.net/problem=14 )和我打使用哈斯克尔是死路一条。

I'm trying to resolve problem 14 of Project Euler (http://projecteuler.net/problem=14) and I hit a dead end using Haskell.

现在,我知道这些数字可能是足够小,我可以做一个蛮力,但是这不是我的工作的目的。 我试图记住中间结果在地图类型地图整型(布尔,整数)与意义

Now, I know that the numbers may be small enough and I could do a brute force, but that isn't the purpose of my exercise. I am trying to memorize the intermediate results in a Map of type Map Integer (Bool, Integer) with the meaning of:

- the first Integer (the key) holds the number
- the Tuple (Bool, Interger) holds either (True, Length) or (False, Number) 
                                           where Length = length of the chain
                                                 Number = the number before him

例如:

  for 13: the chain is 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1
  My map should contain : 
  13 - (True, 10)
  40 - (False, 13)
  20 - (False, 40)
  10 - (False, 20)
  5  - (False, 10)
  16 - (False, 5)
  8  - (False, 16)
  4  - (False, 8)
  2  - (False, 4)
  1  - (False, 2)

现在,当我搜索其他一些像 40 我知道,环比(10 - 1)长度等等。 我现在想,如果我搜索了10,不仅要告诉我10的长度(10 - 3)长度和更新地图,而且我想更新20,40的情况下,他们仍然是(假,_)

Now when I search for another number like 40 i know that the chain has (10 - 1) length and so on. I want now, if I search for 10, not only to tell me that length of 10 is (10 - 3) length and update the map, but also I want to update 20, 40 in case they are still (False, _)

我的code:

import Data.Map as Map

solve :: [Integer] -> Map Integer (Bool, Integer)
solve xs    = solve' xs Map.empty
    where
        solve' :: [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
        solve' []     table = table
        solve' (x:xs) table =
            case Map.lookup x table of
                Nothing     -> countF x 1 (x:xs) table
                Just     (b, _) ->
                    case b of
                        True    -> solve' xs table
                        False   -> {-WRONG-} solve' xs table

        f :: Integer -> Integer
        f x
            | x `mod` 2 == 0    = x `quot` 2
            | otherwise     = 3 * x + 1

        countF :: Integer -> Integer -> [Integer] -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
        countF n cnt (x:xs) table
            | n == 1    = solve' xs (Map.insert x (True, cnt) table)
            | otherwise = countF (f n) (cnt + 1) (x:xs) $ checkMap (f n) n table

        checkMap :: Integer -> Integer -> Map Integer (Bool, Integer) -> Map Integer (Bool, Integer)
        checkMap n rez table    =
            case Map.lookup n table of
                Nothing -> Map.insert n (False, rez) table
                Just _  -> table

在{-WRONG-}一部分,我们应该更新所有的值就像在下面的例子:

At the {-WRONG-} part we should update all the values like in the following example:

--We are looking for 10:
  10 - (False, 20)
     |
     V                                   {-finally-} update 10 => (True, 10 - 1 - 1 - 1)
  20 - (False, 40)                                      ^
     |                                                  |
     V                                  update 20 => 20 - (True, 10 - 1 - 1)
  40 - (False, 13)                          ^
     |                                      |
     V                      update 40 => 40 - (True, 10 - 1)
  13 - (True, 10)              ^
     |                         |
     ---------------------------

现在的问题是,我不知道,如果它可以做到在功能2之类的更新数量,并继续recurence。在 C 喜欢语言,我可以做这样的事情(伪code):

The problem is that I don't know if its possible to do 2 things in a function like updating a number and continue the recurence. In a C like language I may do something like (pseudocode):

void f(int n, tuple(b,nr), int &length, table)
{
      if(b == False) f (nr, (table lookup nr), 0, table);
      // the bool is true so we got a length
      else
      {
            length = nr;
            return;
      }
      // Since this is a recurence it would work as a stack, producing the right output
      table update(n, --cnt);
}

最后一条指令会的工作,因为我们都通过引用发送CNT。此外,我们一直都知道它会完成在某些时候和CNT不应< 1。

The last instruction would work since we are sending cnt by reference. Also we always know that it will finish at some point and cnt should not be < 1.

推荐答案

最简单的优化(如您已确定)的记忆化。您试图创建一个记忆化系统自己,但所遇到的问题,对如何存储memoized值。有解决方案,在一个维护的方式这样做,如使用国单子或 STArray 。然而,有一个更简单的解决问题的方法 - 利用Haskell的现有记忆化。哈斯克尔默认记得常数值,因此,如果您创建一个值,该值存储在Collat​​z值,它会自动memoized!

The easiest optimization (as you have identified) is memoization. You have attempted create a memoization system yourself, however have come across issues on how to store the memoized values. There are solutions to doing this in a maintainable way, such as using a State monad or a STArray. However, there is a much simpler solution to your problem - use haskell's existing memoization. Haskell by default remembers constant values, so if you create a value that stores the collatz values, it will be automatically memoized!

这方面的一个简单的例子是以下斐波纳契定义:

A simple example of this is the following fibonacci definition:

fib :: Int -> Integer
fib n = fibValues !! n where
  fibValues = 1 : 1 : zipWith (+) fibValues (tail fibValues)

fibValues​​ [整数] ,并且因为它仅仅是一个恒定值,它是memoized。但是,这并不意味着它是所有memoized一次,因为它是一个infinte名单,这将永远不会结束。相反,在需要时值只计算,因为Haskell是懒惰。

The fibValues is a [Integer], and as it is just a constant value, it is memoized. However, that doesn't mean it is all memoized at once, since as it is an infinte list, this would never finish. Instead, the values are only calculated when needed, as haskell is lazy.

所以,如果你与你的问题类似的东西,你会得到记忆化没有很多的工作。然而,使用一个列表像上面不会在解决方案工作得很好。这是因为的Collat​​z算法使用许多不同的值,以得到的结果为给定数目,因此使用将需要随机存取的容器是有效的。最明显的选择是一个数组。

So if you do something similar with your problem, you will get memoization without a lot of the work. However, using a list like above won't work well in your solution. This is because the collatz algorithm uses many different values to get the result for a given number, so the container used will require random access to be efficient. The obvious choice is an array.

collatzMemoized :: Array Integer Int

接下来,我们需要填写正确的值的数组。我会写这个函数pretending一个在Collat​​z 函数存在,用于计算任何n个在Collat​​z值。另外请注意,阵列是固定的尺寸,因此一个值需要被用来确定最大数量memoize的。我会用一万美元,但任何值都可以使用(它是一个内存/速度的权衡)。

Next, we need to fill up the array with the correct values. I'll write this function pretending a collatz function exists that calculates the collatz value for any n. Also, note that arrays are fixed size, so a value needs to be used to determine the maximum number to memoize. I'll use a million, but any value can be used (it is a memory/speed tradeoff).

collatzMemoized = listArray (1, maxNumberToMemoize) $ map collatz [1..maxNumberToMemoize] where
  maxNumberToMemroize = 1000000

这是pretty的直白,在 listArray 给定范围,并在此范围内的所有在Collat​​z值的列表是考虑到它。请记住,这将不会计算所有在Collat​​z值直线距离,因为价值观是懒惰的。

That is pretty straightforward, the listArray is given bounds, and the a list of all the collatz values in that range is given to it. Remember that this won't calculate all the collatz values straight away, as the values are lazy.

现在,的Collat​​z函数可以写成。最重要的部分是只检查 collat​​zMemoized 数组,如果被检查的数量是它的范围之内:

Now, the collatz function can be written. The most important part is to only check the collatzMemoized array if the number being checked is within its bounds:

collatz :: Integer -> Int
collatz 1 = 1
collatz n
  | inRange (bounds collatzMemoized) nextValue = 1 + collatzMemoized ! nextValue
  | otherwise = 1 + collatz nextValue
  where
    nextValue = case n of
      1 -> 1
      n | even n -> n `div` 2
        | otherwise -> 3 * n + 1

在ghci中,你可以看到现在的记忆化的有效性。尝试在Collat​​z 200000 。这将需要大约2秒完成。不过,如果你再次运行它,它会立即完成。

In ghci, you can now see the effectiveness of the memoization. Try collatz 200000. It will take about 2 seconds to finish. However, if you run it again, it will complete instantly.

最后,该溶液可以发现:

Finally, the solution can be found:

maxCollatzUpTo :: Integer -> (Integer, Int)
maxCollatzUpTo n = maximumBy (compare `on` snd) $ zip [1..n] (map collatz [1..n]) where

再印:

main = print $ maxCollatzUpTo 1000000

如果您运行的主,结果将被打印在大约10秒。

If you run main, the result will be printed in about 10 seconds.

现在,一个小问题,这种方法是它使用了大量的堆栈空间。它会正常工作在ghci中(这似乎采用更灵活的问候堆栈空间)。但是,如果你编译它,并尝试运行可执行文件,它会崩溃(带堆栈空间溢出)。因此,要运行该程序,您必须指定更多的时候,你编译它。这可以通过将 -with-rtsopts ='K64m的编译选项来完成。这增加了堆栈64MB的。

Now, a small problem with this approach is it uses a lot of stack space. It will work fine in ghci (which seems to use be more flexible with regards to stack space). However, if you compile it and try to run the executable, it will crash (with a stack space overflow). So to run the program, you have to specify more when you compile it. This can be done by adding -with-rtsopts='K64m' to the compile options. This increases the stack to 64mb.

现在该程序可以编译和运行:

Now the program can be compiled and ran:

> ghc -O3 --make -with-rtsopts='-K6m' problem.hs

运行 ./问题将给出结果,在不到一秒钟。

Running ./problem will give the result in less than a second.