红宝石:字符串重建算法中,工作只是部分红宝石、字符串、算法、部分

2023-09-11 23:31:31 作者:可怜沒ィ嗳

我工作的一个字符串重建算法(动态编程示例的经典,把空间较少的文本为正常间距的文本)的红宝石。 的code以下是纯粹的红宝石,你可以复制粘贴,并开始测试立即,它的工作80%的时间,往往打破,大词典的。我已经与更多然后80K字字典测试,它的工作原理不太好,大约70%的时间。

如果有一种方法,使其工作100%,如果这个词是present在字典中,请告诉我。

这里的code:(这是很好间隔应该很可读)的

 在纯Ruby#部分工作串重建算法中

#字典
高清字典(someWord)
  myArray的=,最好,域,我的,成功,图像,缩放工具,高,高科技,犯罪,单位,姓名, 边缘,时代,寻找,A,讨价还价,自由,意气风发,风格,我,命令,走出去,直接,到,吊带,中,力]
  返回!!(myArray.index(someWord))
结束


#由http://cseweb.ucsd.edu/classes/wi12/cse202-a/lecture6-final.pdf启发

##,请取消对任何一个你想使用
#
#(使用的所有的话是present在字典上图)
#
#工作句子
  X =''+harnesstheforce
#X =''+hightechcrimeunit
#
#非工作句子
#X =''+findabargain
#X =''+的ICommand

把试图重建#{X}

我们将#有用的变量在我们的算法中使用
N = x.length
K = Array.new(N)
S = Array.new(N)
断点= Hash.new
validBreakpoints = Hash.new

开始

  #让我们填写ķ
  在0到n-1我
    K表[我] =我
  结束

  #核心算法中从这里开始
  S [0] =真
  对于k在1..N-1

    S [K] = FALSE

    对于j在1..k
      如果s [J-1]安培;&安培;字典(X [j..k])
        S [K] =真

        #使用哈希只是一招,不重复
        breakpoints.store(K,真)
      结束
    结束

  结束

  #调试
  把断点:#{breakpoints.inspect}的#{X}

  #让我们创建一个有效的破发点矢量

  I = 1
  而I< = n-1个做

    #我们选择最长的有效字
    breakpoints.keys.sort.each做| K |

      如果我> = K
        下一个
      结束

      #调试:当算法中休息,它所以这里并进入无限循环
      #puts×[#{我} ..#[KP}]:#{X [i..k]}
      如果字典(X [i..k])
        validBreakpoints [I] = K
      结束
    结束

    如果validBreakpoints [I]
      I = validBreakpoints〔Ⅰ〕+ 1
    结束

  结束

  #调试
  把validBreakpoints:#{validBreakpoints.inspect}的#{X}

  #我们在插入了有效的断点定义的地方的空间
  X = x.strip
  I = 0
  validBreakpoints.each_key做|关键|
    validBreakpoints [关键] = validBreakpoints [键] + I
    I + = 1
  结束

  validBreakpoints.each_value做|价值|
    x.insert(价值,'')
  结束

  把调试:X:#{X}


  #我们捕捉CTRL-C
救援SignalException
  退出

救援#结束
结束
 

解决方案 字节跳动 50道高频算法题

请注意,您的算法不能包含单字符单词串。这是一个关闭的情况的一个差错。你是忽略这样的话后,断点,所以你结束了一个字(abargain)不包含在你的字典。

修改

 如果我> = K
     下一个
   结束
 

 如果我> ķ
     下一个
   结束
 

以上的红宝石般

 接下来,如果我> ķ
 

还要注意的是,你正在运行到一个无限循环,只要您的字符串包含一些东西,不是一个字:

 如果validBreakpoints [I]#会是假的
  I = validBreakpoints〔Ⅰ〕+ 1#我不递增,所以重新开始在相同的位置
结束
 

您更好地将此视为一个错误

 返回'<没有解析>除非validBreakpoints [I]#或者如果你是不是在一个函数抛出
I = validBreakpoints〔Ⅰ〕+ 1
 

inotifier的问题是算法的不足。总是选择最长的单词是不好的。在这种情况下,检测到的第一个有效的断点是,在后这让你的无字otifier

I'm working on a string reconstruction algorithm (a classic in dynamic programming examples, turning space less text into normal spaced text) in Ruby. The code below is pure ruby, you can copy paste and start testing immediately, it's working 80% of the time and tends to break, the larger the dictionary becomes. I've tested it with more then 80k words dictionaries and it works less good, about 70% of the time.

If there's a way to make it work 100% if the word is present in the dictionary, please show me.

Here's the code: (it's well spaced and should be very readable)

# Partially working string reconstruction algo in pure Ruby

# the dictionary
def dict(someWord)
  myArray = [" ", "best", "domain", "my", "successes", "image", "resizer", "high", "tech", "crime", "unit", "name", "edge", "times", "find", "a", "bargain", "free", "spirited", "style", "i", "command", "go", "direct", "to", "harness", "the", "force"]
  return !!(myArray.index(someWord))
end


# inspired by http://cseweb.ucsd.edu/classes/wi12/cse202-a/lecture6-final.pdf

## Please uncomment the one you wanna use
#
# (all the words used are present in the dictionary above)
#
# working sentences
  x = ' ' + "harnesstheforce"
# x = ' ' + "hightechcrimeunit"
#
# non working sentences
# x = ' ' + "findabargain"
# x = ' ' + "icommand"

puts "Trying to reconstruct #{x}"

# useful variables we're going to use in our algo
n = x.length
k = Array.new(n)
s = Array.new(n)
breakpoints = Hash.new
validBreakpoints = Hash.new

begin

  # let's fill k
  for i in 0..n-1
    k[i] = i
  end

  # the core algo starts here
  s[0] = true
  for k in 1..n-1

    s[k] = false

    for j in 1..k
      if s[j-1] && dict(x[j..k])
        s[k] = true

        # using a hash is just a trick to not have duplicates
        breakpoints.store(k, true)
      end
    end

  end

  # debug
  puts "breakpoints: #{breakpoints.inspect} for #{x}"

  # let's create a valid break point vector

  i=1
  while i <= n-1 do

    # we choose the longest valid word
    breakpoints.keys.sort.each do |k|

      if i >= k
        next
      end

      # debug: when the algo breaks, it does so here and goes into an infinite loop
      #puts "x[#{i}..#{k}]: #{x[i..k]}"
      if dict(x[i..k])
        validBreakpoints[i] = k
      end
    end

    if validBreakpoints[i]
      i = validBreakpoints[i] + 1
    end

  end

  # debug
  puts "validBreakpoints: #{validBreakpoints.inspect} for #{x}"

  # we insert the spaces at the places defined by the valid breakpoints
  x = x.strip
  i = 0
  validBreakpoints.each_key do |key|
    validBreakpoints[key] = validBreakpoints[key] + i
    i += 1
  end

  validBreakpoints.each_value do |value|
    x.insert(value, ' ')
  end

  puts "Debug: x: #{x}"


  # we capture ctrl-c
rescue SignalException
  abort

# end of rescue
end

解决方案

Note that your algorithm fails for strings containing single-character words. This is an off-by-one error. You are ignoring the breakpoints after such words, thus you end up with a word ("abargain") not contained in your dictionary.

Change

   if i >= k
     next
   end

to

   if i > k
     next
   end

or more Ruby-like

   next if i > k

Note also that you are running into an endless loop whenever your string contains something which is not a word:

if validBreakpoints[i]        # will be false 
  i = validBreakpoints[i] + 1 # i not incremented, so start over at the same position
end

You better treat this as an error

return '<no parse>' unless validBreakpoints[i] # or throw if you are not in a function
i = validBreakpoints[i] + 1

The problem with "inotifier" is a deficiency of your algorithm. Always choosing the longest word is not good. In this case, the first "valid" breakpoint detected is after the "in" which leaves you the non-word "otifier".