普通防爆pression - 块中的格式文本 - IM文本、普通、格式、IM

2023-09-11 04:14:36 作者:沧桑记忆

您好,我试图找出一个普通的前pression替换文本中的innerHTML块文本在操作上谷歌即时提供类似本地格式。

Hello I am trying to figure out a regular expression to replace text in an innerHTML block to provide local formatting for text similar in operation to Google IM.

Where: 
_Italics_
!Inderline!
*Bold*
-Strike-

的条件之一是,文本必须用符号来包装的,但如果一个空间,如下之后再触发条件是无效的;因此*大胆*不会粗体和* notbold *但这是大胆*

Part of the conditions is that the text must be wrapped by the symbol, but if a space follows immediately after then the trigger condition is voided; so * bold* would not be bolded and: * notbold*but this is bold*

的innerHTML将已经被转换成的HREF因此为了不惹他们,我已经加入以下到我的正则表达式的前面的URL。

The innerHTML will have URLS which have already been converted to hrefs so in order to not mess with them, I have added the following to the front of my regex.

    (?!(?!.*?<a)[^<]*<\/a>)

下面的JavaScript并没有捕捉到所有的结果,将有不同的结果取决于我在其中进行替换的顺序。

The following javascript does not capture all the results and will have varied results depending on the order in which I conduct the replace.

var boldPattern          = /(?!(?!.*?<a)[^<]*<\/a>)\*([^\s]+[\s\S]?[^\s]+)\*([\s_!-]?)/gi;
var italicsPattern       = /(?!(?!.*?<a)[^<]*<\/a>)_([^\s]+[\s\S]?[^\s]+)_([\s-!\*]?)/gi;
var strikethroughPattern = /(?!(?!.*?<a)[^<]*<\/a>)-([^\s]+[\s\S]?[^\s]+)-([\s_!\*]?)/gi;
var underlinePattern     = /(?!(?!.*?<a)[^<]*<\/a>)!([^\s]+[\s\S]?[^\s]+)!([\s-_\*]?)/gi;
str = str.replace(strikethroughPattern, '<span style="text-decoration:line-through;">$1</span>$2');
str = str.replace(boldPattern, '<span style="font-weight:bold;">$1</span>$2');
str = str.replace(underlinePattern, '<span style="text-decoration:underline;">$1</span>$2');
str = str.replace(italicsPattern, '<span style="font-style:italic;">$1</span>$2');

测试数据为3选4的样子:

The test data for the 3 choose 4 looks like:

1 _-*ISB*-_ 2 _-!ISU!-_ 3 _*-IBS-*_ 4 _*!IBU!*_
5 _!-IUS-!_ 6 _!*IUB*!_ 7 -_*SIB*_- 8 -_!SIU!_-
9 -*_SBI_*- 10 -*!SBU!*- 11 -!_SUI_!- 12 -!*SIB*!-
13 *_-BIS-_* 14 *_!BIU!_* 15 *-_BSI_-* 16 *-!BSU!-*
17 *!_BUI_!* 18 *!-BUS-!* 19 !_-UIS-_! 20 !_*UIB*_!
21 !-_USI_-! 22 !-*USB*-! 23 !*_UBI_*! 24 !*-UBS-*!

你甚至可以有一个4级深度嵌套的风格跨度像任何24个排列,所有4种模式选择,如:

Can you even have a 4 level deep nested style span like any of the 24 permutations where all 4 modes are selected like:

    -!_*SUIB*_!-

由于我一直在争取这一个星期左右。

Thanks I've been fighting this for about a week.

积分为避免不良反馈的Mozilla的标记不应该被动态地传递给的innerHTML 。 (我看不出这是可能的,当一个人改变格式)。

Bonus points for avoiding bad feedback from Mozilla for "Markup should not be passed to innerHTML dynamically." (I don't see how that might be possible when one is changing the formatting).

太感谢了正则表达式的向导!我在你的债务。

Thanks a million regex wizards! I am in your debt.

mwolfe。

更新

使用相同的HREF检测如上文及@talemyn帮助下,我们现在正处于:

Using the same href detection as above and @talemyn help we are now at:

var boldPattern          = /(?!(?!.*?<a)[^<]*<\/a>)\*([^\s][^\*]*)\*/gi;
var italicsPattern       = /(?!(?!.*?<a)[^<]*<\/a>)_([^\s][^_]*)_/gi;
var strikethroughPattern = /(?!(?!.*?<a)[^<]*<\/a>)-([^\s][^-]*)-/gi;
var underlinePattern     = /(?!(?!.*?<a)[^<]*<\/a>)!([^\s][^!]*)!/gi;
str = str.replace(strikethroughPattern, '<s>$1</s>');
str = str.replace(italicsPattern, '<span style="font-style:italic;">$1</span>');
str = str.replace(boldPattern, '<strong>$1</strong>');
str = str.replace(underlinePattern, '<u>$1</u>');

这似乎覆盖一个极端的例子:

Which seems to cover an extreme example:

    _wow *a real* !nice *person! on -stackoverflow* figured- it out_ cool beans.

我认为一个人可以使用风格跨度,做一个正则表达式回望来确定previous未封闭的跨度,关闭它,应该在打开一个新的跨越与旧格式加上新的属性,关闭和打开一个新的跨越,完成格式..但可能会导致混乱或不可能做定期EX pressions为@NovaDenizen指出。

I think one could use the style spans and do a regex lookback to determine the previous unclosed span, close it, open a new span with old format plus new attribute, close when supposed and open a new span to finish the formatting .. but that could get messy or impossible to do with regular expressions as @NovaDenizen points out.

感谢您的所有帮助。如果有任何改进,请让我知道。注:我无法使用,因为网站上的CSS不会渲染。可以在超载? [这是一个Firefox / Greasemonkey的/镀铬插件]

Thank you for all your help. If there are any improvements please let me know. NB: I was unable to use and as the CSS on the site would not render it. Can that be overloaded? [This is for a firefox/greasemonkey/chrome plugin]

更新(几乎)最后一个

使用我的'破'的测试短语,如@MikeM正确指出,作为一个例子是否嵌套的正常与否会正确显示(减去下划线)在谷歌即时通讯。所以在看从谷歌的IM,我注意到它高兴地没有preformat刺痛,但简单做了一个替代品所需的文本输出HTML。

Using my 'broken' test phrase, as @MikeM correctly stated, as an example it would render correctly (minus the underline) in Google IM whether nested properly or not. So looking at the HTML output from the text in Google IM I noticed that it happily did not preformat the sting but simple did a substitute for as required.

所以,在看现场code,其使用resetcss删除我需要插入通过JavaScript的CSS格式后。计算器救援。 http://stackoverflow.com/questions/707565/how-do-you-add-css-with-javascript 和http://stackoverflow.com/questions/20107/yui-reset-css-makes-strongemthis-not-work-em-strong

So after looking at the site code which was using resetcss to remove I needed to insert the CSS formatting via javascript. Stackoverflow to the rescue. http://stackoverflow.com/questions/707565/how-do-you-add-css-with-javascript and http://stackoverflow.com/questions/20107/yui-reset-css-makes-strongemthis-not-work-em-strong

所以,我的解决方案,现在看起来像:

So my solution now looks like:

....
var css = document.createElement("style");
css.type = "text/css";
css.innerHTML = "strong, b, strong *, b * { font-weight: bold !important; } \
            em, i, em *, i * { font-style: italic !important; }";
document.body.appendChild(css);
 ....
var boldPattern          = /(?!(?!.*?<a)[^<]*<\/a>)\*([^\s][^\*]*)\*/gi;
var italicsPattern       = /(?!(?!.*?<a)[^<]*<\/a>)_([^\s][^_]*)_/gi;
var strikethroughPattern = /(?!(?!.*?<a)[^<]*<\/a>)-([^\s][^-]*)-/gi;
var underlinePattern     = /(?!(?!.*?<a)[^<]*<\/a>)!([^\s][^!]*)!/gi;
str = str.replace(strikethroughPattern, '<s>$1</s>');
str = str.replace(italicsPattern, '<i>$1</i>');
str = str.replace(boldPattern, '<b>$1</b>');
str = str.replace(underlinePattern, '<u>$1</u>');
.....

田田它主要的工程!

更新最终的解决方案 之后从@MikeM锚元素检查最后一分钟的简化,从另一个计算器文章中,我们已经到了一个完整的工作方案。

UPDATE FINAL SOLUTION After a last minute simplification on the anchor element check from @MikeM and combining the conditions from another stackoverflow post we have arrived at a complete working solution.

我还需要在与收盘符号一个字符样式检查添加,因为我们并肩更换触发令牌的一面。

I also needed to add in a check for a one char style with closing symbol, since we were replacing trigger tokens side by side.

作为@ acheong87提醒小心\ W,它包括 _ ,以便加入到包裹条件句为所有,但strikethroughPattern。

As @acheong87 reminded be careful with \w as it includes the _, so that was added to the wrapping conditionals for all but the strikethroughPattern.

var boldPattern          = /(?![^<]*<\/a>)(^|<.>|[\s\W_])\*(\S.*?\S)\*($|<\/.>|[\s\W_])/g;
var italicsPattern       = /(?![^<]*<\/a>)(^|<.>|[\s\W])_(\S.*?\S)_($|<\/.>|[\s\W])/g;
var strikethroughPattern = /(?![^<]*<\/a>)(^|<.>|[\s\W_])-(\S.*?\S)-($|<\/.>|[\s\W_])/gi;
var underlinePattern     = /(?![^<]*<\/a>)(^|<.>|[\s\W_])!(\S.*?\S)!($|<\/.>|[\s\W_])/gi;
str = str.replace(strikethroughPattern, '$1<s>$2</s>$3');
str = str.replace(italicsPattern, '$1<i>$2</i>$3');
str = str.replace(boldPattern, '$1<b>$2</b>$3');
str = str.replace(underlinePattern, '$1<u>$2</u>$3');

感谢你这么多大家(@MikeM,@talemyn,@ acheong87,等。)

Thank you so much everyone (@MikeM, @talemyn, @acheong87, et al.)

mwolfe。

推荐答案

我建议你从负面看,aheads取下内非查询aheads:

I recommend that you remove the inner negative look-aheads from your negative look-aheads:

/(?!(?!.*?<a)[^<]*<\/a>)_it_/.test( ' _it_ <a></a>' );         // true  (correct)
/(?!(?!.*?<a)[^<]*<\/a>)_it_/.test( '<a> _it_ </a>' );         // false (correct)
/(?!(?!.*?<a)[^<]*<\/a>)_it_/.test( '<a> _it_ </a> <a></a>' ); // true  (wrong)

/(?![^<]*<\/a>)_it_/.test( ' _it_ <a></a>' );                  // true  (correct)
/(?![^<]*<\/a>)_it_/.test( '<a> _it_ </a>' );                  // false (correct)
/(?![^<]*<\/a>)_it_/.test( '<a> _it_ </a> <a></a>' );          // false (correct)