Automatically detect and fix HTML errors on comments/posts. December 6, 2005 2:45 PM Subscribe
Automatically detect and fix HTML errors on comments/posts. In case of invalid HTML, it will automatically try to fix the markup using HTML Tidy, list the errors and force a preview.
I don't see any ports of tidy for coldfusion
posted by mathowie (staff) at 3:20 PM on December 6, 2005
posted by mathowie (staff) at 3:20 PM on December 6, 2005
Actually, this looks like an ugly hack for it, but if anyone spots anything better, feel free to post it.
posted by mathowie (staff) at 3:23 PM on December 6, 2005
posted by mathowie (staff) at 3:23 PM on December 6, 2005
I dont <see any?/
posted by blue_beetle at 5:05 PM on December 6, 2005
posted by blue_beetle at 5:05 PM on December 6, 2005
Hmm.. how is it a hack?
I just say that because it looks like a simple Java call, albeit with a lot of lines of wrapper code. Doesn't look like anything can be trimmed though.
posted by holloway at 6:02 PM on December 6, 2005
I just say that because it looks like a simple Java call, albeit with a lot of lines of wrapper code. Doesn't look like anything can be trimmed though.
posted by holloway at 6:02 PM on December 6, 2005
By the way, Radium's been going through comment validation issues in rewriting the SA forum software.
posted by holloway at 6:29 PM on December 6, 2005
posted by holloway at 6:29 PM on December 6, 2005
Why not just use body.onLoad regex's on the comments?
posted by Civil_Disobedient at 6:44 PM on December 6, 2005
posted by Civil_Disobedient at 6:44 PM on December 6, 2005
Regexes don't catch half the things tidy/sgml/xml parsers do. Try making a regex that understands what's wrong in this,
<table><tr><td>
&bsp; <table><tr><td>
</td></tr></table>
posted by holloway at 6:55 PM on December 6, 2005
<table><tr><td>
&bsp; <table><tr><td>
</td></tr></table>
posted by holloway at 6:55 PM on December 6, 2005
When did the concept "automatically fix" actually start working? I missed it. I thought that was all a
Microsoft beat-off fantasy.
posted by scarabic at 6:59 PM on December 6, 2005
Microsoft beat-off fantasy.
posted by scarabic at 6:59 PM on December 6, 2005
Try making a regex that understands what's wrong in this...
Oh, that's easy. Just replace the < and>with < and >-- users should be making tables in their comments anyway. :)>
posted by Civil_Disobedient at 7:09 PM on December 6, 2005
Oh, that's easy. Just replace the < and>with < and >-- users should be making tables in their comments anyway. :)>
posted by Civil_Disobedient at 7:09 PM on December 6, 2005
shouldn't, that is.
posted by Civil_Disobedient at 7:14 PM on December 6, 2005
posted by Civil_Disobedient at 7:14 PM on December 6, 2005
how is it a hack?
It turns text strings into files before running through tidy. Seems that doing that many thousands of times a day here could be problematic, given all that filesystem and memory use.
posted by mathowie (staff) at 7:17 PM on December 6, 2005
It turns text strings into files before running through tidy. Seems that doing that many thousands of times a day here could be problematic, given all that filesystem and memory use.
posted by mathowie (staff) at 7:17 PM on December 6, 2005
Ok then, as you're being awkward how about this,
<small><b><small>what</b><i></small></i>
posted by holloway at 7:19 PM on December 6, 2005
<small><b><small>what</b><i></small></i>
posted by holloway at 7:19 PM on December 6, 2005
Ah, right. Good point about the temp files. There's a function named tidyParseString in the .net bindings -- maybe there's something similar for Java.
posted by holloway at 7:28 PM on December 6, 2005
posted by holloway at 7:28 PM on December 6, 2005
BTW, thanks for that link Holloway. Interesting stuff.
posted by smackfu at 9:42 PM on December 6, 2005
posted by smackfu at 9:42 PM on December 6, 2005
Yeah, that site's pretty good. He's coming at it from the standpoint of writing code Knuth would love. Parsing variables the fewest number of times, and with something that understands *ML, rather than adding another line of regex replacement.
It's that kind of thinking that got me to give up on PHP. It's better than old-style ASP, CFMX, and the rest, but that's not the test anymore. It's up against Ruby and Python, .Net and Perl. 'cause when I want to write state-machines and use SAX/STX those former languages just get in the way.
I've been writing a cached XML pipeline streaming system based around SAX/STX. It's like the 2 previous versions of Phpilfer, XML but not so pure as Apache Cocoon so it can be fast. It's built to avoid the filesystem and to scale across boxes (memcached). Each subsequent version has less code and has been easier to program - it's a really good feeling :)
I don't know whether Radium is a great programmer but his approach is inspiring and it's helped me appreciate algorithms again. Years writing commercial software gave me the idea that a good engineer would pragmatically concentrate on architecture and UI because that's what users want. That's true, but unbalanced. thx radium.
posted by holloway at 1:46 AM on December 7, 2005
It's that kind of thinking that got me to give up on PHP. It's better than old-style ASP, CFMX, and the rest, but that's not the test anymore. It's up against Ruby and Python, .Net and Perl. 'cause when I want to write state-machines and use SAX/STX those former languages just get in the way.
I've been writing a cached XML pipeline streaming system based around SAX/STX. It's like the 2 previous versions of Phpilfer, XML but not so pure as Apache Cocoon so it can be fast. It's built to avoid the filesystem and to scale across boxes (memcached). Each subsequent version has less code and has been easier to program - it's a really good feeling :)
I don't know whether Radium is a great programmer but his approach is inspiring and it's helped me appreciate algorithms again. Years writing commercial software gave me the idea that a good engineer would pragmatically concentrate on architecture and UI because that's what users want. That's true, but unbalanced. thx radium.
posted by holloway at 1:46 AM on December 7, 2005
There's a better code sample here
There are also quite a few alternatives to JTidy here
posted by Sharcho at 3:36 AM on December 7, 2005
There are also quite a few alternatives to JTidy here
posted by Sharcho at 3:36 AM on December 7, 2005
You are not logged in, either login or create an account to post comments
posted by gleuschk at 3:17 PM on December 6, 2005