|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.sandev.basics.util.XMLTextProcessing
public class XMLTextProcessing
Provides raw text translation services for XML.
This class leverages the StringCharacterIterator combined with Character.isWhite to do its work. It does not actually make use of StringTokenizer or StreamTokenizer (not that those share anything in common either).
| Constructor Summary | |
|---|---|
XMLTextProcessing()
|
|
| Method Summary | |
|---|---|
static java.lang.String |
convertFromXML(java.lang.String text)
Performs the inverse of the convertToXML character escapes. |
static java.lang.String |
convertToHTML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat)
Like convertToXML, except less stringent about things like apostrophes, quotes and ampersands. |
static java.lang.String |
convertToXML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat)
Convert the given text to valid XML plaintext. |
static void |
escapeCharacter(java.lang.StringBuffer buf,
char currChar,
boolean stringentEscape)
Append the character or the equivalent XML escape string to the given buffer. |
static java.lang.String |
getPrefix(java.lang.String token)
Return the open parenthesis or other prefix this token starts with, or the empty string if it is unprefixed. |
static java.lang.String |
getSuffix(java.lang.String token)
Return the close parentheses or other suffix this token ends with, or the empty string if it is unsuffixed. |
static java.lang.String |
getXMLTagValue(java.lang.String tagname,
java.lang.String input)
Given some XML input, retrieve the value of the given tag. |
static java.lang.String |
processToXML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat,
boolean stringentEscape)
Workhorse for convertToXML, convertToHTML methods. |
static java.lang.String |
translateToken(java.lang.String token,
boolean linkHref,
boolean linkEmail)
If the given token looks like an email address or a hyperlink then make it into one. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public XMLTextProcessing()
| Method Detail |
|---|
public static java.lang.String convertToXML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat)
public static java.lang.String convertToHTML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat)
public static java.lang.String processToXML(java.lang.String text,
boolean linkHref,
boolean linkEmail,
boolean translateFormat,
boolean stringentEscape)
The tough part about this is that in an HTML display, a space between two characters gets displayed, while space at the beginning of a line is typically ignored. So "blah nbsp;blah" is two spaces whereas " nbsp;blah" at the beginning of a line is 1 space. So when creating an indented list in text, we lose the first space character, so cut-and-paste into an editor loses one level of indenting. To avoid this we would need to trap whether we were at the beginning of a new line or not, which doesn't seem worth it. The relative positions look ok.
This was also causing annoyances when a sentence is ended with two spaces, since the HTML will wrap the nbsp onto the next line causing it to indent which looks wierd. To avoid that we skip counting one hardspace directly after the end of a sentence.
matching on newlines
We have to match on backslash n explicitely when recognizing newlines, or text values that are created programmatically don't always get formatted. In other words if you explicitely set the value of a large text field to be a string with an embedded backslash n, then it won't be translated (at least on windoze). So the upshot is that either the unicode newline character or an explicit backslash n will be recognized as a newline. That said, a crlf needs to recognized as a single newline character or we end up double spaced.
public static java.lang.String translateToken(java.lang.String token,
boolean linkHref,
boolean linkEmail)
public static java.lang.String getPrefix(java.lang.String token)
public static java.lang.String getSuffix(java.lang.String token)
public static void escapeCharacter(java.lang.StringBuffer buf,
char currChar,
boolean stringentEscape)
public static java.lang.String convertFromXML(java.lang.String text)
public static java.lang.String getXMLTagValue(java.lang.String tagname,
java.lang.String input)
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||