So the task is like this: you have a text T, like "cat-1 dog-1 cat-1 elephant-1 cat-2 dog-2 cat-3".
Suppose we want to change numerals attached to the words "cat" to their word representations: "1" to "one", "2" to "two".
One straightforward way would be to match all "cat-([0-9])+" subsequences and then run replace operation on T.
So the code would look something like this:
String T = "cat-1 dog-1 cat-1 elephant-1 cat-2 dog-2 cat-3";
Pattern catPattern = Pattern.compile("cat-([0-9]+)");
Matcher catMatcher = catPattern.matcher(T);
Map numToWord = new HashMap();
numToWord.add("1", "one");
numToWord.add("2", "two");
numToWord.add("3", "three"); // ...
while (catMatcher.find())
{
T = T.replaceFirst(catMatcher.group(1), numToWord.get(catMatcher.group(1)));
}
This code produces:
cat-one dog-one cat-1 elephant-1 cat-two dog-2 cat-three
Which is missing one substitution. Ok, let's use replaceAll instead and make sure we touch only cats:
{
T = T.replaceAll("cat-" + catMatcher.group(1), "cat-" + numToWord.get(catMatcher.group(1)));
}
which produces what we want:
cat-one dog-1 cat-one elephant-1 cat-two dog-2 cat-three
But now what happens inside the loop is logically out of sync with the loop condition: we iterate over matches, but call replaceAll (probably not efficient either, as replaceAll will be attempted even when not needed anymore, for duplicate matches).
Any more elegant and correct solution?
Yes! It is called Matcher.appendReplacement
Pattern catPattern = Pattern.compile("cat-([0-9]+)");
Matcher catMatcher = catPattern.matcher(T);
MapnumToWord = new HashMap ();
numToWord.put("1", "one");
numToWord.put("2", "two");
numToWord.put("3", "three"); // ...
StringBuffer sb = new StringBuffer();
while (catMatcher.find())
{
System.out.println("Match:" + catMatcher.group(1));
catMatcher.appendReplacement(sb, "cat-" + numToWord.get(catMatcher.group(1)));
}
catMatcher.appendTail(sb);
now sb.toString() contains:
cat-one dog-1 cat-one elephant-1 cat-two dog-2 cat-three
If you append System.out.println(sb.toString()); inside the while loop, you will also see, that replacements happen in sync with the while loop's state, so that what is inside the loop and what while loops over are in sync.
No comments:
Post a Comment