I'm using Java's regex libraries, and I can't get grouping to work. Or at least I can't get it to work they way I want.
What I want is to match from the begining of the string up to, but not including any number of trailing semicolon characters. I expected that grouping it would let the first group be the characters I want.
But no.
Here is a code snipet:
static final Pattern pat = Pattern.compile("^(.*?);*$");
private static final String[] list = {
"abc;",
"N:Berger;Gary;;;",
"EMAIL;type=INTERNET;type=pref:halberman@alum.mit.edu"};
private void bar(String arg) {
Matcher m = pat.matcher(arg);
int count = 0;
while(m.find()) {
count++;
System.out.println("Match number "+count);
System.out.println("start(): "+m.start());
System.out.println("end(): "+m.end());
System.out.println(arg.substring(m.start(), m.end()));
for (int i = 0; i < m.groupCount(); i++) {
System.out.println(m.group(i));
}
}
}
Any pointers greatly appreciated.
hi, there are 2 little problems with what you're doing.
1. Your regular expression ("^(.*?);*$") is saying match smallest string which ends with zero or more semi-colons.
It should be "^(.*?);.*$" - match smallest string which end with semi-colon and zero or more of any character.
2. If/When you actually find a match your group count will be 1 as you only have 1 set of parenthesis, so your loop ...
for (int i = 0; i < m.groupCount(); i++)
{
System.out.println(m.group(i));
}
...will never show m.group(1). You need to change it to ...
for (int i = 0; i <= m.groupCount(); i++)
{
System.out.println(m.group(i));
}
Note : group(0) just means the whole string you're testing against.