Re: complex regex
Funny, I think I found my answer. This way seamed to do the trick. Is
it possible to do the same thing with just Matcher.replaceAll()?
String mat = "(<html><script><p><font><script>";
String pat = "<[^>]*>";
StringBuffer sb = new StringBuffer(mat);
StringBuffer sb2 = new StringBuffer(mat);
Pattern pattern = Pattern.compile(pat);
Matcher matcher = pattern.matcher(mat);
int start,end = 0;
int newStart = 0;
while(matcher.find()){
start = matcher.start();
end = matcher.end();
System.out.println("old string ---
"+sb.substring(matcher.start(),matcher.end()).toString());
if(sb.substring(start,end).indexOf("script") > -1){
System.out.println("new string --- "+sb2.delete(start-
newStart,end-newStart).toString());
newStart = sb.length() - sb2.length();
}
System.out.println(start+" "+end+" "+newStart);
}
On Oct 9, 9:59 pm, "carlberna...@gmail.com" <carlberna...@gmail.com>
wrote:
HI,
I am new to java.util.regex package which I am using to detect each
time the javascript tag occurs in an html file and delete it. I tried
using the following code to find examples such as the ones below but
instead it finds the first occurrence of "<" and the last occurrence
of ">" which is not what I am looking for.
<script>
<script src="script.js">
</script>
String mat = "<html><script><p><font></script>";
String pat = "<*[\\x00-\\x7f]*jscript*[a-z0-9]*>";
Pattern pattern = Pattern.compile(pat);
Matcher matcher = pattern.matcher(mat);
while(matcher.find()){
System.out.println("Match: "+matcher.group()+"
Start:"+matcher.start()+" End:"+ matcher.end());
}
output:
Match: <html><script><p><font><script> Start:0 End:39
i would be looking for an out put of:
Match: <script> Start:6 End:18
Match: <script> Start:27 End:18
Appreciate any input,
Carl