Helpful Information
 
 
Category: Regex Programming
Replacing plain text URLs with links dilemma

Okay, replacing plain text URLs with a link is pretty easy, but I've run into an issue. The way I'm replacing them, it's able to read into the HTML, kind of [BBCode class].

So if I have: http://wat.com
and <img src="http://wat.com">

It would replace both, and then I'd end up with:
<img src="<a href="http://wat.com" height="y" width="x" />

Etc. Right now I'm using

/((((http|https|ftp):\/\/)|(www\.))(.*?)([,:%#&\/?=\w+\.-]+))/is
To find and replace the links, but I need to ignore it if it has src=" behind it. How can I do that? I've tried adding ^[src.*?] but it has no effect.

If it doesn't get more complex, this might work:


/(?<!(src|href)=(['"]|))((((http|https|ftp):\/\/)|(www\.))(.*?)([,:%#&\/?=\w+\.-]+))/is

A regular expression based bbcode parser isn't a good idea for all regex libraries, though. If we're talking (e.g.) PHP here, you'll probably hit a wall pretty soon.

Regards, Jens

What I do in situations like this is split the string into HTML and non-HTML parts (with PHP it's preg_split (http://php.net/preg-split)). Then I can apply some replacements to HTML text and other replacements to non-HTML text.

If it doesn't get more complex, this might work:


/(?<!(src|href)=(['"]|))((((http|https|ftp):\/\/)|(www\.))(.*?)([,:%#&\/?=\w+\.-]+))/is

A regular expression based bbcode parser isn't a good idea for all regex libraries, though. If we're talking (e.g.) PHP here, you'll probably hit a wall pretty soon.

Regards, Jens

I don't see why, but that won't work.

And I'm applying this after the BBCode replaces everything, but maybe I'll look into another way. Right now I'm matching the 'tags', listing it and then using a switch() to replace the attributes, inner text and the tags themselves.

Is that still an inefficient way to do it?

Seems as if php's regex extension doesn't like look behind assertations with variable length. Try this instead:


/(?<!src=['"])(?<!href=['"])((((http|https|ftp):\/\/)|(www\.))(.*?)([,:%#&\/?=\w+\.-]+))/is

For the last part:

yes, that should work better.

Regards, J.

That it did :)

Awesome :-P good job










privacy (GDPR)