Regex for Extracting URLs in Plain Text


Here is a Regex for extracting URLs from text. However, these links will not already be hyperlinked or source attribtues from images or iframes.

This example is in PHP. I was trying to format a Wordpress page to auto hyperlink but preserve embeded images, iframes, etc.

Wordpress does some funky stuff w/ regard to formatting, which is why I ended up going with the nl2br function and then removing double instances.

A little hacky but it seems to work.

  $regex = '~^http(.*?)(?=[\<])~im';
  $content = nl2br(get_the_content());
  $content = preg_replace($regex, '<a href="$0" target="_blank">$0</a>', $content);

  $html = nl2br(html_entity_decode(esc_html($content)));
  $html = str_replace('<br /><br />', '<br/>', $html);
  echo $html;
Tagged w/ #php #regex #wordpress