Regex for Extracting URLs in Plain Text

Written by Sean Behan on Fri Apr 14th 2017

Here is a Regex for extracting URLs from text. However, these links will not already be hyperlinked or source attribtues from images or iframes.

This example is in PHP. I was trying to format a Wordpress page to auto hyperlink but preserve embeded images, iframes, etc.

Wordpress does some funky stuff w/ regard to formatting, which is why I ended up going with the nl2br function and then removing double instances.

A little hacky but it seems to work.

<?php
  $regex = '~^http(.*?)(?=[\<])~im';
  $content = nl2br(get_the_content());
  $content = preg_replace($regex, '<a href="$0" target="_blank">$0</a>', $content);

  $html = nl2br(html_entity_decode(esc_html($content)));
  $html = str_replace('<br /><br />', '<br/>', $html);
  echo $html;

Tagged with..
#PHP #Regex #Wordpress

Just finishing up brewing up some fresh ground comments...

Regex for Extracting URLs in Plain Text

SB