Special Characters in HTML with PHP’s htmlspecialchars() Function

Posted December 8th, 2009 by Barnaby Knowles in Content Management, Website Development

Google Buzz

Introduction

Certain characters should not be used as plain text in HTML markup but should instead be represented by their respective HTML entities in order to preserve their meanings. When writing HTML this is a straightforward process – you type the HTML entity rather than the special character. But what happens when you have some plain text containing these characters (out of a database, for example) that you need to display? PHP has a function that will take a string and convert special characters to their HTML entities for you.

What are special characters?

Some characters have special significance in HTML and in order to preserve their meanings they must be represented by HTML entities. One of the most common examples of this is the ampersand (&).

Why convert special characters to HTML entities?

Ampersands are used in HTML to begin an entity reference, and should not therefore be used as punctuation in standard text. Rather, ampersands should be converted to their HTML entity – & – so that HTML interpreters (such as web browsers) are aware that they are not preceeding an entity reference.

When should you convert special characters to their HTML entities?

Special characters should always be converted to HTML entities whenever you need to preserve their meanings.

Using the ampersand as an example again, if you leave it unconverted in HTML markup as a punctuation symbol an HTML interpreter will expect that you are actually beginning an entity reference and could parse your document incorrectly.

Another example is the double quotation mark – " – (commonly referred to as “double quotes”). Double quotes are used in HTML markup to enclose attribute values amongst other things. Therefore if your value includes a double quote that is not converted to its HTML entity ("), the HTML interpreter will assume that this denotes the end of the value.

How to convert special characters to HTML entities

PHP has a function called htmlspecialchars() that takes a string and returns the string with special characters converted to HTML entities. Using htmlspecialchars() the following translations are performed:

  • '&' (ampersand) becomes '&'
  • '"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
  • ''' (single quote) becomes ''' only when ENT_QUOTES is set.
  • '<' (less than) becomes '&lt;'
  • '>' (greater than) becomes '&gt;'

This is not a complete list of HTML character entities, just the most useful for everyday web programming.

You might use htmlspecialchars() in a similar way to this:

<input type="text" name="anything" value="<?php echo htmlspecialchars($row['value']); ?>" />

If the text that you are displaying contains any special characters such as ampersands or double quotes they will be converted to HTML entities and will not invalidate or otherwise affect your HTML code.

What about other special characters?

As explained previously, htmlspecialchars() only converts the characters listed into their HTML entities. If you want to perform full entity translation, PHP also has the htmlentities() function, which will convert all applicable characters to HTML entities.

Conclusion

You should always use HTML entities if you want to preserve the meaning of special characters. If you’re writing HTML yourself you should code these into your markup. If you’re inserting text dynamically, simply run it through PHP’s htmlspecialchars() function.

  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • LinkedIn
  • MySpace
  • Ping.fm
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati
  • Twitter



Leave a Reply

 
Follow us on twitter! View Our Digg Profile!
Browse Our YouTube Channel! Check Out Our Delicious Bookmarks!
Connect With Us On LinkedIn! Find us on Facebook
Make Child Poverty History
© 2009 RAM. All rights reserved. Built and Powered by WSI. | Sitemap
Website Development and Online Marketing for Huddersfield, Leeds, Manchester, Sheffield & West Yorkshire

WSI Internet Consulting, The Media Centre, 7 Northumberland Street, Huddersfield, HD1 1RL
Registered in England No. 4968860, Bridge End House, Park Mount Avenue, Baildon, BD17 6DS