MySQL, PHP and garbage characters in a database
Posted by rbTech Staff, Last modified by rbTech Staff on 20 May 2014 03:18 PM

This one took me a good long while to figure out, so I thought I'd put a quick post up so the next poor sod who is beating their head on a wall about it can do so for perhaps slightly shorter than I did.

The issue:  And older LAMP application, presenting garbage characters.  The garbage characters may or may not be stored correctly in the database.

In this instance, the "Section" symbol (§) was giving me particular fits:  It was sometimes stored, often in amongst other garbage, correctly in the database like so: Â§Â
Sometimes it looked instead like this: ï¿½ which was even uglier.

After an enormous amount of consulting with my friend the Google, I was able to do some simple REPLACE queries[1] on the database to get rid of the garbage so it looked right from a MySQL command line.  However, they still weren't displaying right in the <textarea>, and worse, they were getting re-corrupted on save.

I was able to get my textarea to display the corrected data alright, by wrapping the field in htmlentities($field).  However, it was still getting corrupted on save (which would ripple out to not displaying correctly again, which would send me back into a tailspin).

I wrapped the INSERT statement in a htmlentities() as well, but then it was still storing incorrectly - instead of a single § character, it was storing Â§ .  What gives?!!

Finally, I realized that the old school addslashes() was not my friend in this instance:  Addslashes was messing with the characters as they went into the database.  The original code was using addslashes($notes), and I modified the code to use addslashes(htmlentities($notes)).  By removing the addslashes altogether, the characters are finally being displayed, stored, and saved correctly.

TL;DR: don't try to wrap htmlentities() with addslashes() (or the other way around) or you're going to have a bad time!

[1]Queries (you can cut and paste these and they may fix you right up):
UPDATE table SET column = REPLACE(column, '�', '§');
UPDATE table SET column = REPLACE(column, §½', '§');
UPDATE table SET column = REPLACE(column, '§', '§');

(1 vote(s))
Not helpful

Comments (0)
Post a new comment
Full Name:
CAPTCHA Verification 
Please enter the text you see in the image into the textbox below (we use this to prevent automated submissions).