 | News Feed |
 | Jobs Feed |
Sections
|
| feed this: |  |
James Cohen's Blog: How to Avoid Character Encoding Problems in PHP
by Chris Cornutt April 25, 2011 @ 14:13:14
James Cohen has a recent post to his blog looking at a way you can avoid some of the character encoding problems in PHP that can come with working with multiple character sets.
Character sets can be confusing at the best of times. This post aims to explain the potential problems and suggest solutions. Although this is applied to PHP and a typical LAMP stack you can apply the same principles to any multi-tier stack.
He includes a "boring history" session (and recommends skipping if you just want the good stuff) that talks a bit about character sets and their history in computer system handling. All that said, he recommends using UTF-8 to ease your character encoding woes. He talks about configuring your editor to support it, making sure your browsers understand it and setting up your MySQL database connection to use it.
voice your opinion now!
character encoding issue mysql browser editor ide
Brian Swan's Blog: SQL Server Driver for PHP Connection Options CharacterSet
by Chris Cornutt February 28, 2011 @ 12:15:33
Brian Swan has posted another in his series looking at connection options for the SQL Server driver for PHP. In his latest he looks at the "CharacterSet" setting, an easy way to define which encoding the remote database is using.
One thing that helped me understand the CharacterSet option was to realize that its name is a bit misleading (although it seems to be inline with other uses of CharacterSet or charset). It is used to specify the encoding of data that is being sent to the server, not the character set. With that in mind, the possible values for the option begin to make sense: SQLSRV_ENC_CHAR, SQLSRV_ENC_BINARY, and UTF-8.
He looks at each of these three options in more detail - SQLSRV_ENC_CHAR being the default, SQLSRV_ENC_BINARY when binary data is needed and UTF-8 when, obviously, you need UTF-8 data transfer between the client and server.
voice your opinion now!
sqlserver connection option characterset encoding
WebReference.com: Create a Localized Web Page with PHP
by Chris Cornutt October 21, 2010 @ 13:21:23
On WebReference.com there's a new tutorial posted about localizing your website by defining a character set to use for your content.
The process of making your applications/websites usable in many different locales is called internationalization, While customizing your code for different locales is called localization. Localization is the process of making your applications or websites local to where it is being viewed. For example, you can make a website more local to a particular place by converting its text to the predominate language of that location and by displaying the local time (e.g. German for people living in Germany or French for people living in France).
They show how to define constants that can be used in your application for the character set and language encoding. They use two major encodings - UTF-8 and ISO-8859-1 - in their examples of showing a sample "welcome" message in different languages. There's also a simple page to show you how to switch between languages if you'd like to give your visitors the option.
voice your opinion now!
localize tutorial language encoding character
Kevin Schroeder's Blog: You want to do WHAT with PHP? Chapter 3
by Chris Cornutt August 31, 2010 @ 13:44:32
Kevin Schroeder has posted another excerpt from his "You Want to Do WHAT with PHP?" book to his blog today. This time it's from the third chapter that looks at character encodings like UTF-8 or ISO-8859-1.
I realized that while this 3.5-year PHP consultant knew Unicode, UTF-8, character encodings such as ISO-8859-1 or ISO-8859-7, I didn't understand them as well as I thought I had. With that I threw this chapter in the book. Knowing about character encoding is what many developers have. Not as many truly understand it. In this chapter I try to de-mystify character encoding as a whole.
The excerpt introduces character encoding and what it really is - a translation for the computer to be able to handle the human language. The problem comes in when multiple tools try to define the same sort of letters/chatacters in different ways. He gives an example of a "hello world" string in a normal ASCII format versus one from the EBCDIC format and how it would be rendered by an ASCII-understanding browser.
voice your opinion now!
character encoding book excerpt ascii example
Evert Pot's Blog: Filesystem encoding and PHP
by Chris Cornutt April 21, 2010 @ 12:19:39
Evert Pot has a new post to his blog about working with files in your applications, more specifically in dealing with filesystem encodings other than some of the defaults.
Many PHP applications save files to a local filesystem. Most of the times for the bulk of readers here you'll likely only ever store files using US-ASCII encoding, either because your filenames are simply based on database fields (as you should try in most cases), or simply because most of your users never have a need for non-english characters. When you do though, it's important to know how operating systems cope with these characters. Unsurprising, all of them do this differently.
He talks about encoding issues in three major operating system types - Windows, OS X and Linux - with some code snippets included to illustrate how each handles the different encodings.
voice your opinion now!
filesystem encoding osx linux windows
Zend Developer Zone: PHP DOM XML extension encoding processing
by Chris Cornutt September 02, 2009 @ 09:48:18
On the Zend Developer Zone today Alexander Veremyev shares some helpful hints he discovered about the DOM XML extension for PHP that could come in handy when working with different character encodings.
I recently worked with PHP's DOM XML extension while working on Zend Framework's Zend_Search_Lucene HTML highlighting capabilities, and uncovered some undocumented features and issues with the extension in regards to character encoding. The information contained in this article should also apply to other libxml-based DOM implementations, as PHP's DOM extension simply wraps that library.
There's five different tips he shares:
- Internal document encoding is always UTF-8
- Input data is always treated as UTF-8
- Text nodes and CDATA are stored as UTF-8 without transformations
- Document encoding does not affect loading behavior
- Save/dumping operations and encoding
He describes each of the points and includes some sample code and XML to parse to help illustrate each.
voice your opinion now!
tutorial dom extension character encoding
Sean Coates' Blog: UTF WTF?
by Chris Cornutt November 24, 2008 @ 09:31:04
Sean Coates has reposted an article that was originally published in php|architect magazine covering UTF-8 and proper Unicode encoding.
If I had to guess, I would estimate that I've spent somewhere in the range of 40 hours wrangling UTF-8 in the past 3 months, which is not only expensive for my employer, but also disheartening as a developer who's got real work to do. Admittedly, this number is inflated, due to the heavy development cycle we completed with the launch of our new site.
Sean goes on to talk about Unicode issues in general (partially supported in some places, too many points of failure) and some of his other experiences with "the UTF-8 monster" that have given him trouble over time.
voice your opinion now!
utf8 utf character encoding unicode
Danne Lundqvist's Blog: Two weeks with Zend Studio for Eclipse
by Chris Cornutt October 20, 2008 @ 08:48:03
New on the DotVoid blog Danne Lundqvist has posted about the experience of two weeks with the Zend Studio for Eclipse software as a primary editor.
After more than ten years with Emacs and terminal flipping as my primary development environment, whether for C, PHP, WSDL, HTML/CSS or javascript, I decided to try (I mean really really try) an IDE for a while. As PHP is my main focus these days I have been looking towards Zend Studio for Eclipse. I figured Eclipse with it's maturity must work well enough on linux.
Danne talks about the transition from editor to IDE (shortcuts? features? where is everything?) including importing a project in from a subversion repository. He had a few issues as he started out - technical glitches, problems with subversion integration and encoding support - but lots of other good things too (phpDocumentor support, code folding, inline errors/warnings).
voice your opinion now!
zendstudio eclipse ide editor subversion encoding shortcut phpdocumentor
SitePoint PHP Blog: Character Encoding Issues with Cultural Integration
by Chris Cornutt September 10, 2008 @ 12:07:06
On the SitePoint PHP Blog Troels Knak-Nielsen points out some "cultural integration issues" he's seen when it comes to character encoding in his PHP applications.
The gold standard solution is to convert everything to utf-8. Since utf-8 covers the entire unicode range, it is capable of representing any character that latin1 can. Unfortunately, that's a lot easier to do from the outset, than with a big, running application. And even then, there may be third party code and extensions, which assume latin1. I'd much rather continue with latin1 being the default, and only jump through hoops at the few places where I actually need full utf-8 capacity.
He came up with a (relatively) simple solution - keep the information encoded in the latin1 he already has but serve up the pages with a utf-8 format, embedding utf-8 inside the latin1 when needed. He gives the code for both, making use of output buffering and the utf8 encoding functions to make it all work.
voice your opinion now!
character encoding cultural integration utf8 latin1 tutorial
|
Community Events
Don't see your event here? Let us know!
|