Leonid Mamchenkov has a quick new post to his blog sharing a regular expression that can be used to check that a string contains only English or Latin characters (no Unicode allowed).
Today at work I came across a task which turned out to be much easier and simpler than I originally thought it would. We have have a site with some user registration forms. The site is translated into a number of languages, but due to the regulatory procedures, we have to force users to input their registration details in English only. Using Latin characters, numbers, and punctuation.
Thankfully the PCRE regular expression engine bundled with PHP makes it simple - it uses a standard regular expression without anything special to accommodate for Unicode characters. He notes that adding the "/u" modifier to the expression makes it "totally malfunction" (where strings are treated as UTF-8). If you'd like an example of some of the tricks that go into supporting Unicode in a regex, see this comment in the PHP manual.