The Java lowercase conversion surprise in Turkey

You can convert Strings to lowercase in Java with String.toLowerCase().

The following code will work anywhere in the world except Turkey:

import java.util.Locale;

public class LocaleTest {

  public void test() {
    Assert.assertEquals("title", "TITLE".toLowerCase());


In Turkey, this will give you:

org.junit.ComparisonFailure: expected:<t[i]tle> but was:<t[?]tle>
	at org.junit.Assert.assertEquals(
	at org.junit.Assert.assertEquals(
	at com.garygregory.util.LocaleTest.testBug2(
	... 24 more

Why is that?

Under the covers, the call to toLowerCase() becomes toLowerCase(Locale.getDefault()), and of course getDefault() changes depending on your country. So what really happens in Turkey is toLowerCase(Locale.forLanguageTag("TR"))).

In the Turkish locale, the Unicode LATIN CAPITAL LETTER I becomes a LATIN SMALL LETTER DOTLESS I. That’s not a lowercase “i”.

If you index maps with String keys that include the letter I and you normalize keys to lowercase, your program will have a bug when running on the Turkish locale.

How are you supposed to deal with that?! Well, luckily, you can use a Locale that does not perform such conversions, namely Locale.ENGLISH and more generically, Locale.ROOT.

You defensively written code becomes one of:

Assert.assertEquals("title", "TITLE".toLowerCase(Locale.ENGLISH));
Assert.assertEquals("title", "TITLE".toLowerCase(Locale.ROOT));

If you write your code “in English”, the Locale.ENGLISH is fine, but consider Locale.ROOT for more general support.

Java defines Locale.ROOT as:

 * Useful constant for the root locale. The root locale is the locale whose
 * language, country, and variant are empty ("") strings. This is regarded
 * as the base locale of all locales, and is used as the language/country
 * neutral locale for the locale sensitive operations.
 * @since 1.6
static public final Locale ROOT = createConstant("", "");

As a reminder, Java provides localization support with the java.util.Locale class, from the Javadoc:

A Locale object represents a specific geographical, political, or cultural region. An operation that requires a Locale to perform its task is called locale-sensitive and uses the Locale to tailor information for the user. For example, displaying a number is a locale-sensitive operation— the number should be formatted according to the customs and conventions of the user’s native country, region, or culture.

The Locale class implements IETF BCP 47 which is composed of RFC 4647 “Matching of Language Tags” and RFC 5646 “Tags for Identifying Languages” with support for the LDML (UTS#35, “Unicode Locale Data Markup Language”) BCP 47-compatible extensions for locale data exchange.

For further experimentation, the following Java system properties affect how Java initializes the default locale:

  • user.language
  • user.region
  • user.script
  • user.variant

Here is the full source code code for the test:

package com.garygregory.util;

import java.util.Locale;

import org.junit.Assert;
import org.junit.Assume;
import org.junit.ComparisonFailure;
import org.junit.Test;

public class LocaleTest {

  private static final Locale TURKISH = Locale.forLanguageTag("tr");

  public void testBug1() {
    Assume.assumeTrue(Locale.getDefault() != TURKISH);
    Assert.assertEquals("title", "TITLE".toLowerCase());
    Assert.assertEquals("title", "TITLE".toLowerCase(Locale.getDefault()));

  @Test(expected = ComparisonFailure.class)
  public void testBug2() {
    Assert.assertEquals("title", "TITLE".toLowerCase(TURKISH));

  public void testFix() {
    Assert.assertEquals("title", "TITLE".toLowerCase(Locale.ENGLISH));
    Assert.assertEquals("title", "TITLE".toLowerCase(Locale.ROOT));


In summary, use toLowerCase(Locale.ROOT) instead of toLowerCase() for internal maps.

Happy Coding,
Gary Gregory


2 thoughts on “The Java lowercase conversion surprise in Turkey

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s