UTF-8

This group is for testing UTF-8 capabilities of PmWiki. Specific things that need to be tested:

  • Non-western characters in pagenames
  • Non-latin languages

Note: if you cannot see chinese (russian, urdu..) characters below, your browser either does not support unicode, or your browser is using the wrong encoding type.


Question

Is the mbstring function disabled or enabled on this page? If it's enabled, then I think we're not actually testing the new utf8toupper() function from xlpage-utf-8.php script here? If we're not, then where can we?


(Advanced) Making your national characters convert to upper-case properly

You'd better know a bit about PHP if you want to do that. If you do, then the notes below may be of use for you:

  • you must add your special national characters to the $CaseConversions array at the bottom of the scripts/xlpage-utf-8.php file; it's format is a bit strange at first sight - see below for an explanation;
  • the Character Map in MS Windows can help you: when you click a chosen character in it, you'll see it's Unicode number in the lower-left corner of the window (e.g. é has a number of E9 (or e9), shown as: U+00E9).

* {(toupper abcdefghijklmnopqrstuvwxyz)}
* {(toupper ABCDEFGHIJKLMNOPRQSTUVWXYZ)}
* {(toupper áâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ)}
* {(toupper ąčęėįšųūž)}
* {(toupper "430 абвгдежзийклмноп")}
* {(toupper "440 рстуфхцчшщъыьэюя")}
* {(toupper "450 ѐёђѓєѕіїјљњћќѝўџ")}
* {(toupper "48a ҊҋҌҍҎҏ")}
* {(toupper "490 ҐґҒғҔҕҖҗҘҙҚқҜҝҞҟ")}
* {(toupper "4A0 ҠҡҢңҤҥҦҧҨҩҪҫҬҭҮү")}
* {(toupper "4B0 ҰұҲҳҴҵҶҷҸҹҺһҼҽҾҿ")}
* {(toupper "4C0 ӀӁӂӃӄӅӆӇӈӉӊӋӌӍӎ")}
* {(toupper "4D0 ӐӑӒӓӔӕӖӗӘәӚӛӜӝӞӟ")}
* {(toupper "4E0 ӠӡӢӣӤӥӦӧӨөӪӫӬӭӮӯ")}
* {(toupper "4F0 ӰӱӲӳӴӵӶӷӸӹӺӻӼӽӾӿ")}
  • ABCDEFGHIJKLMNOPQRSTUVWXYZ
  • ABCDEFGHIJKLMNOPRQSTUVWXYZ
  • ÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙÚÛÜÝÞŸ
  • ĄČĘĖĮŠŲŪŽ
  • 430 АБВГДЕЖЗИЙКЛМНОП
  • 440 РСТУФХЦЧШЩЪЫЬЭЮЯ
  • 450 ЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏ
  • 48A ҊҊҌҌҎҎ
  • 490 ҐҐҒҒҔҔҖҖҘҘҚҚҜҜҞҞ
  • 4A0 ҠҠҢҢҤҤҦҦҨҨҪҪҬҬҮҮ
  • 4B0 ҰҰҲҲҴҴҶҶҸҸҺҺҼҼҾҾ
  • 4C0 ӀӁӁӃӃӅӅӇӇӉӉӋӋӍӍ
  • 4D0 ӐӐӒӒӔӔӖӖӘӘӚӚӜӜӞӞ
  • 4E0 ӠӠӢӢӤӤӦӦӨӨӪӪӬӬӮӮ
  • 4F0 ӰӰӲӲӴӴӶӶӸӸӺӺӼӼӾӾ


рст уф сту ф туф уф

Visiškai lietuviškų žodžių šičia


Arabic in div with rtl class: rtl class adds style direction:rtl;

يوركشر - مقاطعة سابقة في شمالي إنجلترا

هَلْ أَنْتَ مُتَزَوِّجٌ؟


Armenian:

կդֆ թսդ թլսաաէպրո լթսդֆկթդսլկոպէէւոպլթսդ թլսդկպչէ ԼԹԹԼՍԹԼԴ պչՌՌՒԷպչպչւ


Bulgarian:

абвгдежзийклмнопрстуфхцчшщъьюя

АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЬЮЯ?

ЧфываваШфвыафывафвы


Chinese:

寫個中文試試看


Czech: (cs)

Příliš žluťoučký kůň úpěl ďábelské ódy
PŘÍLIŠ ŽLUŤOUČKÝ KŮŇ ÚPĚL ĎÁBELSKÉ ÓDY
Příliš žluťoučký kůň úpěl ďábelské ódy?


Danish: (da)

Rødgrød med mælk og fløde på
RØDGRØD MED MÆLK OG FLØDE PÅ

æøåÆØÅ?


Devanagari: (Hindi, Nepali and so on)

देवनागरी लिपी -- हिन्दी / नेपाली


Estonian:

õäöü? ÕÄÖÜ?


Farsi (Persian) in div with class rtl:

راست به جهت نوشتار از چپ را می توان با RTL کلاس سبک اجرا


Greek:

αβγδεζηθικλμνξοπρστυφχψω άέήίόύώϊϋ
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ ΆΈΉΊΌΎΏ

Unfortunatelly there are still known issues with UTF-8 Greek strings and searching.


Hebrew:

The original entry showed:

אני לא אוהב את המוסיקה של מדונה?

Hmmh, RTL behaves strangely (observing the placement of a question mark)

אני אוהב את המוסיקה של סטינג

אני אוהב את המוסיקה של סטינג 2?

The new entry, however:

  • RTL seems ok using class rtl
  • It even works inside another directive, like >>rframe rtl<<

Japanese:

日本語のテスト


Polish:

ąćśżźęółń? ĄćśŻźęóŁń? Ąćś Żźęó Łń?


Thai:

  • เป็นมนุษย์ สุดประเสริฐ เลิศคุณค่า
  • กว่าบรรดา ฝูงสัตว์ เดรัจฉาน
  • จงฝ่าฟัน พัฒนา วิชาการ?
  • อย่าล้างผลาญ ฤๅเข่นฆ่า บีฑาใคร
  • ไม่ถือโทษ โกรธแช่งซัด ฮึดฮัดด่า
  • หัดอภัย เหมือนกีฬา อัชฌาสัย?
  • ปฏิบัติ? ประพฤติกฎ กำหนดใจ
  • พูดจาให้ จ๊ะๆ จ๋าๆ น่าฟังเอยฯ

Turkish:

Elazıg Adapazarı Çetinkaya Aydın Denizli Eskişehir Eğridir Dyarbakır Çatalağzı?


Urdu (in div with class rtl)

بائیں متن کی سمت کا حق طرز کلاس RTL ساتھ نافذ کیا جا سکتا ہے


УкраїнськаМова?


Hrvatski šŠĐČĆŽ? Ššđčćž


Korean:

한국에서 쓰는?피엠위키