public
final
class
UScript
extends Object
java.lang.Object | |
↳ | android.icu.lang.UScript |
Constants for ISO 15924 script codes, and related functions.
The current set of script code constants supports at least all scripts that are encoded in the version of Unicode which ICU currently supports. The names of the constants are usually derived from the Unicode script property value aliases. See UAX #24 Unicode Script Property (http://www.unicode.org/reports/tr24/) and http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt .
Starting with ICU 3.6, constants for most ISO 15924 script codes are included, for use with language tags, CLDR data, and similar. Some of those codes are not used in the Unicode Character Database (UCD). For example, there are no characters that have a UCD script property value of Hans or Hant. All Han ideographs have the Hani script property value in Unicode.
Private-use codes Qaaa..Qabx are not included.
Starting with ICU 55, script codes are only added when their scripts have been or will certainly be encoded in Unicode, and have been assigned Unicode script property value aliases, to ensure that their script names are stable and match the names of the constants. Script codes like Latf and Aran that are not subject to separate encoding may be added at any time.
Nested classes | |
---|---|
enum |
UScript.ScriptUsage
Script usage constants. |
Constants | |
---|---|
int |
AFAKA
ISO 15924 script code |
int |
AHOM
ISO 15924 script code |
int |
ANATOLIAN_HIEROGLYPHS
ISO 15924 script code |
int |
ARABIC
Arabic |
int |
ARMENIAN
Armenian |
int |
AVESTAN
ISO 15924 script code |
int |
BALINESE
ISO 15924 script code |
int |
BAMUM
ISO 15924 script code |
int |
BASSA_VAH
ISO 15924 script code |
int |
BATAK
ISO 15924 script code |
int |
BENGALI
Bengali |
int |
BLISSYMBOLS
ISO 15924 script code |
int |
BOOK_PAHLAVI
ISO 15924 script code |
int |
BOPOMOFO
Bopomofo |
int |
BRAHMI
ISO 15924 script code |
int |
BRAILLE
Braille Script in Unicode 4 |
int |
BUGINESE
Script in Unicode 4.1 |
int |
BUHID
Buhid |
int |
CANADIAN_ABORIGINAL
Unified Canadian Aboriginal Symbols |
int |
CARIAN
ISO 15924 script code |
int |
CAUCASIAN_ALBANIAN
ISO 15924 script code |
int |
CHAKMA
ISO 15924 script code |
int |
CHAM
ISO 15924 script code |
int |
CHEROKEE
Cherokee |
int |
CIRTH
ISO 15924 script code |
int |
COMMON
Common |
int |
COPTIC
Coptic |
int |
CUNEIFORM
ISO 15924 script code |
int |
CYPRIOT
Cypriot Script in Unicode 4 |
int |
CYRILLIC
Cyrillic |
int |
DEMOTIC_EGYPTIAN
ISO 15924 script code |
int |
DESERET
Deseret |
int |
DEVANAGARI
Devanagari |
int |
DUPLOYAN
ISO 15924 script code |
int |
EASTERN_SYRIAC
ISO 15924 script code |
int |
EGYPTIAN_HIEROGLYPHS
ISO 15924 script code |
int |
ELBASAN
ISO 15924 script code |
int |
ESTRANGELO_SYRIAC
ISO 15924 script code |
int |
ETHIOPIC
Ethiopic |
int |
GEORGIAN
Georgian |
int |
GLAGOLITIC
Script in Unicode 4.1 |
int |
GOTHIC
Gothic |
int |
GRANTHA
ISO 15924 script code |
int |
GREEK
Greek |
int |
GUJARATI
Gujarati |
int |
GURMUKHI
Gurmukhi |
int |
HAN
Han |
int |
HANGUL
Hangul |
int |
HANUNOO
Hanunooo |
int |
HARAPPAN_INDUS
ISO 15924 script code |
int |
HATRAN
ISO 15924 script code |
int |
HEBREW
Hebrew |
int |
HIERATIC_EGYPTIAN
ISO 15924 script code |
int |
HIRAGANA
Hiragana |
int |
IMPERIAL_ARAMAIC
ISO 15924 script code |
int |
INHERITED
Inherited |
int |
INSCRIPTIONAL_PAHLAVI
ISO 15924 script code |
int |
INSCRIPTIONAL_PARTHIAN
ISO 15924 script code |
int |
INVALID_CODE
Invalid code |
int |
JAPANESE
ISO 15924 script code |
int |
JAVANESE
ISO 15924 script code |
int |
JURCHEN
ISO 15924 script code |
int |
KAITHI
ISO 15924 script code |
int |
KANNADA
Kannada |
int |
KATAKANA
Katakana |
int |
KATAKANA_OR_HIRAGANA
Script in Unicode 4.0.1 |
int |
KAYAH_LI
ISO 15924 script code |
int |
KHAROSHTHI
Script in Unicode 4.1 |
int |
KHMER
Khmer |
int |
KHOJKI
ISO 15924 script code |
int |
KHUDAWADI
ISO 15924 script code |
int |
KHUTSURI
ISO 15924 script code |
int |
KOREAN
ISO 15924 script code |
int |
KPELLE
ISO 15924 script code |
int |
LANNA
ISO 15924 script code |
int |
LAO
Lao |
int |
LATIN
Latin |
int |
LATIN_FRAKTUR
ISO 15924 script code |
int |
LATIN_GAELIC
ISO 15924 script code |
int |
LEPCHA
ISO 15924 script code |
int |
LIMBU
Limbu Script in Unicode 4 |
int |
LINEAR_A
ISO 15924 script code |
int |
LINEAR_B
Linear B Script in Unicode 4 |
int |
LISU
ISO 15924 script code |
int |
LOMA
ISO 15924 script code |
int |
LYCIAN
ISO 15924 script code |
int |
LYDIAN
ISO 15924 script code |
int |
MAHAJANI
ISO 15924 script code |
int |
MALAYALAM
Malayalam |
int |
MANDAEAN
ISO 15924 script code |
int |
MANDAIC
ISO 15924 script code |
int |
MANICHAEAN
ISO 15924 script code |
int |
MATHEMATICAL_NOTATION
ISO 15924 script code |
int |
MAYAN_HIEROGLYPHS
ISO 15924 script code |
int |
MEITEI_MAYEK
ISO 15924 script code |
int |
MENDE
Mende Kikakui ISO 15924 script code |
int |
MEROITIC
ISO 15924 script code |
int |
MEROITIC_CURSIVE
ISO 15924 script code |
int |
MEROITIC_HIEROGLYPHS
ISO 15924 script code |
int |
MIAO
ISO 15924 script code |
int |
MODI
ISO 15924 script code |
int |
MONGOLIAN
Mangolian |
int |
MOON
ISO 15924 script code |
int |
MRO
ISO 15924 script code |
int |
MULTANI
ISO 15924 script code |
int |
MYANMAR
Myammar |
int |
NABATAEAN
ISO 15924 script code |
int |
NAKHI_GEBA
ISO 15924 script code |
int |
NEW_TAI_LUE
Script in Unicode 4.1 |
int |
NKO
ISO 15924 script code |
int |
NUSHU
ISO 15924 script code |
int |
OGHAM
Ogham |
int |
OLD_CHURCH_SLAVONIC_CYRILLIC
ISO 15924 script code |
int |
OLD_HUNGARIAN
ISO 15924 script code |
int |
OLD_ITALIC
Old Itallic |
int |
OLD_NORTH_ARABIAN
ISO 15924 script code |
int |
OLD_PERMIC
ISO 15924 script code |
int |
OLD_PERSIAN
Script in Unicode 4.1 |
int |
OLD_SOUTH_ARABIAN
ISO 15924 script code |
int |
OL_CHIKI
ISO 15924 script code |
int |
ORIYA
Oriya |
int |
ORKHON
ISO 15924 script code |
int |
OSMANYA
Osmanya Script in Unicode 4 |
int |
PAHAWH_HMONG
ISO 15924 script code |
int |
PALMYRENE
ISO 15924 script code |
int |
PAU_CIN_HAU
ISO 15924 script code |
int |
PHAGS_PA
ISO 15924 script code |
int |
PHOENICIAN
ISO 15924 script code |
int |
PHONETIC_POLLARD
ISO 15924 script code |
int |
PSALTER_PAHLAVI
ISO 15924 script code |
int |
REJANG
ISO 15924 script code |
int |
RONGORONGO
ISO 15924 script code |
int |
RUNIC
Runic |
int |
SAMARITAN
ISO 15924 script code |
int |
SARATI
ISO 15924 script code |
int |
SAURASHTRA
ISO 15924 script code |
int |
SHARADA
ISO 15924 script code |
int |
SHAVIAN
Shavian Script in Unicode 4 |
int |
SIDDHAM
ISO 15924 script code |
int |
SIGN_WRITING
ISO 15924 script code for Sutton SignWriting |
int |
SIMPLIFIED_HAN
ISO 15924 script code |
int |
SINDHI
ISO 15924 script code |
int |
SINHALA
Sinhala |
int |
SORA_SOMPENG
ISO 15924 script code |
int |
SUNDANESE
ISO 15924 script code |
int |
SYLOTI_NAGRI
Script in Unicode 4.1 |
int |
SYMBOLS
ISO 15924 script code |
int |
SYRIAC
Syriac |
int |
TAGALOG
Tagalog |
int |
TAGBANWA
Tagbanwa |
int |
TAI_LE
Tai Le Script in Unicode 4 |
int |
TAI_VIET
ISO 15924 script code |
int |
TAKRI
ISO 15924 script code |
int |
TAMIL
Tamil |
int |
TANGUT
ISO 15924 script code |
int |
TELUGU
Telugu |
int |
TENGWAR
ISO 15924 script code |
int |
THAANA
Thana |
int |
THAI
Thai |
int |
TIBETAN
Tibetan |
int |
TIFINAGH
Script in Unicode 4.1 |
int |
TIRHUTA
ISO 15924 script code |
int |
TRADITIONAL_HAN
ISO 15924 script code |
int |
UCAS
Unified Canadian Aboriginal Symbols (alias) |
int |
UGARITIC
Ugaritic Script in Unicode 4 |
int |
UNKNOWN
ISO 15924 script code |
int |
UNWRITTEN_LANGUAGES
ISO 15924 script code |
int |
VAI
ISO 15924 script code |
int |
VISIBLE_SPEECH
ISO 15924 script code |
int |
WARANG_CITI
ISO 15924 script code |
int |
WESTERN_SYRIAC
ISO 15924 script code |
int |
WOLEAI
ISO 15924 script code |
int |
YI
Yi syllables |
Public methods | |
---|---|
static
final
boolean
|
breaksBetweenLetters(int script)
Returns true if the script allows line breaks between letters (excluding hyphenation). |
static
final
int[]
|
getCode(ULocale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static
final
int[]
|
getCode(String nameOrAbbrOrLocale)
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. |
static
final
int[]
|
getCode(Locale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static
final
int
|
getCodeFromName(String nameOrAbbr)
Returns the script code associated with the given Unicode script property alias (name or abbreviation). |
static
final
String
|
getName(int scriptCode)
Returns the long Unicode script name, if there is one. |
static
final
String
|
getSampleString(int script)
Returns the script sample character string. |
static
final
int
|
getScript(int codepoint)
Gets the script code associated with the given codepoint. |
static
final
int
|
getScriptExtensions(int c, BitSet set)
Sets code point c's Script_Extensions as script code integers into the output BitSet. |
static
final
String
|
getShortName(int scriptCode)
Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. |
static
final
UScript.ScriptUsage
|
getUsage(int script)
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. |
static
final
boolean
|
hasScript(int c, int sc)
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc. |
static
final
boolean
|
isCased(int script)
Returns true if in modern (or most recent) usage of the script case distinctions are customary. |
static
final
boolean
|
isRightToLeft(int script)
Returns true if the script is written right-to-left. |
Inherited methods | |
---|---|
From
class
java.lang.Object
|
int ANATOLIAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 156 (0x0000009c)
int BASSA_VAH
ISO 15924 script code
Constant Value: 134 (0x00000086)
int BLISSYMBOLS
ISO 15924 script code
Constant Value: 64 (0x00000040)
int BOOK_PAHLAVI
ISO 15924 script code
Constant Value: 124 (0x0000007c)
int BRAILLE
Braille Script in Unicode 4
Constant Value: 46 (0x0000002e)
int CANADIAN_ABORIGINAL
Unified Canadian Aboriginal Symbols
Constant Value: 40 (0x00000028)
int CAUCASIAN_ALBANIAN
ISO 15924 script code
Constant Value: 159 (0x0000009f)
int CUNEIFORM
ISO 15924 script code
Constant Value: 101 (0x00000065)
int CYPRIOT
Cypriot Script in Unicode 4
Constant Value: 47 (0x0000002f)
int DEMOTIC_EGYPTIAN
ISO 15924 script code
Constant Value: 69 (0x00000045)
int EASTERN_SYRIAC
ISO 15924 script code
Constant Value: 97 (0x00000061)
int EGYPTIAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 71 (0x00000047)
int ESTRANGELO_SYRIAC
ISO 15924 script code
Constant Value: 95 (0x0000005f)
int GLAGOLITIC
Script in Unicode 4.1
Constant Value: 56 (0x00000038)
int HARAPPAN_INDUS
ISO 15924 script code
Constant Value: 77 (0x0000004d)
int HIERATIC_EGYPTIAN
ISO 15924 script code
Constant Value: 70 (0x00000046)
int IMPERIAL_ARAMAIC
ISO 15924 script code
Constant Value: 116 (0x00000074)
int INSCRIPTIONAL_PAHLAVI
ISO 15924 script code
Constant Value: 122 (0x0000007a)
int INSCRIPTIONAL_PARTHIAN
ISO 15924 script code
Constant Value: 125 (0x0000007d)
int KATAKANA_OR_HIRAGANA
Script in Unicode 4.0.1
Constant Value: 54 (0x00000036)
int KHAROSHTHI
Script in Unicode 4.1
Constant Value: 57 (0x00000039)
int KHUDAWADI
ISO 15924 script code
Constant Value: 145 (0x00000091)
int LATIN_FRAKTUR
ISO 15924 script code
Constant Value: 80 (0x00000050)
int LATIN_GAELIC
ISO 15924 script code
Constant Value: 81 (0x00000051)
int LINEAR_B
Linear B Script in Unicode 4
Constant Value: 49 (0x00000031)
int MANICHAEAN
ISO 15924 script code
Constant Value: 121 (0x00000079)
int MATHEMATICAL_NOTATION
ISO 15924 script code
Constant Value: 128 (0x00000080)
int MAYAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 85 (0x00000055)
int MEITEI_MAYEK
ISO 15924 script code
Constant Value: 115 (0x00000073)
int MENDE
Mende Kikakui ISO 15924 script code
Constant Value: 140 (0x0000008c)
int MEROITIC_CURSIVE
ISO 15924 script code
Constant Value: 141 (0x0000008d)
int MEROITIC_HIEROGLYPHS
ISO 15924 script code
Constant Value: 86 (0x00000056)
int NABATAEAN
ISO 15924 script code
Constant Value: 143 (0x0000008f)
int NAKHI_GEBA
ISO 15924 script code
Constant Value: 132 (0x00000084)
int NEW_TAI_LUE
Script in Unicode 4.1
Constant Value: 59 (0x0000003b)
int OLD_CHURCH_SLAVONIC_CYRILLIC
ISO 15924 script code
Constant Value: 68 (0x00000044)
int OLD_HUNGARIAN
ISO 15924 script code
Constant Value: 76 (0x0000004c)
int OLD_NORTH_ARABIAN
ISO 15924 script code
Constant Value: 142 (0x0000008e)
int OLD_PERMIC
ISO 15924 script code
Constant Value: 89 (0x00000059)
int OLD_PERSIAN
Script in Unicode 4.1
Constant Value: 61 (0x0000003d)
int OLD_SOUTH_ARABIAN
ISO 15924 script code
Constant Value: 133 (0x00000085)
int OSMANYA
Osmanya Script in Unicode 4
Constant Value: 50 (0x00000032)
int PAHAWH_HMONG
ISO 15924 script code
Constant Value: 75 (0x0000004b)
int PALMYRENE
ISO 15924 script code
Constant Value: 144 (0x00000090)
int PAU_CIN_HAU
ISO 15924 script code
Constant Value: 165 (0x000000a5)
int PHOENICIAN
ISO 15924 script code
Constant Value: 91 (0x0000005b)
int PHONETIC_POLLARD
ISO 15924 script code
Constant Value: 92 (0x0000005c)
int PSALTER_PAHLAVI
ISO 15924 script code
Constant Value: 123 (0x0000007b)
int RONGORONGO
ISO 15924 script code
Constant Value: 93 (0x0000005d)
int SAMARITAN
ISO 15924 script code
Constant Value: 126 (0x0000007e)
int SAURASHTRA
ISO 15924 script code
Constant Value: 111 (0x0000006f)
int SHAVIAN
Shavian Script in Unicode 4
Constant Value: 51 (0x00000033)
int SIGN_WRITING
ISO 15924 script code for Sutton SignWriting
Constant Value: 112 (0x00000070)
int SIMPLIFIED_HAN
ISO 15924 script code
Constant Value: 73 (0x00000049)
int SORA_SOMPENG
ISO 15924 script code
Constant Value: 152 (0x00000098)
int SUNDANESE
ISO 15924 script code
Constant Value: 113 (0x00000071)
int SYLOTI_NAGRI
Script in Unicode 4.1
Constant Value: 58 (0x0000003a)
int TRADITIONAL_HAN
ISO 15924 script code
Constant Value: 74 (0x0000004a)
int UCAS
Unified Canadian Aboriginal Symbols (alias)
Constant Value: 40 (0x00000028)
int UGARITIC
Ugaritic Script in Unicode 4
Constant Value: 53 (0x00000035)
int UNWRITTEN_LANGUAGES
ISO 15924 script code
Constant Value: 102 (0x00000066)
int VISIBLE_SPEECH
ISO 15924 script code
Constant Value: 100 (0x00000064)
int WARANG_CITI
ISO 15924 script code
Constant Value: 146 (0x00000092)
int WESTERN_SYRIAC
ISO 15924 script code
Constant Value: 96 (0x00000060)
boolean breaksBetweenLetters (int script)
Returns true if the script allows line breaks between letters (excluding hyphenation). Such a script typically requires dictionary-based line breaking. For example, Hani and Thai.
Parameters | |
---|---|
script |
int :
script code |
Returns | |
---|---|
boolean |
true if the script allows line breaks between letters |
int[] getCode (ULocale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
ULocale :
ULocale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int[] getCode (String nameOrAbbrOrLocale)
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Note: To search by short or long script alias only, use
getCodeFromName(String)
instead.
That does a fast lookup with no access of the locale data.
Parameters | |
---|---|
nameOrAbbrOrLocale |
String :
name of the script or ISO 15924 code or locale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int[] getCode (Locale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
Locale :
Locale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int getCodeFromName (String nameOrAbbr)
Returns the script code associated with the given Unicode script property alias (name or abbreviation). Short aliases are ISO 15924 script codes. Returns MALAYAM given "Malayam" OR "Mlym".
Parameters | |
---|---|
nameOrAbbr |
String :
name of the script or ISO 15924 code |
Returns | |
---|---|
int |
The script code value, or INVALID_CODE if the code cannot be found. |
String getName (int scriptCode)
Returns the long Unicode script name, if there is one. Otherwise returns the 4-letter ISO 15924 script code. Returns "Malayam" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
int :
int script code |
Returns | |
---|---|
String |
long script name as given in PropertyValueAliases.txt, or the 4-letter code |
Throws | |
---|---|
IllegalArgumentException |
if the script code is not valid |
String getSampleString (int script)
Returns the script sample character string. This string normally consists of one code point but might be longer. The string is empty if the script is not encoded.
Parameters | |
---|---|
script |
int :
script code |
Returns | |
---|---|
String |
the sample character string |
int getScript (int codepoint)
Gets the script code associated with the given codepoint. Returns UScript.MALAYAM given 0x0D02
Parameters | |
---|---|
codepoint |
int :
UChar32 codepoint |
Returns | |
---|---|
int |
The script code |
int getScriptExtensions (int c, BitSet set)
Sets code point c's Script_Extensions as script code integers into the output BitSet.
UNKNOWN
code is put into the set
and also returned.
Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
The Script_Extensions property is provisional. It may be modified or removed in future versions of the Unicode Standard, and thus in ICU.
Parameters | |
---|---|
c |
int :
code point |
set |
BitSet :
set of script code integers; will be cleared, then bits are set
corresponding to c's Script_Extensions |
Returns | |
---|---|
int |
negative number of script codes in c's Script_Extensions, or the non-negative single Script value |
String getShortName (int scriptCode)
Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. Returns "Mlym" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
int :
int script code |
Returns | |
---|---|
String |
short script name (4-letter code) |
Throws | |
---|---|
IllegalArgumentException |
if the script code is not valid |
UScript.ScriptUsage getUsage (int script)
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax.
Returns NOT_ENCODED
if the script is not encoded in Unicode.
Parameters | |
---|---|
script |
int :
script code |
Returns | |
---|---|
UScript.ScriptUsage |
script usage |
See also:
boolean hasScript (int c, int sc)
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc.
Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
The Script_Extensions property is provisional. It may be modified or removed in future versions of the Unicode Standard, and thus in ICU.
Parameters | |
---|---|
c |
int :
code point |
sc |
int :
script code |
Returns | |
---|---|
boolean |
true if sc is in Script_Extensions(c) |
boolean isCased (int script)
Returns true if in modern (or most recent) usage of the script case distinctions are customary. For example, Latn and Cyrl.
Parameters | |
---|---|
script |
int :
script code |
Returns | |
---|---|
boolean |
true if the script is cased |
boolean isRightToLeft (int script)
Returns true if the script is written right-to-left. For example, Arab and Hebr.
Parameters | |
---|---|
script |
int :
script code |
Returns | |
---|---|
boolean |
true if the script is right-to-left |