小学社会309069 views
高校日本史190088 views
高校国語786513 views
数学講師2866165 views
高校物理158799 views
教育149127 views
高校化学2916692 views
中学数学622271 views
MathPython493786 views
中学社会667560 views

JavaScript Lecture

Surrogate pair regex is a good alternative to split('') when splitting a string by space

__The split('')__ returns a strange array when an original text contains emojis.
const word = 'AB😀C'

const items = word.split('')

console.log(items)
// ['A', 'B', '\uD83D', '\uDE00', 'C']

😀 is split to two code points. JavaScript uses UTF-16 and 😀 is expressed by UTF-16 surrogate pair. The below helps us to understand JS String object.

const word = 'AB😀C'

console.log(word.length)
// 5

An alternative to split function

Regular expression and match enable a string to be split as an emoji length is 1.

const word = 'AB😀C'

const array = word.match(/([\uD800-\uDBFF][\uDC00-\uDFFF])|./g)

console.log(array)
// ['A', 'B', '😀', 'C']

Surrogate pair is a pair of high and low surrogate. Both code point ranges are:

HighD800 - DBFF
LowDC00 - DFFF