Idiomatically determine all the characters that can be used for symbols
Any Unicode character or combination of characters can be used for symbols in Perl 6. Here's some counting rods and some cuneiform:
sub postfix:<𒋦>($n) { say "$n trilobites" }
sub term:<𝍧> { unival('𝍧') }
𝍧𒋦
Output:
8 trilobites
And here is a Zalgo-text symbol:
sub Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ ($n) { say "$n COMES" }
Z̧̔ͩ͌͑̉̎A̢̲̙̮̹̮͍̎L̔ͧ́͆G̰̬͎͔̱̅ͣͫO͙̔ͣ̈́̈̽̎ͣ 'HE'
Output:
HE COMES
Of course, as in other languages, most of the characters you'll typically see in names are going to be alphanumerics from ASCII (or maybe Unicode), but that's a convention, not a limitation, due to the syntactic category notation demonstrated above, which can introduce any sequence of characters as a term or operator.
Actually, the above is a slight prevarication. The syntactic category notation does not allow you to use whitespace in the definition of a new symbol. But that leaves many more characters allowed than not allowed. Hence, it is much easier to enumerate the characters that cannot be used in symbols:
say .fmt("%4x"),"\t", uniname($_)
if uniprop($_,'Z')
for 0..0x1ffff;
Output:
20 SPACE
a0 NO-BREAK SPACE
1680 OGHAM SPACE MARK
2000 EN QUAD
2001 EM QUAD
2002 EN SPACE
2003 EM SPACE
2004 THREE-PER-EM SPACE
2005 FOUR-PER-EM SPACE
2006 SIX-PER-EM SPACE
2007 FIGURE SPACE
2008 PUNCTUATION SPACE
2009 THIN SPACE
200a HAIR SPACE
2028 LINE SEPARATOR
2029 PARAGRAPH SEPARATOR
202f NARROW NO-BREAK SPACE
205f MEDIUM MATHEMATICAL SPACE
3000 IDEOGRAPHIC SPACE
We enforce the whitespace restriction to prevent insanity in the readers of programs.
That being said, even the whitespace restriction is arbitrary, and can be bypassed by deriving a new grammar and switching to it. We view all other languages as dialects of Perl 6, even the insane ones. :-)