
| From: Stewart C. Russell via talk <talk@gtalug.org> | tr on Mac OS seems to assume input is valid UTF-8 text (if locale is | suitably UTF-8). To amplify this, not all byte sequences are valid UTF-8. Random byte sequences will sometimes be invalid. Off the top of my head, I think that the following are invalid: - A 0x80 byte not preceded by a byte with the high bit on - A string ending with a byte with the high bit on - A sequence of more than n bytes with the high bit on (n is something like 4). Each valid character is represented as a sequence of zero or more bytes with the high bit on, not starting with 0x80, followed by a byte without the high bit on. All the non-high bits are concatenated to form the UTF-32 value. Overflow is forbidden. On the other hand, UTF-8 is UTF-8, whether you are in US or CA locale. So the different behaviours between the two UTF-8 locales would seem to be a bug. (In theory, collating sequences could be different so ranges in tr could be different, but I would not see that affecting the ASCII subset you are using in your ranges.) Using C locale should give you 8-bit characters, not UTF-8. So it should work. This (untested) small change to Giles' script should work. dd if=/dev/urandom bs=1 count=256 2>/dev/null | LC_ALL=C tr -dc 'A-Za-z0-9!@$%^&*(){}[]=+-_/?\|~`' | head -c 32 LC_ALL might be overkill. I don't know. I'd probably add an echo to put a newline at the end.