Encryption for non alphabetic languages

SLAM: debunk creationism, pseudoscience, and superstitions. Discuss logic and morality.

Moderator: Alyrium Denryle

Post Reply
User avatar
mr friendly guy
The Doctor
Posts: 11235
Joined: 2004-12-12 10:55pm
Location: In a 1960s police telephone box somewhere in Australia

Encryption for non alphabetic languages

Post by mr friendly guy »

I subscribe to sci show, and they had a nice video on encryption techniques. However these applied to English and by extension languages which have alphabets.

https://www.youtube.com/watch?v=-yFZGF8FHSg

How would one encrypt a language that doesn't have an alphabet, for example Chinese.
Never apologise for being a geek, because they won't apologise to you for being an arsehole. John Barrowman - 22 June 2014 Perth Supernova.

Countries I have been to - 14.
Australia, Canada, China, Colombia, Denmark, Ecuador, Finland, Germany, Malaysia, Netherlands, Norway, Singapore, Sweden, USA.
Always on the lookout for more nice places to visit.
fnord
Jedi Knight
Posts: 950
Joined: 2005-09-18 08:09am
Location: You're not cleared for that

Re: Encryption for non alphabetic languages

Post by fnord »

Just off the top of my head, without having viewed the link, encipher (frinstance) the Unicode representation - as far as encipherment's concerned, it's a string of bytes. Unicode imposes a mapping on top of that, so as long as it round trips - same bytes come out after decipherment that were fed into encipherment - afaik, you're good.

Of course, I could be talking utter bollocks - someone who knows more, please correct me.
A mad person thinks there's a gateway to hell in his basement. A mad genius builds one and turns it on. - CaptainChewbacca
User avatar
Jub
Sith Marauder
Posts: 4396
Joined: 2012-08-06 07:58pm
Location: British Columbia, Canada

Re: Encryption for non alphabetic languages

Post by Jub »

fnord wrote:Just off the top of my head, without having viewed the link, encipher (frinstance) the Unicode representation - as far as encipherment's concerned, it's a string of bytes. Unicode imposes a mapping on top of that, so as long as it round trips - same bytes come out after decipherment that were fed into encipherment - afaik, you're good.

Of course, I could be talking utter bollocks - someone who knows more, please correct me.
You have the right if it. To a computer everything is bits and because of this encryption doesn't really care what the data going in means. You litterally just do some transformative steps on incoming strings of bits and then use a decryption key at the other end.
User avatar
mr friendly guy
The Doctor
Posts: 11235
Joined: 2004-12-12 10:55pm
Location: In a 1960s police telephone box somewhere in Australia

Re: Encryption for non alphabetic languages

Post by mr friendly guy »

So you guys are saying you can encrypt non alphabetic languages with computers? Ok, what about before computers. One of the encryption techniques mentioned was developed in the 16th century, and I can't see how that would apply to non alphabetic languages.
Never apologise for being a geek, because they won't apologise to you for being an arsehole. John Barrowman - 22 June 2014 Perth Supernova.

Countries I have been to - 14.
Australia, Canada, China, Colombia, Denmark, Ecuador, Finland, Germany, Malaysia, Netherlands, Norway, Singapore, Sweden, USA.
Always on the lookout for more nice places to visit.
User avatar
Terralthra
Requiescat in Pace
Posts: 4741
Joined: 2007-10-05 09:55pm
Location: San Francisco, California, United States

Re: Encryption for non alphabetic languages

Post by Terralthra »

Ciphers in Chinese and other logographic languages could be encrypted with grille ciphers, secret sharing (cutting a message into vertical strips, thus breaking up sentences), mixing and matching the syllables involved in logographs to include a separate message (steganography, in other words), and a couple other techniques. You're right that alphabet-substitution techniques like Caesar and Vigenere ciphers would not work particularly well.
User avatar
mr friendly guy
The Doctor
Posts: 11235
Joined: 2004-12-12 10:55pm
Location: In a 1960s police telephone box somewhere in Australia

Re: Encryption for non alphabetic languages

Post by mr friendly guy »

Terralthra wrote:Ciphers in Chinese and other logographic languages could be encrypted with grille ciphers, secret sharing (cutting a message into vertical strips, thus breaking up sentences), mixing and matching the syllables involved in logographs to include a separate message (steganography, in other words), and a couple other techniques. You're right that alphabet-substitution techniques like Caesar and Vigenere ciphers would not work particularly well.
Before the advent of computers, how well would these techniques work? For example if we have 2 equally good mathematicians, one who could only speak English and the other only knows Chinese, who would find it easier to crack a simple message with the equivalent number of words in their respective language using a cipher method like the caesar and vigenere for English, and one of those methods for Chinese.

I know its going to be hard to answer, but I thought I would try.
Never apologise for being a geek, because they won't apologise to you for being an arsehole. John Barrowman - 22 June 2014 Perth Supernova.

Countries I have been to - 14.
Australia, Canada, China, Colombia, Denmark, Ecuador, Finland, Germany, Malaysia, Netherlands, Norway, Singapore, Sweden, USA.
Always on the lookout for more nice places to visit.
User avatar
Terralthra
Requiescat in Pace
Posts: 4741
Joined: 2007-10-05 09:55pm
Location: San Francisco, California, United States

Re: Encryption for non alphabetic languages

Post by Terralthra »

Eh....that is hard to answer. Vigenere polyciphers were effectively unbroken until the age of Charles Babbage and Kasinski, by which time the underpinnings of cryptanalysis using mathematics (which would go on to be computer-assisted) began to be known. By the 1850s, a shorter key-length Vigenere could be broken by hand (and was), but longer-key Vigeneres were effectively just another way of saying "one time pad" and were more or less unbreakable (and still are, barring weakness in the random number generator). Caesar ciphers were solved problems using frequency analysis by the 9th Century CE.

Grille ciphers and steganography rely as much on the encryption being undetected as the actual message involved. If I give you a large piece of paper with a bunch of Chinese characters, and the characters are written in multiple colors, there are effectively an arbitrarily large number of messages hidden in it, based solely on the holes in the paper and what color filter is in those holes. How easy is that to solve? I dunno. Depends a lot on what outside information you have on what sort of knowledge you're seeking. Secret sharing likewise relies a lot on the words not being assembleable without outside knowledge - if you have all n strips and some basic idea what the message is, I can't imagine it being too hard. Steganography, likewise, if you know there's a message hidden, it's not hard to find.
User avatar
Beowulf
The Patrician
Posts: 10619
Joined: 2002-07-04 01:18am
Location: 32ULV

Re: Encryption for non alphabetic languages

Post by Beowulf »

Terralthra wrote:Ciphers in Chinese and other logographic languages could be encrypted with grille ciphers, secret sharing (cutting a message into vertical strips, thus breaking up sentences), mixing and matching the syllables involved in logographs to include a separate message (steganography, in other words), and a couple other techniques. You're right that alphabet-substitution techniques like Caesar and Vigenere ciphers would not work particularly well.
Not quite mentioned, but before the advent of computers, you could still find codebooks, that translated your plaintext into another form. Sometimes they were secret, and sometimes not (commercial codes typically weren't). The usual output of a codebook would be in a alphabet of sorts. A common use for these would be for telegraphy. In fact, there's a standard Chinese telegraph code, which maps characters to 4 digit numbers. From there, you could do your standard encryption algorithms. Note: this is now equivalent to getting the Unicode equivalent of the characters and encrypting that, but that's because Unicode is a non-secret codebook.
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
User avatar
Ziggy Stardust
Sith Devotee
Posts: 3114
Joined: 2006-09-10 10:16pm
Location: Research Triangle, NC

Re: Encryption for non alphabetic languages

Post by Ziggy Stardust »

Terralthra wrote:Ciphers in Chinese and other logographic languages could be encrypted with grille ciphers, secret sharing (cutting a message into vertical strips, thus breaking up sentences), mixing and matching the syllables involved in logographs to include a separate message (steganography, in other words), and a couple other techniques. You're right that alphabet-substitution techniques like Caesar and Vigenere ciphers would not work particularly well.
Not to be pedantic, but technically in Chinese specifically wouldn't secret sharing entail cutting the message into HORIZONTAL strips, since they traditionally wrote vertically?
User avatar
Terralthra
Requiescat in Pace
Posts: 4741
Joined: 2007-10-05 09:55pm
Location: San Francisco, California, United States

Re: Encryption for non alphabetic languages

Post by Terralthra »

Ziggy Stardust wrote:
Terralthra wrote:Ciphers in Chinese and other logographic languages could be encrypted with grille ciphers, secret sharing (cutting a message into vertical strips, thus breaking up sentences), mixing and matching the syllables involved in logographs to include a separate message (steganography, in other words), and a couple other techniques. You're right that alphabet-substitution techniques like Caesar and Vigenere ciphers would not work particularly well.
Not to be pedantic, but technically in Chinese specifically wouldn't secret sharing entail cutting the message into HORIZONTAL strips, since they traditionally wrote vertically?
Yes, it would, and yes, it's pedantic. :)
User avatar
Sea Skimmer
Yankee Capitalist Air Pirate
Posts: 37389
Joined: 2002-07-03 11:49pm
Location: Passchendaele City, HAB

Re: Encryption for non alphabetic languages

Post by Sea Skimmer »

Digital information is a 1 or a 0. What a human reads as a script is irrelevant to encryption method, code in your operating system handles the conversion from the 1/0 crap to a language. The point is how you scramble the 1/0 stuff.

Now if your talking about older precmuter material then yeah, it can get annoying, but really all a language like Mandarin Chinese means is that your code book will be much thicker for any given method vs English. Any number of strategies will work (against a non computerized enemy) to provide a useful cypher. Once computers are involved the language really doesn't matter, the complexity of possible cypher methods is far greater then that of the languages.
"This cult of special forces is as sensible as to form a Royal Corps of Tree Climbers and say that no soldier who does not wear its green hat with a bunch of oak leaves stuck in it should be expected to climb a tree"
— Field Marshal William Slim 1956
User avatar
Zixinus
Emperor's Hand
Posts: 6663
Joined: 2007-06-19 12:48pm
Location: In Seth the Blitzspear
Contact:

Re: Encryption for non alphabetic languages

Post by Zixinus »

Computers have to have "alphabetized" non-alphabetic characters like Chinese characters, they have to in order for them to be rendered at all. To a computer, they are just characters. Encryption, whether digital or not, should actually be relatively easier because you have more raw variety of information to jumble around (which is roughly what encryption is). It means messages would be bigger but that's already a given with such writing systems.
Credo!
Chat with me on Skype if you want to talk about writing, ideas or if you want a test-reader! PM for address.
User avatar
Sea Skimmer
Yankee Capitalist Air Pirate
Posts: 37389
Joined: 2002-07-03 11:49pm
Location: Passchendaele City, HAB

Re: Encryption for non alphabetic languages

Post by Sea Skimmer »

The message size is kinda irrelevant, good encryption methods always employed lots of padding so that the enemy cannot infer the message meaning by its length or format, or easily exploit partly broken codes. Classic human processed example of how to do that is to attach a bunch of names from the phone book to each end of the original text, easily ignored once decrypted. Prior to fully computerized systems though one's ability to use padding was more constrained though, because for important communications encryption/decryption time begins to matter, say morse radio communications between Admirals at Sea during operations. Errors also become a problem.

If your working by hand or simple machine Chinese ect... style characters are going to be a pain in the ass to work with as a practical manner, which will increase the probability of errors. Or code operators doing things they shouldn't like using the same code book page each day to make life easier. Japan was incredibly bad at this in WW2. Codes themselves were pretty good, but operator discipline was very poor, particularly on civilian ships. Amusingly though they were also pretty bad at precise navigation, so many US decrypts of ship positions proved to be useless to US submarines, because the ship was that wrong about where it was!

Once you go digital computer, problems like this are much lessened, but not eliminated.
"This cult of special forces is as sensible as to form a Royal Corps of Tree Climbers and say that no soldier who does not wear its green hat with a bunch of oak leaves stuck in it should be expected to climb a tree"
— Field Marshal William Slim 1956
User avatar
Zeropoint
Jedi Knight
Posts: 581
Joined: 2013-09-14 01:49am

Re: Encryption for non alphabetic languages

Post by Zeropoint »

You could just make a list of characters that assigns a numeric code to each character, and then freely distribute that list, and just encrypt the sequence of numeric codes that make up a message. I would assume that making a list of all relevant characters would be a pain, but probably worth it for secure communications.

Wow, 4000 characters. The list is going to be a small book on its own.
I'm a cis-het white male, and I oppose racism, sexism, homophobia, and transphobia. I support treating all humans equally.

When fascism came to America, it was wrapped in the flag and carrying a cross.

That which will not bend must break and that which can be destroyed by truth should never be spared its demise.
User avatar
Beowulf
The Patrician
Posts: 10619
Joined: 2002-07-04 01:18am
Location: 32ULV

Re: Encryption for non alphabetic languages

Post by Beowulf »

Zeropoint wrote:You could just make a list of characters that assigns a numeric code to each character, and then freely distribute that list, and just encrypt the sequence of numeric codes that make up a message. I would assume that making a list of all relevant characters would be a pain, but probably worth it for secure communications.

Wow, 4000 characters. The list is going to be a small book on its own.
Chinese telegraphic code? 7000 characters in a 100 page book. It's organized similarly to Chinese dictionaries.
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
User avatar
LadyTevar
White Mage
White Mage
Posts: 23184
Joined: 2003-02-12 10:59pm

Re: Encryption for non alphabetic languages

Post by LadyTevar »

There was also the 'shared codebook' route, which would work for non-alphabetic languages. Use a famous book of poetry, or a treatise that wouldn't seem out of place in anyone's home or office. The code is a series of numbers that refers to the page and word/character on that page, or perhaps a whole phrase, which then has hidden meanings of its own.
Image
Nitram, slightly high on cough syrup: Do you know you're beautiful?
Me: Nope, that's why I have you around to tell me.
Nitram: You -are- beautiful. Anyone tries to tell you otherwise kill them.

"A life is like a garden. Perfect moments can be had, but not preserved, except in memory. LLAP" -- Leonard Nimoy, last Tweet
User avatar
U.P. Cinnabar
Sith Marauder
Posts: 3845
Joined: 2016-02-05 08:11pm
Location: Aboard the RCS Princess Cecile

Re: Encryption for non alphabetic languages

Post by U.P. Cinnabar »

LadyTevar wrote:There was also the 'shared codebook' route, which would work for non-alphabetic languages. Use a famous book of poetry, or a treatise that wouldn't seem out of place in anyone's home or office. The code is a series of numbers that refers to the page and word/character on that page, or perhaps a whole phrase, which then has hidden meanings of its own.
"To Serve Man" is an excellent cookbook.
"Beware the Beast, Man, for he is the Devil's pawn. Alone amongst God's primates, he kills for sport, for lust, for greed. Yea, he will murder his brother to possess his brother's land. Let him not breed in great numbers, for he will make a desert of his home and yours. Shun him, drive him back into his jungle lair, for he is the harbinger of Death.."
—29th Scroll, 6th Verse of Ape Law
"Indelible in the hippocampus is the laughter. The uproarious laughter between the two, and their having fun at my expense.”
---Doctor Christine Blasey-Ford
User avatar
Cykeisme
Jedi Council Member
Posts: 2416
Joined: 2004-12-25 01:47pm
Contact:

Re: Encryption for non alphabetic languages

Post by Cykeisme »

From a purely analytical standpoint, as was already said, generally as long as a language can be encoded into purely numeric sequences, then you can encrypt it, even prior to the advent of computers and formalized methods of representing text in binary sequences.
Once that encoding method is standardized, then encrypting any language would basically be the same.

So, for the example of the Vignere algorithm, you'd simply need a method to encode any language into numbers, after which you can apply the same encryption algorithm.

Like Sea Skimmer said, this becomes increasingly impractical when fast encryption and decryption is required, because the additional time overhead performing the encoding and decoding (which is in addition to the encryption/decryption) could become a hindrance during military operations of any era. So, preferably, the encoding would have to be simple.

For example, it would be possible to simplify the language-to-numerals encoding by skipping written characters and assigning numeric values to spoken syllables (even taking into account intonation) for Chinese. However there is a chance of misinterpretation; with the recipient using the context of entire words, phrases and/or sentences, this is unlikely, but still possible.
I suppose the Chinese telegraphic code would be a lossless, but more time consuming (if done without computers) method of encoding?


For the English alphabet, the encoding method for letters is obvious and simple: simply replacing each letter with its position number in the alphabet.
With the simple encoding out of the way, the actual encryption itself is the the Vignere algorithm itself (adding each plaintext character's sequence number with the corresponding position character in the key text, wrapping around).


LadyTevar wrote:There was also the 'shared codebook' route, which would work for non-alphabetic languages. Use a famous book of poetry, or a treatise that wouldn't seem out of place in anyone's home or office. The code is a series of numbers that refers to the page and word/character on that page, or perhaps a whole phrase, which then has hidden meanings of its own.
Another good thing is that, as long as the language-to-numbers encoding method for each language is already established (either being well-known, or simply understood by both parties), the plaintexts and the codebook doesn't even have to be the same language (going as far as to even use languages that don't even use the same written characters at all).
After all, once encoded, both plaintext and key text are both just sequences of numbers.
I imagine that in a real situation, this might add just a little bit more obfuscation to counter-intelligence and codebreaking efforts, particularly if one of the language encoding methods is not a commonly known accepted standard.
"..history has shown the best defense against heavy cavalry are pikemen, so aircraft should mount lances on their noses and fly in tight squares to fend off bombers". - RedImperator

"ha ha, raping puppies is FUN!" - Johonebesus

"It would just be Unicron with pew pew instead of nom nom". - Vendetta, explaining his justified disinterest in the idea of the movie Allspark affecting the Death Star
Post Reply